Crate hdfs [−] [src]
hdfs-rs is a library for accessing to HDFS cluster. Basically, it provides libhdfs FFI APIs. It also provides more idiomatic and abstract Rust APIs, hiding manual memory management and some thread-safety problem of libhdfs. Rust APIs are highly recommended for most users.
Important Note
The original libhdfs
implementation allows only one HdfsFs
instance for the
same namenode because libhdfs
only keeps a single hdfsFs
entry for each namenode.
As a result, you need to keep a singleton HdfsFsCache
in an entire program, and
you must get HdfsFs
through only HdfsFsCache
. For it, you need to share
HdfsFsCache
instance across all threads in the program.
Contrast, HdfsFs
instance itself is thread-safe.
Usage
in Cargo.toml:
[dependencies] hdfs = "0.0.4"
or
[dependencies.hdfs] git = "https://github.com/hyunsik/hdfs-rs.git"
and this to your crate root:
extern crate hdfs;
hdfs-rs uses libhdfs, which is JNI native implementation. JNI native implementation
requires the proper CLASSPATH
. exec.sh included in the source code root plays a role to
execute your program with the proper CLASSPATH
. exec.sh
requires HADOOP_HOME
.
So, you firstly set HADOOP_HOME
shell environment variable as follows:
export HADOOP_HOME=<hadoop install dir>
Then, you can execute your program as follows:
./exec.sh your_program arg1 arg2
Testing
The test also requires the CLASSPATH
. So, you should run cargo test
through exec.sh
.
./exec.sh cargo test
Example
use std::rc::Rc; use std::cell::RefCell; use hdfs::HdfsFsCache; // You must get HdfsFs instance through HdfsFsCache. Also, HdfsFsCache // must be shared across all threads in the entire program in order to // avoid the thread-safe problem of the original libhdfs. let cache = Rc::new(RefCell::new(HdfsFsCache::new())); let fs: HdfsFs = cache.borrow_mut().get("hdfs://localhost:8020/").ok().unwrap(); match fs.mkdir("/data") { Ok(_) => { println!("/data has been created") }, Err(_) => { panic!("/data creation has failed") } };
Modules
minidfs |
Mini HDFS Cluster for easily building unit tests MiniDfs Cluster |
native |
libhdfs native binding APIs libhdfs FFI Binding APIs |
Structs
BlockHosts |
Includes hostnames where a particular block of a file is stored. |
FileStatus |
Interface that represents the client side information for a file or directory. |
HdfsFile |
open hdfs file |
HdfsFs |
Hdfs Filesystem |
HdfsFsCache |
HdfsFsCache which caches HdfsFs instances. |
HdfsUtil |
Hdfs Utility |
RzBuffer |
A buffer returned from zero-copy read. This buffer will be automatically freed when its lifetime is finished. |
RzOptions |
Options for zero-copy read |
Enums
HdfsErr |
Errors which can occur during accessing Hdfs cluster |