Crate hdfs [−] [src]
hdfs-rs is a library for accessing to HDFS cluster. Basically, it provides libhdfs FFI APIs. It also provides more idiomatic and abstract Rust APIs, hiding manual memory management and some thread-safety problem of libhdfs. Rust APIs are highly recommended for most users.
Important Note
The original libhdfs implementation allows only one HdfsFs instance for the
same namenode because libhdfs only keeps a single hdfsFs entry for each namenode.
As a result, you need to keep a singleton HdfsFsCache in an entire program, and
you must get HdfsFs through only HdfsFsCache. For it, you need to share
HdfsFsCache instance across all threads in the program.
Contrast, HdfsFs instance itself is thread-safe.
Usage
in Cargo.toml:
[dependencies] hdfs = "0.0.4"
or
[dependencies.hdfs] git = "https://github.com/hyunsik/hdfs-rs.git"
and this to your crate root:
extern crate hdfs;
hdfs-rs uses libhdfs, which is JNI native implementation. JNI native implementation
requires the proper CLASSPATH. exec.sh included in the source code root plays a role to
execute your program with the proper CLASSPATH. exec.sh requires HADOOP_HOME.
So, you firstly set HADOOP_HOME shell environment variable as follows:
export HADOOP_HOME=<hadoop install dir>
Then, you can execute your program as follows:
./exec.sh your_program arg1 arg2
Testing
The test also requires the CLASSPATH. So, you should run cargo test
through exec.sh.
./exec.sh cargo test
Example
use std::rc::Rc; use std::cell::RefCell; use hdfs::HdfsFsCache; // You must get HdfsFs instance through HdfsFsCache. Also, HdfsFsCache // must be shared across all threads in the entire program in order to // avoid the thread-safe problem of the original libhdfs. let cache = Rc::new(RefCell::new(HdfsFsCache::new())); let fs: HdfsFs = cache.borrow_mut().get("hdfs://localhost:8020/").ok().unwrap(); match fs.mkdir("/data") { Ok(_) => { println!("/data has been created") }, Err(_) => { panic!("/data creation has failed") } };
Modules
| minidfs |
Mini HDFS Cluster for easily building unit tests MiniDfs Cluster |
| native |
libhdfs native binding APIs libhdfs FFI Binding APIs |
Structs
| BlockHosts |
Includes hostnames where a particular block of a file is stored. |
| FileStatus |
Interface that represents the client side information for a file or directory. |
| HdfsFile |
open hdfs file |
| HdfsFs |
Hdfs Filesystem |
| HdfsFsCache |
HdfsFsCache which caches HdfsFs instances. |
| HdfsUtil |
Hdfs Utility |
| RzBuffer |
A buffer returned from zero-copy read. This buffer will be automatically freed when its lifetime is finished. |
| RzOptions |
Options for zero-copy read |
Enums
| HdfsErr |
Errors which can occur during accessing Hdfs cluster |