A distributed file system that provides high-throughput access to application data.
This service can be used to:
[Webhdfs][crate::services::Webhdfs] is powered by hdfs's RESTful HTTP API.
HDFS support needs to enable feature services-hdfs.
root: Set the work dir for backend.name_node: Set the name node for backend.kerberos_ticket_cache_path: Set the kerberos ticket cache path for backend, this should be gotten by klist after kinituser: Set the user for backendenable_append: enable the append capacity. Default is false.Refer to [HdfsBuilder]'s public API docs for more information.
HDFS needs some environment set correctly.
JAVA_HOME: the path to java home, could be found via java -XshowSettings:properties -versionHADOOP_HOME: the path to hadoop home, opendal relays on this env to discover hadoop jars and set CLASSPATH automatically.Most of the time, setting JAVA_HOME and HADOOP_HOME is enough. But there are some edge cases:
error while loading shared libraries: libjvm.so: cannot open shared object file: No such file or directory
Java's lib are not including in pkg-config find path, please set LD_LIBRARY_PATH:
export LD_LIBRARY_PATH=${JAVA_HOME}/lib/server:${LD_LIBRARY_PATH}
The path of libjvm.so could be different, please keep an eye on it.
(unable to get stack trace for java.lang.NoClassDefFoundError exception: ExceptionUtils::getStackTrace error.)
CLASSPATH is not set correctly or your hadoop installation is incorrect.
To set CLASSPATH:
export CLASSPATH=$(find $HADOOP_HOME -iname "*.jar" | xargs echo | tr ' ' ':'):${CLASSPATH}
export HADOOP_CONF_DIR=<path of the config folder>
CLASSPATHexport CLASSPATH=$HADOOP_CONF_DIR:$HADOOP_CLASSPATH:$CLASSPATH
cluster_name specified in the core-site.xml file (located in the HADOOP_CONF_DIR folder) to replace namenode:port.builder.name_node("hdfs://cluster_name");
If you encounter an issue during the build process on macOS with an error message similar to:
ld: unknown file type in $HADOOP_HOME/lib/native/libhdfs.so.0.0.0 clang: error: linker command failed with exit code 1 (use -v to see invocation)
This error is likely due to the fact that the official Hadoop build includes the libhdfs.so file for the x86-64 architecture, which is not compatible with aarch64 architecture required for MacOS.
To resolve this issue, you can add hdrs as a dependency in your Rust application's Cargo.toml file, and enable the vendored feature:
[dependencies] hdrs = { version = "<version_number>", features = ["vendored"] }
Enabling the vendored feature ensures that hdrs includes the necessary libhdfs.so library built for the correct architecture.
use opendal_core::Operator; use opendal_service_hdfs::Hdfs; #[tokio::main] async fn main() -> Result<(), Box<dyn std::error::Error>> { // Create fs backend builder. let mut builder = Hdfs::default() // Set the name node for hdfs. // If the string starts with a protocol type such as file://, hdfs://, or gs://, this protocol type will be used. .name_node("hdfs://127.0.0.1:9000") // Set the root for hdfs, all operations will happen under this root. // // NOTE: the root must be absolute path. .root("/tmp") // Enable the append capacity for hdfs. // // Note: HDFS run in non-distributed mode doesn't support append. .enable_append(true); // `Accessor` provides the low level APIs, we will use `Operator` normally. let op: Operator = Operator::new(builder)?.finish(); Ok(()) }