blob: 2c33988926ddc46b26a542f3f4c48b05cb5b6c9e [file] [log] [blame]
HDFSPROXY is an HTTPS proxy server that exposes the same HSFTP interface as a
real cluster. It authenticates users via user certificates and enforce access
control based on configuration files.
Starting up an HDFSPROXY server is similar to starting up an HDFS cluster.
Simply run "hdfsproxy" shell command. The main configuration file is
hdfsproxy-default.xml, which should be on the classpath. hdfsproxy-env.sh
can be used to set up environmental variables. In particular, JAVA_HOME should
be set. Additional configuration files include user-certs.xml,
user-permissions.xml and ssl-server.xml, which are used to specify allowed user
certs, allowed directories/files, and ssl keystore information for the proxy,
respectively. The location of these files can be specified in
hdfsproxy-default.xml. Environmental variable HDFSPROXY_CONF_DIR can be used to
point to the directory where these configuration files are located. The
configuration files of the proxied HDFS cluster should also be available on the
classpath (hdfs-default.xml and hdfs-site.xml).
Mirroring those used in HDFS, a few shell scripts are provided to start and
stop a group of proxy servers. The hosts to run hdfsproxy on are specified in
hdfsproxy-hosts file, one host per line. All hdfsproxy servers are stateless
and run independently from each other. Simple load balancing can be set up by
mapping all hdfsproxy server IP addresses to a single hostname. Users should
use that hostname to access the proxy. If an IP address look up for that
hostname returns more than one IP addresses, an HFTP/HSFTP client will randomly
pick one to use.
Command "hdfsproxy -reloadPermFiles" can be used to trigger reloading of
user-certs.xml and user-permissions.xml files on all proxy servers listed in
the hdfsproxy-hosts file. Similarly, "hdfsproxy -clearUgiCache" command can be
used to clear the UGI caches on all proxy servers.