blob: 8ce8d831eba9d182723f3da9ee8758734400920e [file] [log] [blame]
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<!--
| Generated by Apache Maven Doxia at 2021-06-15
| Rendered using Apache Maven Stylus Skin 1.5
-->
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title>Apache Hadoop Amazon Web Services support &#x2013; Working with Encrypted S3 Data</title>
<style type="text/css" media="all">
@import url("../../css/maven-base.css");
@import url("../../css/maven-theme.css");
@import url("../../css/site.css");
</style>
<link rel="stylesheet" href="../../css/print.css" type="text/css" media="print" />
<meta name="Date-Revision-yyyymmdd" content="20210615" />
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
</head>
<body class="composite">
<div id="banner">
<a href="http://hadoop.apache.org/" id="bannerLeft">
<img src="http://hadoop.apache.org/images/hadoop-logo.jpg" alt="" />
</a>
<a href="http://www.apache.org/" id="bannerRight">
<img src="http://www.apache.org/images/asf_logo_wide.png" alt="" />
</a>
<div class="clear">
<hr/>
</div>
</div>
<div id="breadcrumbs">
<div class="xleft">
<a href="http://www.apache.org/" class="externalLink">Apache</a>
&gt;
<a href="http://hadoop.apache.org/" class="externalLink">Hadoop</a>
&gt;
<a href="../../index.html">Apache Hadoop Amazon Web Services support</a>
&gt;
Working with Encrypted S3 Data
</div>
<div class="xright"> <a href="http://wiki.apache.org/hadoop" class="externalLink">Wiki</a>
|
<a href="https://gitbox.apache.org/repos/asf/hadoop.git" class="externalLink">git</a>
&nbsp;| Last Published: 2021-06-15
&nbsp;| Version: 3.3.1
</div>
<div class="clear">
<hr/>
</div>
</div>
<div id="leftColumn">
<div id="navcolumn">
<h5>General</h5>
<ul>
<li class="none">
<a href="../../../index.html">Overview</a>
</li>
<li class="none">
<a href="../../../hadoop-project-dist/hadoop-common/SingleCluster.html">Single Node Setup</a>
</li>
<li class="none">
<a href="../../../hadoop-project-dist/hadoop-common/ClusterSetup.html">Cluster Setup</a>
</li>
<li class="none">
<a href="../../../hadoop-project-dist/hadoop-common/CommandsManual.html">Commands Reference</a>
</li>
<li class="none">
<a href="../../../hadoop-project-dist/hadoop-common/FileSystemShell.html">FileSystem Shell</a>
</li>
<li class="none">
<a href="../../../hadoop-project-dist/hadoop-common/Compatibility.html">Compatibility Specification</a>
</li>
<li class="none">
<a href="../../../hadoop-project-dist/hadoop-common/DownstreamDev.html">Downstream Developer's Guide</a>
</li>
<li class="none">
<a href="../../../hadoop-project-dist/hadoop-common/AdminCompatibilityGuide.html">Admin Compatibility Guide</a>
</li>
<li class="none">
<a href="../../../hadoop-project-dist/hadoop-common/InterfaceClassification.html">Interface Classification</a>
</li>
<li class="none">
<a href="../../../hadoop-project-dist/hadoop-common/filesystem/index.html">FileSystem Specification</a>
</li>
</ul>
<h5>Common</h5>
<ul>
<li class="none">
<a href="../../../hadoop-project-dist/hadoop-common/CLIMiniCluster.html">CLI Mini Cluster</a>
</li>
<li class="none">
<a href="../../../hadoop-project-dist/hadoop-common/FairCallQueue.html">Fair Call Queue</a>
</li>
<li class="none">
<a href="../../../hadoop-project-dist/hadoop-common/NativeLibraries.html">Native Libraries</a>
</li>
<li class="none">
<a href="../../../hadoop-project-dist/hadoop-common/Superusers.html">Proxy User</a>
</li>
<li class="none">
<a href="../../../hadoop-project-dist/hadoop-common/RackAwareness.html">Rack Awareness</a>
</li>
<li class="none">
<a href="../../../hadoop-project-dist/hadoop-common/SecureMode.html">Secure Mode</a>
</li>
<li class="none">
<a href="../../../hadoop-project-dist/hadoop-common/ServiceLevelAuth.html">Service Level Authorization</a>
</li>
<li class="none">
<a href="../../../hadoop-project-dist/hadoop-common/HttpAuthentication.html">HTTP Authentication</a>
</li>
<li class="none">
<a href="../../../hadoop-project-dist/hadoop-common/CredentialProviderAPI.html">Credential Provider API</a>
</li>
<li class="none">
<a href="../../../hadoop-kms/index.html">Hadoop KMS</a>
</li>
<li class="none">
<a href="../../../hadoop-project-dist/hadoop-common/Tracing.html">Tracing</a>
</li>
<li class="none">
<a href="../../../hadoop-project-dist/hadoop-common/UnixShellGuide.html">Unix Shell Guide</a>
</li>
<li class="none">
<a href="../../../hadoop-project-dist/hadoop-common/registry/index.html">Registry</a>
</li>
</ul>
<h5>HDFS</h5>
<ul>
<li class="none">
<a href="../../../hadoop-project-dist/hadoop-hdfs/HdfsDesign.html">Architecture</a>
</li>
<li class="none">
<a href="../../../hadoop-project-dist/hadoop-hdfs/HdfsUserGuide.html">User Guide</a>
</li>
<li class="none">
<a href="../../../hadoop-project-dist/hadoop-hdfs/HDFSCommands.html">Commands Reference</a>
</li>
<li class="none">
<a href="../../../hadoop-project-dist/hadoop-hdfs/HDFSHighAvailabilityWithQJM.html">NameNode HA With QJM</a>
</li>
<li class="none">
<a href="../../../hadoop-project-dist/hadoop-hdfs/HDFSHighAvailabilityWithNFS.html">NameNode HA With NFS</a>
</li>
<li class="none">
<a href="../../../hadoop-project-dist/hadoop-hdfs/ObserverNameNode.html">Observer NameNode</a>
</li>
<li class="none">
<a href="../../../hadoop-project-dist/hadoop-hdfs/Federation.html">Federation</a>
</li>
<li class="none">
<a href="../../../hadoop-project-dist/hadoop-hdfs/ViewFs.html">ViewFs</a>
</li>
<li class="none">
<a href="../../../hadoop-project-dist/hadoop-hdfs/ViewFsOverloadScheme.html">ViewFsOverloadScheme</a>
</li>
<li class="none">
<a href="../../../hadoop-project-dist/hadoop-hdfs/HdfsSnapshots.html">Snapshots</a>
</li>
<li class="none">
<a href="../../../hadoop-project-dist/hadoop-hdfs/HdfsEditsViewer.html">Edits Viewer</a>
</li>
<li class="none">
<a href="../../../hadoop-project-dist/hadoop-hdfs/HdfsImageViewer.html">Image Viewer</a>
</li>
<li class="none">
<a href="../../../hadoop-project-dist/hadoop-hdfs/HdfsPermissionsGuide.html">Permissions and HDFS</a>
</li>
<li class="none">
<a href="../../../hadoop-project-dist/hadoop-hdfs/HdfsQuotaAdminGuide.html">Quotas and HDFS</a>
</li>
<li class="none">
<a href="../../../hadoop-project-dist/hadoop-hdfs/LibHdfs.html">libhdfs (C API)</a>
</li>
<li class="none">
<a href="../../../hadoop-project-dist/hadoop-hdfs/WebHDFS.html">WebHDFS (REST API)</a>
</li>
<li class="none">
<a href="../../../hadoop-hdfs-httpfs/index.html">HttpFS</a>
</li>
<li class="none">
<a href="../../../hadoop-project-dist/hadoop-hdfs/ShortCircuitLocalReads.html">Short Circuit Local Reads</a>
</li>
<li class="none">
<a href="../../../hadoop-project-dist/hadoop-hdfs/CentralizedCacheManagement.html">Centralized Cache Management</a>
</li>
<li class="none">
<a href="../../../hadoop-project-dist/hadoop-hdfs/HdfsNfsGateway.html">NFS Gateway</a>
</li>
<li class="none">
<a href="../../../hadoop-project-dist/hadoop-hdfs/HdfsRollingUpgrade.html">Rolling Upgrade</a>
</li>
<li class="none">
<a href="../../../hadoop-project-dist/hadoop-hdfs/ExtendedAttributes.html">Extended Attributes</a>
</li>
<li class="none">
<a href="../../../hadoop-project-dist/hadoop-hdfs/TransparentEncryption.html">Transparent Encryption</a>
</li>
<li class="none">
<a href="../../../hadoop-project-dist/hadoop-hdfs/HdfsMultihoming.html">Multihoming</a>
</li>
<li class="none">
<a href="../../../hadoop-project-dist/hadoop-hdfs/ArchivalStorage.html">Storage Policies</a>
</li>
<li class="none">
<a href="../../../hadoop-project-dist/hadoop-hdfs/MemoryStorage.html">Memory Storage Support</a>
</li>
<li class="none">
<a href="../../../hadoop-project-dist/hadoop-hdfs/SLGUserGuide.html">Synthetic Load Generator</a>
</li>
<li class="none">
<a href="../../../hadoop-project-dist/hadoop-hdfs/HDFSErasureCoding.html">Erasure Coding</a>
</li>
<li class="none">
<a href="../../../hadoop-project-dist/hadoop-hdfs/HDFSDiskbalancer.html">Disk Balancer</a>
</li>
<li class="none">
<a href="../../../hadoop-project-dist/hadoop-hdfs/HdfsUpgradeDomain.html">Upgrade Domain</a>
</li>
<li class="none">
<a href="../../../hadoop-project-dist/hadoop-hdfs/HdfsDataNodeAdminGuide.html">DataNode Admin</a>
</li>
<li class="none">
<a href="../../../hadoop-project-dist/hadoop-hdfs-rbf/HDFSRouterFederation.html">Router Federation</a>
</li>
<li class="none">
<a href="../../../hadoop-project-dist/hadoop-hdfs/HdfsProvidedStorage.html">Provided Storage</a>
</li>
</ul>
<h5>MapReduce</h5>
<ul>
<li class="none">
<a href="../../../hadoop-mapreduce-client/hadoop-mapreduce-client-core/MapReduceTutorial.html">Tutorial</a>
</li>
<li class="none">
<a href="../../../hadoop-mapreduce-client/hadoop-mapreduce-client-core/MapredCommands.html">Commands Reference</a>
</li>
<li class="none">
<a href="../../../hadoop-mapreduce-client/hadoop-mapreduce-client-core/MapReduce_Compatibility_Hadoop1_Hadoop2.html">Compatibility with 1.x</a>
</li>
<li class="none">
<a href="../../../hadoop-mapreduce-client/hadoop-mapreduce-client-core/EncryptedShuffle.html">Encrypted Shuffle</a>
</li>
<li class="none">
<a href="../../../hadoop-mapreduce-client/hadoop-mapreduce-client-core/PluggableShuffleAndPluggableSort.html">Pluggable Shuffle/Sort</a>
</li>
<li class="none">
<a href="../../../hadoop-mapreduce-client/hadoop-mapreduce-client-core/DistributedCacheDeploy.html">Distributed Cache Deploy</a>
</li>
<li class="none">
<a href="../../../hadoop-mapreduce-client/hadoop-mapreduce-client-core/SharedCacheSupport.html">Support for YARN Shared Cache</a>
</li>
</ul>
<h5>MapReduce REST APIs</h5>
<ul>
<li class="none">
<a href="../../../hadoop-mapreduce-client/hadoop-mapreduce-client-core/MapredAppMasterRest.html">MR Application Master</a>
</li>
<li class="none">
<a href="../../../hadoop-mapreduce-client/hadoop-mapreduce-client-hs/HistoryServerRest.html">MR History Server</a>
</li>
</ul>
<h5>YARN</h5>
<ul>
<li class="none">
<a href="../../../hadoop-yarn/hadoop-yarn-site/YARN.html">Architecture</a>
</li>
<li class="none">
<a href="../../../hadoop-yarn/hadoop-yarn-site/YarnCommands.html">Commands Reference</a>
</li>
<li class="none">
<a href="../../../hadoop-yarn/hadoop-yarn-site/CapacityScheduler.html">Capacity Scheduler</a>
</li>
<li class="none">
<a href="../../../hadoop-yarn/hadoop-yarn-site/FairScheduler.html">Fair Scheduler</a>
</li>
<li class="none">
<a href="../../../hadoop-yarn/hadoop-yarn-site/ResourceManagerRestart.html">ResourceManager Restart</a>
</li>
<li class="none">
<a href="../../../hadoop-yarn/hadoop-yarn-site/ResourceManagerHA.html">ResourceManager HA</a>
</li>
<li class="none">
<a href="../../../hadoop-yarn/hadoop-yarn-site/ResourceModel.html">Resource Model</a>
</li>
<li class="none">
<a href="../../../hadoop-yarn/hadoop-yarn-site/NodeLabel.html">Node Labels</a>
</li>
<li class="none">
<a href="../../../hadoop-yarn/hadoop-yarn-site/NodeAttributes.html">Node Attributes</a>
</li>
<li class="none">
<a href="../../../hadoop-yarn/hadoop-yarn-site/WebApplicationProxy.html">Web Application Proxy</a>
</li>
<li class="none">
<a href="../../../hadoop-yarn/hadoop-yarn-site/TimelineServer.html">Timeline Server</a>
</li>
<li class="none">
<a href="../../../hadoop-yarn/hadoop-yarn-site/TimelineServiceV2.html">Timeline Service V.2</a>
</li>
<li class="none">
<a href="../../../hadoop-yarn/hadoop-yarn-site/WritingYarnApplications.html">Writing YARN Applications</a>
</li>
<li class="none">
<a href="../../../hadoop-yarn/hadoop-yarn-site/YarnApplicationSecurity.html">YARN Application Security</a>
</li>
<li class="none">
<a href="../../../hadoop-yarn/hadoop-yarn-site/NodeManager.html">NodeManager</a>
</li>
<li class="none">
<a href="../../../hadoop-yarn/hadoop-yarn-site/DockerContainers.html">Running Applications in Docker Containers</a>
</li>
<li class="none">
<a href="../../../hadoop-yarn/hadoop-yarn-site/RuncContainers.html">Running Applications in runC Containers</a>
</li>
<li class="none">
<a href="../../../hadoop-yarn/hadoop-yarn-site/NodeManagerCgroups.html">Using CGroups</a>
</li>
<li class="none">
<a href="../../../hadoop-yarn/hadoop-yarn-site/SecureContainer.html">Secure Containers</a>
</li>
<li class="none">
<a href="../../../hadoop-yarn/hadoop-yarn-site/ReservationSystem.html">Reservation System</a>
</li>
<li class="none">
<a href="../../../hadoop-yarn/hadoop-yarn-site/GracefulDecommission.html">Graceful Decommission</a>
</li>
<li class="none">
<a href="../../../hadoop-yarn/hadoop-yarn-site/OpportunisticContainers.html">Opportunistic Containers</a>
</li>
<li class="none">
<a href="../../../hadoop-yarn/hadoop-yarn-site/Federation.html">YARN Federation</a>
</li>
<li class="none">
<a href="../../../hadoop-yarn/hadoop-yarn-site/SharedCache.html">Shared Cache</a>
</li>
<li class="none">
<a href="../../../hadoop-yarn/hadoop-yarn-site/UsingGpus.html">Using GPU</a>
</li>
<li class="none">
<a href="../../../hadoop-yarn/hadoop-yarn-site/UsingFPGA.html">Using FPGA</a>
</li>
<li class="none">
<a href="../../../hadoop-yarn/hadoop-yarn-site/PlacementConstraints.html">Placement Constraints</a>
</li>
<li class="none">
<a href="../../../hadoop-yarn/hadoop-yarn-site/YarnUI2.html">YARN UI2</a>
</li>
</ul>
<h5>YARN REST APIs</h5>
<ul>
<li class="none">
<a href="../../../hadoop-yarn/hadoop-yarn-site/WebServicesIntro.html">Introduction</a>
</li>
<li class="none">
<a href="../../../hadoop-yarn/hadoop-yarn-site/ResourceManagerRest.html">Resource Manager</a>
</li>
<li class="none">
<a href="../../../hadoop-yarn/hadoop-yarn-site/NodeManagerRest.html">Node Manager</a>
</li>
<li class="none">
<a href="../../../hadoop-yarn/hadoop-yarn-site/TimelineServer.html#Timeline_Server_REST_API_v1">Timeline Server</a>
</li>
<li class="none">
<a href="../../../hadoop-yarn/hadoop-yarn-site/TimelineServiceV2.html#Timeline_Service_v.2_REST_API">Timeline Service V.2</a>
</li>
</ul>
<h5>YARN Service</h5>
<ul>
<li class="none">
<a href="../../../hadoop-yarn/hadoop-yarn-site/yarn-service/Overview.html">Overview</a>
</li>
<li class="none">
<a href="../../../hadoop-yarn/hadoop-yarn-site/yarn-service/QuickStart.html">QuickStart</a>
</li>
<li class="none">
<a href="../../../hadoop-yarn/hadoop-yarn-site/yarn-service/Concepts.html">Concepts</a>
</li>
<li class="none">
<a href="../../../hadoop-yarn/hadoop-yarn-site/yarn-service/YarnServiceAPI.html">Yarn Service API</a>
</li>
<li class="none">
<a href="../../../hadoop-yarn/hadoop-yarn-site/yarn-service/ServiceDiscovery.html">Service Discovery</a>
</li>
<li class="none">
<a href="../../../hadoop-yarn/hadoop-yarn-site/yarn-service/SystemServices.html">System Services</a>
</li>
</ul>
<h5>Hadoop Compatible File Systems</h5>
<ul>
<li class="none">
<a href="../../../hadoop-aliyun/tools/hadoop-aliyun/index.html">Aliyun OSS</a>
</li>
<li class="none">
<a href="../../../hadoop-aws/tools/hadoop-aws/index.html">Amazon S3</a>
</li>
<li class="none">
<a href="../../../hadoop-azure/index.html">Azure Blob Storage</a>
</li>
<li class="none">
<a href="../../../hadoop-azure-datalake/index.html">Azure Data Lake Storage</a>
</li>
<li class="none">
<a href="../../../hadoop-openstack/index.html">OpenStack Swift</a>
</li>
<li class="none">
<a href="../../../hadoop-cos/cloud-storage/index.html">Tencent COS</a>
</li>
</ul>
<h5>Auth</h5>
<ul>
<li class="none">
<a href="../../../hadoop-auth/index.html">Overview</a>
</li>
<li class="none">
<a href="../../../hadoop-auth/Examples.html">Examples</a>
</li>
<li class="none">
<a href="../../../hadoop-auth/Configuration.html">Configuration</a>
</li>
<li class="none">
<a href="../../../hadoop-auth/BuildingIt.html">Building</a>
</li>
</ul>
<h5>Tools</h5>
<ul>
<li class="none">
<a href="../../../hadoop-streaming/HadoopStreaming.html">Hadoop Streaming</a>
</li>
<li class="none">
<a href="../../../hadoop-archives/HadoopArchives.html">Hadoop Archives</a>
</li>
<li class="none">
<a href="../../../hadoop-archive-logs/HadoopArchiveLogs.html">Hadoop Archive Logs</a>
</li>
<li class="none">
<a href="../../../hadoop-distcp/DistCp.html">DistCp</a>
</li>
<li class="none">
<a href="../../../hadoop-gridmix/GridMix.html">GridMix</a>
</li>
<li class="none">
<a href="../../../hadoop-rumen/Rumen.html">Rumen</a>
</li>
<li class="none">
<a href="../../../hadoop-resourceestimator/ResourceEstimator.html">Resource Estimator Service</a>
</li>
<li class="none">
<a href="../../../hadoop-sls/SchedulerLoadSimulator.html">Scheduler Load Simulator</a>
</li>
<li class="none">
<a href="../../../hadoop-project-dist/hadoop-common/Benchmarking.html">Hadoop Benchmarking</a>
</li>
<li class="none">
<a href="../../../hadoop-dynamometer/Dynamometer.html">Dynamometer</a>
</li>
</ul>
<h5>Reference</h5>
<ul>
<li class="none">
<a href="../../../hadoop-project-dist/hadoop-common/release/">Changelog and Release Notes</a>
</li>
<li class="none">
<a href="../../../api/index.html">Java API docs</a>
</li>
<li class="none">
<a href="../../../hadoop-project-dist/hadoop-common/UnixShellAPI.html">Unix Shell API</a>
</li>
<li class="none">
<a href="../../../hadoop-project-dist/hadoop-common/Metrics.html">Metrics</a>
</li>
</ul>
<h5>Configuration</h5>
<ul>
<li class="none">
<a href="../../../hadoop-project-dist/hadoop-common/core-default.xml">core-default.xml</a>
</li>
<li class="none">
<a href="../../../hadoop-project-dist/hadoop-hdfs/hdfs-default.xml">hdfs-default.xml</a>
</li>
<li class="none">
<a href="../../../hadoop-project-dist/hadoop-hdfs-rbf/hdfs-rbf-default.xml">hdfs-rbf-default.xml</a>
</li>
<li class="none">
<a href="../../../hadoop-mapreduce-client/hadoop-mapreduce-client-core/mapred-default.xml">mapred-default.xml</a>
</li>
<li class="none">
<a href="../../../hadoop-yarn/hadoop-yarn-common/yarn-default.xml">yarn-default.xml</a>
</li>
<li class="none">
<a href="../../../hadoop-kms/kms-default.html">kms-default.xml</a>
</li>
<li class="none">
<a href="../../../hadoop-hdfs-httpfs/httpfs-default.html">httpfs-default.xml</a>
</li>
<li class="none">
<a href="../../../hadoop-project-dist/hadoop-common/DeprecatedProperties.html">Deprecated Properties</a>
</li>
</ul>
<a href="http://maven.apache.org/" title="Built by Maven" class="poweredBy">
<img alt="Built by Maven" src="../../images/logos/maven-feather.png"/>
</a>
</div>
</div>
<div id="bodyColumn">
<div id="contentBox">
<!---
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. See accompanying LICENSE file.
-->
<h1>Working with Encrypted S3 Data</h1>
<ul>
<li><a href="#Introduction"> Introduction</a></li>
<li><a href="#How_data_is_encrypted">How data is encrypted</a></li>
<li><a href="#S3_Default_Encryption"> S3 Default Encryption</a></li>
<li><a href="#SSE-S3_Amazon_S3-Managed_Encryption_Keys"> SSE-S3 Amazon S3-Managed Encryption Keys</a>
<ul>
<li><a href="#Enabling_SSE-S3">Enabling SSE-S3</a></li>
<li><a href="#SSE-KMS:_Amazon_S3-KMS_Managed_Encryption_Keys"> SSE-KMS: Amazon S3-KMS Managed Encryption Keys</a></li>
<li><a href="#Enabling_SSE-KMS">Enabling SSE-KMS</a></li>
<li><a href="#the_S3A_fs.s3a.encryption.key_key_only_affects_created_files">the S3A fs.s3a.encryption.key key only affects created files</a></li></ul></li>
<li><a href="#SSE-C:_Server_side_encryption_with_a_client-supplied_key."> SSE-C: Server side encryption with a client-supplied key.</a>
<ul>
<li><a href="#Enabling_SSE-C">Enabling SSE-C</a></li>
<li><a href="#The_fs.s3a.encryption.key_value_is_used_to_read_and_write_data">The fs.s3a.encryption.key value is used to read and write data</a></li>
<li><a href="#SSE-C_Warning">SSE-C Warning</a></li></ul></li>
<li><a href="#Encryption_best_practises"> Encryption best practises</a>
<ul>
<li><a href="#Mandate_encryption_through_policies"> Mandate encryption through policies</a></li>
<li><a href="#Use_S3a_per-bucket_configuration_to_control_encryption_settings"> Use S3a per-bucket configuration to control encryption settings</a></li>
<li><a href="#Use_rename.28.29_to_encrypt_files_with_new_keys"> Use rename() to encrypt files with new keys</a></li></ul></li>
<li><a href="#Troubleshooting_Encryption"> Troubleshooting Encryption</a></li></ul>
<div class="section">
<h2><a name="Introduction"></a><a name="introduction"></a> Introduction</h2>
<p>The S3A filesystem client supports Amazon S3&#x2019;s Server Side Encryption for at-rest data encryption. You should to read up on the <a class="externalLink" href="https://docs.aws.amazon.com/AmazonS3/latest/dev/serv-side-encryption.html">AWS documentation</a> for S3 Server Side Encryption for up to date information on the encryption mechanisms.</p>
<p>When configuring an encryption method in the <tt>core-site.xml</tt>, this will apply cluster wide. Any new file written will be encrypted with this encryption configuration. When the S3A client reads a file, S3 will attempt to decrypt it using the mechanism and keys with which the file was encrypted.</p>
<ul>
<li>It is <b>NOT</b> advised to mix and match encryption types in a bucket</li>
<li>It is much simpler and safer to encrypt with just one type and key per bucket.</li>
<li>You can use AWS bucket policies to mandate encryption rules for a bucket.</li>
<li>You can use S3A per-bucket configuration to ensure that S3A clients use encryption policies consistent with the mandated rules.</li>
<li>You can use S3 Default Encryption to encrypt data without needing to set anything in the client.</li>
<li>Changing the encryption options on the client does not change how existing files were encrypted, except when the files are renamed.</li>
<li>For all mechanisms other than SSE-C, clients do not need any configuration options set in order to read encrypted data: it is all automatically handled in S3 itself.</li>
</ul></div>
<div class="section">
<h2><a name="How_data_is_encrypted"></a><a name="encryption_types"></a>How data is encrypted</h2>
<p>AWS S3 supports server-side encryption inside the storage system itself. When an S3 client uploading data requests data to be encrypted, then an encryption key is used to encrypt the data as it saved to S3. It remains encrypted on S3 until deleted: clients cannot change the encryption attributes of an object once uploaded.</p>
<p>The Amazon AWS SDK also offers client-side encryption, in which all the encoding and decoding of data is performed on the client. This is <i>not</i> supported by the S3A client.</p>
<p>The server-side &#x201c;SSE&#x201d; encryption is performed with symmetric AES256 encryption; S3 offers different mechanisms for actually defining the key to use.</p>
<p>There are four key management mechanisms, which in order of simplicity of use, are:</p>
<ul>
<li>S3 Default Encryption</li>
<li>SSE-S3: an AES256 key is generated in S3, and saved alongside the data.</li>
<li>SSE-KMS: an AES256 key is generated in S3, and encrypted with a secret key provided by Amazon&#x2019;s Key Management Service, a key referenced by name in the uploading client.</li>
<li>SSE-C : the client specifies an actual base64 encoded AES-256 key to be used to encrypt and decrypt the data.</li>
</ul></div>
<div class="section">
<h2><a name="S3_Default_Encryption"></a><a name="sse-s3"></a> S3 Default Encryption</h2>
<p>This feature allows the administrators of the AWS account to set the &#x201c;default&#x201d; encryption policy on a bucket -the encryption to use if the client does not explicitly declare an encryption algorithm.</p>
<p><a class="externalLink" href="https://docs.aws.amazon.com/AmazonS3/latest/dev/bucket-encryption.html">S3 Default Encryption for S3 Buckets</a></p>
<p>This supports SSE-S3 and SSE-KMS.</p>
<p>There is no need to set anything up in the client: do it in the AWS console.</p></div>
<div class="section">
<h2><a name="SSE-S3_Amazon_S3-Managed_Encryption_Keys"></a><a name="sse-s3"></a> SSE-S3 Amazon S3-Managed Encryption Keys</h2>
<p>In SSE-S3, all keys and secrets are managed inside S3. This is the simplest encryption mechanism. There is no extra cost for storing data with this option.</p>
<div class="section">
<h3><a name="Enabling_SSE-S3"></a>Enabling SSE-S3</h3>
<p>To write S3-SSE encrypted files, the value of <tt>fs.s3a.server-side-encryption-algorithm</tt> must be set to that of the encryption mechanism used in <tt>core-site</tt>; currently only <tt>AES256</tt> is supported.</p>
<div>
<div>
<pre class="source">&lt;property&gt;
&lt;name&gt;fs.s3a.server-side-encryption-algorithm&lt;/name&gt;
&lt;value&gt;AES256&lt;/value&gt;
&lt;/property&gt;
</pre></div></div>
<p>Once set, all new data will be stored encrypted. There is no need to set this property when downloading data &#x2014; the data will be automatically decrypted when read using the Amazon S3-managed key.</p>
<p>To learn more, refer to <a class="externalLink" href="http://docs.aws.amazon.com/AmazonS3/latest/dev/UsingServerSideEncryption.html">Protecting Data Using Server-Side Encryption with Amazon S3-Managed Encryption Keys (SSE-S3) in AWS documentation</a>.</p></div>
<div class="section">
<h3><a name="SSE-KMS:_Amazon_S3-KMS_Managed_Encryption_Keys"></a><a name="sse-kms"></a> SSE-KMS: Amazon S3-KMS Managed Encryption Keys</h3>
<p>Amazon offers a pay-per-use key management service, <a class="externalLink" href="https://aws.amazon.com/documentation/kms/">AWS KMS</a>. This service can be used to encrypt data on S3 by defining &#x201c;customer master keys&#x201d;, CMKs, which can be centrally managed and assigned to specific roles and IAM accounts.</p>
<p>The AWS KMS <a class="externalLink" href="http://docs.aws.amazon.com/kms/latest/developerguide/services-s3.html">can be used encrypt data on S3uploaded data</a>.</p>
<blockquote>
<p>The AWS KMS service is <b>not</b> related to the Key Management Service built into Hadoop (<i>Hadoop KMS</i>). The <i>Hadoop KMS</i> primarily focuses on managing keys for <i>HDFS Transparent Encryption</i>. Similarly, HDFS encryption is unrelated to S3 data encryption.</p>
</blockquote>
<p>When uploading data encrypted with SSE-KMS, the sequence is as follows.</p>
<ol style="list-style-type: decimal">
<li>
<p>The S3A client must declare a specific CMK in the property <tt>fs.s3a.server-side-encryption.key</tt>, or leave it blank to use the default configured for that region.</p>
</li>
<li>
<p>The S3A client uploads all the data as normal, now including encryption information.</p>
</li>
<li>
<p>The S3 service encrypts the data with a symmetric key unique to the new object.</p>
</li>
<li>
<p>The S3 service retrieves the chosen CMK key from the KMS service, and, if the user has the right to use it, uses it to encrypt the object-specific key.</p>
</li>
</ol>
<p>When downloading SSE-KMS encrypted data, the sequence is as follows</p>
<ol style="list-style-type: decimal">
<li>The S3A client issues an HTTP GET request to read the data.</li>
<li>S3 sees that the data was encrypted with SSE-KMS, and looks up the specific key in the KMS service</li>
<li>If and only if the requesting user has been granted permission to use the CMS key does the KMS service provide S3 with the key.</li>
<li>As a result, S3 will only decode the data if the user has been granted access to the key.</li>
</ol>
<p>KMS keys can be managed by an organization&#x2019;s administrators in AWS, including having access permissions assigned and removed from specific users, groups, and IAM roles. Only those &#x201c;principals&#x201d; with granted rights to a key may access it, hence only they may encrypt data with the key, <i>and decrypt data encrypted with it</i>. This allows KMS to be used to provide a cryptographically secure access control mechanism for data stores on S3.</p>
<p>Each KMS server is region specific, and accordingly, so is each CMK configured. A CMK defined in one region cannot be used with an S3 bucket in a different region.</p>
<p>Notes</p>
<ul>
<li>Callers are charged for every use of a key, both for encrypting the data in uploads and for decrypting it when reading it back.</li>
<li>Random-access IO on files may result in multiple GET requests of an object during a read sequence (especially for columnar data), so may require more than one key retrieval to process a single file,</li>
<li>The KMS service is throttled: too many requests may cause requests to fail.</li>
<li>As well as incurring charges, heavy I/O <i>may</i> reach IO limits for a customer. If those limits are reached, they can be increased through the AWS console.</li>
</ul></div>
<div class="section">
<h3><a name="Enabling_SSE-KMS"></a>Enabling SSE-KMS</h3>
<p>To enable SSE-KMS, the property <tt>fs.s3a.server-side-encryption-algorithm</tt> must be set to <tt>SSE-KMS</tt> in <tt>core-site</tt>:</p>
<div>
<div>
<pre class="source">&lt;property&gt;
&lt;name&gt;fs.s3a.server-side-encryption-algorithm&lt;/name&gt;
&lt;value&gt;SSE-KMS&lt;/value&gt;
&lt;/property&gt;
</pre></div></div>
<p>The ID of the specific key used to encrypt the data should also be set in the property <tt>fs.s3a.server-side-encryption.key</tt>:</p>
<div>
<div>
<pre class="source">&lt;property&gt;
&lt;name&gt;fs.s3a.server-side-encryption.key&lt;/name&gt;
&lt;value&gt;arn:aws:kms:us-west-2:360379543683:key/071a86ff-8881-4ba0-9230-95af6d01ca01&lt;/value&gt;
&lt;/property&gt;
</pre></div></div>
<p>Organizations may define a default key in the Amazon KMS; if a default key is set, then it will be used whenever SSE-KMS encryption is chosen and the value of <tt>fs.s3a.server-side-encryption.key</tt> is empty.</p></div>
<div class="section">
<h3><a name="the_S3A_fs.s3a.encryption.key_key_only_affects_created_files"></a>the S3A <tt>fs.s3a.encryption.key</tt> key only affects created files</h3>
<p>With SSE-KMS, the S3A client option <tt>fs.s3a.server-side-encryption.key</tt> sets the key to be used when new files are created. When reading files, this key, and indeed the value of <tt>fs.s3a.server-side-encryption-algorithme</tt> is ignored: S3 will attempt to retrieve the key and decrypt the file based on the create-time settings.</p>
<p>This means that</p>
<ul>
<li>There&#x2019;s no need to configure any client simply reading data.</li>
<li>It is possible for a client to read data encrypted with one KMS key, and write it with another.</li>
</ul></div></div>
<div class="section">
<h2><a name="SSE-C:_Server_side_encryption_with_a_client-supplied_key."></a><a name="sse-c"></a> SSE-C: Server side encryption with a client-supplied key.</h2>
<p>In SSE-C, the client supplies the secret key needed to read and write data. Every client trying to read or write data must be configured with the same secret key.</p>
<p>SSE-C integration with Hadoop is still stabilizing; issues related to it are still surfacing. It is already clear that SSE-C with a common key <b>must</b> be used exclusively within a bucket if it is to be used at all. This is the only way to ensure that path and directory listings do not fail with &#x201c;Bad Request&#x201d; errors.</p>
<div class="section">
<h3><a name="Enabling_SSE-C"></a>Enabling SSE-C</h3>
<p>To use SSE-C, the configuration option <tt>fs.s3a.server-side-encryption-algorithm</tt> must be set to <tt>SSE-C</tt>, and a base-64 encoding of the key placed in <tt>fs.s3a.server-side-encryption.key</tt>.</p>
<div>
<div>
<pre class="source">&lt;property&gt;
&lt;name&gt;fs.s3a.server-side-encryption-algorithm&lt;/name&gt;
&lt;value&gt;SSE-C&lt;/value&gt;
&lt;/property&gt;
&lt;property&gt;
&lt;name&gt;fs.s3a.server-side-encryption.key&lt;/name&gt;
&lt;value&gt;SGVscCwgSSdtIHRyYXBwZWQgaW5zaWRlIGEgYmFzZS02NC1jb2RlYyE=&lt;/value&gt;
&lt;/property&gt;
</pre></div></div>
<p>All clients must share this same key.</p></div>
<div class="section">
<h3><a name="The_fs.s3a.encryption.key_value_is_used_to_read_and_write_data"></a>The <tt>fs.s3a.encryption.key</tt> value is used to read and write data</h3>
<p>With SSE-C, the S3A client option <tt>fs.s3a.server-side-encryption.key</tt> sets the key to be used for both reading <i>and</i> writing data.</p>
<p>When reading any file written with SSE-C, the same key must be set in the property <tt>fs.s3a.server-side-encryption.key</tt>.</p>
<p>This is unlike SSE-S3 and SSE-KMS, where the information needed to decode data is kept in AWS infrastructure.</p></div>
<div class="section">
<h3><a name="SSE-C_Warning"></a>SSE-C Warning</h3>
<p>You need to fully understand how SSE-C works in the S3 environment before using this encryption type. Please refer to the Server Side Encryption documentation available from AWS. SSE-C is only recommended for advanced users with advanced encryption use cases. Failure to properly manage encryption keys can cause data loss. Currently, the AWS S3 API(and thus S3A) only supports one encryption key and cannot support decrypting objects during moves under a previous key to a new destination. It is <b>NOT</b> advised to use multiple encryption keys in a bucket, and is recommended to use one key per bucket and to not change this key. This is due to when a request is made to S3, the actual encryption key must be provided to decrypt the object and access the metadata. Since only one encryption key can be provided at a time, S3A will not pass the correct encryption key to decrypt the data.</p></div></div>
<div class="section">
<h2><a name="Encryption_best_practises"></a><a name="best_practises"></a> Encryption best practises</h2>
<div class="section">
<h3><a name="Mandate_encryption_through_policies"></a><a name="bucket_policy"></a> Mandate encryption through policies</h3>
<p>Because it is up to the clients to enable encryption on new objects, all clients must be correctly configured in order to guarantee that data is encrypted.</p>
<p>To mandate that all data uploaded to a bucket is encrypted, you can set a <a class="externalLink" href="https://aws.amazon.com/blogs/security/how-to-prevent-uploads-of-unencrypted-objects-to-amazon-s3/">bucket policy</a> declaring that clients must provide encryption information with all data uploaded.</p>
<ul>
<li>Mandating an encryption mechanism on newly uploaded data does not encrypt existing data; existing data will retain whatever encryption (if any) applied at the time of creation*</li>
</ul>
<p>Here is a policy to mandate <tt>SSE-S3/AES265</tt> encryption on all data uploaded to a bucket. This covers uploads as well as the copy operations which take place when file/directory rename operations are mimicked.</p>
<div>
<div>
<pre class="source">{
&quot;Version&quot;: &quot;2012-10-17&quot;,
&quot;Id&quot;: &quot;EncryptionPolicy&quot;,
&quot;Statement&quot;: [
{
&quot;Sid&quot;: &quot;RequireEncryptionHeaderOnPut&quot;,
&quot;Effect&quot;: &quot;Deny&quot;,
&quot;Principal&quot;: &quot;*&quot;,
&quot;Action&quot;: [
&quot;s3:PutObject&quot;
],
&quot;Resource&quot;: &quot;arn:aws:s3:::BUCKET/*&quot;,
&quot;Condition&quot;: {
&quot;Null&quot;: {
&quot;s3:x-amz-server-side-encryption&quot;: true
}
}
},
{
&quot;Sid&quot;: &quot;RequireAESEncryptionOnPut&quot;,
&quot;Effect&quot;: &quot;Deny&quot;,
&quot;Principal&quot;: &quot;*&quot;,
&quot;Action&quot;: [
&quot;s3:PutObject&quot;
],
&quot;Resource&quot;: &quot;arn:aws:s3:::BUCKET/*&quot;,
&quot;Condition&quot;: {
&quot;StringNotEquals&quot;: {
&quot;s3:x-amz-server-side-encryption&quot;: &quot;AES256&quot;
}
}
}
]
}
</pre></div></div>
<p>To use SSE-KMS, a different restriction must be defined:</p>
<div>
<div>
<pre class="source">{
&quot;Version&quot;: &quot;2012-10-17&quot;,
&quot;Id&quot;: &quot;EncryptionPolicy&quot;,
&quot;Statement&quot;: [
{
&quot;Sid&quot;: &quot;RequireEncryptionHeaderOnPut&quot;,
&quot;Effect&quot;: &quot;Deny&quot;,
&quot;Principal&quot;: &quot;*&quot;,
&quot;Action&quot;: [
&quot;s3:PutObject&quot;
],
&quot;Resource&quot;: &quot;arn:aws:s3:::BUCKET/*&quot;,
&quot;Condition&quot;: {
&quot;Null&quot;: {
&quot;s3:x-amz-server-side-encryption&quot;: true
}
}
},
{
&quot;Sid&quot;: &quot;RequireKMSEncryptionOnPut&quot;,
&quot;Effect&quot;: &quot;Deny&quot;,
&quot;Principal&quot;: &quot;*&quot;,
&quot;Action&quot;: [
&quot;s3:PutObject&quot;
],
&quot;Resource&quot;: &quot;arn:aws:s3:::BUCKET/*&quot;,
&quot;Condition&quot;: {
&quot;StringNotEquals&quot;: {
&quot;s3:x-amz-server-side-encryption&quot;: &quot;SSE-KMS&quot;
}
}
}
]
}
</pre></div></div>
<p>To use one of these policies:</p>
<ol style="list-style-type: decimal">
<li>Replace <tt>BUCKET</tt> with the specific name of the bucket being secured.</li>
<li>Locate the bucket in the AWS console <a class="externalLink" href="https://console.aws.amazon.com/s3/home">S3 section</a>.</li>
<li>Select the &#x201c;Permissions&#x201d; tab.</li>
<li>Select the &#x201c;Bucket Policy&#x201d; tab in the permissions section.</li>
<li>Paste the edited policy into the form.</li>
<li>Save the policy.</li>
</ol></div>
<div class="section">
<h3><a name="Use_S3a_per-bucket_configuration_to_control_encryption_settings"></a><a name="per_bucket_config"></a> Use S3a per-bucket configuration to control encryption settings</h3>
<p>In an organisation which has embraced S3 encryption, different buckets inevitably have different encryption policies, such as different keys for SSE-KMS encryption. In particular, as different keys need to be named for different regions, unless you rely on the administrator-managed &#x201c;default&#x201d; key for each S3 region, you will need unique keys.</p>
<p>S3A&#x2019;s per-bucket configuration enables this.</p>
<p>Here, for example, are settings for a bucket in London, <tt>london-stats</tt>:</p>
<div>
<div>
<pre class="source">&lt;property&gt;
&lt;name&gt;fs.s3a.bucket.london-stats.server-side-encryption-algorithm&lt;/name&gt;
&lt;value&gt;AES256&lt;/value&gt;
&lt;/property&gt;
</pre></div></div>
<p>This requests SSE-S; if matched with a bucket policy then all data will be encrypted as it is uploaded.</p>
<p>A different bucket can use a different policy (here SSE-KMS) and, when necessary, declare a key.</p>
<p>Here is an example bucket in S3 Ireland, which uses SSE-KMS and a KMS key hosted in the AWS-KMS service in the same region.</p>
<div>
<div>
<pre class="source">&lt;property&gt;
&lt;name&gt;fs.s3a.bucket.ireland-dev.server-side-encryption-algorithm&lt;/name&gt;
&lt;value&gt;SSE-KMS&lt;/value&gt;
&lt;/property&gt;
&lt;property&gt;
&lt;name&gt;fs.s3a.bucket.ireland-dev.server-side-encryption.key&lt;/name&gt;
&lt;value&gt;arn:aws:kms:eu-west-1:98067faff834c:key/071a86ff-8881-4ba0-9230-95af6d01ca01&lt;/value&gt;
&lt;/property&gt;
</pre></div></div>
<p>Again the appropriate bucket policy can be used to guarantee that all callers will use SSE-KMS; they can even mandate the name of the key used to encrypt the data, so guaranteeing that access to thee data can be read by everyone granted access to that key, and nobody without access to it.</p></div>
<div class="section">
<h3><a name="Use_rename.28.29_to_encrypt_files_with_new_keys"></a><a name="changing-encryption"></a> Use rename() to encrypt files with new keys</h3>
<p>The encryption of an object is set when it is uploaded. If you want to encrypt an unencrypted file, or change the SEE-KMS key of a file, the only way to do so is by copying the object.</p>
<p>How can you do that from Hadoop? With <tt>rename()</tt>.</p>
<p>The S3A client mimics a real filesystem&#x2019;s&#x2019; rename operation by copying all the source files to the destination paths, then deleting the old ones.</p>
<p>Note: this does not work for SSE-C, because you cannot set a different key for reading as for writing, and you must supply that key for reading. There you need to copy one bucket to a different bucket, one with a different key. Use <tt>distCp</tt>for this, with per-bucket encryption policies.</p></div></div>
<div class="section">
<h2><a name="Troubleshooting_Encryption"></a><a name="troubleshooting"></a> Troubleshooting Encryption</h2>
<p>The <a href="./troubleshooting_s3a.html">troubleshooting</a> document covers stack traces which may surface when working with encrypted data.</p></div>
</div>
</div>
<div class="clear">
<hr/>
</div>
<div id="footer">
<div class="xright">
&#169; 2008-2021
Apache Software Foundation
- <a href="http://maven.apache.org/privacy-policy.html">Privacy Policy</a>.
Apache Maven, Maven, Apache, the Apache feather logo, and the Apache Maven project logos are trademarks of The Apache Software Foundation.
</div>
<div class="clear">
<hr/>
</div>
</div>
</body>
</html>