components/camel-azure/camel-azure-files/src/main/docs/azure-files-component.adoc - camel - Git at Google

 = Azure Files Component
 :doctitle: Azure Files
 :shortname: azure-files
 :artifactid: camel-azure-files
 :description: Send and receive files to Azure storage file share
 :since: 3.22
 :supportlevel: Preview
 :tabs-sync-option:
 :component-header: Both producer and consumer are supported
 //Manually maintained attributes
 :group: Azure
 :camel-spring-boot-name: azure-files

 *Since Camel {since}*

 *{component-header}*

 This component provides access to Azure Files.

 [CAUTION]
 ====
 This is preview component, therefore, anything can change in future releases
 (features and behavior can be changed, modified or even dropped without notice). At the same time it is consolidated
 enough, sparingly documented, a few users reported it was working
 in their environment, and it is ready for wider feedback.
 ====

 When consuming from remote files server, make sure you read the section titled _Consuming Files_
 further below for details related to consuming files.

 Maven users will need to add the following dependency to their `pom.xml`
 for this component:

 [source,xml]
 ----
 <dependency>
     <groupId>org.apache.camel</groupId>
     <artifactId>camel-azure-files</artifactId>
     <version>x.y.z</version>
     <!-- use the same version as your Camel core version -->
 </dependency>
 ----

 == Endpoint URI Format

 ----
 azure-files://account[.file.core.windows.net][:port]/share[/directory]
 ----

 Where *directory* represents the underlying directory. The directory
 is a relative path and does not include the share name. The relative path
 can contain nested folders, such as `inbox/spam`. It defaults to
 the share root directory.

 The `autoCreate` option is supported for the directory;
 when consumer or producer starts, there's an additional operation
 performed to create the directory configured for the endpoint. The default
 value for `autoCreate` is `true`. On the contrary, the share must exist; it
 is not automatically created.

 If no *port* number is provided, Camel will provide default values
 according to the protocol (https 443).

 You can append query options to the URI in the following format
 `?option=value&option2=value&...`.

 To use this component, you have multiple options to provide the required Azure authentication information:

 - Via Azure Identity, when specifying `credentialType=AZURE_IDENTITY` and providing required https://github.com/Azure/azure-sdk-for-java/tree/main/sdk/identity/azure-identity#environment-variables[environment variables]. This enables service principal (e.g. app registration) authentication with secret/certificate as well as username password.
 - Via shared storage account key, when specifying `credentialType=SHARED_ACCOUNT_KEY` and providing `sharedKey` for your Azure account, this is the simplest way to get started. The sharedKey can be generated through your Azure portal.
 - Via Azure SAS, when specifying `credentialType=AZURE_SAS` and providing a SAS Token parameter through the `token` parameter.

 // component-configure options: START

 // component-configure options: END

 // component options: START
 include::partial$component-configure-options.adoc[]
 include::partial$component-endpoint-options.adoc[]
 // component options: END

 // endpoint options: START

 // endpoint options: END
 // component headers: START
 include::partial$component-endpoint-headers.adoc[]
 // component headers: END


 === Endpoint URI Examples

 ----
 azure-files://camelazurefiles.file.core.windows.net/samples?sv=2022-11-02&ss=f&srt=sco&sp=rwdlc&se=2023-06-18T22:29:13Z&st=2023-06-05T14:29:13Z&spr=https&sig=MPsMh8zci0v3To7IT9SKdaFGZV8ezno63m9C8s9bdVQ%3D
 ----

 ----
 azure-files://camelazurefiles/samples/inbox/spam?sharedKey=FAKE502UyuBD...3Z%2BASt9dCmJg%3D%3D&delete=true
 ----

 == Usage

 === Paths

 The path separator is `/`. The absolute paths start with the path separator.
 The absolute paths do not include the share name, and they are relative
 to the share root rather than to the endpoint starting directory.

 *NOTE:* At some places, namely logs of used libraries, OS-specific path separator
 appears, and the relative paths are relative to the share root (rather than
 to the current working directory or to the endpoint starting directory)
 so interpret them with a grain of salt.

 === Concurrency

 This component does not support concurrency on its endpoints.

 === More Information

 This component mimics the FTP component.
 So, there are more samples and details on the FTP
 component page.

 This component uses the Azure Java SDK libraries for the actual work.

 === Consuming Files

 The remote consumer will by default leave the consumed
 files untouched on the remote cloud files server. You have to configure it
 explicitly if you want it to delete the files or move them to another
 location. For example, you can use `delete=true` to delete the files, or
 use `move=.done` to move the files into `.done` sub directory.

 In Camel, the `.`-prefixed folders are excluded from
 recursive polling.

 The regular File consumer is different as it will by
 default move files to a `.camel` sub directory. The reason Camel does
 *not* do this by default for the remote consumer is that it may lack
 permissions by default to be able to move or delete files.

 ==== Body Type Options

 For each matching file, the consumer sends to the Camel exchange
 a message with a selected body type:

   - `byte[]` by default
   - `java.io.InputStream` if `streamDownload=true` is configured
   - `java.io.File` if `localWorkDirectory` is configured

 The body type configuration should be tuned to fit available resources,
 performance targets, route processors, caching, resuming, etc.

 ==== Limitations

 The option *readLock* can be used to force Camel *not* to consume files
 that are currently in the progress of being written. However, this option
 is turned off by default, as it requires that the user has write access.
 See the endpoint options table for more details about
 read locks. +
  There are other solutions to avoid consuming files that are currently
 being written; for instance, you can write to a temporary
 destination and move the file after it has been written.

 For the `readLock=changed`, it relies only on the last modified;
 furthermore a precision finer than 5 seconds might be problematic.

 When moving files using `move` or `preMove` option, the files are
 restricted to the share. That prevents consumer from moving files
 outside the endpoint share.

 === Exchange Properties

 The consumer sets the following exchange properties

 [width="100%",cols="50%,50%",options="header",]
 |=======================================================================
 |Header |Description

 |`CamelBatchIndex` | The current index out of total number of files being consumed in this batch.

 |`CamelBatchSize` |The total number of files being consumed in this batch.

 |`CamelBatchComplete` | True if there are no more files in this batch.
 |=======================================================================

 === Producing Files

 The Files producer is optimized for two body types:

   - `java.io.InputStream` if `CamelFileLength` header is set
   - `byte[]`

 In either case, the remote file size is allocated
 and then rewritten with body content. Any inconsistency between
 declared file length and stream length results in a corrupted
 remote file.

 ==== Limitations

 The underlying Azure Files service does not allow growing files. The file
 length must be known at its creation time, consequently:

   - `CamelFileLength` header has an important
     meaning even for producers.
   - No appending mode is supported.


 === About Timeouts

 You can use the `connectTimeout` option to set
 a timeout in millis to connect or disconnect.

 The `timeout` option only applies as the data timeout in millis.

 The meta-data operations timeout is minimum of:
 `readLockCheckInterval`, `timeout` and 20_000 millis.

 For now, the file upload has no timeout. During the upload,
 the underlying library could log timeout warnings. They are
 recoverable and upload could continue.

 === Using Local Work Directory

 Camel supports consuming from remote files servers and downloading the
 files directly into a local work directory. This avoids reading the
 entire remote file content into memory as it is streamed directly into
 the local file using `FileOutputStream`.

 Camel will store to a local file with the same name as the remote file,
 though with `.inprogress` as an extension while the file is being
 downloaded. Afterward, the file is renamed to remove the `.inprogress`
 suffix. And finally, when the Exchange is complete,
 the local file is deleted.

 So if you want to download files from a remote files server and store it
 as local files, then you need to route to a file endpoint such as:

 [source,java]
 ----
 from("azure-files://...&localWorkDirectory=/tmp").to("file://inbox");
 ----

 [TIP]
 ====
 The route above is ultra efficient as it avoids reading the entire file content into memory.
 It will download the remote file directly to a local file stream.
 The `java.io.File` handle is then used as the Exchange body. The file producer leverages this fact and can work directly on the work file `java.io.File` handle and perform a `java.io.File.rename` to the target filename.
 As Camel knows it's a local work file, it can optimize and use a rename instead of a file copy, as the work file is meant to be deleted anyway.
 ====

 === Custom Filtering

 Camel supports pluggable filtering strategies. This strategy it to use
 the build in `org.apache.camel.component.file.GenericFileFilter` in
 Java. You can then configure the endpoint with such a filter to skip
 certain filters before being processed.

 In the sample, we have built our own filter that only accepts files
 starting with the report in the filename.

 And then we can configure our route using the *filter* attribute to
 reference our filter (using `#` notation) that we have defined in the
 spring XML file:

 The accept(file) file argument has properties:

   - endpoint path: the share name such as `/samples`
   - relative path: a path to the file such as `subdir/a file`
   - directory: `true` if a directory
   - file length: if not a directory, then a length of the file in bytes


 === Filtering using ANT path matcher

 The ANT path matcher is a filter shipped out-of-the-box in the
 *camel-spring* jar. So you need to depend on *camel-spring* if you are
 using Maven. +
  The reason is that we leverage Spring's
 http://static.springsource.org/spring/docs/3.0.x/api/org/springframework/util/AntPathMatcher.html[AntPathMatcher]
 to do the actual matching.

 The file paths are matched with the following rules:

 * `?` matches one character
 * `*` matches zero or more characters
 * `**` matches zero or more directories in a path

 The sample below demonstrates how to use it:

 [source,java]
 ----
 from("azure-files://...&antInclude=**/*.txt").to("...");
 ----

 === Using a Proxy

 Consult the https://learn.microsoft.com/en-us/azure/developer/java/sdk/proxying[underlying library]
 documentation.


 === Consuming a single file using a fixed name

 Unlike FTP component that features a special combination of options:

   - `useList=false`
   - `fileName=myFileName.txt`
   - `ignoreFileNotFoundOrPermissionError=true`

 to optimize _the single file using a fixed name_ use case,
 it is necessary to fall back to regular filters (i.e. the list
 permission is needed).

 === Debug logging

 This component has log level *TRACE* that can be helpful if you have
 problems.
	= Azure Files Component
	:doctitle: Azure Files
	:shortname: azure-files
	:artifactid: camel-azure-files
	:description: Send and receive files to Azure storage file share
	:since: 3.22
	:supportlevel: Preview
	:tabs-sync-option:
	:component-header: Both producer and consumer are supported
	//Manually maintained attributes
	:group: Azure
	:camel-spring-boot-name: azure-files

	Since Camel {since}

	{component-header}

	This component provides access to Azure Files.

	[CAUTION]
	====
	This is preview component, therefore, anything can change in future releases
	(features and behavior can be changed, modified or even dropped without notice). At the same time it is consolidated
	enough, sparingly documented, a few users reported it was working
	in their environment, and it is ready for wider feedback.
	====

	When consuming from remote files server, make sure you read the section titled _Consuming Files_
	further below for details related to consuming files.

	Maven users will need to add the following dependency to their `pom.xml`
	for this component:

	[source,xml]
	----
	<dependency>
	<groupId>org.apache.camel</groupId>
	<artifactId>camel-azure-files</artifactId>
	<version>x.y.z</version>
	<!-- use the same version as your Camel core version -->
	</dependency>
	----

	== Endpoint URI Format

	----
	azure-files://account[.file.core.windows.net][:port]/share[/directory]
	----

	Where directory represents the underlying directory. The directory
	is a relative path and does not include the share name. The relative path
	can contain nested folders, such as `inbox/spam`. It defaults to
	the share root directory.

	The `autoCreate` option is supported for the directory;
	when consumer or producer starts, there's an additional operation
	performed to create the directory configured for the endpoint. The default
	value for `autoCreate` is `true`. On the contrary, the share must exist; it
	is not automatically created.

	If no port number is provided, Camel will provide default values
	according to the protocol (https 443).

	You can append query options to the URI in the following format
	`?option=value&option2=value&...`.

	To use this component, you have multiple options to provide the required Azure authentication information:

	- Via Azure Identity, when specifying `credentialType=AZURE_IDENTITY` and providing required https://github.com/Azure/azure-sdk-for-java/tree/main/sdk/identity/azure-identity#environment-variables[environment variables]. This enables service principal (e.g. app registration) authentication with secret/certificate as well as username password.
	- Via shared storage account key, when specifying `credentialType=SHARED_ACCOUNT_KEY` and providing `sharedKey` for your Azure account, this is the simplest way to get started. The sharedKey can be generated through your Azure portal.
	- Via Azure SAS, when specifying `credentialType=AZURE_SAS` and providing a SAS Token parameter through the `token` parameter.

	// component-configure options: START

	// component-configure options: END

	// component options: START
	include::partial$component-configure-options.adoc[]
	include::partial$component-endpoint-options.adoc[]
	// component options: END

	// endpoint options: START

	// endpoint options: END
	// component headers: START
	include::partial$component-endpoint-headers.adoc[]
	// component headers: END


	=== Endpoint URI Examples

	----
	azure-files://camelazurefiles.file.core.windows.net/samples?sv=2022-11-02&ss=f&srt=sco&sp=rwdlc&se=2023-06-18T22:29:13Z&st=2023-06-05T14:29:13Z&spr=https&sig=MPsMh8zci0v3To7IT9SKdaFGZV8ezno63m9C8s9bdVQ%3D
	----

	----
	azure-files://camelazurefiles/samples/inbox/spam?sharedKey=FAKE502UyuBD...3Z%2BASt9dCmJg%3D%3D&delete=true
	----

	== Usage

	=== Paths

	The path separator is `/`. The absolute paths start with the path separator.
	The absolute paths do not include the share name, and they are relative
	to the share root rather than to the endpoint starting directory.

	NOTE: At some places, namely logs of used libraries, OS-specific path separator
	appears, and the relative paths are relative to the share root (rather than
	to the current working directory or to the endpoint starting directory)
	so interpret them with a grain of salt.

	=== Concurrency

	This component does not support concurrency on its endpoints.

	=== More Information

	This component mimics the FTP component.
	So, there are more samples and details on the FTP
	component page.

	This component uses the Azure Java SDK libraries for the actual work.

	=== Consuming Files

	The remote consumer will by default leave the consumed
	files untouched on the remote cloud files server. You have to configure it
	explicitly if you want it to delete the files or move them to another
	location. For example, you can use `delete=true` to delete the files, or
	use `move=.done` to move the files into `.done` sub directory.

	In Camel, the `.`-prefixed folders are excluded from
	recursive polling.

	The regular File consumer is different as it will by
	default move files to a `.camel` sub directory. The reason Camel does
	not do this by default for the remote consumer is that it may lack
	permissions by default to be able to move or delete files.

	==== Body Type Options

	For each matching file, the consumer sends to the Camel exchange
	a message with a selected body type:

	- `byte[]` by default
	- `java.io.InputStream` if `streamDownload=true` is configured
	- `java.io.File` if `localWorkDirectory` is configured

	The body type configuration should be tuned to fit available resources,
	performance targets, route processors, caching, resuming, etc.

	==== Limitations

	The option readLock can be used to force Camel not to consume files
	that are currently in the progress of being written. However, this option
	is turned off by default, as it requires that the user has write access.
	See the endpoint options table for more details about
	read locks. +
	There are other solutions to avoid consuming files that are currently
	being written; for instance, you can write to a temporary
	destination and move the file after it has been written.

	For the `readLock=changed`, it relies only on the last modified;
	furthermore a precision finer than 5 seconds might be problematic.

	When moving files using `move` or `preMove` option, the files are
	restricted to the share. That prevents consumer from moving files
	outside the endpoint share.

	=== Exchange Properties

	The consumer sets the following exchange properties

	[width="100%",cols="50%,50%",options="header",]
	\|=======================================================================
	\|Header \|Description

	\|`CamelBatchIndex` \| The current index out of total number of files being consumed in this batch.

	\|`CamelBatchSize` \|The total number of files being consumed in this batch.

	\|`CamelBatchComplete` \| True if there are no more files in this batch.
	\|=======================================================================

	=== Producing Files

	The Files producer is optimized for two body types:

	- `java.io.InputStream` if `CamelFileLength` header is set
	- `byte[]`

	In either case, the remote file size is allocated
	and then rewritten with body content. Any inconsistency between
	declared file length and stream length results in a corrupted
	remote file.

	==== Limitations

	The underlying Azure Files service does not allow growing files. The file
	length must be known at its creation time, consequently:

	- `CamelFileLength` header has an important
	meaning even for producers.
	- No appending mode is supported.


	=== About Timeouts

	You can use the `connectTimeout` option to set
	a timeout in millis to connect or disconnect.

	The `timeout` option only applies as the data timeout in millis.

	The meta-data operations timeout is minimum of:
	`readLockCheckInterval`, `timeout` and 20_000 millis.

	For now, the file upload has no timeout. During the upload,
	the underlying library could log timeout warnings. They are
	recoverable and upload could continue.

	=== Using Local Work Directory

	Camel supports consuming from remote files servers and downloading the
	files directly into a local work directory. This avoids reading the
	entire remote file content into memory as it is streamed directly into
	the local file using `FileOutputStream`.

	Camel will store to a local file with the same name as the remote file,
	though with `.inprogress` as an extension while the file is being
	downloaded. Afterward, the file is renamed to remove the `.inprogress`
	suffix. And finally, when the Exchange is complete,
	the local file is deleted.

	So if you want to download files from a remote files server and store it
	as local files, then you need to route to a file endpoint such as:

	[source,java]
	----
	from("azure-files://...&localWorkDirectory=/tmp").to("file://inbox");
	----

	[TIP]
	====
	The route above is ultra efficient as it avoids reading the entire file content into memory.
	It will download the remote file directly to a local file stream.
	The `java.io.File` handle is then used as the Exchange body. The file producer leverages this fact and can work directly on the work file `java.io.File` handle and perform a `java.io.File.rename` to the target filename.
	As Camel knows it's a local work file, it can optimize and use a rename instead of a file copy, as the work file is meant to be deleted anyway.
	====

	=== Custom Filtering

	Camel supports pluggable filtering strategies. This strategy it to use
	the build in `org.apache.camel.component.file.GenericFileFilter` in
	Java. You can then configure the endpoint with such a filter to skip
	certain filters before being processed.

	In the sample, we have built our own filter that only accepts files
	starting with the report in the filename.

	And then we can configure our route using the filter attribute to
	reference our filter (using `#` notation) that we have defined in the
	spring XML file:

	The accept(file) file argument has properties:

	- endpoint path: the share name such as `/samples`
	- relative path: a path to the file such as `subdir/a file`
	- directory: `true` if a directory
	- file length: if not a directory, then a length of the file in bytes


	=== Filtering using ANT path matcher

	The ANT path matcher is a filter shipped out-of-the-box in the
	camel-spring jar. So you need to depend on camel-spring if you are
	using Maven. +
	The reason is that we leverage Spring's
	http://static.springsource.org/spring/docs/3.0.x/api/org/springframework/util/AntPathMatcher.html[AntPathMatcher]
	to do the actual matching.

	The file paths are matched with the following rules:

	* `?` matches one character
	* `*` matches zero or more characters
	* `**` matches zero or more directories in a path

	The sample below demonstrates how to use it:

	[source,java]
	----
	from("azure-files://...&antInclude=*/.txt").to("...");
	----

	=== Using a Proxy

	Consult the https://learn.microsoft.com/en-us/azure/developer/java/sdk/proxying[underlying library]
	documentation.


	=== Consuming a single file using a fixed name

	Unlike FTP component that features a special combination of options:

	- `useList=false`
	- `fileName=myFileName.txt`
	- `ignoreFileNotFoundOrPermissionError=true`

	to optimize _the single file using a fixed name_ use case,
	it is necessary to fall back to regular filters (i.e. the list
	permission is needed).

	=== Debug logging

	This component has log level TRACE that can be helpful if you have
	problems.