| = Azure Files Component |
| :doctitle: Azure Files |
| :shortname: azure-files |
| :artifactid: camel-azure-files |
| :description: Send and receive files to Azure storage file share |
| :since: 3.22 |
| :supportlevel: Preview |
| :tabs-sync-option: |
| :component-header: Both producer and consumer are supported |
| //Manually maintained attributes |
| :group: Azure |
| :camel-spring-boot-name: azure-files |
| |
| *Since Camel {since}* |
| |
| *{component-header}* |
| |
| This component provides access to Azure Files. |
| |
| [CAUTION] |
| ==== |
| This is preview component, therefore, anything can change in future releases |
| (features and behavior can be changed, modified or even dropped without notice). At the same time it is consolidated |
| enough, sparingly documented, a few users reported it was working |
| in their environment, and it is ready for wider feedback. |
| ==== |
| |
| When consuming from remote files server, make sure you read the section titled _Consuming Files_ |
| further below for details related to consuming files. |
| |
| Maven users will need to add the following dependency to their `pom.xml` |
| for this component: |
| |
| [source,xml] |
| ---- |
| <dependency> |
| <groupId>org.apache.camel</groupId> |
| <artifactId>camel-azure-files</artifactId> |
| <version>x.y.z</version> |
| <!-- use the same version as your Camel core version --> |
| </dependency> |
| ---- |
| |
| == Endpoint URI Format |
| |
| ---- |
| azure-files://account[.file.core.windows.net][:port]/share[/directory] |
| ---- |
| |
| Where *directory* represents the underlying directory. The directory |
| is a relative path and does not include the share name. The relative path |
| can contain nested folders, such as `inbox/spam`. It defaults to |
| the share root directory. |
| |
| The `autoCreate` option is supported for the directory; |
| when consumer or producer starts, there's an additional operation |
| performed to create the directory configured for the endpoint. The default |
| value for `autoCreate` is `true`. On the contrary, the share must exist; it |
| is not automatically created. |
| |
| If no *port* number is provided, Camel will provide default values |
| according to the protocol (https 443). |
| |
| You can append query options to the URI in the following format |
| `?option=value&option2=value&...`. |
| |
| To use this component, you have multiple options to provide the required Azure authentication information: |
| |
| - Via Azure Identity, when specifying `credentialType=AZURE_IDENTITY` and providing required https://github.com/Azure/azure-sdk-for-java/tree/main/sdk/identity/azure-identity#environment-variables[environment variables]. This enables service principal (e.g. app registration) authentication with secret/certificate as well as username password. |
| - Via shared storage account key, when specifying `credentialType=SHARED_ACCOUNT_KEY` and providing `sharedKey` for your Azure account, this is the simplest way to get started. The sharedKey can be generated through your Azure portal. |
| - Via Azure SAS, when specifying `credentialType=AZURE_SAS` and providing a SAS Token parameter through the `token` parameter. |
| |
| // component-configure options: START |
| |
| // component-configure options: END |
| |
| // component options: START |
| include::partial$component-configure-options.adoc[] |
| include::partial$component-endpoint-options.adoc[] |
| // component options: END |
| |
| // endpoint options: START |
| |
| // endpoint options: END |
| // component headers: START |
| include::partial$component-endpoint-headers.adoc[] |
| // component headers: END |
| |
| |
| === Endpoint URI Examples |
| |
| ---- |
| azure-files://camelazurefiles.file.core.windows.net/samples?sv=2022-11-02&ss=f&srt=sco&sp=rwdlc&se=2023-06-18T22:29:13Z&st=2023-06-05T14:29:13Z&spr=https&sig=MPsMh8zci0v3To7IT9SKdaFGZV8ezno63m9C8s9bdVQ%3D |
| ---- |
| |
| ---- |
| azure-files://camelazurefiles/samples/inbox/spam?sharedKey=FAKE502UyuBD...3Z%2BASt9dCmJg%3D%3D&delete=true |
| ---- |
| |
| == Usage |
| |
| === Paths |
| |
| The path separator is `/`. The absolute paths start with the path separator. |
| The absolute paths do not include the share name, and they are relative |
| to the share root rather than to the endpoint starting directory. |
| |
| *NOTE:* At some places, namely logs of used libraries, OS-specific path separator |
| appears, and the relative paths are relative to the share root (rather than |
| to the current working directory or to the endpoint starting directory) |
| so interpret them with a grain of salt. |
| |
| === Concurrency |
| |
| This component does not support concurrency on its endpoints. |
| |
| === More Information |
| |
| This component mimics the FTP component. |
| So, there are more samples and details on the FTP |
| component page. |
| |
| This component uses the Azure Java SDK libraries for the actual work. |
| |
| === Consuming Files |
| |
| The remote consumer will by default leave the consumed |
| files untouched on the remote cloud files server. You have to configure it |
| explicitly if you want it to delete the files or move them to another |
| location. For example, you can use `delete=true` to delete the files, or |
| use `move=.done` to move the files into `.done` sub directory. |
| |
| In Camel, the `.`-prefixed folders are excluded from |
| recursive polling. |
| |
| The regular File consumer is different as it will by |
| default move files to a `.camel` sub directory. The reason Camel does |
| *not* do this by default for the remote consumer is that it may lack |
| permissions by default to be able to move or delete files. |
| |
| ==== Body Type Options |
| |
| For each matching file, the consumer sends to the Camel exchange |
| a message with a selected body type: |
| |
| - `byte[]` by default |
| - `java.io.InputStream` if `streamDownload=true` is configured |
| - `java.io.File` if `localWorkDirectory` is configured |
| |
| The body type configuration should be tuned to fit available resources, |
| performance targets, route processors, caching, resuming, etc. |
| |
| ==== Limitations |
| |
| The option *readLock* can be used to force Camel *not* to consume files |
| that are currently in the progress of being written. However, this option |
| is turned off by default, as it requires that the user has write access. |
| See the endpoint options table for more details about |
| read locks. + |
| There are other solutions to avoid consuming files that are currently |
| being written; for instance, you can write to a temporary |
| destination and move the file after it has been written. |
| |
| For the `readLock=changed`, it relies only on the last modified; |
| furthermore a precision finer than 5 seconds might be problematic. |
| |
| When moving files using `move` or `preMove` option, the files are |
| restricted to the share. That prevents consumer from moving files |
| outside the endpoint share. |
| |
| === Exchange Properties |
| |
| The consumer sets the following exchange properties |
| |
| [width="100%",cols="50%,50%",options="header",] |
| |======================================================================= |
| |Header |Description |
| |
| |`CamelBatchIndex` | The current index out of total number of files being consumed in this batch. |
| |
| |`CamelBatchSize` |The total number of files being consumed in this batch. |
| |
| |`CamelBatchComplete` | True if there are no more files in this batch. |
| |======================================================================= |
| |
| === Producing Files |
| |
| The Files producer is optimized for two body types: |
| |
| - `java.io.InputStream` if `CamelFileLength` header is set |
| - `byte[]` |
| |
| In either case, the remote file size is allocated |
| and then rewritten with body content. Any inconsistency between |
| declared file length and stream length results in a corrupted |
| remote file. |
| |
| ==== Limitations |
| |
| The underlying Azure Files service does not allow growing files. The file |
| length must be known at its creation time, consequently: |
| |
| - `CamelFileLength` header has an important |
| meaning even for producers. |
| - No appending mode is supported. |
| |
| |
| === About Timeouts |
| |
| You can use the `connectTimeout` option to set |
| a timeout in millis to connect or disconnect. |
| |
| The `timeout` option only applies as the data timeout in millis. |
| |
| The meta-data operations timeout is minimum of: |
| `readLockCheckInterval`, `timeout` and 20_000 millis. |
| |
| For now, the file upload has no timeout. During the upload, |
| the underlying library could log timeout warnings. They are |
| recoverable and upload could continue. |
| |
| === Using Local Work Directory |
| |
| Camel supports consuming from remote files servers and downloading the |
| files directly into a local work directory. This avoids reading the |
| entire remote file content into memory as it is streamed directly into |
| the local file using `FileOutputStream`. |
| |
| Camel will store to a local file with the same name as the remote file, |
| though with `.inprogress` as an extension while the file is being |
| downloaded. Afterward, the file is renamed to remove the `.inprogress` |
| suffix. And finally, when the Exchange is complete, |
| the local file is deleted. |
| |
| So if you want to download files from a remote files server and store it |
| as local files, then you need to route to a file endpoint such as: |
| |
| [source,java] |
| ---- |
| from("azure-files://...&localWorkDirectory=/tmp").to("file://inbox"); |
| ---- |
| |
| [TIP] |
| ==== |
| The route above is ultra efficient as it avoids reading the entire file content into memory. |
| It will download the remote file directly to a local file stream. |
| The `java.io.File` handle is then used as the Exchange body. The file producer leverages this fact and can work directly on the work file `java.io.File` handle and perform a `java.io.File.rename` to the target filename. |
| As Camel knows it's a local work file, it can optimize and use a rename instead of a file copy, as the work file is meant to be deleted anyway. |
| ==== |
| |
| === Custom Filtering |
| |
| Camel supports pluggable filtering strategies. This strategy it to use |
| the build in `org.apache.camel.component.file.GenericFileFilter` in |
| Java. You can then configure the endpoint with such a filter to skip |
| certain filters before being processed. |
| |
| In the sample, we have built our own filter that only accepts files |
| starting with the report in the filename. |
| |
| And then we can configure our route using the *filter* attribute to |
| reference our filter (using `#` notation) that we have defined in the |
| spring XML file: |
| |
| The accept(file) file argument has properties: |
| |
| - endpoint path: the share name such as `/samples` |
| - relative path: a path to the file such as `subdir/a file` |
| - directory: `true` if a directory |
| - file length: if not a directory, then a length of the file in bytes |
| |
| |
| === Filtering using ANT path matcher |
| |
| The ANT path matcher is a filter shipped out-of-the-box in the |
| *camel-spring* jar. So you need to depend on *camel-spring* if you are |
| using Maven. + |
| The reason is that we leverage Spring's |
| http://static.springsource.org/spring/docs/3.0.x/api/org/springframework/util/AntPathMatcher.html[AntPathMatcher] |
| to do the actual matching. |
| |
| The file paths are matched with the following rules: |
| |
| * `?` matches one character |
| * `*` matches zero or more characters |
| * `**` matches zero or more directories in a path |
| |
| The sample below demonstrates how to use it: |
| |
| [source,java] |
| ---- |
| from("azure-files://...&antInclude=**/*.txt").to("..."); |
| ---- |
| |
| === Using a Proxy |
| |
| Consult the https://learn.microsoft.com/en-us/azure/developer/java/sdk/proxying[underlying library] |
| documentation. |
| |
| |
| === Consuming a single file using a fixed name |
| |
| Unlike FTP component that features a special combination of options: |
| |
| - `useList=false` |
| - `fileName=myFileName.txt` |
| - `ignoreFileNotFoundOrPermissionError=true` |
| |
| to optimize _the single file using a fixed name_ use case, |
| it is necessary to fall back to regular filters (i.e. the list |
| permission is needed). |
| |
| === Debug logging |
| |
| This component has log level *TRACE* that can be helpful if you have |
| problems. |