blob: 8f08a104d5540d2887f65cba843481723c9f7221 [file] [log] [blame]
= Azure Files Component
:doctitle: Azure Files
:shortname: azure-files
:artifactid: camel-azure-files
:description: Send and receive files to Azure storage file share
:since: 3.22
:supportlevel: Preview
:tabs-sync-option:
:component-header: Both producer and consumer are supported
//Manually maintained attributes
:group: Azure
:camel-spring-boot-name: azure-files
*Since Camel {since}*
*{component-header}*
This component provides access to Azure Files.
[CAUTION]
====
This is preview component, therefore, anything can change in future releases
(features and behavior can be changed, modified or even dropped without notice). At the same time it is consolidated
enough, sparingly documented, a few users reported it was working
in their environment, and it is ready for wider feedback.
====
When consuming from remote files server, make sure you read the section titled _Consuming Files_
further below for details related to consuming files.
Maven users will need to add the following dependency to their `pom.xml`
for this component:
[source,xml]
----
<dependency>
<groupId>org.apache.camel</groupId>
<artifactId>camel-azure-files</artifactId>
<version>x.y.z</version>
<!-- use the same version as your Camel core version -->
</dependency>
----
== Endpoint URI Format
----
azure-files://account[.file.core.windows.net][:port]/share[/directory]
----
Where *directory* represents the underlying directory. The directory
is a relative path and does not include the share name. The relative path
can contain nested folders, such as `inbox/spam`. It defaults to
the share root directory.
The `autoCreate` option is supported for the directory;
when consumer or producer starts, there's an additional operation
performed to create the directory configured for the endpoint. The default
value for `autoCreate` is `true`. On the contrary, the share must exist; it
is not automatically created.
If no *port* number is provided, Camel will provide default values
according to the protocol (https 443).
You can append query options to the URI in the following format
`?option=value&option2=value&...`.
To use this component, you have multiple options to provide the required Azure authentication information:
- Via Azure Identity, when specifying `credentialType=AZURE_IDENTITY` and providing required https://github.com/Azure/azure-sdk-for-java/tree/main/sdk/identity/azure-identity#environment-variables[environment variables]. This enables service principal (e.g. app registration) authentication with secret/certificate as well as username password.
- Via shared storage account key, when specifying `credentialType=SHARED_ACCOUNT_KEY` and providing `sharedKey` for your Azure account, this is the simplest way to get started. The sharedKey can be generated through your Azure portal.
- Via Azure SAS, when specifying `credentialType=AZURE_SAS` and providing a SAS Token parameter through the `token` parameter.
// component-configure options: START
// component-configure options: END
// component options: START
include::partial$component-configure-options.adoc[]
include::partial$component-endpoint-options.adoc[]
// component options: END
// endpoint options: START
// endpoint options: END
// component headers: START
include::partial$component-endpoint-headers.adoc[]
// component headers: END
=== Endpoint URI Examples
----
azure-files://camelazurefiles.file.core.windows.net/samples?sv=2022-11-02&ss=f&srt=sco&sp=rwdlc&se=2023-06-18T22:29:13Z&st=2023-06-05T14:29:13Z&spr=https&sig=MPsMh8zci0v3To7IT9SKdaFGZV8ezno63m9C8s9bdVQ%3D
----
----
azure-files://camelazurefiles/samples/inbox/spam?sharedKey=FAKE502UyuBD...3Z%2BASt9dCmJg%3D%3D&delete=true
----
== Usage
=== Paths
The path separator is `/`. The absolute paths start with the path separator.
The absolute paths do not include the share name, and they are relative
to the share root rather than to the endpoint starting directory.
*NOTE:* At some places, namely logs of used libraries, OS-specific path separator
appears, and the relative paths are relative to the share root (rather than
to the current working directory or to the endpoint starting directory)
so interpret them with a grain of salt.
=== Concurrency
This component does not support concurrency on its endpoints.
=== More Information
This component mimics the FTP component.
So, there are more samples and details on the FTP
component page.
This component uses the Azure Java SDK libraries for the actual work.
=== Consuming Files
The remote consumer will by default leave the consumed
files untouched on the remote cloud files server. You have to configure it
explicitly if you want it to delete the files or move them to another
location. For example, you can use `delete=true` to delete the files, or
use `move=.done` to move the files into `.done` sub directory.
In Camel, the `.`-prefixed folders are excluded from
recursive polling.
The regular File consumer is different as it will by
default move files to a `.camel` sub directory. The reason Camel does
*not* do this by default for the remote consumer is that it may lack
permissions by default to be able to move or delete files.
==== Body Type Options
For each matching file, the consumer sends to the Camel exchange
a message with a selected body type:
- `byte[]` by default
- `java.io.InputStream` if `streamDownload=true` is configured
- `java.io.File` if `localWorkDirectory` is configured
The body type configuration should be tuned to fit available resources,
performance targets, route processors, caching, resuming, etc.
==== Limitations
The option *readLock* can be used to force Camel *not* to consume files
that are currently in the progress of being written. However, this option
is turned off by default, as it requires that the user has write access.
See the endpoint options table for more details about
read locks. +
There are other solutions to avoid consuming files that are currently
being written; for instance, you can write to a temporary
destination and move the file after it has been written.
For the `readLock=changed`, it relies only on the last modified;
furthermore a precision finer than 5 seconds might be problematic.
When moving files using `move` or `preMove` option, the files are
restricted to the share. That prevents consumer from moving files
outside the endpoint share.
=== Exchange Properties
The consumer sets the following exchange properties
[width="100%",cols="50%,50%",options="header",]
|=======================================================================
|Header |Description
|`CamelBatchIndex` | The current index out of total number of files being consumed in this batch.
|`CamelBatchSize` |The total number of files being consumed in this batch.
|`CamelBatchComplete` | True if there are no more files in this batch.
|=======================================================================
=== Producing Files
The Files producer is optimized for two body types:
- `java.io.InputStream` if `CamelFileLength` header is set
- `byte[]`
In either case, the remote file size is allocated
and then rewritten with body content. Any inconsistency between
declared file length and stream length results in a corrupted
remote file.
==== Limitations
The underlying Azure Files service does not allow growing files. The file
length must be known at its creation time, consequently:
- `CamelFileLength` header has an important
meaning even for producers.
- No appending mode is supported.
=== About Timeouts
You can use the `connectTimeout` option to set
a timeout in millis to connect or disconnect.
The `timeout` option only applies as the data timeout in millis.
The meta-data operations timeout is minimum of:
`readLockCheckInterval`, `timeout` and 20_000 millis.
For now, the file upload has no timeout. During the upload,
the underlying library could log timeout warnings. They are
recoverable and upload could continue.
=== Using Local Work Directory
Camel supports consuming from remote files servers and downloading the
files directly into a local work directory. This avoids reading the
entire remote file content into memory as it is streamed directly into
the local file using `FileOutputStream`.
Camel will store to a local file with the same name as the remote file,
though with `.inprogress` as an extension while the file is being
downloaded. Afterward, the file is renamed to remove the `.inprogress`
suffix. And finally, when the Exchange is complete,
the local file is deleted.
So if you want to download files from a remote files server and store it
as local files, then you need to route to a file endpoint such as:
[source,java]
----
from("azure-files://...&localWorkDirectory=/tmp").to("file://inbox");
----
[TIP]
====
The route above is ultra efficient as it avoids reading the entire file content into memory.
It will download the remote file directly to a local file stream.
The `java.io.File` handle is then used as the Exchange body. The file producer leverages this fact and can work directly on the work file `java.io.File` handle and perform a `java.io.File.rename` to the target filename.
As Camel knows it's a local work file, it can optimize and use a rename instead of a file copy, as the work file is meant to be deleted anyway.
====
=== Custom Filtering
Camel supports pluggable filtering strategies. This strategy it to use
the build in `org.apache.camel.component.file.GenericFileFilter` in
Java. You can then configure the endpoint with such a filter to skip
certain filters before being processed.
In the sample, we have built our own filter that only accepts files
starting with the report in the filename.
And then we can configure our route using the *filter* attribute to
reference our filter (using `#` notation) that we have defined in the
spring XML file:
The accept(file) file argument has properties:
- endpoint path: the share name such as `/samples`
- relative path: a path to the file such as `subdir/a file`
- directory: `true` if a directory
- file length: if not a directory, then a length of the file in bytes
=== Filtering using ANT path matcher
The ANT path matcher is a filter shipped out-of-the-box in the
*camel-spring* jar. So you need to depend on *camel-spring* if you are
using Maven. +
The reason is that we leverage Spring's
http://static.springsource.org/spring/docs/3.0.x/api/org/springframework/util/AntPathMatcher.html[AntPathMatcher]
to do the actual matching.
The file paths are matched with the following rules:
* `?` matches one character
* `*` matches zero or more characters
* `**` matches zero or more directories in a path
The sample below demonstrates how to use it:
[source,java]
----
from("azure-files://...&antInclude=**/*.txt").to("...");
----
=== Using a Proxy
Consult the https://learn.microsoft.com/en-us/azure/developer/java/sdk/proxying[underlying library]
documentation.
=== Consuming a single file using a fixed name
Unlike FTP component that features a special combination of options:
- `useList=false`
- `fileName=myFileName.txt`
- `ignoreFileNotFoundOrPermissionError=true`
to optimize _the single file using a fixed name_ use case,
it is necessary to fall back to regular filters (i.e. the list
permission is needed).
=== Debug logging
This component has log level *TRACE* that can be helpful if you have
problems.