| <!DOCTYPE html> |
| <html lang="en"> |
| <!-- |
| Licensed to the Apache Software Foundation (ASF) under one or more |
| contributor license agreements. See the NOTICE file distributed with |
| this work for additional information regarding copyright ownership. |
| The ASF licenses this file to You under the Apache License, Version 2.0 |
| (the "License"); you may not use this file except in compliance with |
| the License. You may obtain a copy of the License at |
| http://www.apache.org/licenses/LICENSE-2.0 |
| Unless required by applicable law or agreed to in writing, software |
| distributed under the License is distributed on an "AS IS" BASIS, |
| WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. |
| See the License for the specific language governing permissions and |
| limitations under the License. |
| --> |
| <head> |
| <meta charset="utf-8" /> |
| <title>PutGridFS</title> |
| <link rel="stylesheet" href="../../../../../css/component-usage.css" type="text/css" /> |
| </head> |
| |
| <body> |
| <!-- Processor Documentation ================================================== --> |
| <h2>Description:</h2> |
| <p> |
| This processor puts a file with one or more user-defined metadata values into GridFS in the configured bucket. It |
| allows the user to define how big each file chunk will be during ingestion and provides some ability to intelligently |
| attempt to enforce file uniqueness using filename or hash values instead of just relying on a database index. |
| </p> |
| <h3>GridFS File Attributes</h3> |
| <p> |
| <em>PutGridFS</em> allows for flowfile attributes that start with a configured prefix to be added to the GridFS |
| document. These can be very useful later when working with GridFS for providing metadata about a file. |
| </p> |
| <h3>Chunk Size</h3> |
| <p> |
| GridFS splits up file into chunks within Mongo documents as the file is ingested into the database. The chunk size |
| configuration parameter configures the maximum size of each chunk. This field should be left at its default value |
| unless there is a specific business case to increase or decrease it. |
| </p> |
| <h3>Uniqueness Enforcement</h3> |
| <p> |
| There are four operating modes: |
| </p> |
| <ul> |
| <li>No enforcement at the application level.</li> |
| <li>Enforce by unique file name.</li> |
| <li>Enforce by unique hash value.</li> |
| <li>Use both hash and file name.</li> |
| </ul> |
| <p> |
| The hash value by default is taken from the attribute <em>hash.value</em> which can be generated by configuring a |
| <em>HashContent</em> processor upstream of <em>PutGridFS</em>. Both this and the name option use a query on the existing |
| data to see if a file matching that criteria exists before attempting to write the flowfile contents. |
| </p> |
| </body> |
| </html> |