MINOR: Bump commons-io:commons-io from 2.19.0 to 2.21.0 (#974) Bumps [commons-io:commons-io](https://github.com/apache/commons-io) from 2.19.0 to 2.21.0. <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/apache/commons-io/blob/master/RELEASE-NOTES.txt">commons-io:commons-io's changelog</a>.</em></p> <blockquote> <p>Apache Commons IO 2.21.0 Release Notes</p> <p>The Apache Commons IO team is pleased to announce the release of Apache Commons IO 2.21.0.</p> <h2>Introduction</h2> <p>The Apache Commons IO library contains utility classes, stream implementations, file filters, file comparators, endian transformation classes, and much more.</p> <p>Version 2.21.0: Java 8 or later is required.</p> <h2>New features</h2> <p>o FileUtils#byteCountToDisplaySize() supports Zettabyte, Yottabyte, Ronnabyte and Quettabyte <a href="https://redirect.github.com/apache/commons-io/issues/763">#763</a>. Thanks to strangelookingnerd, Gary Gregory. o Add org.apache.commons.io.FileUtils.ONE_RB <a href="https://redirect.github.com/apache/commons-io/issues/763">#763</a>. Thanks to strangelookingnerd, Gary Gregory. o Add org.apache.commons.io.FileUtils.ONE_QB <a href="https://redirect.github.com/apache/commons-io/issues/763">#763</a>. Thanks to strangelookingnerd, Gary Gregory. o Add org.apache.commons.io.output.ProxyOutputStream.writeRepeat(byte[], int, int, long). Thanks to Gary Gregory. o Add org.apache.commons.io.output.ProxyOutputStream.writeRepeat(byte[], long). Thanks to Gary Gregory. o Add org.apache.commons.io.output.ProxyOutputStream.writeRepeat(int, long). Thanks to Gary Gregory. o Add length unit support in FileSystem limits. Thanks to Piotr P. Karwasz. o Add IOUtils.toByteArray(InputStream, int, int) for safer chunked reading with size validation. Thanks to Piotr P. Karwasz. o Add org.apache.commons.io.file.PathUtils.getPath(String, String). Thanks to Gary Gregory. o Add org.apache.commons.io.channels.ByteArraySeekableByteChannel. Thanks to Gary Gregory. o Add IOIterable.asIterable(). Thanks to Gary Gregory. o Add NIO channel support to <code>AbstractStreamBuilder</code>. Thanks to Piotr P. Karwasz. o Add CloseShieldChannel to close-shielded NIO Channels <a href="https://redirect.github.com/apache/commons-io/issues/786">#786</a>. Thanks to Piotr P. Karwasz. o Added IOUtils.checkFromIndexSize as a Java 8 backport of Objects.checkFromIndexSize <a href="https://redirect.github.com/apache/commons-io/issues/790">#790</a>. Thanks to Piotr P. Karwasz.</p> <h2>Fixed Bugs</h2> <p>o When testing on Java 21 and up, enable -XX:+EnableDynamicAgentLoading. Thanks to Gary Gregory. o When testing on Java 24 and up, don't fail FileUtilsListFilesTest for a different behavior in the JRE. Thanks to Gary Gregory. o ValidatingObjectInputStream does not validate dynamic proxy interfaces. Thanks to Stanislav Fort, Gary Gregory. o BoundedInputStream.getRemaining() now reports Long.MAX_VALUE instead of 0 when no limit is set. Thanks to Piotr P. Karwasz. o BoundedInputStream.available() correctly accounts for the maximum read limit. Thanks to Piotr P. Karwasz. o Deprecate IOUtils.readFully(InputStream, int) in favor of toByteArray(InputStream, int). Thanks to Gary Gregory, Piotr P. Karwasz. o IOUtils.toByteArray(InputStream) now throws IOException on byte array overflow. Thanks to Piotr P. Karwasz. o Javadoc general improvements. Thanks to Gary Gregory, Piotr P. Karwasz. o IOUtils.toByteArray() now throws EOFException when not enough data is available <a href="https://redirect.github.com/apache/commons-io/issues/796">#796</a>. Thanks to Piotr P. Karwasz. o Fix IOUtils.skip() usage in concurrent scenarios. Thanks to Piotr P. Karwasz. o [javadoc] Fix XmlStreamReader Javadoc to indicate the correct class that is built <a href="https://redirect.github.com/apache/commons-io/issues/806">#806</a>. Thanks to J Hawkins.</p> <h2>Changes</h2> <p>o Bump org.apache.commons:commons-parent from 85 to 91 <a href="https://redirect.github.com/apache/commons-io/issues/774">#774</a>, <a href="https://redirect.github.com/apache/commons-io/issues/783">#783</a>, <a href="https://redirect.github.com/apache/commons-io/issues/808">#808</a>. Thanks to Gary Gregory, Dependabot.</p> <!-- raw HTML omitted --> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Commits</summary> <ul> <li><a href="https://github.com/apache/commons-io/commit/54073d3b5fdd2985b98a48040ede95eb59c7ee53"><code>54073d3</code></a> Prepare for the release candidate 2.21.0 RC1</li> <li><a href="https://github.com/apache/commons-io/commit/f141f09d91368543e4f0754cbd649c484768c55c"><code>f141f09</code></a> Prepare for the next release candidate</li> <li><a href="https://github.com/apache/commons-io/commit/adcf1350152faf4dbd8cf53fb2f2649f25dbe227"><code>adcf135</code></a> Add license header</li> <li><a href="https://github.com/apache/commons-io/commit/0f499d060adbd4b36bbd9f47393a7ea6af8149ff"><code>0f499d0</code></a> Use new oak logo</li> <li><a href="https://github.com/apache/commons-io/commit/34a961c3ed58ed96c73836db154ae50f0c45110f"><code>34a961c</code></a> Use HTTPS in URL</li> <li><a href="https://github.com/apache/commons-io/commit/9e511181a03096b77c3a4b9c6077a4ac0b56b510"><code>9e51118</code></a> Use HTTPS in URL</li> <li><a href="https://github.com/apache/commons-io/commit/d715865ee705fdb8ed786582bd6bd4ee996b0665"><code>d715865</code></a> Add dependabot email [skip ci]</li> <li><a href="https://github.com/apache/commons-io/commit/3d6a7e113633e1a33ca254d744c3fcbab61663f3"><code>3d6a7e1</code></a> Javadoc</li> <li><a href="https://github.com/apache/commons-io/commit/ad875d566f273f54094b6b872bf9433be9fd86a7"><code>ad875d5</code></a> Bump actions/upload-artifact from 4.6.2 to 5.0.0 (<a href="https://redirect.github.com/apache/commons-io/issues/810">#810</a>)</li> <li><a href="https://github.com/apache/commons-io/commit/bc01dee31ec0ff10aa0841ff245b770fa1ecfade"><code>bc01dee</code></a> Bump github/codeql-action from 4.30.9 to 4.31.2 (<a href="https://redirect.github.com/apache/commons-io/issues/811">#811</a>)</li> <li>Additional commits viewable in <a href="https://github.com/apache/commons-io/compare/rel/commons-io-2.19.0...rel/commons-io-2.21.0">compare view</a></li> </ul> </details> <br /> [](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
The following guides explain the fundamental data structures used in the Java implementation of Apache Arrow.
Generated javadoc documentation is available here.
Refer to Building Apache Arrow for documentation of environment setup and build instructions.
Arrow uses Google's Flatbuffers to transport metadata. The java version of the library requires the generated flatbuffer classes can only be used with the same version that generated them. Arrow packages a version of the arrow-vector module that shades flatbuffers and arrow-format into a single JAR. Using the classifier “shade-format-flatbuffers” in your pom.xml will make use of this JAR, you can then exclude/resolve the original dependency to a version of your choosing.
$ flatc --version flatc version 25.1.24 $ grep "dep.fbs.version" java/pom.xml <dep.fbs.version>25.1.24</dep.fbs.version>
cd $ARROW_HOME # remove the existing files rm -rf java/format/src # regenerate from the .fbs files flatc --java -o java/format/src/main/java format/*.fbs # prepend license header mvn spotless:apply -pl :arrow-format
There are several system/environmental variables that users can configure. These trade off safety (they turn off checking) for speed. Typically they are only used in production settings after the code has been thoroughly tested without using them.
Bounds Checking for memory accesses: Bounds checking is on by default. You can disable it by setting either the system property(arrow.enable_unsafe_memory_access) or the environmental variable (ARROW_ENABLE_UNSAFE_MEMORY_ACCESS) to true. When both the system property and the environmental variable are set, the system property takes precedence.
null checking for gets: ValueVector get methods (not getObject) methods by default verify the slot is not null. You can disable it by setting either the system property(arrow.enable_null_check_for_get) or the environmental variable (ARROW_ENABLE_NULL_CHECK_FOR_GET) to false. When both the system property and the environmental variable are set, the system property takes precedence.
-Dio.netty.tryReflectionSetAccessible=true should be set. This fixes java.lang.UnsupportedOperationException: sun.misc.Unsafe or java.nio.DirectByteBuffer.(long, int) not available. thrown by Netty.StructVector enable -Darrow.struct.conflict.policy=CONFLICT_APPEND. Duplicate fields are ignored (CONFLICT_REPLACE) by default and overwritten. To support different policies for conflicting or duplicate fields set this JVM flag or use the correct static constructor methods for StructVectors.Arrow Java follows the Google style guide here with the following differences:
NoFinalizer, OverloadMethodsDeclarationOrder, and VariableDeclarationUsageDistance due to the existing code base. These rules should be followed when possible.Refer to checkstyle.xml for rule specifics.
When running tests, Arrow Java uses the Logback logger with SLF4J. By default, it uses the logback.xml present in the corresponding module's src/test/resources directory, which has the default log level set to INFO. Arrow Java can be built with an alternate logback configuration file using the following command run in the project root directory:
mvn -Dlogback.configurationFile=file:<path-of-logback-file>
See Logback Configuration for more details.
Integration tests which require more time or more memory can be run by activating the integration-tests profile. This activates the maven failsafe plugin and any class prefixed with IT will be run during the testing phase. The integration tests currently require a larger amount of memory (>4GB) and time to complete. To activate the profile:
mvn -Pintegration-tests <rest of mvn arguments>