| <!DOCTYPE HTML> |
| <html lang> |
| <head> |
| <!-- Generated by javadoc (17) on Wed Jun 10 19:48:32 UTC 2026 --> |
| <title>DataFrame (Apache DataFusion Java 0.2.0-SNAPSHOT)</title> |
| <meta name="viewport" content="width=device-width, initial-scale=1"> |
| <meta http-equiv="Content-Type" content="text/html; charset=UTF-8"> |
| <meta name="dc.created" content="2026-06-10"> |
| <meta name="description" content="declaration: package: org.apache.datafusion, class: DataFrame"> |
| <meta name="generator" content="javadoc/ClassWriterImpl"> |
| <link rel="stylesheet" type="text/css" href="../../../stylesheet.css" title="Style"> |
| <link rel="stylesheet" type="text/css" href="../../../script-dir/jquery-ui.min.css" title="Style"> |
| <link rel="stylesheet" type="text/css" href="../../../jquery-ui.overrides.css" title="Style"> |
| <script type="text/javascript" src="../../../script.js"></script> |
| <script type="text/javascript" src="../../../script-dir/jquery-3.7.1.min.js"></script> |
| <script type="text/javascript" src="../../../script-dir/jquery-ui.min.js"></script> |
| </head> |
| <body class="class-declaration-page"> |
| <script type="text/javascript">var evenRowColor = "even-row-color"; |
| var oddRowColor = "odd-row-color"; |
| var tableTab = "table-tab"; |
| var activeTableTab = "active-table-tab"; |
| var pathtoroot = "../../../"; |
| loadScripts(document, 'script');</script> |
| <noscript> |
| <div>JavaScript is disabled on your browser.</div> |
| </noscript> |
| <div class="flex-box"> |
| <header role="banner" class="flex-header"> |
| <nav role="navigation"> |
| <!-- ========= START OF TOP NAVBAR ======= --> |
| <div class="top-nav" id="navbar-top"> |
| <div class="skip-nav"><a href="#skip-navbar-top" title="Skip navigation links">Skip navigation links</a></div> |
| <ul id="navbar-top-firstrow" class="nav-list" title="Navigation"> |
| <li><a href="../../../index.html">Overview</a></li> |
| <li><a href="package-summary.html">Package</a></li> |
| <li class="nav-bar-cell1-rev">Class</li> |
| <li><a href="class-use/DataFrame.html">Use</a></li> |
| <li><a href="package-tree.html">Tree</a></li> |
| <li><a href="../../../index-all.html">Index</a></li> |
| <li><a href="../../../help-doc.html#class">Help</a></li> |
| </ul> |
| </div> |
| <div class="sub-nav"> |
| <div> |
| <ul class="sub-nav-list"> |
| <li>Summary: </li> |
| <li>Nested | </li> |
| <li>Field | </li> |
| <li>Constr | </li> |
| <li><a href="#method-summary">Method</a></li> |
| </ul> |
| <ul class="sub-nav-list"> |
| <li>Detail: </li> |
| <li>Field | </li> |
| <li>Constr | </li> |
| <li><a href="#method-detail">Method</a></li> |
| </ul> |
| </div> |
| <div class="nav-list-search"><label for="search-input">SEARCH:</label> |
| <input type="text" id="search-input" value="search" disabled="disabled"> |
| <input type="reset" id="reset-button" value="reset" disabled="disabled"> |
| </div> |
| </div> |
| <!-- ========= END OF TOP NAVBAR ========= --> |
| <span class="skip-nav" id="skip-navbar-top"></span></nav> |
| </header> |
| <div class="flex-content"> |
| <main role="main"> |
| <!-- ======== START OF CLASS DATA ======== --> |
| <div class="header"> |
| <div class="sub-title"><span class="package-label-in-type">Package</span> <a href="package-summary.html">org.apache.datafusion</a></div> |
| <h1 title="Class DataFrame" class="title">Class DataFrame</h1> |
| </div> |
| <div class="inheritance" title="Inheritance Tree"><a href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/lang/Object.html" title="class or interface in java.lang" class="external-link">java.lang.Object</a> |
| <div class="inheritance">org.apache.datafusion.DataFrame</div> |
| </div> |
| <section class="class-description" id="class-description"> |
| <dl class="notes"> |
| <dt>All Implemented Interfaces:</dt> |
| <dd><code><a href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/lang/AutoCloseable.html" title="class or interface in java.lang" class="external-link">AutoCloseable</a></code></dd> |
| </dl> |
| <hr> |
| <div class="type-signature"><span class="modifiers">public final class </span><span class="element-name type-name-label">DataFrame</span> |
| <span class="extends-implements">extends <a href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/lang/Object.html" title="class or interface in java.lang" class="external-link">Object</a> |
| implements <a href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/lang/AutoCloseable.html" title="class or interface in java.lang" class="external-link">AutoCloseable</a></span></div> |
| <div class="block">A lazy representation of a query plan, mirroring the Rust DataFusion <code>DataFrame</code>. Created |
| by <a href="SessionContext.html#sql(java.lang.String)"><code>SessionContext.sql(String)</code></a> or other planning entry points and executed by either |
| <a href="#collect(org.apache.arrow.memory.BufferAllocator)"><code>collect(org.apache.arrow.memory.BufferAllocator)</code></a> (materializes every batch on the native heap before returning) or <a href="#executeStream(org.apache.arrow.memory.BufferAllocator)"><code>executeStream(org.apache.arrow.memory.BufferAllocator)</code></a> (yields one batch at a time as Java drains the reader). |
| |
| <p>Instances are <strong>not thread-safe</strong> and must be closed. Both <a href="#collect(org.apache.arrow.memory.BufferAllocator)"><code>collect(org.apache.arrow.memory.BufferAllocator)</code></a> and |
| <a href="#executeStream(org.apache.arrow.memory.BufferAllocator)"><code>executeStream(org.apache.arrow.memory.BufferAllocator)</code></a> consume the DataFrame: a successfully consumed DataFrame cannot be |
| consumed again by either method (or by other executors such as <a href="#count()"><code>count()</code></a>), and <a href="#close()"><code>close()</code></a> on an already-consumed instance is a no-op.</div> |
| </section> |
| <section class="summary"> |
| <ul class="summary-list"> |
| <!-- ========== METHOD SUMMARY =========== --> |
| <li> |
| <section class="method-summary" id="method-summary"> |
| <h2>Method Summary</h2> |
| <div id="method-summary-table"> |
| <div class="table-tabs" role="tablist" aria-orientation="horizontal"><button id="method-summary-table-tab0" role="tab" aria-selected="true" aria-controls="method-summary-table.tabpanel" tabindex="0" onkeydown="switchTab(event)" onclick="show('method-summary-table', 'method-summary-table', 3)" class="active-table-tab">All Methods</button><button id="method-summary-table-tab2" role="tab" aria-selected="false" aria-controls="method-summary-table.tabpanel" tabindex="-1" onkeydown="switchTab(event)" onclick="show('method-summary-table', 'method-summary-table-tab2', 3)" class="table-tab">Instance Methods</button><button id="method-summary-table-tab4" role="tab" aria-selected="false" aria-controls="method-summary-table.tabpanel" tabindex="-1" onkeydown="switchTab(event)" onclick="show('method-summary-table', 'method-summary-table-tab4', 3)" class="table-tab">Concrete Methods</button></div> |
| <div id="method-summary-table.tabpanel" role="tabpanel" aria-labelledby="method-summary-table-tab0"> |
| <div class="summary-table three-column-summary"> |
| <div class="table-header col-first">Modifier and Type</div> |
| <div class="table-header col-second">Method</div> |
| <div class="table-header col-last">Description</div> |
| <div class="col-first even-row-color method-summary-table method-summary-table-tab2 method-summary-table-tab4"><code><a href="DataFrame.html" title="class in org.apache.datafusion">DataFrame</a></code></div> |
| <div class="col-second even-row-color method-summary-table method-summary-table-tab2 method-summary-table-tab4"><code><a href="#cache()" class="member-name-link">cache</a>()</code></div> |
| <div class="col-last even-row-color method-summary-table method-summary-table-tab2 method-summary-table-tab4"> |
| <div class="block">Materialise this DataFrame into an in-memory table and return a new DataFrame that scans it.</div> |
| </div> |
| <div class="col-first odd-row-color method-summary-table method-summary-table-tab2 method-summary-table-tab4"><code>void</code></div> |
| <div class="col-second odd-row-color method-summary-table method-summary-table-tab2 method-summary-table-tab4"><code><a href="#close()" class="member-name-link">close</a>()</code></div> |
| <div class="col-last odd-row-color method-summary-table method-summary-table-tab2 method-summary-table-tab4"> </div> |
| <div class="col-first even-row-color method-summary-table method-summary-table-tab2 method-summary-table-tab4"><code>org.apache.arrow.vector.ipc.ArrowReader</code></div> |
| <div class="col-second even-row-color method-summary-table method-summary-table-tab2 method-summary-table-tab4"><code><a href="#collect(org.apache.arrow.memory.BufferAllocator)" class="member-name-link">collect</a><wbr>(org.apache.arrow.memory.BufferAllocator allocator)</code></div> |
| <div class="col-last even-row-color method-summary-table method-summary-table-tab2 method-summary-table-tab4"> |
| <div class="block">Execute the plan and return its record batches as an <code>ArrowReader</code>.</div> |
| </div> |
| <div class="col-first odd-row-color method-summary-table method-summary-table-tab2 method-summary-table-tab4"><code>long</code></div> |
| <div class="col-second odd-row-color method-summary-table method-summary-table-tab2 method-summary-table-tab4"><code><a href="#count()" class="member-name-link">count</a>()</code></div> |
| <div class="col-last odd-row-color method-summary-table method-summary-table-tab2 method-summary-table-tab4"> |
| <div class="block">Execute the plan and return the number of rows.</div> |
| </div> |
| <div class="col-first even-row-color method-summary-table method-summary-table-tab2 method-summary-table-tab4"><code><a href="DataFrame.html" title="class in org.apache.datafusion">DataFrame</a></code></div> |
| <div class="col-second even-row-color method-summary-table method-summary-table-tab2 method-summary-table-tab4"><code><a href="#describe()" class="member-name-link">describe</a>()</code></div> |
| <div class="col-last even-row-color method-summary-table method-summary-table-tab2 method-summary-table-tab4"> |
| <div class="block">Compute summary statistics (count, null_count, mean, std, min, max, median) over this |
| DataFrame's columns and return them as a new DataFrame.</div> |
| </div> |
| <div class="col-first odd-row-color method-summary-table method-summary-table-tab2 method-summary-table-tab4"><code><a href="DataFrame.html" title="class in org.apache.datafusion">DataFrame</a></code></div> |
| <div class="col-second odd-row-color method-summary-table method-summary-table-tab2 method-summary-table-tab4"><code><a href="#distinct()" class="member-name-link">distinct</a>()</code></div> |
| <div class="col-last odd-row-color method-summary-table method-summary-table-tab2 method-summary-table-tab4"> |
| <div class="block">Deduplicate rows across all columns.</div> |
| </div> |
| <div class="col-first even-row-color method-summary-table method-summary-table-tab2 method-summary-table-tab4"><code><a href="DataFrame.html" title="class in org.apache.datafusion">DataFrame</a></code></div> |
| <div class="col-second even-row-color method-summary-table method-summary-table-tab2 method-summary-table-tab4"><code><a href="#dropColumns(java.lang.String...)" class="member-name-link">dropColumns</a><wbr>(<a href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/lang/String.html" title="class or interface in java.lang" class="external-link">String</a>... columnNames)</code></div> |
| <div class="col-last even-row-color method-summary-table method-summary-table-tab2 method-summary-table-tab4"> |
| <div class="block">Drop the named columns.</div> |
| </div> |
| <div class="col-first odd-row-color method-summary-table method-summary-table-tab2 method-summary-table-tab4"><code><a href="DataFrame.html" title="class in org.apache.datafusion">DataFrame</a></code></div> |
| <div class="col-second odd-row-color method-summary-table method-summary-table-tab2 method-summary-table-tab4"><code><a href="#except(org.apache.datafusion.DataFrame)" class="member-name-link">except</a><wbr>(<a href="DataFrame.html" title="class in org.apache.datafusion">DataFrame</a> other)</code></div> |
| <div class="col-last odd-row-color method-summary-table method-summary-table-tab2 method-summary-table-tab4"> |
| <div class="block">Rows present in this DataFrame but not in <code>other</code>, keeping duplicates from the receiver |
| (SQL <code>EXCEPT ALL</code>).</div> |
| </div> |
| <div class="col-first even-row-color method-summary-table method-summary-table-tab2 method-summary-table-tab4"><code><a href="DataFrame.html" title="class in org.apache.datafusion">DataFrame</a></code></div> |
| <div class="col-second even-row-color method-summary-table method-summary-table-tab2 method-summary-table-tab4"><code><a href="#exceptDistinct(org.apache.datafusion.DataFrame)" class="member-name-link">exceptDistinct</a><wbr>(<a href="DataFrame.html" title="class in org.apache.datafusion">DataFrame</a> other)</code></div> |
| <div class="col-last even-row-color method-summary-table method-summary-table-tab2 method-summary-table-tab4"> |
| <div class="block">Rows present in this DataFrame but not in <code>other</code>, deduplicated (SQL <code>EXCEPT</code>).</div> |
| </div> |
| <div class="col-first odd-row-color method-summary-table method-summary-table-tab2 method-summary-table-tab4"><code>org.apache.arrow.vector.ipc.ArrowReader</code></div> |
| <div class="col-second odd-row-color method-summary-table method-summary-table-tab2 method-summary-table-tab4"><code><a href="#executeStream(org.apache.arrow.memory.BufferAllocator)" class="member-name-link">executeStream</a><wbr>(org.apache.arrow.memory.BufferAllocator allocator)</code></div> |
| <div class="col-last odd-row-color method-summary-table method-summary-table-tab2 method-summary-table-tab4"> |
| <div class="block">Execute the plan and return its record batches as a streaming <code>ArrowReader</code>.</div> |
| </div> |
| <div class="col-first even-row-color method-summary-table method-summary-table-tab2 method-summary-table-tab4"><code><a href="DataFrame.html" title="class in org.apache.datafusion">DataFrame</a></code></div> |
| <div class="col-second even-row-color method-summary-table method-summary-table-tab2 method-summary-table-tab4"><code><a href="#explain(boolean,boolean)" class="member-name-link">explain</a><wbr>(boolean verbose, |
| boolean analyze)</code></div> |
| <div class="col-last even-row-color method-summary-table method-summary-table-tab2 method-summary-table-tab4"> |
| <div class="block">Return a new DataFrame whose rows describe the plan that would execute this DataFrame.</div> |
| </div> |
| <div class="col-first odd-row-color method-summary-table method-summary-table-tab2 method-summary-table-tab4"><code><a href="DataFrame.html" title="class in org.apache.datafusion">DataFrame</a></code></div> |
| <div class="col-second odd-row-color method-summary-table method-summary-table-tab2 method-summary-table-tab4"><code><a href="#filter(java.lang.String)" class="member-name-link">filter</a><wbr>(<a href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/lang/String.html" title="class or interface in java.lang" class="external-link">String</a> predicate)</code></div> |
| <div class="col-last odd-row-color method-summary-table method-summary-table-tab2 method-summary-table-tab4"> |
| <div class="block">Apply a SQL predicate to produce a filtered DataFrame.</div> |
| </div> |
| <div class="col-first even-row-color method-summary-table method-summary-table-tab2 method-summary-table-tab4"><code><a href="DataFrame.html" title="class in org.apache.datafusion">DataFrame</a></code></div> |
| <div class="col-second even-row-color method-summary-table method-summary-table-tab2 method-summary-table-tab4"><code><a href="#intersect(org.apache.datafusion.DataFrame)" class="member-name-link">intersect</a><wbr>(<a href="DataFrame.html" title="class in org.apache.datafusion">DataFrame</a> other)</code></div> |
| <div class="col-last even-row-color method-summary-table method-summary-table-tab2 method-summary-table-tab4"> |
| <div class="block">Rows present in both this DataFrame and <code>other</code>, keeping duplicates from the receiver |
| (SQL <code>INTERSECT ALL</code>).</div> |
| </div> |
| <div class="col-first odd-row-color method-summary-table method-summary-table-tab2 method-summary-table-tab4"><code><a href="DataFrame.html" title="class in org.apache.datafusion">DataFrame</a></code></div> |
| <div class="col-second odd-row-color method-summary-table method-summary-table-tab2 method-summary-table-tab4"><code><a href="#intersectDistinct(org.apache.datafusion.DataFrame)" class="member-name-link">intersectDistinct</a><wbr>(<a href="DataFrame.html" title="class in org.apache.datafusion">DataFrame</a> other)</code></div> |
| <div class="col-last odd-row-color method-summary-table method-summary-table-tab2 method-summary-table-tab4"> |
| <div class="block">Rows present in both this DataFrame and <code>other</code>, deduplicated (SQL <code>INTERSECT</code>).</div> |
| </div> |
| <div class="col-first even-row-color method-summary-table method-summary-table-tab2 method-summary-table-tab4"><code><a href="DataFrame.html" title="class in org.apache.datafusion">DataFrame</a></code></div> |
| <div class="col-second even-row-color method-summary-table method-summary-table-tab2 method-summary-table-tab4"><code><a href="#join(org.apache.datafusion.DataFrame,org.apache.datafusion.JoinType,java.lang.String%5B%5D,java.lang.String%5B%5D)" class="member-name-link">join</a><wbr>(<a href="DataFrame.html" title="class in org.apache.datafusion">DataFrame</a> right, |
| <a href="JoinType.html" title="enum class in org.apache.datafusion">JoinType</a> type, |
| <a href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/lang/String.html" title="class or interface in java.lang" class="external-link">String</a>[] leftCols, |
| <a href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/lang/String.html" title="class or interface in java.lang" class="external-link">String</a>[] rightCols)</code></div> |
| <div class="col-last even-row-color method-summary-table method-summary-table-tab2 method-summary-table-tab4"> |
| <div class="block">Equi-join this DataFrame with <code>right</code> on the named columns, using the given <a href="JoinType.html" title="enum class in org.apache.datafusion"><code>JoinType</code></a>.</div> |
| </div> |
| <div class="col-first odd-row-color method-summary-table method-summary-table-tab2 method-summary-table-tab4"><code><a href="DataFrame.html" title="class in org.apache.datafusion">DataFrame</a></code></div> |
| <div class="col-second odd-row-color method-summary-table method-summary-table-tab2 method-summary-table-tab4"><code><a href="#join(org.apache.datafusion.DataFrame,org.apache.datafusion.JoinType,java.lang.String%5B%5D,java.lang.String%5B%5D,java.lang.String)" class="member-name-link">join</a><wbr>(<a href="DataFrame.html" title="class in org.apache.datafusion">DataFrame</a> right, |
| <a href="JoinType.html" title="enum class in org.apache.datafusion">JoinType</a> type, |
| <a href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/lang/String.html" title="class or interface in java.lang" class="external-link">String</a>[] leftCols, |
| <a href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/lang/String.html" title="class or interface in java.lang" class="external-link">String</a>[] rightCols, |
| <a href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/lang/String.html" title="class or interface in java.lang" class="external-link">String</a> filter)</code></div> |
| <div class="col-last odd-row-color method-summary-table method-summary-table-tab2 method-summary-table-tab4"> |
| <div class="block">Equi-join this DataFrame with <code>right</code>, restricting the result with a residual SQL filter |
| parsed against the <em>combined</em> schema (left columns followed by right columns; columns |
| may be qualified with the relation alias when ambiguous).</div> |
| </div> |
| <div class="col-first even-row-color method-summary-table method-summary-table-tab2 method-summary-table-tab4"><code><a href="DataFrame.html" title="class in org.apache.datafusion">DataFrame</a></code></div> |
| <div class="col-second even-row-color method-summary-table method-summary-table-tab2 method-summary-table-tab4"><code><a href="#joinOn(org.apache.datafusion.DataFrame,org.apache.datafusion.JoinType,java.lang.String...)" class="member-name-link">joinOn</a><wbr>(<a href="DataFrame.html" title="class in org.apache.datafusion">DataFrame</a> right, |
| <a href="JoinType.html" title="enum class in org.apache.datafusion">JoinType</a> type, |
| <a href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/lang/String.html" title="class or interface in java.lang" class="external-link">String</a>... predicates)</code></div> |
| <div class="col-last even-row-color method-summary-table method-summary-table-tab2 method-summary-table-tab4"> |
| <div class="block">Join this DataFrame with <code>right</code> using arbitrary SQL predicates parsed against the |
| <em>combined</em> schema.</div> |
| </div> |
| <div class="col-first odd-row-color method-summary-table method-summary-table-tab2 method-summary-table-tab4"><code><a href="DataFrame.html" title="class in org.apache.datafusion">DataFrame</a></code></div> |
| <div class="col-second odd-row-color method-summary-table method-summary-table-tab2 method-summary-table-tab4"><code><a href="#limit(int)" class="member-name-link">limit</a><wbr>(int fetch)</code></div> |
| <div class="col-last odd-row-color method-summary-table method-summary-table-tab2 method-summary-table-tab4"> |
| <div class="block">Take the first <code>fetch</code> rows.</div> |
| </div> |
| <div class="col-first even-row-color method-summary-table method-summary-table-tab2 method-summary-table-tab4"><code><a href="DataFrame.html" title="class in org.apache.datafusion">DataFrame</a></code></div> |
| <div class="col-second even-row-color method-summary-table method-summary-table-tab2 method-summary-table-tab4"><code><a href="#limit(int,int)" class="member-name-link">limit</a><wbr>(int skip, |
| int fetch)</code></div> |
| <div class="col-last even-row-color method-summary-table method-summary-table-tab2 method-summary-table-tab4"> |
| <div class="block">Skip <code>skip</code> rows, then take the next <code>fetch</code> rows.</div> |
| </div> |
| <div class="col-first odd-row-color method-summary-table method-summary-table-tab2 method-summary-table-tab4"><code><a href="DataFrame.html" title="class in org.apache.datafusion">DataFrame</a></code></div> |
| <div class="col-second odd-row-color method-summary-table method-summary-table-tab2 method-summary-table-tab4"><code><a href="#repartitionHash(int,java.lang.String...)" class="member-name-link">repartitionHash</a><wbr>(int numPartitions, |
| <a href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/lang/String.html" title="class or interface in java.lang" class="external-link">String</a>... columns)</code></div> |
| <div class="col-last odd-row-color method-summary-table method-summary-table-tab2 method-summary-table-tab4"> |
| <div class="block">Repartition this DataFrame by hashing the named columns into <code>numPartitions</code> output |
| partitions. v1 supports column-name keys only; expression keys are deferred until the Java |
| binding gains an <code>Expr</code> builder.</div> |
| </div> |
| <div class="col-first even-row-color method-summary-table method-summary-table-tab2 method-summary-table-tab4"><code><a href="DataFrame.html" title="class in org.apache.datafusion">DataFrame</a></code></div> |
| <div class="col-second even-row-color method-summary-table method-summary-table-tab2 method-summary-table-tab4"><code><a href="#repartitionRoundRobin(int)" class="member-name-link">repartitionRoundRobin</a><wbr>(int numPartitions)</code></div> |
| <div class="col-last even-row-color method-summary-table method-summary-table-tab2 method-summary-table-tab4"> |
| <div class="block">Repartition this DataFrame using a round-robin scheme across <code>numPartitions</code> output |
| partitions.</div> |
| </div> |
| <div class="col-first odd-row-color method-summary-table method-summary-table-tab2 method-summary-table-tab4"><code>org.apache.arrow.vector.types.pojo.Schema</code></div> |
| <div class="col-second odd-row-color method-summary-table method-summary-table-tab2 method-summary-table-tab4"><code><a href="#schema()" class="member-name-link">schema</a>()</code></div> |
| <div class="col-last odd-row-color method-summary-table method-summary-table-tab2 method-summary-table-tab4"> |
| <div class="block">Return the Arrow <code>Schema</code> of this DataFrame's output.</div> |
| </div> |
| <div class="col-first even-row-color method-summary-table method-summary-table-tab2 method-summary-table-tab4"><code><a href="DataFrame.html" title="class in org.apache.datafusion">DataFrame</a></code></div> |
| <div class="col-second even-row-color method-summary-table method-summary-table-tab2 method-summary-table-tab4"><code><a href="#select(java.lang.String...)" class="member-name-link">select</a><wbr>(<a href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/lang/String.html" title="class or interface in java.lang" class="external-link">String</a>... columnNames)</code></div> |
| <div class="col-last even-row-color method-summary-table method-summary-table-tab2 method-summary-table-tab4"> |
| <div class="block">Project the listed columns into a new DataFrame.</div> |
| </div> |
| <div class="col-first odd-row-color method-summary-table method-summary-table-tab2 method-summary-table-tab4"><code>void</code></div> |
| <div class="col-second odd-row-color method-summary-table method-summary-table-tab2 method-summary-table-tab4"><code><a href="#show()" class="member-name-link">show</a>()</code></div> |
| <div class="col-last odd-row-color method-summary-table method-summary-table-tab2 method-summary-table-tab4"> |
| <div class="block">Execute the plan and print formatted batches to native stdout.</div> |
| </div> |
| <div class="col-first even-row-color method-summary-table method-summary-table-tab2 method-summary-table-tab4"><code>void</code></div> |
| <div class="col-second even-row-color method-summary-table method-summary-table-tab2 method-summary-table-tab4"><code><a href="#show(int)" class="member-name-link">show</a><wbr>(int limit)</code></div> |
| <div class="col-last even-row-color method-summary-table method-summary-table-tab2 method-summary-table-tab4"> |
| <div class="block">Execute the plan and print the first <code>limit</code> rows to native stdout.</div> |
| </div> |
| <div class="col-first odd-row-color method-summary-table method-summary-table-tab2 method-summary-table-tab4"><code><a href="DataFrame.html" title="class in org.apache.datafusion">DataFrame</a></code></div> |
| <div class="col-second odd-row-color method-summary-table method-summary-table-tab2 method-summary-table-tab4"><code><a href="#sort(org.apache.datafusion.SortExpr...)" class="member-name-link">sort</a><wbr>(<a href="SortExpr.html" title="class in org.apache.datafusion">SortExpr</a>... exprs)</code></div> |
| <div class="col-last odd-row-color method-summary-table method-summary-table-tab2 method-summary-table-tab4"> |
| <div class="block">Order the rows by the supplied sort keys.</div> |
| </div> |
| <div class="col-first even-row-color method-summary-table method-summary-table-tab2 method-summary-table-tab4"><code><a href="DataFrame.html" title="class in org.apache.datafusion">DataFrame</a></code></div> |
| <div class="col-second even-row-color method-summary-table method-summary-table-tab2 method-summary-table-tab4"><code><a href="#union(org.apache.datafusion.DataFrame)" class="member-name-link">union</a><wbr>(<a href="DataFrame.html" title="class in org.apache.datafusion">DataFrame</a> other)</code></div> |
| <div class="col-last even-row-color method-summary-table method-summary-table-tab2 method-summary-table-tab4"> |
| <div class="block">Concatenate this DataFrame with <code>other</code> by column position, keeping all duplicates (SQL |
| <code>UNION ALL</code>).</div> |
| </div> |
| <div class="col-first odd-row-color method-summary-table method-summary-table-tab2 method-summary-table-tab4"><code><a href="DataFrame.html" title="class in org.apache.datafusion">DataFrame</a></code></div> |
| <div class="col-second odd-row-color method-summary-table method-summary-table-tab2 method-summary-table-tab4"><code><a href="#unionByName(org.apache.datafusion.DataFrame)" class="member-name-link">unionByName</a><wbr>(<a href="DataFrame.html" title="class in org.apache.datafusion">DataFrame</a> other)</code></div> |
| <div class="col-last odd-row-color method-summary-table method-summary-table-tab2 method-summary-table-tab4"> |
| <div class="block">Concatenate this DataFrame with <code>other</code> by column name, keeping all duplicates.</div> |
| </div> |
| <div class="col-first even-row-color method-summary-table method-summary-table-tab2 method-summary-table-tab4"><code><a href="DataFrame.html" title="class in org.apache.datafusion">DataFrame</a></code></div> |
| <div class="col-second even-row-color method-summary-table method-summary-table-tab2 method-summary-table-tab4"><code><a href="#unionByNameDistinct(org.apache.datafusion.DataFrame)" class="member-name-link">unionByNameDistinct</a><wbr>(<a href="DataFrame.html" title="class in org.apache.datafusion">DataFrame</a> other)</code></div> |
| <div class="col-last even-row-color method-summary-table method-summary-table-tab2 method-summary-table-tab4"> |
| <div class="block">Concatenate this DataFrame with <code>other</code> by column name, removing duplicates.</div> |
| </div> |
| <div class="col-first odd-row-color method-summary-table method-summary-table-tab2 method-summary-table-tab4"><code><a href="DataFrame.html" title="class in org.apache.datafusion">DataFrame</a></code></div> |
| <div class="col-second odd-row-color method-summary-table method-summary-table-tab2 method-summary-table-tab4"><code><a href="#unionDistinct(org.apache.datafusion.DataFrame)" class="member-name-link">unionDistinct</a><wbr>(<a href="DataFrame.html" title="class in org.apache.datafusion">DataFrame</a> other)</code></div> |
| <div class="col-last odd-row-color method-summary-table method-summary-table-tab2 method-summary-table-tab4"> |
| <div class="block">Concatenate this DataFrame with <code>other</code> by column position, removing duplicates (SQL |
| <code>UNION DISTINCT</code> -- equivalent to plain <code>UNION</code> in standard SQL).</div> |
| </div> |
| <div class="col-first even-row-color method-summary-table method-summary-table-tab2 method-summary-table-tab4"><code><a href="DataFrame.html" title="class in org.apache.datafusion">DataFrame</a></code></div> |
| <div class="col-second even-row-color method-summary-table method-summary-table-tab2 method-summary-table-tab4"><code><a href="#unnestColumns(java.lang.String...)" class="member-name-link">unnestColumns</a><wbr>(<a href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/lang/String.html" title="class or interface in java.lang" class="external-link">String</a>... columns)</code></div> |
| <div class="col-last even-row-color method-summary-table method-summary-table-tab2 method-summary-table-tab4"> |
| <div class="block">Expand list or struct columns into rows or fields, with default <a href="UnnestOptions.html" title="class in org.apache.datafusion"><code>UnnestOptions</code></a> (i.e.</div> |
| </div> |
| <div class="col-first odd-row-color method-summary-table method-summary-table-tab2 method-summary-table-tab4"><code><a href="DataFrame.html" title="class in org.apache.datafusion">DataFrame</a></code></div> |
| <div class="col-second odd-row-color method-summary-table method-summary-table-tab2 method-summary-table-tab4"><code><a href="#unnestColumns(org.apache.datafusion.UnnestOptions,java.lang.String...)" class="member-name-link">unnestColumns</a><wbr>(<a href="UnnestOptions.html" title="class in org.apache.datafusion">UnnestOptions</a> options, |
| <a href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/lang/String.html" title="class or interface in java.lang" class="external-link">String</a>... columns)</code></div> |
| <div class="col-last odd-row-color method-summary-table method-summary-table-tab2 method-summary-table-tab4"> |
| <div class="block">Expand list or struct columns into rows or fields with the supplied <a href="UnnestOptions.html" title="class in org.apache.datafusion"><code>UnnestOptions</code></a>.</div> |
| </div> |
| <div class="col-first even-row-color method-summary-table method-summary-table-tab2 method-summary-table-tab4"><code><a href="DataFrame.html" title="class in org.apache.datafusion">DataFrame</a></code></div> |
| <div class="col-second even-row-color method-summary-table method-summary-table-tab2 method-summary-table-tab4"><code><a href="#withColumn(java.lang.String,java.lang.String)" class="member-name-link">withColumn</a><wbr>(<a href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/lang/String.html" title="class or interface in java.lang" class="external-link">String</a> name, |
| <a href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/lang/String.html" title="class or interface in java.lang" class="external-link">String</a> expr)</code></div> |
| <div class="col-last even-row-color method-summary-table method-summary-table-tab2 method-summary-table-tab4"> |
| <div class="block">Add a column to this DataFrame computed from a SQL expression.</div> |
| </div> |
| <div class="col-first odd-row-color method-summary-table method-summary-table-tab2 method-summary-table-tab4"><code><a href="DataFrame.html" title="class in org.apache.datafusion">DataFrame</a></code></div> |
| <div class="col-second odd-row-color method-summary-table method-summary-table-tab2 method-summary-table-tab4"><code><a href="#withColumnRenamed(java.lang.String,java.lang.String)" class="member-name-link">withColumnRenamed</a><wbr>(<a href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/lang/String.html" title="class or interface in java.lang" class="external-link">String</a> oldName, |
| <a href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/lang/String.html" title="class or interface in java.lang" class="external-link">String</a> newName)</code></div> |
| <div class="col-last odd-row-color method-summary-table method-summary-table-tab2 method-summary-table-tab4"> |
| <div class="block">Rename a column.</div> |
| </div> |
| <div class="col-first even-row-color method-summary-table method-summary-table-tab2 method-summary-table-tab4"><code>void</code></div> |
| <div class="col-second even-row-color method-summary-table method-summary-table-tab2 method-summary-table-tab4"><code><a href="#writeCsv(java.lang.String)" class="member-name-link">writeCsv</a><wbr>(<a href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/lang/String.html" title="class or interface in java.lang" class="external-link">String</a> path)</code></div> |
| <div class="col-last even-row-color method-summary-table method-summary-table-tab2 method-summary-table-tab4"> |
| <div class="block">Materialize this DataFrame as CSV at <code>path</code>.</div> |
| </div> |
| <div class="col-first odd-row-color method-summary-table method-summary-table-tab2 method-summary-table-tab4"><code>void</code></div> |
| <div class="col-second odd-row-color method-summary-table method-summary-table-tab2 method-summary-table-tab4"><code><a href="#writeCsv(java.lang.String,org.apache.datafusion.CsvWriteOptions)" class="member-name-link">writeCsv</a><wbr>(<a href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/lang/String.html" title="class or interface in java.lang" class="external-link">String</a> path, |
| <a href="CsvWriteOptions.html" title="class in org.apache.datafusion">CsvWriteOptions</a> options)</code></div> |
| <div class="col-last odd-row-color method-summary-table method-summary-table-tab2 method-summary-table-tab4"> |
| <div class="block">Materialize this DataFrame as CSV at <code>path</code> with the supplied <a href="CsvWriteOptions.html" title="class in org.apache.datafusion"><code>CsvWriteOptions</code></a>.</div> |
| </div> |
| <div class="col-first even-row-color method-summary-table method-summary-table-tab2 method-summary-table-tab4"><code>void</code></div> |
| <div class="col-second even-row-color method-summary-table method-summary-table-tab2 method-summary-table-tab4"><code><a href="#writeJson(java.lang.String)" class="member-name-link">writeJson</a><wbr>(<a href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/lang/String.html" title="class or interface in java.lang" class="external-link">String</a> path)</code></div> |
| <div class="col-last even-row-color method-summary-table method-summary-table-tab2 method-summary-table-tab4"> |
| <div class="block">Materialize this DataFrame as newline-delimited JSON at <code>path</code>.</div> |
| </div> |
| <div class="col-first odd-row-color method-summary-table method-summary-table-tab2 method-summary-table-tab4"><code>void</code></div> |
| <div class="col-second odd-row-color method-summary-table method-summary-table-tab2 method-summary-table-tab4"><code><a href="#writeJson(java.lang.String,org.apache.datafusion.JsonWriteOptions)" class="member-name-link">writeJson</a><wbr>(<a href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/lang/String.html" title="class or interface in java.lang" class="external-link">String</a> path, |
| <a href="JsonWriteOptions.html" title="class in org.apache.datafusion">JsonWriteOptions</a> options)</code></div> |
| <div class="col-last odd-row-color method-summary-table method-summary-table-tab2 method-summary-table-tab4"> |
| <div class="block">Materialize this DataFrame as newline-delimited JSON at <code>path</code> with the supplied <a href="JsonWriteOptions.html" title="class in org.apache.datafusion"><code>JsonWriteOptions</code></a>.</div> |
| </div> |
| <div class="col-first even-row-color method-summary-table method-summary-table-tab2 method-summary-table-tab4"><code>void</code></div> |
| <div class="col-second even-row-color method-summary-table method-summary-table-tab2 method-summary-table-tab4"><code><a href="#writeParquet(java.lang.String)" class="member-name-link">writeParquet</a><wbr>(<a href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/lang/String.html" title="class or interface in java.lang" class="external-link">String</a> path)</code></div> |
| <div class="col-last even-row-color method-summary-table method-summary-table-tab2 method-summary-table-tab4"> |
| <div class="block">Materialize this DataFrame as Parquet at <code>path</code>.</div> |
| </div> |
| <div class="col-first odd-row-color method-summary-table method-summary-table-tab2 method-summary-table-tab4"><code>void</code></div> |
| <div class="col-second odd-row-color method-summary-table method-summary-table-tab2 method-summary-table-tab4"><code><a href="#writeParquet(java.lang.String,org.apache.datafusion.ParquetWriteOptions)" class="member-name-link">writeParquet</a><wbr>(<a href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/lang/String.html" title="class or interface in java.lang" class="external-link">String</a> path, |
| <a href="ParquetWriteOptions.html" title="class in org.apache.datafusion">ParquetWriteOptions</a> options)</code></div> |
| <div class="col-last odd-row-color method-summary-table method-summary-table-tab2 method-summary-table-tab4"> |
| <div class="block">Materialize this DataFrame as Parquet at <code>path</code> with the supplied <a href="ParquetWriteOptions.html" title="class in org.apache.datafusion"><code>ParquetWriteOptions</code></a>.</div> |
| </div> |
| </div> |
| </div> |
| </div> |
| <div class="inherited-list"> |
| <h3 id="methods-inherited-from-class-java.lang.Object">Methods inherited from class java.lang.<a href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/lang/Object.html" title="class or interface in java.lang" class="external-link">Object</a></h3> |
| <code><a href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/lang/Object.html#clone()" title="class or interface in java.lang" class="external-link">clone</a>, <a href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/lang/Object.html#equals(java.lang.Object)" title="class or interface in java.lang" class="external-link">equals</a>, <a href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/lang/Object.html#finalize()" title="class or interface in java.lang" class="external-link">finalize</a>, <a href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/lang/Object.html#getClass()" title="class or interface in java.lang" class="external-link">getClass</a>, <a href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/lang/Object.html#hashCode()" title="class or interface in java.lang" class="external-link">hashCode</a>, <a href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/lang/Object.html#notify()" title="class or interface in java.lang" class="external-link">notify</a>, <a href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/lang/Object.html#notifyAll()" title="class or interface in java.lang" class="external-link">notifyAll</a>, <a href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/lang/Object.html#toString()" title="class or interface in java.lang" class="external-link">toString</a>, <a href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/lang/Object.html#wait()" title="class or interface in java.lang" class="external-link">wait</a>, <a href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/lang/Object.html#wait(long)" title="class or interface in java.lang" class="external-link">wait</a>, <a href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/lang/Object.html#wait(long,int)" title="class or interface in java.lang" class="external-link">wait</a></code></div> |
| </section> |
| </li> |
| </ul> |
| </section> |
| <section class="details"> |
| <ul class="details-list"> |
| <!-- ============ METHOD DETAIL ========== --> |
| <li> |
| <section class="method-details" id="method-detail"> |
| <h2>Method Details</h2> |
| <ul class="member-list"> |
| <li> |
| <section class="detail" id="collect(org.apache.arrow.memory.BufferAllocator)"> |
| <h3>collect</h3> |
| <div class="member-signature"><span class="modifiers">public</span> <span class="return-type">org.apache.arrow.vector.ipc.ArrowReader</span> <span class="element-name">collect</span><wbr><span class="parameters">(org.apache.arrow.memory.BufferAllocator allocator)</span></div> |
| <div class="block">Execute the plan and return its record batches as an <code>ArrowReader</code>. |
| |
| <p>Consumes this DataFrame: the native plan is released as soon as the stream is established. |
| The caller is responsible for closing the returned reader, and the supplied allocator must |
| outlive it. |
| |
| <p>This method materializes every batch on the native heap before the first batch crosses the |
| FFI boundary, which can OOM the Rust side for unbounded or very large result sets. Prefer |
| <a href="#executeStream(org.apache.arrow.memory.BufferAllocator)"><code>executeStream(BufferAllocator)</code></a> for analytics-scale queries.</div> |
| </section> |
| </li> |
| <li> |
| <section class="detail" id="executeStream(org.apache.arrow.memory.BufferAllocator)"> |
| <h3>executeStream</h3> |
| <div class="member-signature"><span class="modifiers">public</span> <span class="return-type">org.apache.arrow.vector.ipc.ArrowReader</span> <span class="element-name">executeStream</span><wbr><span class="parameters">(org.apache.arrow.memory.BufferAllocator allocator)</span></div> |
| <div class="block">Execute the plan and return its record batches as a streaming <code>ArrowReader</code>. Each call to |
| <code>ArrowReader.loadNextBatch()</code> drives one async <code>stream.next()</code> on the native side, so |
| memory pressure stays bounded by the executor pipeline plus one in-flight batch instead of the |
| full result set. |
| |
| <p>Consumes this DataFrame with the same lifecycle rules as <a href="#collect(org.apache.arrow.memory.BufferAllocator)"><code>collect(BufferAllocator)</code></a>: |
| the native plan is released as soon as the stream is established, the caller closes the |
| returned reader, and the supplied allocator must outlive it. |
| |
| <p>For result sets that fit comfortably in native memory and are read in their entirety, <a href="#collect(org.apache.arrow.memory.BufferAllocator)"><code>collect(BufferAllocator)</code></a> remains a reasonable choice. For TB-scale or unbounded result sets, |
| use this method.</div> |
| </section> |
| </li> |
| <li> |
| <section class="detail" id="schema()"> |
| <h3>schema</h3> |
| <div class="member-signature"><span class="modifiers">public</span> <span class="return-type">org.apache.arrow.vector.types.pojo.Schema</span> <span class="element-name">schema</span>()</div> |
| <div class="block">Return the Arrow <code>Schema</code> of this DataFrame's output. Non-consuming: the receiver remains |
| usable and must still be closed independently. Schema inspection does not execute the plan. |
| |
| <p>The schema is transferred via Arrow IPC; no <code>BufferAllocator</code> is required because a |
| schema carries no buffer data.</div> |
| </section> |
| </li> |
| <li> |
| <section class="detail" id="explain(boolean,boolean)"> |
| <h3>explain</h3> |
| <div class="member-signature"><span class="modifiers">public</span> <span class="return-type"><a href="DataFrame.html" title="class in org.apache.datafusion">DataFrame</a></span> <span class="element-name">explain</span><wbr><span class="parameters">(boolean verbose, |
| boolean analyze)</span></div> |
| <div class="block">Return a new DataFrame whose rows describe the plan that would execute this DataFrame. |
| Non-consuming: the receiver remains usable and must still be closed independently. |
| |
| <p>With <code>verbose=false</code> and <code>analyze=false</code> (the cheap, lazy variant), the result |
| contains the logical plan only. <code>verbose=true</code> adds optimised-plan and physical-plan |
| rows; <code>analyze=true</code> runs the plan and attaches per-operator metrics. Render via <a href="#show()"><code>show()</code></a> or <a href="#collect(org.apache.arrow.memory.BufferAllocator)"><code>collect(BufferAllocator)</code></a>.</div> |
| </section> |
| </li> |
| <li> |
| <section class="detail" id="cache()"> |
| <h3>cache</h3> |
| <div class="member-signature"><span class="modifiers">public</span> <span class="return-type"><a href="DataFrame.html" title="class in org.apache.datafusion">DataFrame</a></span> <span class="element-name">cache</span>()</div> |
| <div class="block">Materialise this DataFrame into an in-memory table and return a new DataFrame that scans it. |
| Non-consuming: the receiver remains usable and must still be closed independently. |
| |
| <p>Executes the plan eagerly: the entire result set is held in native memory until the returned |
| DataFrame is closed. Suitable for intermediate results that will be reused across multiple |
| downstream queries.</div> |
| <dl class="notes"> |
| <dt>Throws:</dt> |
| <dd><code><a href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/lang/RuntimeException.html" title="class or interface in java.lang" class="external-link">RuntimeException</a></code> - if execution fails.</dd> |
| </dl> |
| </section> |
| </li> |
| <li> |
| <section class="detail" id="describe()"> |
| <h3>describe</h3> |
| <div class="member-signature"><span class="modifiers">public</span> <span class="return-type"><a href="DataFrame.html" title="class in org.apache.datafusion">DataFrame</a></span> <span class="element-name">describe</span>()</div> |
| <div class="block">Compute summary statistics (count, null_count, mean, std, min, max, median) over this |
| DataFrame's columns and return them as a new DataFrame. Non-consuming: the receiver remains |
| usable and must still be closed independently. |
| |
| <p>Executes the plan: DataFusion runs seven aggregate sub-plans against this DataFrame to build |
| the summary table. Numeric columns receive every statistic; non-numeric columns receive <code> |
| count</code> / <code>null_count</code> / <code>min</code> / <code>max</code> where applicable.</div> |
| <dl class="notes"> |
| <dt>Throws:</dt> |
| <dd><code><a href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/lang/RuntimeException.html" title="class or interface in java.lang" class="external-link">RuntimeException</a></code> - if execution fails.</dd> |
| </dl> |
| </section> |
| </li> |
| <li> |
| <section class="detail" id="count()"> |
| <h3>count</h3> |
| <div class="member-signature"><span class="modifiers">public</span> <span class="return-type">long</span> <span class="element-name">count</span>()</div> |
| <div class="block">Execute the plan and return the number of rows.</div> |
| </section> |
| </li> |
| <li> |
| <section class="detail" id="show()"> |
| <h3>show</h3> |
| <div class="member-signature"><span class="modifiers">public</span> <span class="return-type">void</span> <span class="element-name">show</span>()</div> |
| <div class="block">Execute the plan and print formatted batches to native stdout.</div> |
| </section> |
| </li> |
| <li> |
| <section class="detail" id="show(int)"> |
| <h3>show</h3> |
| <div class="member-signature"><span class="modifiers">public</span> <span class="return-type">void</span> <span class="element-name">show</span><wbr><span class="parameters">(int limit)</span></div> |
| <div class="block">Execute the plan and print the first <code>limit</code> rows to native stdout.</div> |
| </section> |
| </li> |
| <li> |
| <section class="detail" id="select(java.lang.String...)"> |
| <h3>select</h3> |
| <div class="member-signature"><span class="modifiers">public</span> <span class="return-type"><a href="DataFrame.html" title="class in org.apache.datafusion">DataFrame</a></span> <span class="element-name">select</span><wbr><span class="parameters">(<a href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/lang/String.html" title="class or interface in java.lang" class="external-link">String</a>... columnNames)</span></div> |
| <div class="block">Project the listed columns into a new DataFrame. The receiver remains usable and must still be |
| closed independently.</div> |
| </section> |
| </li> |
| <li> |
| <section class="detail" id="filter(java.lang.String)"> |
| <h3>filter</h3> |
| <div class="member-signature"><span class="modifiers">public</span> <span class="return-type"><a href="DataFrame.html" title="class in org.apache.datafusion">DataFrame</a></span> <span class="element-name">filter</span><wbr><span class="parameters">(<a href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/lang/String.html" title="class or interface in java.lang" class="external-link">String</a> predicate)</span></div> |
| <div class="block">Apply a SQL predicate to produce a filtered DataFrame. The predicate is parsed against this |
| DataFrame's own schema. The receiver remains usable and must still be closed independently.</div> |
| </section> |
| </li> |
| <li> |
| <section class="detail" id="limit(int)"> |
| <h3>limit</h3> |
| <div class="member-signature"><span class="modifiers">public</span> <span class="return-type"><a href="DataFrame.html" title="class in org.apache.datafusion">DataFrame</a></span> <span class="element-name">limit</span><wbr><span class="parameters">(int fetch)</span></div> |
| <div class="block">Take the first <code>fetch</code> rows. Equivalent to <a href="#limit(int,int)"><code>limit(int, int)</code></a> with <code>skip = |
| 0</code>. The receiver remains usable and must still be closed independently.</div> |
| </section> |
| </li> |
| <li> |
| <section class="detail" id="limit(int,int)"> |
| <h3>limit</h3> |
| <div class="member-signature"><span class="modifiers">public</span> <span class="return-type"><a href="DataFrame.html" title="class in org.apache.datafusion">DataFrame</a></span> <span class="element-name">limit</span><wbr><span class="parameters">(int skip, |
| int fetch)</span></div> |
| <div class="block">Skip <code>skip</code> rows, then take the next <code>fetch</code> rows. Both arguments must be |
| non-negative. The receiver remains usable and must still be closed independently.</div> |
| </section> |
| </li> |
| <li> |
| <section class="detail" id="distinct()"> |
| <h3>distinct</h3> |
| <div class="member-signature"><span class="modifiers">public</span> <span class="return-type"><a href="DataFrame.html" title="class in org.apache.datafusion">DataFrame</a></span> <span class="element-name">distinct</span>()</div> |
| <div class="block">Deduplicate rows across all columns. The receiver remains usable and must still be closed |
| independently.</div> |
| </section> |
| </li> |
| <li> |
| <section class="detail" id="dropColumns(java.lang.String...)"> |
| <h3>dropColumns</h3> |
| <div class="member-signature"><span class="modifiers">public</span> <span class="return-type"><a href="DataFrame.html" title="class in org.apache.datafusion">DataFrame</a></span> <span class="element-name">dropColumns</span><wbr><span class="parameters">(<a href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/lang/String.html" title="class or interface in java.lang" class="external-link">String</a>... columnNames)</span></div> |
| <div class="block">Drop the named columns. The inverse of <a href="#select(java.lang.String...)"><code>select(String...)</code></a>. The receiver remains usable |
| and must still be closed independently.</div> |
| </section> |
| </li> |
| <li> |
| <section class="detail" id="withColumnRenamed(java.lang.String,java.lang.String)"> |
| <h3>withColumnRenamed</h3> |
| <div class="member-signature"><span class="modifiers">public</span> <span class="return-type"><a href="DataFrame.html" title="class in org.apache.datafusion">DataFrame</a></span> <span class="element-name">withColumnRenamed</span><wbr><span class="parameters">(<a href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/lang/String.html" title="class or interface in java.lang" class="external-link">String</a> oldName, |
| <a href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/lang/String.html" title="class or interface in java.lang" class="external-link">String</a> newName)</span></div> |
| <div class="block">Rename a column. The receiver remains usable and must still be closed independently.</div> |
| </section> |
| </li> |
| <li> |
| <section class="detail" id="withColumn(java.lang.String,java.lang.String)"> |
| <h3>withColumn</h3> |
| <div class="member-signature"><span class="modifiers">public</span> <span class="return-type"><a href="DataFrame.html" title="class in org.apache.datafusion">DataFrame</a></span> <span class="element-name">withColumn</span><wbr><span class="parameters">(<a href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/lang/String.html" title="class or interface in java.lang" class="external-link">String</a> name, |
| <a href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/lang/String.html" title="class or interface in java.lang" class="external-link">String</a> expr)</span></div> |
| <div class="block">Add a column to this DataFrame computed from a SQL expression. If a column with the given name |
| already exists, it is replaced in place; otherwise the new column is appended. The expression |
| is parsed against this DataFrame's own schema, matching the convention used by <a href="#filter(java.lang.String)"><code>filter(String)</code></a>. The receiver remains usable and must still be closed independently.</div> |
| <dl class="notes"> |
| <dt>Throws:</dt> |
| <dd><code><a href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/lang/IllegalArgumentException.html" title="class or interface in java.lang" class="external-link">IllegalArgumentException</a></code> - if <code>name</code> or <code>expr</code> is <code>null</code>.</dd> |
| </dl> |
| </section> |
| </li> |
| <li> |
| <section class="detail" id="unnestColumns(java.lang.String...)"> |
| <h3>unnestColumns</h3> |
| <div class="member-signature"><span class="modifiers">public</span> <span class="return-type"><a href="DataFrame.html" title="class in org.apache.datafusion">DataFrame</a></span> <span class="element-name">unnestColumns</span><wbr><span class="parameters">(<a href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/lang/String.html" title="class or interface in java.lang" class="external-link">String</a>... columns)</span></div> |
| <div class="block">Expand list or struct columns into rows or fields, with default <a href="UnnestOptions.html" title="class in org.apache.datafusion"><code>UnnestOptions</code></a> (i.e. |
| <code>preserveNulls = true</code>). The receiver remains usable and must still be closed |
| independently.</div> |
| </section> |
| </li> |
| <li> |
| <section class="detail" id="unnestColumns(org.apache.datafusion.UnnestOptions,java.lang.String...)"> |
| <h3>unnestColumns</h3> |
| <div class="member-signature"><span class="modifiers">public</span> <span class="return-type"><a href="DataFrame.html" title="class in org.apache.datafusion">DataFrame</a></span> <span class="element-name">unnestColumns</span><wbr><span class="parameters">(<a href="UnnestOptions.html" title="class in org.apache.datafusion">UnnestOptions</a> options, |
| <a href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/lang/String.html" title="class or interface in java.lang" class="external-link">String</a>... columns)</span></div> |
| <div class="block">Expand list or struct columns into rows or fields with the supplied <a href="UnnestOptions.html" title="class in org.apache.datafusion"><code>UnnestOptions</code></a>. The |
| receiver remains usable and must still be closed independently.</div> |
| <dl class="notes"> |
| <dt>Throws:</dt> |
| <dd><code><a href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/lang/IllegalArgumentException.html" title="class or interface in java.lang" class="external-link">IllegalArgumentException</a></code> - if <code>options</code> or <code>columns</code> is <code>null</code>.</dd> |
| </dl> |
| </section> |
| </li> |
| <li> |
| <section class="detail" id="union(org.apache.datafusion.DataFrame)"> |
| <h3>union</h3> |
| <div class="member-signature"><span class="modifiers">public</span> <span class="return-type"><a href="DataFrame.html" title="class in org.apache.datafusion">DataFrame</a></span> <span class="element-name">union</span><wbr><span class="parameters">(<a href="DataFrame.html" title="class in org.apache.datafusion">DataFrame</a> other)</span></div> |
| <div class="block">Concatenate this DataFrame with <code>other</code> by column position, keeping all duplicates (SQL |
| <code>UNION ALL</code>). The two schemas must match positionally. Both this DataFrame and <code> |
| other</code> remain usable after the call and must still be closed independently.</div> |
| <dl class="notes"> |
| <dt>Throws:</dt> |
| <dd><code><a href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/lang/IllegalArgumentException.html" title="class or interface in java.lang" class="external-link">IllegalArgumentException</a></code> - if <code>other</code> is <code>null</code>.</dd> |
| <dd><code><a href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/lang/RuntimeException.html" title="class or interface in java.lang" class="external-link">RuntimeException</a></code> - if the schemas are incompatible.</dd> |
| </dl> |
| </section> |
| </li> |
| <li> |
| <section class="detail" id="unionDistinct(org.apache.datafusion.DataFrame)"> |
| <h3>unionDistinct</h3> |
| <div class="member-signature"><span class="modifiers">public</span> <span class="return-type"><a href="DataFrame.html" title="class in org.apache.datafusion">DataFrame</a></span> <span class="element-name">unionDistinct</span><wbr><span class="parameters">(<a href="DataFrame.html" title="class in org.apache.datafusion">DataFrame</a> other)</span></div> |
| <div class="block">Concatenate this DataFrame with <code>other</code> by column position, removing duplicates (SQL |
| <code>UNION DISTINCT</code> -- equivalent to plain <code>UNION</code> in standard SQL). Both DataFrames |
| remain usable.</div> |
| <dl class="notes"> |
| <dt>Throws:</dt> |
| <dd><code><a href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/lang/IllegalArgumentException.html" title="class or interface in java.lang" class="external-link">IllegalArgumentException</a></code> - if <code>other</code> is <code>null</code>.</dd> |
| <dd><code><a href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/lang/RuntimeException.html" title="class or interface in java.lang" class="external-link">RuntimeException</a></code> - if the schemas are incompatible.</dd> |
| </dl> |
| </section> |
| </li> |
| <li> |
| <section class="detail" id="unionByName(org.apache.datafusion.DataFrame)"> |
| <h3>unionByName</h3> |
| <div class="member-signature"><span class="modifiers">public</span> <span class="return-type"><a href="DataFrame.html" title="class in org.apache.datafusion">DataFrame</a></span> <span class="element-name">unionByName</span><wbr><span class="parameters">(<a href="DataFrame.html" title="class in org.apache.datafusion">DataFrame</a> other)</span></div> |
| <div class="block">Concatenate this DataFrame with <code>other</code> by column name, keeping all duplicates. Columns |
| present in only one side are filled with NULL on the other. Both DataFrames remain usable.</div> |
| <dl class="notes"> |
| <dt>Throws:</dt> |
| <dd><code><a href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/lang/IllegalArgumentException.html" title="class or interface in java.lang" class="external-link">IllegalArgumentException</a></code> - if <code>other</code> is <code>null</code>.</dd> |
| <dd><code><a href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/lang/RuntimeException.html" title="class or interface in java.lang" class="external-link">RuntimeException</a></code> - if column types disagree on a shared name.</dd> |
| </dl> |
| </section> |
| </li> |
| <li> |
| <section class="detail" id="unionByNameDistinct(org.apache.datafusion.DataFrame)"> |
| <h3>unionByNameDistinct</h3> |
| <div class="member-signature"><span class="modifiers">public</span> <span class="return-type"><a href="DataFrame.html" title="class in org.apache.datafusion">DataFrame</a></span> <span class="element-name">unionByNameDistinct</span><wbr><span class="parameters">(<a href="DataFrame.html" title="class in org.apache.datafusion">DataFrame</a> other)</span></div> |
| <div class="block">Concatenate this DataFrame with <code>other</code> by column name, removing duplicates. Columns |
| present in only one side are filled with NULL on the other. Both DataFrames remain usable.</div> |
| <dl class="notes"> |
| <dt>Throws:</dt> |
| <dd><code><a href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/lang/IllegalArgumentException.html" title="class or interface in java.lang" class="external-link">IllegalArgumentException</a></code> - if <code>other</code> is <code>null</code>.</dd> |
| <dd><code><a href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/lang/RuntimeException.html" title="class or interface in java.lang" class="external-link">RuntimeException</a></code> - if column types disagree on a shared name.</dd> |
| </dl> |
| </section> |
| </li> |
| <li> |
| <section class="detail" id="intersect(org.apache.datafusion.DataFrame)"> |
| <h3>intersect</h3> |
| <div class="member-signature"><span class="modifiers">public</span> <span class="return-type"><a href="DataFrame.html" title="class in org.apache.datafusion">DataFrame</a></span> <span class="element-name">intersect</span><wbr><span class="parameters">(<a href="DataFrame.html" title="class in org.apache.datafusion">DataFrame</a> other)</span></div> |
| <div class="block">Rows present in both this DataFrame and <code>other</code>, keeping duplicates from the receiver |
| (SQL <code>INTERSECT ALL</code>). Both schemas must match positionally. Both DataFrames remain |
| usable. |
| |
| <p><strong>Implementation note:</strong> DataFusion implements <code>INTERSECT ALL</code> as a |
| left-semi join on equality, not as standard SQL bag intersection. A left row is kept iff any |
| matching row exists in <code>other</code>. With <code>left = (1, 2, 2, 3)</code> and <code>right = (2, |
| 3)</code>, the result is <code>(2, 2, 3)</code> -- both copies of <code>2</code> survive because each finds a |
| match in <code>right</code>. PostgreSQL / Spark <code>INTERSECT ALL</code> would also yield <code>(2, 2, |
| 3)</code> here, but the two engines diverge when <code>other</code> has fewer copies than <code>this</code> of |
| a row that appears in both.</div> |
| <dl class="notes"> |
| <dt>Throws:</dt> |
| <dd><code><a href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/lang/IllegalArgumentException.html" title="class or interface in java.lang" class="external-link">IllegalArgumentException</a></code> - if <code>other</code> is <code>null</code>.</dd> |
| <dd><code><a href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/lang/RuntimeException.html" title="class or interface in java.lang" class="external-link">RuntimeException</a></code> - if the schemas are incompatible.</dd> |
| </dl> |
| </section> |
| </li> |
| <li> |
| <section class="detail" id="intersectDistinct(org.apache.datafusion.DataFrame)"> |
| <h3>intersectDistinct</h3> |
| <div class="member-signature"><span class="modifiers">public</span> <span class="return-type"><a href="DataFrame.html" title="class in org.apache.datafusion">DataFrame</a></span> <span class="element-name">intersectDistinct</span><wbr><span class="parameters">(<a href="DataFrame.html" title="class in org.apache.datafusion">DataFrame</a> other)</span></div> |
| <div class="block">Rows present in both this DataFrame and <code>other</code>, deduplicated (SQL <code>INTERSECT</code>). |
| Both schemas must match positionally. Both DataFrames remain usable.</div> |
| <dl class="notes"> |
| <dt>Throws:</dt> |
| <dd><code><a href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/lang/IllegalArgumentException.html" title="class or interface in java.lang" class="external-link">IllegalArgumentException</a></code> - if <code>other</code> is <code>null</code>.</dd> |
| <dd><code><a href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/lang/RuntimeException.html" title="class or interface in java.lang" class="external-link">RuntimeException</a></code> - if the schemas are incompatible.</dd> |
| </dl> |
| </section> |
| </li> |
| <li> |
| <section class="detail" id="except(org.apache.datafusion.DataFrame)"> |
| <h3>except</h3> |
| <div class="member-signature"><span class="modifiers">public</span> <span class="return-type"><a href="DataFrame.html" title="class in org.apache.datafusion">DataFrame</a></span> <span class="element-name">except</span><wbr><span class="parameters">(<a href="DataFrame.html" title="class in org.apache.datafusion">DataFrame</a> other)</span></div> |
| <div class="block">Rows present in this DataFrame but not in <code>other</code>, keeping duplicates from the receiver |
| (SQL <code>EXCEPT ALL</code>). Both schemas must match positionally. Both DataFrames remain usable. |
| |
| <p><strong>Implementation note:</strong> DataFusion implements <code>EXCEPT ALL</code> as a |
| left-anti join on equality, not as standard SQL bag difference. A left row is kept iff |
| <em>no</em> matching row exists in <code>other</code> -- the multiplicity of matches is irrelevant. |
| With <code>left = (1, 1, 2, 2, 3)</code> and <code>right = (1, 3)</code>, the result is <code>(2, 2)</code>: |
| both copies of <code>2</code> survive (no match in <code>right</code>); both copies of <code>1</code> and the |
| <code>3</code> drop. PostgreSQL / Spark <code>EXCEPT ALL</code> would yield the same answer here, but the |
| two engines diverge when <code>right</code> contains fewer copies than <code>left</code> of a row that |
| appears in both.</div> |
| <dl class="notes"> |
| <dt>Throws:</dt> |
| <dd><code><a href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/lang/IllegalArgumentException.html" title="class or interface in java.lang" class="external-link">IllegalArgumentException</a></code> - if <code>other</code> is <code>null</code>.</dd> |
| <dd><code><a href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/lang/RuntimeException.html" title="class or interface in java.lang" class="external-link">RuntimeException</a></code> - if the schemas are incompatible.</dd> |
| </dl> |
| </section> |
| </li> |
| <li> |
| <section class="detail" id="exceptDistinct(org.apache.datafusion.DataFrame)"> |
| <h3>exceptDistinct</h3> |
| <div class="member-signature"><span class="modifiers">public</span> <span class="return-type"><a href="DataFrame.html" title="class in org.apache.datafusion">DataFrame</a></span> <span class="element-name">exceptDistinct</span><wbr><span class="parameters">(<a href="DataFrame.html" title="class in org.apache.datafusion">DataFrame</a> other)</span></div> |
| <div class="block">Rows present in this DataFrame but not in <code>other</code>, deduplicated (SQL <code>EXCEPT</code>). |
| Both schemas must match positionally. Both DataFrames remain usable.</div> |
| <dl class="notes"> |
| <dt>Throws:</dt> |
| <dd><code><a href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/lang/IllegalArgumentException.html" title="class or interface in java.lang" class="external-link">IllegalArgumentException</a></code> - if <code>other</code> is <code>null</code>.</dd> |
| <dd><code><a href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/lang/RuntimeException.html" title="class or interface in java.lang" class="external-link">RuntimeException</a></code> - if the schemas are incompatible.</dd> |
| </dl> |
| </section> |
| </li> |
| <li> |
| <section class="detail" id="sort(org.apache.datafusion.SortExpr...)"> |
| <h3>sort</h3> |
| <div class="member-signature"><span class="modifiers">public</span> <span class="return-type"><a href="DataFrame.html" title="class in org.apache.datafusion">DataFrame</a></span> <span class="element-name">sort</span><wbr><span class="parameters">(<a href="SortExpr.html" title="class in org.apache.datafusion">SortExpr</a>... exprs)</span></div> |
| <div class="block">Order the rows by the supplied sort keys. Each <a href="SortExpr.html" title="class in org.apache.datafusion"><code>SortExpr</code></a> names a column and a direction |
| (<a href="SortExpr.html#asc(java.lang.String)"><code>SortExpr.asc(String)</code></a> / <a href="SortExpr.html#desc(java.lang.String)"><code>SortExpr.desc(String)</code></a>); call <a href="SortExpr.html#nullsFirst(boolean)"><code>SortExpr.nullsFirst(boolean)</code></a> to override null placement. |
| |
| <p>An empty <code>exprs</code> array is a no-op (matches DataFusion's <code>sort(vec![])</code>). The |
| receiver remains usable and must still be closed independently.</div> |
| <dl class="notes"> |
| <dt>Throws:</dt> |
| <dd><code><a href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/lang/IllegalArgumentException.html" title="class or interface in java.lang" class="external-link">IllegalArgumentException</a></code> - if <code>exprs</code> or any element is <code>null</code>.</dd> |
| <dd><code><a href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/lang/RuntimeException.html" title="class or interface in java.lang" class="external-link">RuntimeException</a></code> - if a sort column does not exist in this DataFrame's schema.</dd> |
| </dl> |
| </section> |
| </li> |
| <li> |
| <section class="detail" id="repartitionRoundRobin(int)"> |
| <h3>repartitionRoundRobin</h3> |
| <div class="member-signature"><span class="modifiers">public</span> <span class="return-type"><a href="DataFrame.html" title="class in org.apache.datafusion">DataFrame</a></span> <span class="element-name">repartitionRoundRobin</span><wbr><span class="parameters">(int numPartitions)</span></div> |
| <div class="block">Repartition this DataFrame using a round-robin scheme across <code>numPartitions</code> output |
| partitions. The receiver remains usable and must still be closed independently.</div> |
| <dl class="notes"> |
| <dt>Throws:</dt> |
| <dd><code><a href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/lang/IllegalArgumentException.html" title="class or interface in java.lang" class="external-link">IllegalArgumentException</a></code> - if <code>numPartitions <= 0</code>.</dd> |
| <dd><code><a href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/lang/RuntimeException.html" title="class or interface in java.lang" class="external-link">RuntimeException</a></code> - if the underlying repartition plan rejects the request.</dd> |
| </dl> |
| </section> |
| </li> |
| <li> |
| <section class="detail" id="repartitionHash(int,java.lang.String...)"> |
| <h3>repartitionHash</h3> |
| <div class="member-signature"><span class="modifiers">public</span> <span class="return-type"><a href="DataFrame.html" title="class in org.apache.datafusion">DataFrame</a></span> <span class="element-name">repartitionHash</span><wbr><span class="parameters">(int numPartitions, |
| <a href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/lang/String.html" title="class or interface in java.lang" class="external-link">String</a>... columns)</span></div> |
| <div class="block">Repartition this DataFrame by hashing the named columns into <code>numPartitions</code> output |
| partitions. v1 supports column-name keys only; expression keys are deferred until the Java |
| binding gains an <code>Expr</code> builder. The receiver remains usable and must still be closed |
| independently.</div> |
| <dl class="notes"> |
| <dt>Throws:</dt> |
| <dd><code><a href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/lang/IllegalArgumentException.html" title="class or interface in java.lang" class="external-link">IllegalArgumentException</a></code> - if <code>numPartitions <= 0</code>, <code>columns</code> is <code>null</code> |
| or empty, or any element of <code>columns</code> is <code>null</code>.</dd> |
| <dd><code><a href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/lang/RuntimeException.html" title="class or interface in java.lang" class="external-link">RuntimeException</a></code> - if a partition column does not exist in this DataFrame's schema.</dd> |
| </dl> |
| </section> |
| </li> |
| <li> |
| <section class="detail" id="join(org.apache.datafusion.DataFrame,org.apache.datafusion.JoinType,java.lang.String[],java.lang.String[])"> |
| <h3>join</h3> |
| <div class="member-signature"><span class="modifiers">public</span> <span class="return-type"><a href="DataFrame.html" title="class in org.apache.datafusion">DataFrame</a></span> <span class="element-name">join</span><wbr><span class="parameters">(<a href="DataFrame.html" title="class in org.apache.datafusion">DataFrame</a> right, |
| <a href="JoinType.html" title="enum class in org.apache.datafusion">JoinType</a> type, |
| <a href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/lang/String.html" title="class or interface in java.lang" class="external-link">String</a>[] leftCols, |
| <a href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/lang/String.html" title="class or interface in java.lang" class="external-link">String</a>[] rightCols)</span></div> |
| <div class="block">Equi-join this DataFrame with <code>right</code> on the named columns, using the given <a href="JoinType.html" title="enum class in org.apache.datafusion"><code>JoinType</code></a>. The receiver and <code>right</code> both remain usable and must still be closed |
| independently. |
| |
| <p>Equivalent to SQL <code>left <type> JOIN right ON l.leftCols[0] = r.rightCols[0] AND ...</code>. |
| <code>leftCols</code> and <code>rightCols</code> must have the same length.</div> |
| <dl class="notes"> |
| <dt>Throws:</dt> |
| <dd><code><a href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/lang/IllegalArgumentException.html" title="class or interface in java.lang" class="external-link">IllegalArgumentException</a></code> - if any argument is <code>null</code> or <code>leftCols.length != |
| rightCols.length</code>.</dd> |
| <dd><code><a href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/lang/IllegalStateException.html" title="class or interface in java.lang" class="external-link">IllegalStateException</a></code> - if either DataFrame is closed or already collected.</dd> |
| <dd><code><a href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/lang/RuntimeException.html" title="class or interface in java.lang" class="external-link">RuntimeException</a></code> - if join planning fails (column collision in the combined schema, |
| unknown column names, etc.).</dd> |
| </dl> |
| </section> |
| </li> |
| <li> |
| <section class="detail" id="join(org.apache.datafusion.DataFrame,org.apache.datafusion.JoinType,java.lang.String[],java.lang.String[],java.lang.String)"> |
| <h3>join</h3> |
| <div class="member-signature"><span class="modifiers">public</span> <span class="return-type"><a href="DataFrame.html" title="class in org.apache.datafusion">DataFrame</a></span> <span class="element-name">join</span><wbr><span class="parameters">(<a href="DataFrame.html" title="class in org.apache.datafusion">DataFrame</a> right, |
| <a href="JoinType.html" title="enum class in org.apache.datafusion">JoinType</a> type, |
| <a href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/lang/String.html" title="class or interface in java.lang" class="external-link">String</a>[] leftCols, |
| <a href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/lang/String.html" title="class or interface in java.lang" class="external-link">String</a>[] rightCols, |
| <a href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/lang/String.html" title="class or interface in java.lang" class="external-link">String</a> filter)</span></div> |
| <div class="block">Equi-join this DataFrame with <code>right</code>, restricting the result with a residual SQL filter |
| parsed against the <em>combined</em> schema (left columns followed by right columns; columns |
| may be qualified with the relation alias when ambiguous). The receiver and <code>right</code> both |
| remain usable and must still be closed independently. |
| |
| <p>For outer joins, the filter is applied only to matched rows; unmatched rows are passed |
| through with nulls on the unmatched side, matching DataFusion's semantics.</div> |
| <dl class="notes"> |
| <dt>Throws:</dt> |
| <dd><code><a href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/lang/IllegalArgumentException.html" title="class or interface in java.lang" class="external-link">IllegalArgumentException</a></code> - if any argument is <code>null</code> or <code>leftCols.length != |
| rightCols.length</code>.</dd> |
| <dd><code><a href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/lang/IllegalStateException.html" title="class or interface in java.lang" class="external-link">IllegalStateException</a></code> - if either DataFrame is closed or already collected.</dd> |
| <dd><code><a href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/lang/RuntimeException.html" title="class or interface in java.lang" class="external-link">RuntimeException</a></code> - if join planning or filter parsing fails.</dd> |
| </dl> |
| </section> |
| </li> |
| <li> |
| <section class="detail" id="joinOn(org.apache.datafusion.DataFrame,org.apache.datafusion.JoinType,java.lang.String...)"> |
| <h3>joinOn</h3> |
| <div class="member-signature"><span class="modifiers">public</span> <span class="return-type"><a href="DataFrame.html" title="class in org.apache.datafusion">DataFrame</a></span> <span class="element-name">joinOn</span><wbr><span class="parameters">(<a href="DataFrame.html" title="class in org.apache.datafusion">DataFrame</a> right, |
| <a href="JoinType.html" title="enum class in org.apache.datafusion">JoinType</a> type, |
| <a href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/lang/String.html" title="class or interface in java.lang" class="external-link">String</a>... predicates)</span></div> |
| <div class="block">Join this DataFrame with <code>right</code> using arbitrary SQL predicates parsed against the |
| <em>combined</em> schema. Each predicate is parsed independently and the join evaluates their |
| conjunction. Predicates may reference columns from either side and may be qualified with the |
| relation alias when ambiguous (e.g. <code>"left.x = right.x"</code>). The receiver and <code>right</code> |
| both remain usable and must still be closed independently. |
| |
| <p>DataFusion's optimiser identifies and rewrites equality predicates into hash-join keys |
| automatically, so <code>joinOn(right, INNER, "l.id = r.id")</code> plans equivalently to <a href="#join(org.apache.datafusion.DataFrame,org.apache.datafusion.JoinType,java.lang.String%5B%5D,java.lang.String%5B%5D)"><code>join(DataFrame, JoinType, String[], String[])</code></a> with a single key. Use <code>joinOn</code> when the |
| predicate is not a simple equality, e.g. inequality joins or range conditions.</div> |
| <dl class="notes"> |
| <dt>Throws:</dt> |
| <dd><code><a href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/lang/IllegalArgumentException.html" title="class or interface in java.lang" class="external-link">IllegalArgumentException</a></code> - if <code>right</code> or <code>type</code> is <code>null</code>, or <code> |
| predicates</code> is <code>null</code> or empty, or any predicate is <code>null</code>.</dd> |
| <dd><code><a href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/lang/IllegalStateException.html" title="class or interface in java.lang" class="external-link">IllegalStateException</a></code> - if either DataFrame is closed or already collected.</dd> |
| <dd><code><a href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/lang/RuntimeException.html" title="class or interface in java.lang" class="external-link">RuntimeException</a></code> - if predicate parsing or join planning fails.</dd> |
| </dl> |
| </section> |
| </li> |
| <li> |
| <section class="detail" id="writeParquet(java.lang.String)"> |
| <h3>writeParquet</h3> |
| <div class="member-signature"><span class="modifiers">public</span> <span class="return-type">void</span> <span class="element-name">writeParquet</span><wbr><span class="parameters">(<a href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/lang/String.html" title="class or interface in java.lang" class="external-link">String</a> path)</span></div> |
| <div class="block">Materialize this DataFrame as Parquet at <code>path</code>. The path is treated as a directory |
| unless overridden via <a href="ParquetWriteOptions.html#singleFileOutput(boolean)"><code>ParquetWriteOptions.singleFileOutput(boolean)</code></a>. The receiver |
| remains usable and must still be closed independently.</div> |
| <dl class="notes"> |
| <dt>Throws:</dt> |
| <dd><code><a href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/lang/RuntimeException.html" title="class or interface in java.lang" class="external-link">RuntimeException</a></code> - if the write fails.</dd> |
| </dl> |
| </section> |
| </li> |
| <li> |
| <section class="detail" id="writeParquet(java.lang.String,org.apache.datafusion.ParquetWriteOptions)"> |
| <h3>writeParquet</h3> |
| <div class="member-signature"><span class="modifiers">public</span> <span class="return-type">void</span> <span class="element-name">writeParquet</span><wbr><span class="parameters">(<a href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/lang/String.html" title="class or interface in java.lang" class="external-link">String</a> path, |
| <a href="ParquetWriteOptions.html" title="class in org.apache.datafusion">ParquetWriteOptions</a> options)</span></div> |
| <div class="block">Materialize this DataFrame as Parquet at <code>path</code> with the supplied <a href="ParquetWriteOptions.html" title="class in org.apache.datafusion"><code>ParquetWriteOptions</code></a>. The receiver remains usable and must still be closed independently.</div> |
| <dl class="notes"> |
| <dt>Throws:</dt> |
| <dd><code><a href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/lang/RuntimeException.html" title="class or interface in java.lang" class="external-link">RuntimeException</a></code> - if the write fails (path inaccessible, invalid compression spec, |
| etc.).</dd> |
| </dl> |
| </section> |
| </li> |
| <li> |
| <section class="detail" id="writeCsv(java.lang.String)"> |
| <h3>writeCsv</h3> |
| <div class="member-signature"><span class="modifiers">public</span> <span class="return-type">void</span> <span class="element-name">writeCsv</span><wbr><span class="parameters">(<a href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/lang/String.html" title="class or interface in java.lang" class="external-link">String</a> path)</span></div> |
| <div class="block">Materialize this DataFrame as CSV at <code>path</code>. The path is treated as a directory unless |
| overridden via <a href="CsvWriteOptions.html#singleFileOutput(boolean)"><code>CsvWriteOptions.singleFileOutput(boolean)</code></a>. The receiver remains usable |
| and must still be closed independently.</div> |
| <dl class="notes"> |
| <dt>Throws:</dt> |
| <dd><code><a href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/lang/RuntimeException.html" title="class or interface in java.lang" class="external-link">RuntimeException</a></code> - if the write fails.</dd> |
| </dl> |
| </section> |
| </li> |
| <li> |
| <section class="detail" id="writeCsv(java.lang.String,org.apache.datafusion.CsvWriteOptions)"> |
| <h3>writeCsv</h3> |
| <div class="member-signature"><span class="modifiers">public</span> <span class="return-type">void</span> <span class="element-name">writeCsv</span><wbr><span class="parameters">(<a href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/lang/String.html" title="class or interface in java.lang" class="external-link">String</a> path, |
| <a href="CsvWriteOptions.html" title="class in org.apache.datafusion">CsvWriteOptions</a> options)</span></div> |
| <div class="block">Materialize this DataFrame as CSV at <code>path</code> with the supplied <a href="CsvWriteOptions.html" title="class in org.apache.datafusion"><code>CsvWriteOptions</code></a>. |
| The receiver remains usable and must still be closed independently.</div> |
| <dl class="notes"> |
| <dt>Throws:</dt> |
| <dd><code><a href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/lang/IllegalArgumentException.html" title="class or interface in java.lang" class="external-link">IllegalArgumentException</a></code> - if <code>path</code> or <code>options</code> is <code>null</code>.</dd> |
| <dd><code><a href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/lang/RuntimeException.html" title="class or interface in java.lang" class="external-link">RuntimeException</a></code> - if the write fails (path inaccessible, invalid compression spec, |
| etc.).</dd> |
| </dl> |
| </section> |
| </li> |
| <li> |
| <section class="detail" id="writeJson(java.lang.String)"> |
| <h3>writeJson</h3> |
| <div class="member-signature"><span class="modifiers">public</span> <span class="return-type">void</span> <span class="element-name">writeJson</span><wbr><span class="parameters">(<a href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/lang/String.html" title="class or interface in java.lang" class="external-link">String</a> path)</span></div> |
| <div class="block">Materialize this DataFrame as newline-delimited JSON at <code>path</code>. The path is treated as a |
| directory unless overridden via <a href="JsonWriteOptions.html#singleFileOutput(boolean)"><code>JsonWriteOptions.singleFileOutput(boolean)</code></a>. The |
| receiver remains usable and must still be closed independently.</div> |
| <dl class="notes"> |
| <dt>Throws:</dt> |
| <dd><code><a href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/lang/RuntimeException.html" title="class or interface in java.lang" class="external-link">RuntimeException</a></code> - if the write fails.</dd> |
| </dl> |
| </section> |
| </li> |
| <li> |
| <section class="detail" id="writeJson(java.lang.String,org.apache.datafusion.JsonWriteOptions)"> |
| <h3>writeJson</h3> |
| <div class="member-signature"><span class="modifiers">public</span> <span class="return-type">void</span> <span class="element-name">writeJson</span><wbr><span class="parameters">(<a href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/lang/String.html" title="class or interface in java.lang" class="external-link">String</a> path, |
| <a href="JsonWriteOptions.html" title="class in org.apache.datafusion">JsonWriteOptions</a> options)</span></div> |
| <div class="block">Materialize this DataFrame as newline-delimited JSON at <code>path</code> with the supplied <a href="JsonWriteOptions.html" title="class in org.apache.datafusion"><code>JsonWriteOptions</code></a>. The receiver remains usable and must still be closed independently.</div> |
| <dl class="notes"> |
| <dt>Throws:</dt> |
| <dd><code><a href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/lang/IllegalArgumentException.html" title="class or interface in java.lang" class="external-link">IllegalArgumentException</a></code> - if <code>path</code> or <code>options</code> is <code>null</code>.</dd> |
| <dd><code><a href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/lang/RuntimeException.html" title="class or interface in java.lang" class="external-link">RuntimeException</a></code> - if the write fails (path inaccessible, invalid compression spec, |
| etc.).</dd> |
| </dl> |
| </section> |
| </li> |
| <li> |
| <section class="detail" id="close()"> |
| <h3>close</h3> |
| <div class="member-signature"><span class="modifiers">public</span> <span class="return-type">void</span> <span class="element-name">close</span>()</div> |
| <dl class="notes"> |
| <dt>Specified by:</dt> |
| <dd><code><a href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/lang/AutoCloseable.html#close()" title="class or interface in java.lang" class="external-link">close</a></code> in interface <code><a href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/lang/AutoCloseable.html" title="class or interface in java.lang" class="external-link">AutoCloseable</a></code></dd> |
| </dl> |
| </section> |
| </li> |
| </ul> |
| </section> |
| </li> |
| </ul> |
| </section> |
| <!-- ========= END OF CLASS DATA ========= --> |
| </main> |
| <footer role="contentinfo"> |
| <hr> |
| <p class="legal-copy"><small>Copyright © 2026. All rights reserved.</small></p> |
| </footer> |
| </div> |
| </div> |
| </body> |
| </html> |