blob: 96d0d74f824c9e7f784206f1fde12fa528c5602c [file] [log] [blame]
<!DOCTYPE html>
<!--[if IE]><![endif]-->
<html>
<head>
<meta charset="utf-8">
<meta http-equiv="X-UA-Compatible" content="IE=edge,chrome=1">
<title>Namespace Lucene.Net.Join
| Apache Lucene.NET 4.8.0-beta00013 Documentation </title>
<meta name="viewport" content="width=device-width">
<meta name="title" content="Namespace Lucene.Net.Join
| Apache Lucene.NET 4.8.0-beta00013 Documentation ">
<meta name="generator" content="docfx 2.56.2.0">
<link rel="shortcut icon" href="https://lucenenet.apache.org/docs/4.8.0-beta00009/logo/favicon.ico">
<link rel="stylesheet" href="https://lucenenet.apache.org/docs/4.8.0-beta00009/styles/docfx.vendor.css">
<link rel="stylesheet" href="https://lucenenet.apache.org/docs/4.8.0-beta00009/styles/docfx.css">
<link rel="stylesheet" href="https://lucenenet.apache.org/docs/4.8.0-beta00009/styles/main.css">
<meta property="docfx:navrel" content="toc.html">
<meta property="docfx:tocrel" content="join/toc.html">
<meta property="docfx:rel" content="https://lucenenet.apache.org/docs/4.8.0-beta00009/">
</head>
<body data-spy="scroll" data-target="#affix" data-offset="120">
<span id="forkongithub"><a href="https://github.com/apache/lucenenet" target="_blank">Fork me on GitHub</a></span>
<div id="wrapper">
<header>
<nav id="autocollapse" class="navbar ng-scope" role="navigation">
<div class="container">
<div class="navbar-header">
<button type="button" class="navbar-toggle" data-toggle="collapse" data-target="#navbar">
<span class="sr-only">Toggle navigation</span>
<span class="icon-bar"></span>
<span class="icon-bar"></span>
<span class="icon-bar"></span>
</button>
<a class="navbar-brand" href="/">
<img id="logo" class="svg" src="https://lucenenet.apache.org/docs/4.8.0-beta00009/logo/lucene-net-color.png" alt="">
</a>
</div>
<div class="collapse navbar-collapse" id="navbar">
<form class="navbar-form navbar-right" role="search" id="search">
<div class="form-group">
<input type="text" class="form-control" id="search-query" placeholder="Search" autocomplete="off">
</div>
</form>
</div>
</div>
</nav>
<div class="subnav navbar navbar-default">
<div class="container hide-when-search">
<ul class="level0 breadcrumb">
<li>
<a href="https://lucenenet.apache.org/docs/4.8.0-beta00009/">API</a>
<span id="breadcrumb">
<ul class="breadcrumb">
<li></li>
</ul>
</span>
</li>
</ul>
</div>
</div>
</header>
<div class="container body-content">
<div id="search-results">
<div class="search-list"></div>
<div class="sr-items">
<p><i class="glyphicon glyphicon-refresh index-loading"></i></p>
</div>
<ul id="pagination"></ul>
</div>
</div>
<div role="main" class="container body-content hide-when-search">
<div class="sidenav hide-when-search">
<a class="btn toc-toggle collapse" data-toggle="collapse" href="#sidetoggle" aria-expanded="false" aria-controls="sidetoggle">Show / Hide Table of Contents</a>
<div class="sidetoggle collapse" id="sidetoggle">
<div id="sidetoc"></div>
</div>
</div>
<div class="article row grid-right">
<div class="col-md-10">
<article class="content wrap" id="_content" data-uid="Lucene.Net.Join">
<h1 id="Lucene_Net_Join" data-uid="Lucene.Net.Join" class="text-break">Namespace Lucene.Net.Join
</h1>
<div class="markdown level0 summary"><!--
Licensed to the Apache Software Foundation (ASF) under one or more
contributor license agreements. See the NOTICE file distributed with
this work for additional information regarding copyright ownership.
The ASF licenses this file to You under the Apache License, Version 2.0
(the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->
<p>This modules support index-time and query-time joins.</p>
<h2 id="index-time-joins">Index-time joins</h2>
<p>The index-time joining support joins while searching, where joined documents are indexed as a single document block using <a class="xref" href="https://lucenenet.apache.org/docs/4.8.0-beta00013/api/core/Lucene.Net.Index.IndexWriter.html#methods">IndexWriter.addDocuments</a>. This is useful for any normalized content (XML documents or database tables). In database terms, all rows for all joined tables matching a single row of the primary table must be indexed as a single document block, with the parent document being last in the group.</p>
<p>When you index in this way, the documents in your index are divided into parent documents (the last document of each block) and child documents (all others). You provide a &lt;xref:Lucene.Net.Search.Filter&gt; that identifies the parent documents, as Lucene does not currently record any information about doc blocks.</p>
<p>At search time, use <a class="xref" href="Lucene.Net.Join.ToParentBlockJoinQuery.html">ToParentBlockJoinQuery</a> to remap/join matches from any child &lt;xref:Lucene.Net.Search.Query&gt; (ie, a query that matches only child documents) up to the parent document space. The resulting query can then be used as a clause in any query that matches parent.</p>
<p>If you only care about the parent documents matching the query, you can use any collector to collect the parent hits, but if you&#39;d also like to see which child documents match for each parent document, use the <a class="xref" href="Lucene.Net.Join.ToParentBlockJoinCollector.html">ToParentBlockJoinCollector</a> to collect the hits. Once the search is done, you retrieve a &lt;xref:Lucene.Net.Grouping.TopGroups&gt; instance from the <a class="xref" href="Lucene.Net.Join.ToParentBlockJoinCollector.html#methods">ToParentBlockJoinCollector.getTopGroups</a> method.</p>
<p>To map/join in the opposite direction, use <a class="xref" href="Lucene.Net.Join.ToChildBlockJoinQuery.html">ToChildBlockJoinQuery</a>. This wraps
any query matching parent documents, creating the joined query
matching only child documents.</p>
<h2 id="query-time-joins">Query-time joins</h2>
<p> The query time joining is index term based and implemented as two pass search. The first pass collects all the terms from a fromField that match the fromQuery. The second pass returns all documents that have matching terms in a toField to the terms collected in the first pass. </p>
<p>Query time joining has the following input:</p>
<ul>
<li><p><code>fromField</code>: The from field to join from.</p>
</li>
<li><p><code>fromQuery</code>: The query executed to collect the from terms. This is usually the user specified query.</p>
</li>
<li><p><code>multipleValuesPerDocument</code>: Whether the fromField contains more than one value per document</p>
</li>
<li><p><code>scoreMode</code>: Defines how scores are translated to the other join side. If you don&#39;t care about scoring
use <a class="xref" href="Lucene.Net.Join.ScoreMode.html">#None</a> mode. This will disable scoring and is therefore more
efficient (requires less memory and is faster).</p>
</li>
<li><p><code>toField</code>: The to field to join to</p>
<p>Basically the query-time joining is accessible from one static method. The user of this method supplies the method with the described input and a <code>IndexSearcher</code> where the from terms need to be collected from. The returned query can be executed with the same <code>IndexSearcher</code>, but also with another <code>IndexSearcher</code>. Example usage of the <a class="xref" href="Lucene.Net.Join.JoinUtil.html#methods">JoinUtil.createJoinQuery</a> : </p>
<p> String fromField = &quot;from&quot;; // Name of the from field
boolean multipleValuesPerDocument = false; // Set only yo true in the case when your fromField has multiple values per document in your index
String toField = &quot;to&quot;; // Name of the to field
ScoreMode scoreMode = ScoreMode.Max // Defines how the scores are translated into the other side of the join.
Query fromQuery = new TermQuery(new Term(&quot;content&quot;, searchTerm)); // Query executed to collect from values to join to the to values</p>
<p>Query joinQuery = JoinUtil.createJoinQuery(fromField, multipleValuesPerDocument, toField, fromQuery, fromSearcher, scoreMode);
TopDocs topDocs = toSearcher.search(joinQuery, 10); // Note: toSearcher can be the same as the fromSearcher
// Render topDocs...</p>
</li>
</ul>
</div>
<div class="markdown level0 conceptual"></div>
<div class="markdown level0 remarks"></div>
<h3 id="classes">Classes
</h3>
<h4><a class="xref" href="Lucene.Net.Join.FixedBitSetCachingWrapperFilter.html">FixedBitSetCachingWrapperFilter</a></h4>
<section><p>A <span class="xref">Lucene.Net.Search.CachingWrapperFilter</span> that caches sets using a <a class="xref" href="https://lucenenet.apache.org/docs/4.8.0-beta00013/api/core/Lucene.Net.Util.FixedBitSet.html">FixedBitSet</a>,
as required for joins. </p>
</section>
<h4><a class="xref" href="Lucene.Net.Join.JoinUtil.html">JoinUtil</a></h4>
<section><p>Utility for query time joining using <span class="xref">Lucene.Net.Join.TermsQuery</span> and <span class="xref">Lucene.Net.Join.TermsCollector</span>.</p>
<div class="lucene-block lucene-experimental">This is a Lucene.NET EXPERIMENTAL API, use at your own risk</div></section>
<h4><a class="xref" href="Lucene.Net.Join.ToChildBlockJoinQuery.html">ToChildBlockJoinQuery</a></h4>
<section><p>Just like <a class="xref" href="Lucene.Net.Join.ToParentBlockJoinQuery.html">ToParentBlockJoinQuery</a>, except this
query joins in reverse: you provide a <span class="xref">Lucene.Net.Search.Query</span> matching
parent documents and it joins down to child
documents.</p>
<div class="lucene-block lucene-experimental">This is a Lucene.NET EXPERIMENTAL API, use at your own risk</div></section>
<h4><a class="xref" href="Lucene.Net.Join.ToParentBlockJoinCollector.html">ToParentBlockJoinCollector</a></h4>
<section><p>Collects parent document hits for a <span class="xref">Lucene.Net.Search.Query</span> containing one more more
BlockJoinQuery clauses, sorted by the
specified parent <span class="xref">Lucene.Net.Search.Sort</span>. Note that this cannot perform
arbitrary joins; rather, it requires that all joined
documents are indexed as a doc block (using
<a class="xref" href="https://lucenenet.apache.org/docs/4.8.0-beta00013/api/core/Lucene.Net.Index.IndexWriter.html#Lucene_Net_Index_IndexWriter_AddDocuments_System_Collections_Generic_IEnumerable_System_Collections_Generic_IEnumerable_Lucene_Net_Index_IIndexableField___Lucene_Net_Analysis_Analyzer_">AddDocuments(IEnumerable&lt;IEnumerable&lt;IIndexableField&gt;&gt;, Analyzer)</a>
or <a class="xref" href="https://lucenenet.apache.org/docs/4.8.0-beta00013/api/core/Lucene.Net.Index.IndexWriter.html#Lucene_Net_Index_IndexWriter_UpdateDocuments_Lucene_Net_Index_Term_System_Collections_Generic_IEnumerable_System_Collections_Generic_IEnumerable_Lucene_Net_Index_IIndexableField___Lucene_Net_Analysis_Analyzer_">UpdateDocuments(Term, IEnumerable&lt;IEnumerable&lt;IIndexableField&gt;&gt;, Analyzer)</a>.
Ie, the join is computed
at index time.</p>
<p>The parent <span class="xref">Lucene.Net.Search.Sort</span> must only use
fields from the parent documents; sorting by field in
the child documents is not supported.</p>
<p>You should only use this
collector if one or more of the clauses in the query is
a <a class="xref" href="Lucene.Net.Join.ToParentBlockJoinQuery.html">ToParentBlockJoinQuery</a>. This collector will find those query
clauses and record the matching child documents for the
top scoring parent documents.</p>
<p>Multiple joins (star join) and nested joins and a mix
of the two are allowed, as long as in all cases the
documents corresponding to a single row of each joined
parent table were indexed as a doc block.</p>
<p>For the simple star join you can retrieve the
<span class="xref">Lucene.Net.Search.Grouping.ITopGroups&lt;TGroupValue&gt;</span> instance containing each <a class="xref" href="Lucene.Net.Join.ToParentBlockJoinQuery.html">ToParentBlockJoinQuery</a>&apos;s
matching child documents for the top parent groups,
using <a class="xref" href="Lucene.Net.Join.ToParentBlockJoinCollector.html#Lucene_Net_Join_ToParentBlockJoinCollector_GetTopGroups_Lucene_Net_Join_ToParentBlockJoinQuery_Lucene_Net_Search_Sort_System_Int32_System_Int32_System_Int32_System_Boolean_">GetTopGroups(ToParentBlockJoinQuery, Sort, Int32, Int32, Int32, Boolean)</a>. Ie,
a single query, which will contain two or more
<a class="xref" href="Lucene.Net.Join.ToParentBlockJoinQuery.html">ToParentBlockJoinQuery</a>&apos;s as clauses representing the star join,
can then retrieve two or more <span class="xref">Lucene.Net.Search.Grouping.ITopGroups&lt;TGroupValue&gt;</span> instances.</p>
<p>For nested joins, the query will run correctly (ie,
match the right parent and child documents), however,
because <span class="xref">Lucene.Net.Search.Grouping.TopGroups`1</span> is currently unable to support nesting
(each group is not able to hold another <span class="xref">Lucene.Net.Search.Grouping.TopGroups`1</span>), you
are only able to retrieve the <span class="xref">Lucene.Net.Search.Grouping.TopGroups`1</span> of the first
join. The <span class="xref">Lucene.Net.Search.Grouping.TopGroups`1</span> of the nested joins will not be
correct.</p>
<p>See <a href="http://lucene.apache.org/core/4_8_0/join/">http://lucene.apache.org/core/4_8_0/join/</a> for a code
sample.</p>
<div class="lucene-block lucene-experimental">This is a Lucene.NET EXPERIMENTAL API, use at your own risk</div></section>
<h4><a class="xref" href="Lucene.Net.Join.ToParentBlockJoinFieldComparer.html">ToParentBlockJoinFieldComparer</a></h4>
<section><p>A field comparer that allows parent documents to be sorted by fields
from the nested / child documents.</p>
<div class="lucene-block lucene-experimental">This is a Lucene.NET EXPERIMENTAL API, use at your own risk</div></section>
<h4><a class="xref" href="Lucene.Net.Join.ToParentBlockJoinFieldComparer.Highest.html">ToParentBlockJoinFieldComparer.Highest</a></h4>
<section><p>Concrete implementation of <a class="xref" href="Lucene.Net.Join.ToParentBlockJoinSortField.html">ToParentBlockJoinSortField</a> to sorts the parent docs with the highest values
in the child / nested docs first.</p>
</section>
<h4><a class="xref" href="Lucene.Net.Join.ToParentBlockJoinFieldComparer.Lowest.html">ToParentBlockJoinFieldComparer.Lowest</a></h4>
<section><p>Concrete implementation of <a class="xref" href="Lucene.Net.Join.ToParentBlockJoinSortField.html">ToParentBlockJoinSortField</a> to sorts the parent docs with the lowest values
in the child / nested docs first.</p>
</section>
<h4><a class="xref" href="Lucene.Net.Join.ToParentBlockJoinQuery.html">ToParentBlockJoinQuery</a></h4>
<section><p>This query requires that you index
children and parent docs as a single block, using the
<a class="xref" href="https://lucenenet.apache.org/docs/4.8.0-beta00013/api/core/Lucene.Net.Index.IndexWriter.html#Lucene_Net_Index_IndexWriter_AddDocuments_System_Collections_Generic_IEnumerable_System_Collections_Generic_IEnumerable_Lucene_Net_Index_IIndexableField___Lucene_Net_Analysis_Analyzer_">AddDocuments(IEnumerable&lt;IEnumerable&lt;IIndexableField&gt;&gt;, Analyzer)</a>
or <a class="xref" href="https://lucenenet.apache.org/docs/4.8.0-beta00013/api/core/Lucene.Net.Index.IndexWriter.html#Lucene_Net_Index_IndexWriter_UpdateDocuments_Lucene_Net_Index_Term_System_Collections_Generic_IEnumerable_System_Collections_Generic_IEnumerable_Lucene_Net_Index_IIndexableField___Lucene_Net_Analysis_Analyzer_">UpdateDocuments(Term, IEnumerable&lt;IEnumerable&lt;IIndexableField&gt;&gt;, Analyzer)</a>
API. In each block, the
child documents must appear first, ending with the parent
document. At search time you provide a <span class="xref">Lucene.Net.Search.Filter</span>
identifying the parents, however this <span class="xref">Lucene.Net.Search.Filter</span> must provide
an <a class="xref" href="https://lucenenet.apache.org/docs/4.8.0-beta00013/api/core/Lucene.Net.Util.FixedBitSet.html">FixedBitSet</a> per sub-reader.</p>
<p>Once the block index is built, use this query to wrap
any sub-query matching only child docs and join matches in that
child document space up to the parent document space.
You can then use this <span class="xref">Lucene.Net.Search.Query</span> as a clause with
other queries in the parent document space.</p>
<p>See <a class="xref" href="Lucene.Net.Join.ToChildBlockJoinQuery.html">ToChildBlockJoinQuery</a> if you need to join
in the reverse order.</p>
<p>The child documents must be orthogonal to the parent
documents: the wrapped child query must never
return a parent document.</p>
<p>If you&apos;d like to retrieve <span class="xref">Lucene.Net.Search.Grouping.ITopGroups&lt;TGroupValue&gt;</span> for the
resulting query, use the <a class="xref" href="Lucene.Net.Join.ToParentBlockJoinCollector.html">ToParentBlockJoinCollector</a>.
Note that this is not necessary, ie, if you simply want
to collect the parent documents and don&apos;t need to see
which child documents matched under that parent, then
you can use any collector.</p>
<p><strong>NOTE</strong>: If the overall query contains parent-only
matches, for example you OR a parent-only query with a
joined child-only query, then the resulting collected documents
will be correct, however the <span class="xref">Lucene.Net.Search.Grouping.ITopGroups&lt;TGroupValue&gt;</span> you get
from <a class="xref" href="Lucene.Net.Join.ToParentBlockJoinCollector.html">ToParentBlockJoinCollector</a> will not contain every
child for parents that had matched.</p>
<p>See <a href="http://lucene.apache.org/core/4_8_0/join/">http://lucene.apache.org/core/4_8_0/join/</a> for an
overview. </p>
<div class="lucene-block lucene-experimental">This is a Lucene.NET EXPERIMENTAL API, use at your own risk</div></section>
<h4><a class="xref" href="Lucene.Net.Join.ToParentBlockJoinSortField.html">ToParentBlockJoinSortField</a></h4>
<section><p>A special sort field that allows sorting parent docs based on nested / child level fields.
Based on the sort order it either takes the document with the lowest or highest field value into account.</p>
<div class="lucene-block lucene-experimental">This is a Lucene.NET EXPERIMENTAL API, use at your own risk</div></section>
<h3 id="enums">Enums
</h3>
<h4><a class="xref" href="Lucene.Net.Join.ScoreMode.html">ScoreMode</a></h4>
<section><p>How to aggregate multiple child hit scores into a single parent score.</p>
</section>
</article>
</div>
<div class="hidden-sm col-md-2" role="complementary">
<div class="sideaffix">
<div class="contribution">
<ul class="nav">
<li>
<a href="https://github.com/apache/lucenenet/blob/docs/4.8.0-beta00013/src/Lucene.Net.Join/package.md/#L2" class="contribution-link">Improve this Doc</a>
</li>
</ul>
</div>
<nav class="bs-docs-sidebar hidden-print hidden-xs hidden-sm affix" id="affix">
<!-- <p><a class="back-to-top" href="#top">Back to top</a><p> -->
</nav>
</div>
</div>
</div>
</div>
<footer>
<div class="grad-bottom"></div>
<div class="footer">
<div class="container">
<span class="pull-right">
<a href="#top">Back to top</a>
</span>
Copyright © 2020 The Apache Software Foundation, Licensed under the <a href='http://www.apache.org/licenses/LICENSE-2.0' target='_blank'>Apache License, Version 2.0</a><br> <small>Apache Lucene.Net, Lucene.Net, Apache, the Apache feather logo, and the Apache Lucene.Net project logo are trademarks of The Apache Software Foundation. <br>All other marks mentioned may be trademarks or registered trademarks of their respective owners.</small>
</div>
</div>
</footer>
</div>
<script type="text/javascript" src="https://lucenenet.apache.org/docs/4.8.0-beta00009/styles/docfx.vendor.js"></script>
<script type="text/javascript" src="https://lucenenet.apache.org/docs/4.8.0-beta00009/styles/docfx.js"></script>
<script type="text/javascript" src="https://lucenenet.apache.org/docs/4.8.0-beta00009/styles/main.js"></script>
</body>
</html>