blob: 9e7b392c745125621604b18553955884f45c287c [file] [log] [blame]
= Searching Nested Child Documents
// Licensed to the Apache Software Foundation (ASF) under one
// or more contributor license agreements. See the NOTICE file
// distributed with this work for additional information
// regarding copyright ownership. The ASF licenses this file
// to you under the Apache License, Version 2.0 (the
// "License"); you may not use this file except in compliance
// with the License. You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing,
// software distributed under the License is distributed on an
// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
// KIND, either express or implied. See the License for the
// specific language governing permissions and limitations
// under the License.
This section exposes potential techniques which can be used for searching deeply nested documents,
show casing how more complex queries can be constructed using some of Solr's query parsers and Doc Transformers.
These features require `\_root_` and `\_nest_path_` to be declared in the schema. +
Please refer to the <<indexing-nested-documents.adoc#, Indexing Nested Documents>>
section for more details about schema and index configuration.
[NOTE]
This section does not show case faceting on nested documents. For nested document faceting, please refer to the
<<blockjoin-faceting#blockjoin-faceting, Block Join Faceting>> section.
== Query Examples
For the upcoming examples, we'll assume an index containing the same documents covered in <<indexing-nested-documents#example-indexing-syntax,Indexing Nested Documents>>:
include::indexing-nested-documents.adoc[tag=sample-indexing-deeply-nested-documents]
=== Child Doc Transformer
By default, documents that match a query do not include any of their nested children in the response. The `[child]` Doc Transformer Can be used enrich query results with the documents' descendants.
For a detailed explanation of this transformer, and specifics on it's syntax & limitations, please refer to the section <<transforming-result-documents.adoc#child-childdoctransformerfactory, [child] - ChildDocTransformerFactory>>.
A simple query matching all documents with a description that includes "staplers":
[source,curl]
----
$ curl 'http://localhost:8983/solr/gettingstarted/select?omitHeader=true&q=description_t:staplers'
{
"response":{"numFound":1,"start":0,"maxScore":0.30136836,"numFoundExact":true,"docs":[
{
"id":"P11!prod",
"name_s":"Swingline Stapler",
"description_t":"The Cadillac of office staplers ...",
"_version_":1672933224035123200}]
}}
----
The same query with the addition of the `[child]` transformer is shown below. Note that the `numFound` has not changed, we are still matching the same set of documents, but when returning those documents the nested children are also returned as pseudo-fields.
[source,curl]
----
$ curl 'http://localhost:8983/solr/gettingstarted/select?omitHeader=true&q=description_t:staplers&fl=*,[child]'
{
"response":{"numFound":1,"start":0,"maxScore":0.30136836,"numFoundExact":true,"docs":[
{
"id":"P11!prod",
"name_s":"Swingline Stapler",
"description_t":"The Cadillac of office staplers ...",
"_version_":1672933224035123200,
"skus":[
{
"id":"P11!S21",
"color_s":"RED",
"price_i":42,
"_version_":1672933224035123200,
"manuals":[
{
"id":"P11!D41",
"name_s":"Red Swingline Brochure",
"pages_i":1,
"content_t":"...",
"_version_":1672933224035123200}]},
{
"id":"P11!S31",
"color_s":"BLACK",
"price_i":3,
"_version_":1672933224035123200}],
"manuals":[
{
"id":"P11!D51",
"name_s":"Quick Reference Guide",
"pages_i":1,
"content_t":"How to use your stapler ...",
"_version_":1672933224035123200},
{
"id":"P11!D61",
"name_s":"Warranty Details",
"pages_i":42,
"content_t":"... lifetime guarantee ...",
"_version_":1672933224035123200}]}]
}}
----
=== Child Query Parser
The `{!child}` query parser can be used to search for the _descendent_ documents of parent documents matching a wrapped query. For a detailed explanation of this parser, see the section <<other-parsers.adoc#block-join-children-query-parser, Block Join Children Query Parser>>.
Let's consider again the `description_t:staplers` query used above -- if we wrap that query in a `{!child}` query parser then instead of "matching" & returning the product level documents, we instead match all of the _descendent_ child documents of the original query:
[source,curl]
----
$ curl 'http://localhost:8983/solr/gettingstarted/select' -d 'omitHeader=true' -d 'q={!child of="*:* -_nest_path_:*"}description_t:staplers'
{
"response":{"numFound":5,"start":0,"maxScore":0.30136836,"numFoundExact":true,"docs":[
{
"id":"P11!D41",
"name_s":"Red Swingline Brochure",
"pages_i":1,
"content_t":"...",
"_version_":1672933224035123200},
{
"id":"P11!S21",
"color_s":"RED",
"price_i":42,
"_version_":1672933224035123200},
{
"id":"P11!S31",
"color_s":"BLACK",
"price_i":3,
"_version_":1672933224035123200},
{
"id":"P11!D51",
"name_s":"Quick Reference Guide",
"pages_i":1,
"content_t":"How to use your stapler ...",
"_version_":1672933224035123200},
{
"id":"P11!D61",
"name_s":"Warranty Details",
"pages_i":42,
"content_t":"... lifetime guarantee ...",
"_version_":1672933224035123200}]
}}
----
In this example we've used `\*:* -\_nest_path_:*` as our <<other-parsers.adoc#block-mask,`of` parameter>> to indicate we want to consider all documents which don't have a nest path -- i.e., all "root" level document -- as the set of possible parents.
By changing the `of` parameter to match ancestors at specific `\_nest_path_` levels, we can narrow down the list of children we return.
In the query below, we search for all descendants of `skus` (using an `of` parameter that identifies all documents that do _not_ have a `\_nest_path_` with the prefix `/skus/*`) with a `price_i` less then `50`:
[source,curl]
----
$ curl 'http://localhost:8983/solr/gettingstarted/select' -d 'omitHeader=true' --data-urlencode 'q={!child of="*:* -_nest_path_:\\/skus\\/*"}(+price_i:[* TO 50] +_nest_path_:\/skus)'
{
"response":{"numFound":1,"start":0,"maxScore":1.0,"numFoundExact":true,"docs":[
{
"id":"P11!D41",
"name_s":"Red Swingline Brochure",
"pages_i":1,
"content_t":"...",
"_version_":1675662666752851968}]
}}
----
[#double-escaping-nest-path-slashes]
[CAUTION]
.Double Escaping `\_nest_path_` slashes in `of`
====
Note that in the above example, the `/` characters in the `\_nest_path_` were "double escaped" in the `of` parameter:
* One level of `\` escaping is necessary to prevent the `/` from being interpreted as a {lucene-javadocs}/queryparser/org/apache/lucene/queryparser/classic/package-summary.html#Regexp_Searches[Regex Query]
* An additional level of "escaping the escape character" is necessary because the `of` local parameter is a quoted string; so we need a second `\` to ensure the first `\` is preserved and passed as is to the query parser.
(You can see that only a single level of of `\` escaping is needed in the body of the query string -- to prevent the Regex syntax -- because it's not a quoted string local param).
You may find it more convenient to use <<local-parameters-in-queries#parameter-dereferencing,parameter references>> in conjunction with <<other-parsers#other-parsers,other parsers>> that do not treat `/` as a special character to express the same query in a more verbose form:
[source,curl]
----
curl 'http://localhost:8983/solr/gettingstarted/select' -d 'omitHeader=true' --data-urlencode 'q={!child of=$block_mask}(+price_i:[* TO 50] +{!field f="_nest_path_" v="/skus"})' --data-urlencode 'block_mask=(*:* -{!prefix f="_nest_path_" v="/skus/"})'
----
====
=== Parent Query Parser
The inverse of the `{!child}` query parser is the `{!parent}` query parser, which lets you search for the _ancestor_ documents of some child documents matching a wrapped query. For a detailed explanation of this parser, see the section <<other-parsers.adoc#block-join-parent-query-parser,Block Join Parent Query Parser>>.
Let's first consider this example of searching for all "manual" type documents that have exactly `1` page:
[source,curl]
----
$ curl 'http://localhost:8983/solr/gettingstarted/select?omitHeader=true&q=pages_i:1'
{
"response":{"numFound":3,"start":0,"maxScore":1.0,"numFoundExact":true,"docs":[
{
"id":"P11!D41",
"name_s":"Red Swingline Brochure",
"pages_i":1,
"content_t":"...",
"_version_":1676585794196733952},
{
"id":"P11!D51",
"name_s":"Quick Reference Guide",
"pages_i":1,
"content_t":"How to use your stapler ...",
"_version_":1676585794196733952},
{
"id":"P22!D42",
"name_s":"Red Mont Blanc Brochure",
"pages_i":1,
"content_t":"...",
"_version_":1676585794347728896}]
}}
----
We can wrap that query in a `{!parent}` query to return the details of all products that are ancestors of these manuals:
[source,curl]
----
$ curl 'http://localhost:8983/solr/gettingstarted/select' -d 'omitHeader=true' --data-urlencode 'q={!parent which="*:* -_nest_path_:*"}(+_nest_path_:\/skus\/manuals +pages_i:1)'
{
"response":{"numFound":2,"start":0,"maxScore":1.4E-45,"numFoundExact":true,"docs":[
{
"id":"P11!prod",
"name_s":"Swingline Stapler",
"description_t":"The Cadillac of office staplers ...",
"_version_":1676585794196733952},
{
"id":"P22!prod",
"name_s":"Mont Blanc Fountain Pen",
"description_t":"A Premium Writing Instrument ...",
"_version_":1676585794347728896}]
}}
----
In this example we've used `\*:* -\_nest_path_:*` as our <<other-parsers#block-mask,`which` parameter>> to indicate we want to consider all documents which don't have a nest path -- i.e., all "root" level document -- as the set of possible parents.
By changing the `which` parameter to match ancestors at specific `\_nest_path_` levels, we can change the type of ancestors we return. In the query below, we search for `skus` (using an `which` parameter that identifies all documents that do _not_ have a `\_nest_path_` with the prefix `/skus/*`) that are the ancestors of `manuals` with exactly `1` page:
[source,curl]
----
$ curl 'http://localhost:8983/solr/gettingstarted/select' -d 'omitHeader=true' --data-urlencode 'q={!parent which="*:* -_nest_path_:\\/skus\\/*"}(+_nest_path_:\/skus\/manuals +pages_i:1)'
{
"response":{"numFound":2,"start":0,"maxScore":1.4E-45,"numFoundExact":true,"docs":[
{
"id":"P11!S21",
"color_s":"RED",
"price_i":42,
"_version_":1676585794196733952},
{
"id":"P22!S22",
"color_s":"RED",
"price_i":89,
"_version_":1676585794347728896}]
}}
----
[CAUTION]
====
Note that in the above example, the `/` characters in the `\_nest_path_` were "double escaped" in the `which` parameter, for the <<#double-escaping-nest-path-slashes,same reasons discussed above>> regarding the `{!child} pasers `of` parameter.
====
=== Combining Block Join Query Parsers with Child Doc Transformer
The combination of these two parsers with the `[child] transformer enables seamless creation of very powerful queries.
Here for example is a query where:
* the (sku) documents returned must have a color of "RED"
* the (sku) docments returned must be the descendents of root level (product) documents which have:
** immediate child "manuals" documents which have:
*** "lifetime guarantee" in their content
* each return (sku) document also includes any descendent (manuals) documents it has
[source,curl]
----
$ curl 'http://localhost:8983/solr/gettingstarted/select' -d 'omitHeader=true' -d 'fq=color_s:RED' --data-urlencode 'q={!child of="*:* -_nest_path_:*" filters=$parent_fq}' --data-urlencode 'parent_fq={!parent which="*:* -_nest_path_:*"}(+_nest_path_:"/manuals" +content_t:"lifetime guarantee")' -d 'fl=*,[child]'
{
"response":{"numFound":1,"start":0,"maxScore":1.4E-45,"numFoundExact":true,"docs":[
{
"id":"P11!S21",
"color_s":"RED",
"price_i":42,
"_version_":1676585794196733952,
"manuals":[
{
"id":"P11!D41",
"name_s":"Red Swingline Brochure",
"pages_i":1,
"content_t":"...",
"_version_":1676585794196733952}]}]
}}
----