blob: 1f454d8ad89bbf0af49ff1046cc654fc07e4b2bb [file] [log] [blame]
= JSON Faceting Domain Changes
:solr-root-path: ../../
:example-source-dir: {solr-root-path}solrj/src/test/org/apache/solr/client/ref_guide_examples/
// Licensed to the Apache Software Foundation (ASF) under one
// or more contributor license agreements. See the NOTICE file
// distributed with this work for additional information
// regarding copyright ownership. The ASF licenses this file
// to you under the Apache License, Version 2.0 (the
// "License"); you may not use this file except in compliance
// with the License. You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing,
// software distributed under the License is distributed on an
// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
// KIND, either express or implied. See the License for the
// specific language governing permissions and limitations
// under the License.
Facet computation operates on a "domain" of documents. By default, this domain consists of the documents matched by the main query. For sub-facets, the domain consists of all documents placed in their bucket by the parent facet.
Users can also override the "domain" of a facet that partitions data, using an explicit `domain` attribute whose value is a JSON object that can support various options for restricting, expanding, or completely changing the original domain before the buckets are computed for the associated facet.
[NOTE]
====
`domain` changes can only be specified on individual facets that do data partitioning -- not statistical/metric facets, or groups of facets.
A `\*:*` query facet with a `domain` change can be used to group multiple sub-facets of any type, for the purpose of applying a common domain change.
====
== Adding Domain Filters
The simplest example of a domain change is to specify an additional filter which will be applied to the existing domain. This can be done via the `filter` keyword in the `domain` block of the facet.
[.dynamic-tabs]
--
[example.tab-pane#curl-json-facet-filtered-domain]
====
[.tab-label]*curl*
[source,bash]
----
curl http://localhost:8983/solr/techproducts/query -d '
{
"query": "*:*",
"facet": {
"categories": {
"type": "terms",
"field": "cat",
"limit": 3,
"domain": {
"filter": "popularity:[5 TO 10]"
}
}
}
}'
----
====
[example.tab-pane#solrj-json-facet-filtered-domain]
====
[.tab-label]*SolrJ*
[source,java,indent=0]
----
include::{example-source-dir}JsonRequestApiTest.java[tag=solrj-json-facet-filtered-domain]
----
====
--
The value of `filter` can be a single query to treat as a filter, or a JSON list of filter queries. Each query can be:
* a string containing a query in Solr query syntax.
* a reference to a request parameter containing Solr query syntax, of the form: `{param: <request_param_name>}`. It's possible to refer to one or multiple queries in DSL syntax defined under <<json-query-dsl.adoc#additional-queries,queries>> key in JSON Request API.
The referred parameter might have 0 (absent) or many values.
** When no values are specified, no filter is applied and no error is thrown.
** When many values are specified, each value is parsed and used as filters in conjunction.
Here is the example of referencing DSL queries:
[source,bash]
----
curl http://localhost:8983/solr/techproducts/query -d '
{
"query": "*:*",
"queries": {
"sample_filtrs":[
{"field":{"f":"text", "query":"usb"}},
{"field":{"f":"text", "query":"lcd"}}
],
"another_filtr":
{"field":{"f":"text", "query":"usb"}}
},
"facet": {
"usblcd": {
"type": "terms",
"field": "cat",
"limit": 3,
"domain": {
"filter": {"param":"sample_filtrs"}
}
},
"justusb": {
"type": "terms",
"field": "cat",
"limit": 3,
"domain": {
"filter": {"param":"another_filtr"}
}
}
}
}'
----
When a `filter` option is combined with other `domain` changing options, the filtering is applied _after_ the other domain changes take place.
== Filter Exclusions
Domains can also be expanded by using the `excludeTags` keyword to discard or ignore particular tagged query filters.
This is used in the example below to show the top two manufacturers matching a search. The search results match the filter `manu_id_s:apple`, but the computed facet discards this filter and operates a domain widened by discarding the `manu_id_s` filter.
[.dynamic-tabs]
--
[example.tab-pane#curl-json-facet-excludetags-domain]
====
[.tab-label]*curl*
[source,bash]
----
curl http://localhost:8983/solr/techproducts/query -d '
{
"query": "cat:electronics",
"filter": "{!tag=MANU}manu_id_s:apple",
"facet": {
"stock": {"type": "terms", "field": "inStock", "limit": 2},
"manufacturers": {
"type": "terms",
"field": "manu_id_s",
"limit": 2,
"domain": { "excludeTags":"MANU" }
}
}
}'
----
====
[example.tab-pane#solrj-json-facet-excludetags-domain]
====
[.tab-label]*SolrJ*
[source,java,indent=0]
----
include::{example-source-dir}JsonRequestApiTest.java[tag=solrj-json-facet-excludetags-domain]
----
====
--
The value of `excludeTags` can be a single string tag, an array of string tags, or comma-separated tags in the single string.
When an `excludeTags` option is combined with other `domain` changing options, it expands the domain _before_ any other domain changes take place.
See also the section on <<faceting.adoc#tagging-and-excluding-filters,multi-select faceting>>.
== Arbitrary Domain Query
A `query` domain can be specified when you wish to compute a facet against an arbitrary set of documents, regardless of the original domain. The most common use case would be to compute a top level facet against a specific subset of the collection, regardless of the main query. But it can also be useful on nested facets when building <<json-facet-api.adoc#relatedness-and-semantic-knowledge-graphs,Semantic Knowledge Graphs>>.
Example:
[.dynamic-tabs]
--
[example.tab-pane#curl-json-facet-query-domain]
====
[.tab-label]*curl*
[source,bash]
----
curl http://localhost:8983/solr/techproducts/query -d '
{
"query": "apple",
"facet": {
"popular_categories": {
"type": "terms",
"field": "cat",
"domain": { "query": "popularity:[8 TO 10]" },
"limit": 3
}
}
}'
----
====
[example.tab-pane#solrj-json-facet-query-domain]
====
[.tab-label]*SolrJ*
[source,java,indent=0]
----
include::{example-source-dir}JsonRequestApiTest.java[tag=solrj-json-facet-query-domain]
----
====
--
The value of `query` can be a single query, or a JSON list of queries. Each query can be:
* a string containing a query in Solr query syntax.
* a reference to a request parameter containing Solr query syntax, of the form: `{param: <request_param_name>}`.
The referred parameter might have 0 (absent) or many values.
** When no values are specified, no error is thrown.
** When many values are specified, each value is parsed and used as queries.
NOTE: While a `query` domain can be combined with an additional domain `filter`, It is not possible to also use `excludeTags`, because the tags would be meaningless: The `query` domain already completely ignores the top-level query and all previous filters.
== Block Join Domain Changes
When a collection contains <<indexing-nested-documents.adoc#, Nested Documents>>, the `blockChildren` or `blockParent` domain options can be used transform an existing domain containing one type of document, into a domain containing the documents with the specified relationship (child or parent of) to the documents from the original domain.
Both of these options work similarly to the corresponding <<other-parsers.adoc#block-join-query-parsers,Block Join Query Parsers>> by taking in a single String query that exclusively matches all parent documents in the collection. If `blockParent` is used, then the resulting domain will contain all parent documents of the children from the original domain. If `blockChildren` is used, then the resulting domain will contain all child documents of the parents from the original domain.
[source,json,subs="verbatim,callouts"]]
----
{
"colors": { // <1>
"type": "terms",
"field": "sku_color", // <2>
"facet" : {
"brands" : {
"type": "terms",
"field": "product_brand", // <3>
"domain": {
"blockParent": "doc_type:product"
}
}}}}
----
<1> This example assumes we parent documents corresponding to Products, with child documents corresponding to individual SKUs with unique colors, and that our original query was against SKU documents.
<2> The `colors` facet will be computed against all of the original SKU documents matching our search.
<3> For each bucket in the `colors` facet, the set of all matching SKU documents will be transformed into the set of corresponding parent Product documents. The resulting `brands` sub-facet will count how many Product documents (that have SKUs with the associated color) exist for each Brand.
== Join Query Domain Changes
A `join` domain change option can be used to specify arbitrary `from` and `to` fields to use in transforming from the existing domain to a related set of documents.
This works similarly to the <<other-parsers.adoc#join-query-parser,Join Query Parser>>, and has the same limitations when dealing with multi-shard collections.
Example:
[source,json]
----
{
"colors": {
"type": "terms",
"field": "sku_color",
"facet": {
"brands": {
"type": "terms",
"field": "product_brand",
"domain" : {
"join" : {
"from": "product_id_of_this_sku",
"to": "id"
},
"filter": "doc_type:product"
}
}
}
}
}
----
`join` domain changes support an optional `method` parameter, which allows users to specify which join implementation they would like to use in this domain transform. Solr offers several join implementations, each with different performance characteristics. For more information on these implementations and their tradeoffs, see the `method` parameter documentation <<other-parsers.adoc#parameters,here>>. Join domain changes support all `method` values except `crossCollection`.
== Graph Traversal Domain Changes
A `graph` domain change option works similarly to the `join` domain option, but can do traversal multiple hops `from` the existing domain `to` other documents.
This works very similar to the <<other-parsers.adoc#graph-query-parser,Graph Query Parser>>, supporting all of its optional parameters, and has the same limitations when dealing with multi-shard collections.
Example:
[source,json]
----
{
"related_brands": {
"type": "terms",
"field": "brand",
"domain": {
"graph": {
"from": "related_product_ids",
"to": "id",
"maxDepth": 3
}
}
}
}
----