blob: 246c3559fff4c22a6a6b64d2b039165bbc3af1d1 [file] [log] [blame]
= Common Query Parameters
// Licensed to the Apache Software Foundation (ASF) under one
// or more contributor license agreements. See the NOTICE file
// distributed with this work for additional information
// regarding copyright ownership. The ASF licenses this file
// to you under the Apache License, Version 2.0 (the
// "License"); you may not use this file except in compliance
// with the License. You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing,
// software distributed under the License is distributed on an
// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
// KIND, either express or implied. See the License for the
// specific language governing permissions and limitations
// under the License.
Several query parsers share supported query parameters.
The following sections describe Solr's common query parameters, which are supported by the <<requesthandlers-and-searchcomponents-in-solrconfig#searchhandlers,Search RequestHandlers>>.
== defType Parameter
The defType parameter selects the query parser that Solr should use to process the main query parameter (`q`) in the request. For example:
`defType=dismax`
If no `defType` parameter is specified, then by default, the <<the-standard-query-parser.adoc#,The Standard Query Parser>> is used. (e.g., `defType=lucene`)
== sort Parameter
The `sort` parameter arranges search results in either ascending (`asc`) or descending (`desc`) order. The parameter can be used with either numerical or alphabetical content. The directions can be entered in either all lowercase or all uppercase letters (i.e., both `asc` and `ASC` are accepted).
Solr can sort query responses according to:
* Document scores
* <<function-queries.adoc#sort-by-function,Function results>>
* The value of any primitive field (numerics, string, boolean, dates, etc.) which has `docValues="true"` (or `multiValued="false"` and `indexed="true"`, in which case the indexed terms will used to build DocValue like structures on the fly at runtime)
* A SortableTextField which implicitly uses `docValues="true"` by default to allow sorting on the original input string regardless of the analyzers used for Searching.
* A single-valued TextField that uses an analyzer (such as the KeywordTokenizer) that produces only a single term per document. TextField does not support `docValues="true"`, but a DocValue-like structure will be built on the fly at runtime.
** *NOTE:* If you want to be able to sort on a field whose contents you want to tokenize to facilitate searching, <<copying-fields.adoc#,use a `copyField` directive>> in the Schema to clone the field. Then search on the field and sort on its clone.
In the case of primitive fields, or SortableTextFields, that are `multiValued="true"` the representative value used for each doc when sorting depends on the sort direction: The minimum value in each document is used for ascending (`asc`) sorting, while the maximal value in each document is used for descending (`desc`) sorting. This default behavior is equivalent to explicitly sorting using the 2 argument `<<function-queries.adoc#field-function,field()>>` function: `sort=field(name,min) asc` and `sort=field(name,max) desc`
The table below explains how Solr responds to various settings of the `sort` parameter.
// TODO: Change column width to %autowidth.spread when https://github.com/asciidoctor/asciidoctor-pdf/issues/599 is fixed
[cols="30,70",options="header"]
|===
|Example |Result
| |If the sort parameter is omitted, sorting is performed as though the parameter were set to `score desc`.
|score desc |Sorts in descending order from the highest score to the lowest score.
|price asc |Sorts in ascending order of the price field
|div(popularity,price) desc |Sorts in descending order of the result of the function `popularity / price`
|inStock desc, price asc |Sorts by the contents of the `inStock` field in descending order, then when multiple documents have the same value for the `inStock` field, those results are sorted in ascending order by the contents of the price field.
|categories asc, price asc |Sorts by the lowest value of the (multivalued) `categories` field in ascending order, then when multiple documents have the same lowest `categories` value, those results are sorted in ascending order by the contents of the price field.
|===
Regarding the sort parameter's arguments:
* A sort ordering must include a field name (or `score` as a pseudo field), followed by whitespace (escaped as + or `%20` in URL strings), followed by a sort direction (`asc` or `desc`).
* Multiple sort orderings can be separated by a comma, using this syntax: `sort=<field name>+<direction>,<field name>+<direction>],...`
** When more than one sort criteria is provided, the second entry will only be used if the first entry results in a tie. If there is a third entry, it will only be used if the first AND second entries are tied. And so on.
** If documents tie in all of the explicit sort criteria, Solr uses each document's Lucene document ID as the final tie-breaker.
This internal property is subject to change during segment merges and document updates, which can lead to unexpected result ordering changes.
Users looking to avoid this behavior can add an additional sort criteria on a unique or rarely-shared field such as `id` to prevent ties from occurring (e.g., `price desc,id asc`).
== start Parameter
When specified, the `start` parameter specifies an offset into a query's result set and instructs Solr to begin displaying results from this offset.
The default value is `0`. In other words, by default, Solr returns results without an offset, beginning where the results themselves begin.
Setting the `start` parameter to some other number, such as `3`, causes Solr to skip over the preceding records and start at the document identified by the offset.
You can use the `start` parameter this way for paging. For example, if the `rows` parameter is set to 10, you could display three successive pages of results by setting start to 0, then re-issuing the same query and setting start to 10, then issuing the query again and setting start to 20.
== rows Parameter
You can use the `rows` parameter to paginate results from a query. The parameter specifies the maximum number of documents from the complete result set that Solr should return to the client at one time.
The default value is `10`. That is, by default, Solr returns 10 documents at a time in response to a query.
== fq (Filter Query) Parameter
The `fq` parameter defines a query that can be used to restrict the superset of documents that can be returned, without influencing score. It can be very useful for speeding up complex queries, since the queries specified with `fq` are cached independently of the main query. When a later query uses the same filter, there's a cache hit, and filter results are returned quickly from the cache.
When using the `fq` parameter, keep in mind the following:
* The `fq` parameter can be specified multiple times in a query. Documents will only be included in the result if they are in the intersection of the document sets resulting from each instance of the parameter. In the example below, only documents which have a popularity greater then 10 and have a section of 0 will match.
+
[source,text]
----
fq=popularity:[10 TO *]&fq=section:0
----
* Filter queries can involve complicated Boolean queries. The above example could also be written as a single `fq` with two mandatory clauses like so:
+
[source,text]
----
fq=+popularity:[10 TO *] +section:0
----
* The document sets from each filter query are cached independently. Thus, concerning the previous examples: use a single `fq` containing two mandatory clauses if those clauses appear together often, and use two separate `fq` parameters if they are relatively independent. (To learn about tuning cache sizes and making sure a filter cache actually exists, see <<the-well-configured-solr-instance.adoc#,The Well-Configured Solr Instance>>.)
* It is also possible to use <<the-standard-query-parser.adoc#differences-between-lucenes-classic-query-parser-and-solrs-standard-query-parser,filter(condition) syntax>> inside the `fq` to cache clauses individually and - among other things - to achieve union of cached filter queries.
* As with all parameters: special characters in an URL need to be properly escaped and encoded as hex values. Online tools are available to help you with URL-encoding. For example: http://meyerweb.com/eric/tools/dencoder/.
== fl (Field List) Parameter
The `fl` parameter limits the information included in a query response to a specified list of fields. The fields must be either `stored="true"` or `docValues="true"``.`
The field list can be specified as a space-separated or comma-separated list of field names. The string "score" can be used to indicate that the score of each document for the particular query should be returned as a field. The wildcard character `*` selects all the fields in the document which are either `stored="true"` or `docValues="true"` and `useDocValuesAsStored="true"` (which is the default when docValues are enabled). You can also add pseudo-fields, functions and transformers to the field list request.
This table shows some basic examples of how to use `fl`:
// TODO: Change column width to %autowidth.spread when https://github.com/asciidoctor/asciidoctor-pdf/issues/599 is fixed
[cols="30,70",options="header"]
|===
|Field List |Result
|id name price |Return only the id, name, and price fields.
|id,name,price |Return only the id, name, and price fields.
|id name, price |Return only the id, name, and price fields.
|id score |Return the id field and the score.
|* |Return all the `stored` fields in each document, as well as any `docValues` fields that have `useDocValuesAsStored="true"`. This is the default value of the fl parameter.
|* score |Return all the fields in each document, along with each field's score.
|*,dv_field_name |Return all the `stored` fields in each document, and any `docValues` fields that have `useDocValuesAsStored="true"` and the docValues from dv_field_name even if it has `useDocValuesAsStored="false"`
|===
=== Functions with fl
<<function-queries.adoc#,Functions>> can be computed for each document in the result and returned as a pseudo-field:
[source,text]
----
fl=id,title,product(price,popularity)
----
=== Document Transformers with fl
<<transforming-result-documents.adoc#,Document Transformers>> can be used to modify the information returned about each documents in the results of a query:
[source,text]
----
fl=id,title,[explain]
----
=== Field Name Aliases
You can change the key used to in the response for a field, function, or transformer by prefixing it with a `_"displayName_:`". For example:
[source,text]
----
fl=id,sales_price:price,secret_sauce:prod(price,popularity),why_score:[explain style=nl]
----
[source,json]
----
{
"response": {
"numFound": 2,
"start": 0,
"docs": [{
"id": "6H500F0",
"secret_sauce": 2100.0,
"sales_price": 350.0,
"why_score": {
"match": true,
"value": 1.052226,
"description": "weight(features:cache in 2) [DefaultSimilarity], result of:",
"details": [{
"..."
}]}}]}}
----
== debug Parameter
The `debug` parameter can be specified multiple times and supports the following arguments:
* `debug=query`: return debug information about the query only.
* `debug=timing`: return debug information about how long the query took to process.
* `debug=results`: return debug information about the score results (also known as "explain").
** By default, score explanations are returned as large string values, using newlines and tab indenting for structure & readability, but an additional `debug.explain.structured=true` parameter may be specified to return this information as nested data structures native to the response format requested by `wt`.
* `debug=all`: return all available debug information about the request request. (alternatively usage: `debug=true`)
For backwards compatibility with older versions of Solr, `debugQuery=true` may instead be specified as an alternative way to indicate `debug=all`
The default behavior is not to include debugging information.
== explainOther Parameter
The `explainOther` parameter specifies a Lucene query in order to identify a set of documents. If this parameter is included and is set to a non-blank value, the query will return debugging information, along with the "explain info" of each document that matches the Lucene query, relative to the main query (which is specified by the `q` parameter). For example:
[source,text]
----
q=supervillians&debugQuery=on&explainOther=id:juggernaut
----
The query above allows you to examine the scoring explain info of the top matching documents, compare it to the explain info for documents matching `id:juggernaut`, and determine why the rankings are not as you expect.
The default value of this parameter is blank, which causes no extra "explain info" to be returned.
== timeAllowed Parameter
This parameter specifies the amount of time, in milliseconds, allowed for a search to complete. If this time expires before the search is complete, any partial results will be returned, but values such as `numFound`, <<faceting.adoc#,facet>> counts, and result <<the-stats-component.adoc#,stats>> may not be accurate for the entire result set. In case of expiration, if `omitHeader` isn't set to `true` the response header contains a special flag called `partialResults`. When using `timeAllowed` in combination with <<pagination-of-results.adoc#using-cursors,`cursorMark`>>, and the `partialResults` flag is present, some matching documents may have been skipped in the result set. Additionally, if the `partialResults` flag is present, `cursorMark` can match `nextCursorMark` even if there may be more results
[source,json]
----
{
"responseHeader": {
"status": 0,
"zkConnected": true,
"partialResults": true,
"QTime": 20,
"params": {
"q": "*:*"
}
},
"response": {
"numFound": 77,
"start": 0,
"docs": [ "..." ]
}
}
----
This value is only checked at the time of:
. Query Expansion, and
. Document collection
. Doc Values reading
As this check is periodically performed, the actual time for which a request can be processed before it is aborted would be marginally greater than or equal to the value of `timeAllowed`. If the request consumes more time in other stages, custom components, etc., this parameter is not expected to abort the request. Regular search, JSON Facet and the Analytics component abandon requests in accordance with this parameter.
== segmentTerminateEarly Parameter
This parameter may be set to either `true` or `false`.
If set to `true`, and if <<indexconfig-in-solrconfig.adoc#mergepolicyfactory,the mergePolicyFactory>> for this collection is a {solr-javadocs}/solr-core/org/apache/solr/index/SortingMergePolicyFactory.html[`SortingMergePolicyFactory`] which uses a `sort` option compatible with <<sort Parameter,the sort parameter>> specified for this query, then Solr will be able to skip documents on a per-segment basis that are definitively not candidates for the current page of results.
If early termination is used, a `segmentTerminatedEarly` header will be included in the `responseHeader`.
Similar to using <<timeAllowed Parameter,the `timeAllowed` Parameter>>, when early segment termination happens values such as `numFound`, <<faceting.adoc#,Facet>> counts, and result <<the-stats-component.adoc#,Stats>> may not be accurate for the entire result set.
The default value of this parameter is `false`.
== omitHeader Parameter
This parameter may be set to either `true` or `false`.
If set to `true`, this parameter excludes the header from the returned results. The header contains information about the request, such as the time it took to complete. The default value for this parameter is `false`. When using parameters such as <<common-query-parameters.adoc#timeallowed-parameter,`timeAllowed`>>, and <<solrcloud-query-routing-and-read-tolerance.adoc#shards-tolerant-parameter,`shards.tolerant`>>, which can lead to partial results, it is advisable to keep the header, so that the `partialResults` flag can be checked, and values such as `numFound`, `nextCursorMark`, <<faceting.adoc#,Facet>> counts, and result <<the-stats-component.adoc#,Stats>> can be interpreted in the context of partial results.
== wt Parameter
The `wt` parameter selects the Response Writer that Solr should use to format the query's response. For detailed descriptions of Response Writers, see <<response-writers.adoc#,Response Writers>>.
If you do not define the `wt` parameter in your queries, JSON will be returned as the format of the response.
== cache Parameter
Solr caches the results of all queries and filter queries by default. To disable result caching, set the `cache=false` parameter.
You can also use the `cost` option to control the order in which non-cached filter queries are evaluated. This allows you to order less expensive non-cached filters before expensive non-cached filters.
For very high cost filters, if `cache=false` and `cost>=100` and the query implements the `PostFilter` interface, a Collector will be requested from that query and used to filter documents after they have matched the main query and all other filter queries. There can be multiple post filters; they are also ordered by cost.
For most queries the default behavior is `cost=0`, but some types of queries (such as `{!frange}`) default to `cost=100`, because they are most efficient when used as a `PostFilter`.
This is an example of 3 regular filters, where all matching documents generated by each are computed up front and cached independently:
[source,text]
q=some keywords
fq=quantity_in_stock:[5 TO *]
fq={!frange l=10 u=100}mul(popularity,price)
fq={!frange cost=200 l=0}pow(mul(sum(1, query('tag:smartphone')), div(1,avg_rating)), 2.3)
These are the same filters run without caching.
The simple range query on the `quantity_in_stock` field will be run in parallel with the main query like a traditional Lucene filter, while the 2 `frange` filters will only be checked against each document has already matched the main query and the `quantity_in_stock` range query -- first the simpler `mul(popularity,price)` will be checked (because of its implicit `cost=100`) and only if it matches will the final very complex filter (with its higher `cost=200`) be checked.
[source,text]
q=some keywords
fq={!cache=false}quantity_in_stock:[5 TO *]
fq={!frange cache=false l=10 u=100}mul(popularity,price)
fq={!frange cache=false cost=200 l=0}pow(mul(sum(1, query('tag:smartphone')), div(1,avg_rating)), 2.3)
== logParamsList Parameter
By default, Solr logs all parameters of requests. Set this parameter to restrict which parameters of a request are logged. This may help control logging to only those parameters considered important to your organization.
For example, you could define this like:
`logParamsList=q,fq`
And only the 'q' and 'fq' parameters will be logged.
If no parameters should be logged, you can send `logParamsList` as empty (i.e., `logParamsList=`).
TIP: This parameter not only applies to query requests, but to any kind of request to Solr.
== echoParams Parameter
The `echoParams` parameter controls what information about request parameters is included in the response header.
The `echoParams` parameter accepts the following values:
* `explicit`: This is the default value. Only parameters included in the actual request, plus the `_` parameter (which is a 64-bit numeric timestamp) will be added to the `params` section of the response header.
* `all`: Include all request parameters that contributed to the query. This will include everything defined in the request handler definition found in `solrconfig.xml` as well as parameters included with the request, plus the `_` parameter. If a parameter is included in the request handler definition AND the request, it will appear multiple times in the response header.
* `none`: Entirely removes the `params` section of the response header. No information about the request parameters will be available in the response.
Here is an example of a JSON response where the echoParams parameter was not included, so the default of `explicit` is active. The request URL that created this response included three parameters - `q`, `wt`, and `indent`:
[source,json]
----
{
"responseHeader": {
"status": 0,
"QTime": 0,
"params": {
"q": "solr",
"indent": "true",
"wt": "json",
"_": "1458227751857"
}
},
"response": {
"numFound": 0,
"start": 0,
"docs": []
}
}
----
This is what happens if a similar request is sent that adds `echoParams=all` to the three parameters used in the previous example:
[source,json]
----
{
"responseHeader": {
"status": 0,
"QTime": 0,
"params": {
"q": "solr",
"df": "text",
"preferLocalShards": "false",
"indent": "true",
"echoParams": "all",
"rows": "10",
"wt": "json",
"_": "1458228887287"
}
},
"response": {
"numFound": 0,
"start": 0,
"docs": []
}
}
----
== minExactCount Parameter
When this parameter is used, Solr will count the number of hits accurately at least until this value. After that, Solr can skip over documents that don't have a score high enough to enter in the top N. This can greatly improve performance of search queries. On the other hand, when this parameter is used, the `numFound` may not be exact, and may instead be an approximation.
The `numFoundExact` boolean attribute is included in all responses, indicating if the `numFound` value is exact or an approximation. If it's an approximation, the real number of hits for the query is guaranteed to be greater or equal `numFound`.
More about approximate document counting and `minExactCount`:
* The documents returned in the response are guaranteed to be the docs with the top scores. This parameter will not make Solr skip documents that are to be returned in the response, it will only allow Solr to skip counting docs that, while they match the query, their score is low enough to not be in the top N.
* Providing `minExactCount` doesn't guarantee that Solr will use approximate hit counting (and thus, provide the speedup). Some types of queries, or other parameters (like if facets are requested) will require accurate counting.
* Approximate counting can only be used when sorting by `score desc` first (which is the default sort in Solr). Other fields can be used after `score desc`, but if any other type of sorting is used before score, then the approximation won't be applied.
* When doing distributed queries across multiple shards, each shard will accurately count hits until `minExactCount` (which means the query could be hitting `numShards * minExactCount` docs and `numFound` in the response would still be accurate)
For example:
[source,text]
q=quick brown fox&minExactCount=100&rows=10
[source,json]
----
"response": {
"numFound": 153,
"start": 0,
"numFoundExact": false,
"docs": [{"doc1"}]
}
----
Since `numFoundExact=false`, we know the number of documents matching the query is greater or equal to 153. If we specify a higher value for `minExactCount`:
[source,text]
q=quick brown fox&minExactCount=200&rows=10
[source,json]
----
"response": {
"numFound": 163,
"start": 0,
"numFoundExact": true,
"docs": [{"doc1"}]
}
----
In this case we know that `163` is the exact number of hits for the query. Both queries must have returned the same number of documents in the top 10.