blob: 458b936e71b50d3abb75b778fe22e611f32545d3 [file] [log] [blame]
= Format of solr.xml
// Licensed to the Apache Software Foundation (ASF) under one
// or more contributor license agreements. See the NOTICE file
// distributed with this work for additional information
// regarding copyright ownership. The ASF licenses this file
// to you under the Apache License, Version 2.0 (the
// "License"); you may not use this file except in compliance
// with the License. You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing,
// software distributed under the License is distributed on an
// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
// KIND, either express or implied. See the License for the
// specific language governing permissions and limitations
// under the License.
The `solr.xml` file defines some global configuration options that apply to all or many cores.
This section will describe the default `solr.xml` file included with Solr and how to modify it for your needs. For details on how to configure `core.properties`, see the section <<defining-core-properties.adoc#,Defining core.properties>>.
== Defining solr.xml
You can find `solr.xml` in your `$SOLR_HOME` directory (usually `server/solr` or `/var/solr/data`) or optionally in ZooKeeper when using SolrCloud. The default `solr.xml` file looks like this:
[source,xml]
----
<solr>
<int name="maxBooleanClauses">${solr.max.booleanClauses:1024}</int>
<str name="sharedLib">${solr.sharedLib:}</str>
<str name="allowPaths">${solr.allowPaths:}</str>
<solrcloud>
<str name="host">${host:}</str>
<int name="hostPort">${solr.port.advertise:0}</int>
<str name="hostContext">${hostContext:solr}</str>
<bool name="genericCoreNodeNames">${genericCoreNodeNames:true}</bool>
<int name="zkClientTimeout">${zkClientTimeout:30000}</int>
<int name="distribUpdateSoTimeout">${distribUpdateSoTimeout:600000}</int>
<int name="distribUpdateConnTimeout">${distribUpdateConnTimeout:60000}</int>
<str name="zkCredentialsProvider">${zkCredentialsProvider:org.apache.solr.common.cloud.DefaultZkCredentialsProvider}</str>
<str name="zkACLProvider">${zkACLProvider:org.apache.solr.common.cloud.DefaultZkACLProvider}</str>
</solrcloud>
<shardHandlerFactory name="shardHandlerFactory"
class="HttpShardHandlerFactory">
<int name="socketTimeout">${socketTimeout:600000}</int>
<int name="connTimeout">${connTimeout:60000}</int>
<str name="shardsWhitelist">${solr.shardsWhitelist:}</str>
</shardHandlerFactory>
</solr>
----
As you can see, the discovery Solr configuration is "SolrCloud friendly". However, the presence of the `<solrcloud>` element does _not_ mean that the Solr instance is running in SolrCloud mode. Unless the `-DzkHost` or `-DzkRun` are specified at startup time, this section is ignored.
== Solr.xml Parameters
=== The <solr> Element
There are no attributes that you can specify in the `<solr>` tag, which is the root element of `solr.xml`. The tables below list the child nodes of each XML element in `solr.xml`.
`adminHandler`::
This attribute does not need to be set.
+
If used, this attribute should be set to the FQN (Fully qualified name) of a class that inherits from CoreAdminHandler. For example, `<str name="adminHandler">com.myorg.MyAdminHandler</str>` would configure the custom admin handler (MyAdminHandler) to handle admin requests.
+
If this attribute isn't set, Solr uses the default admin handler, `org.apache.solr.handler.admin.CoreAdminHandler`.
`collectionsHandler`::
As above, for custom CollectionsHandler implementations.
`infoHandler`::
As above, for custom InfoHandler implementations.
`coreLoadThreads`::
Specifies the number of threads that will be assigned to load cores in parallel.
`replayUpdatesThreads`::
Specifies the number of threads that will be assigned to replay updates in parallel.
This pool is shared for all cores of the node.
The default value is equal to the number of processors.
`coreRootDirectory`::
The root of the core discovery tree, defaults to `$SOLR_HOME` (by default, `server/solr`).
`managementPath`::
Currently non-operational.
`sharedLib`::
Specifies the path to a common library directory that will be shared across all cores. Any JAR files in this directory will be added to the search path for Solr plugins. If the specified path is not absolute, it will be relative to `$SOLR_HOME`. Custom handlers may be placed in this directory. Note that specifying `sharedLib` will not remove `$SOLR_HOME/lib` from Solr's class path.
`allowPaths`::
Solr will normally only access folders relative to `$SOLR_HOME`, `$SOLR_DATA_HOME` or `coreRootDir`. If you need to e.g., create a core outside of these paths, you can explicitly allow the path with `allowPaths`. It is a comma separated string of file system paths to allow. The special value of `*` will allow any path on the system.
`shareSchema`::
This attribute, when set to `true`, ensures that the multiple cores pointing to the same Schema resource file will be referring to the same IndexSchema Object. Sharing the IndexSchema Object makes loading the core faster. If you use this feature, make sure that no core-specific property is used in your Schema file.
`transientCacheSize`::
Defines how many cores with `transient=true` that can be loaded before swapping the least recently used core for a new core.
`configSetBaseDir`::
The directory under which configsets for Solr cores can be found. Defaults to `$SOLR_HOME/configsets`.
[[global-maxbooleanclauses]]
`maxBooleanClauses`::
Sets the maximum number of clauses allowed in any boolean query.
+
This global limit provides a safety constraint on the number of clauses allowed in any boolean queries against any collection -- regardless of whether those clauses were explicitly specified in a query string, or were the result of query expansion/re-writing from a more complex type of query based on the terms in the index.
+
In default configurations this property uses the value of the `solr.max.booleanClauses` system property if specified. This is the same system property used in the `_default` configset for the <<query-settings-in-solrconfig#maxbooleanclauses,`<maxBooleanClauses>` setting of `solrconfig.xml`>> making it easy for Solr administrators to increase both values (in all collections) without needing to search through and update all of their configs.
+
[source,xml]
----
<maxBooleanClauses>${solr.max.booleanClauses:1024}</maxBooleanClauses>
----
=== The <solrcloud> Element
This element defines several parameters that relate so SolrCloud. This section is ignored unless theSolr instance is started with either `-DzkRun` or `-DzkHost`
`distribUpdateConnTimeout`::
Used to set the underlying `connTimeout` for intra-cluster updates.
`distribUpdateSoTimeout`::
Used to set the underlying `socketTimeout` for intra-cluster updates.
`host`::
The hostname Solr uses to access cores.
`hostContext`::
The url context path.
`hostPort`::
The port Solr uses to access cores, and advertise Solr node locations through liveNodes.
This option is only necessary if a Solr instance is listening on a different port than it wants other nodes to contact it at.
For example, if the Solr node is running behind a proxy or in a cloud environment that allows for port mapping, such as Kubernetes.
`hostPort` is the port that the Solr instance wants other nodes to contact it at.
+
In the default `solr.xml` file, this is set to `${solr.port.advertise:0}`.
If no port is passed via the `solr.xml` (i.e., `0`), then Solr will default to the port that jetty is listening on, defined by `${jetty.port}`.
`leaderVoteWait`::
When SolrCloud is starting up, how long each Solr node will wait for all known replicas for that shard to be found before assuming that any nodes that haven't reported are down.
`leaderConflictResolveWait`::
When trying to elect a leader for a shard, this property sets the maximum time a replica will wait to see conflicting state information to be resolved; temporary conflicts in state information can occur when doing rolling restarts, especially when the node hosting the Overseer is restarted.
+
Typically, the default value of `180000` (ms) is sufficient for conflicts to be resolved; you may need to increase this value if you have hundreds or thousands of small collections in SolrCloud.
`zkClientTimeout`::
A timeout for connection to a ZooKeeper server. It is used with SolrCloud.
`zkHost`::
In SolrCloud mode, the URL of the ZooKeeper host that Solr should use for cluster state information.
`genericCoreNodeNames`::
If `TRUE`, node names are not based on the address of the node, but on a generic name that identifies the core. When a different machine takes over serving that core things will be much easier to understand.
`zkCredentialsProvider` & `zkACLProvider`::
Optional parameters that can be specified if you are using <<zookeeper-access-control.adoc#,ZooKeeper Access Control>>.
=== The <logging> Element
`class`::
The class to use for logging. The corresponding JAR file must be available to Solr, perhaps through a `<lib>` directive in `solrconfig.xml`.
`enabled`::
true/false - whether to enable logging or not.
==== The <logging><watcher> Element
`size`::
The number of log events that are buffered.
`threshold`::
The logging level above which your particular logging implementation will record. For example when using log4j one might specify DEBUG, WARN, INFO, etc.
=== The <shardHandlerFactory> Element
Custom shard handlers can be defined in `solr.xml` if you wish to create a custom shard handler.
[source,xml]
----
<shardHandlerFactory name="ShardHandlerFactory" class="qualified.class.name">
----
Since this is a custom shard handler, sub-elements are specific to the implementation. The default and only shard handler provided by Solr is the `HttpShardHandlerFactory` in which case, the following sub-elements can be specified:
`socketTimeout`::
The read timeout for intra-cluster query and administrative requests. The default is the same as the `distribUpdateSoTimeout` specified in the `<solrcloud>` section.
`connTimeout`::
The connection timeout for intra-cluster query and administrative requests. Defaults to the `distribUpdateConnTimeout` specified in the `<solrcloud>` section.
`urlScheme`::
The URL scheme to be used in distributed search.
`maxConnectionsPerHost`::
Maximum connections allowed per host. Defaults to `100000`.
`corePoolSize`::
The initial core size of the threadpool servicing requests. Default is `0`.
`maximumPoolSize`::
The maximum size of the threadpool servicing requests. Default is unlimited.
`maxThreadIdleTime`::
The amount of time in seconds that idle threads persist for in the queue, before being killed. Default is `5` seconds.
`sizeOfQueue`::
If the threadpool uses a backing queue, what is its maximum size to use direct handoff. Default is to use a SynchronousQueue.
`fairnessPolicy`::
A boolean to configure if the threadpool favors fairness over throughput. Default is false to favor throughput.
`shardsWhitelist`::
When running Solr in non-cloud mode and if planning to do distributed search (using the "shards" parameter), the list of hosts needs to be whitelisted or Solr will forbid the request. The whitelist can also be configured in `solr.in.sh`.
`replicaRouting`::
A NamedList specifying replica routing preference configuration. This may be used to select and configure replica routing preferences. `default=true` may be used to set the default base replica routing preference. Only positive default status assertions are respected; i.e., `default=false` has no effect. If no explicit default base replica routing preference is configured, the implicit default will be `random`.
----
<shardHandlerFactory class="HttpShardHandlerFactory">
<lst name="replicaRouting">
<lst name="stable">
<bool name="default">true</bool>
<str name="dividend">routingDividend</str>
<str name="hash">q</str>
</lst>
</lst>
</shardHandlerFactory>
----
Replica routing may also be specified (overriding defaults) per-request, via the `shards.preference` request parameter. If a request contains both `dividend` and `hash`, `dividend` takes priority for routing. For configuring `stable` routing, the `hash` parameter implicitly defaults to a hash of the String value of the main query parameter (i.e., `q`).
+
The `dividend` parameter must be configured explicitly; there is no implicit default. If only `dividend` routing is desired, `hash` may be explicitly set to the empty string, entirely disabling implicit hash-based routing.
=== The <metrics> Element
The `<metrics>` element in `solr.xml` allows you to customize the metrics reported by Solr. You can define system properties that should not be returned, or define custom suppliers and reporters.
In a default `solr.xml` you will not see any `<metrics>` configuration. If you would like to customize the metrics for your installation, see the section <<metrics-reporting.adoc#metrics-configuration,Metrics Configuration>>.
== Substituting JVM System Properties in solr.xml
Solr supports variable substitution of JVM system property values in `solr.xml`, which allows runtime specification of various configuration options. The syntax is `${propertyname[:option default value]}`. This allows defining a default that can be overridden when Solr is launched. If a default value is not specified, then the property must be specified at runtime or the `solr.xml` file will generate an error when parsed.
Any JVM system properties usually specified using the `-D` flag when starting the JVM, can be used as variables in the `solr.xml` file.
For example, in the `solr.xml` file shown below, the `socketTimeout` and `connTimeout` values are each set to "60000". However, if you start Solr using `bin/solr -DsocketTimeout=1000`, the `socketTimeout` option of the `HttpShardHandlerFactory` to be overridden using a value of 1000ms, while the `connTimeout` option will continue to use the default property value of "60000".
[source,xml]
----
<solr>
<shardHandlerFactory name="shardHandlerFactory"
class="HttpShardHandlerFactory">
<int name="socketTimeout">${socketTimeout:60000}</int>
<int name="connTimeout">${connTimeout:60000}</int>
</shardHandlerFactory>
</solr>
----