blob: f558d2083b30adfd0113a0f63a4204ef01975efa [file] [log] [blame]
= Collection Management Commands
:toclevels: 1
// Licensed to the Apache Software Foundation (ASF) under one
// or more contributor license agreements. See the NOTICE file
// distributed with this work for additional information
// regarding copyright ownership. The ASF licenses this file
// to you under the Apache License, Version 2.0 (the
// "License"); you may not use this file except in compliance
// with the License. You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing,
// software distributed under the License is distributed on an
// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
// KIND, either express or implied. See the License for the
// specific language governing permissions and limitations
// under the License.
A collection is a single logical index that uses a single Solr configuration file (`solrconfig.xml`) and a single index schema.
[[create]]
== CREATE: Create a Collection
`/admin/collections?action=CREATE&name=_name_`
=== CREATE Parameters
The CREATE action allows the following parameters:
`name`::
The name of the collection to be created. This parameter is required.
`router.name`::
The router name that will be used. The router defines how documents will be distributed among the shards. Possible values are `implicit` or `compositeId`, which is the default.
+
The `implicit` router does not automatically route documents to different shards. Whichever shard you indicate on the indexing request (or within each document) will be used as the destination for those documents.
+
The `compositeId` router hashes the value in the uniqueKey field and looks up that hash in the collection's clusterstate to determine which shard will receive the document, with the additional ability to manually direct the routing.
+
When using the `implicit` router, the `shards` parameter is required. When using the `compositeId` router, the `numShards` parameter is required.
+
For more information, see also the section <<shards-and-indexing-data-in-solrcloud.adoc#document-routing,Document Routing>>.
`numShards`::
The number of shards to be created as part of the collection. This is a required parameter when the `router.name` is `compositeId`.
`shards`::
A comma separated list of shard names, e.g., `shard-x,shard-y,shard-z`. This is a required parameter when the `router.name` is `implicit`.
`replicationFactor`::
The number of replicas to be created for each shard. The default is `1`.
+
This will create a NRT type of replica. If you want another type of replica, see the `tlogReplicas` and `pullReplica` parameters below. See the section <<shards-and-indexing-data-in-solrcloud.adoc#types-of-replicas,Types of Replicas>> for more information about replica types.
`nrtReplicas`::
The number of NRT (Near-Real-Time) replicas to create for this collection. This type of replica maintains a transaction log and updates its index locally. If you want all of your replicas to be of this type, you can simply use `replicationFactor` instead.
`tlogReplicas`::
The number of TLOG replicas to create for this collection. This type of replica maintains a transaction log but only updates its index via replication from a leader. See the section <<shards-and-indexing-data-in-solrcloud.adoc#types-of-replicas,Types of Replicas>> for more information about replica types.
`pullReplicas`::
The number of PULL replicas to create for this collection. This type of replica does not maintain a transaction log and only updates its index via replication from a leader. This type is not eligible to become a leader and should not be the only type of replicas in the collection. See the section <<shards-and-indexing-data-in-solrcloud.adoc#types-of-replicas,Types of Replicas>> for more information about replica types.
`maxShardsPerNode`::
When creating collections, the shards and/or replicas are spread across all available (i.e., live) nodes, and two replicas of the same shard will never be on the same node.
+
If a node is not live when the CREATE action is called, it will not get any parts of the new collection, which could lead to too many replicas being created on a single live node. Defining `maxShardsPerNode` sets a limit on the number of replicas the CREATE action will spread to each node.
+
If the entire collection can not be fit into the live nodes, no collection will be created at all. The default `maxShardsPerNode` value is `1`. A value of `-1` means unlimited. If a `policy` is also specified then the stricter of `maxShardsPerNode` and policy rules apply.
`createNodeSet`::
Allows defining the nodes to spread the new collection across. The format is a comma-separated list of node_names, such as `localhost:8983_solr,localhost:8984_solr,localhost:8985_solr`.
+
If not provided, the CREATE operation will create shard-replicas spread across all live Solr nodes.
+
Alternatively, use the special value of `EMPTY` to initially create no shard-replica within the new collection and then later use the <<replica-management.adoc#addreplica,ADDREPLICA>> operation to add shard-replicas when and where required.
`createNodeSet.shuffle`::
Controls whether or not the shard-replicas created for this collection will be assigned to the nodes specified by the `createNodeSet` in a sequential manner, or if the list of nodes should be shuffled prior to creating individual replicas.
+
A `false` value makes the results of a collection creation predictable and gives more exact control over the location of the individual shard-replicas, but `true` can be a better choice for ensuring replicas are distributed evenly across nodes. The default is `true`.
+
This parameter is ignored if `createNodeSet` is not also specified.
`collection.configName`::
Defines the name of the configuration (which *must already be stored in ZooKeeper*) to use for this collection.
+
If not provided, Solr will use the configuration of `_default` configset to create a new (and mutable) configset named `<collectionName>.AUTOCREATED` and will use it for the new collection.
When such a collection is deleted, its autocreated configset will be deleted by default when it is not in use by any other collection.
`router.field`::
If this parameter is specified, the router will look at the value of the field in an input document to compute the hash and identify a shard instead of looking at the `uniqueKey` field. If the field specified is null in the document, the document will be rejected.
+
Please note that <<realtime-get.adoc#,RealTime Get>> or retrieval by document ID would also require the parameter `\_route_` (or `shard.keys`) to avoid a distributed search.
`perReplicaState`::
If `true` the states of individual replicas will be maintained as individual child of the `state.json`. The default is `false`.
`property._name_=_value_`::
Set core property _name_ to _value_. See the section <<defining-core-properties.adoc#,Defining core.properties>> for details on supported properties and values.
`autoAddReplicas`::
When set to `true`, enables automatic addition of replicas when the number of active replicas falls below the value set for `replicationFactor`. This may occur if a replica goes down, for example. The default is `false`, which means new replicas will not be added.
+
While this parameter is provided as part of Solr's set of features to provide autoscaling of clusters, it is available even when you have not implemented any other part of autoscaling (such as a policy). See the section <<solrcloud-autoscaling-auto-add-replicas.adoc#the-autoaddreplicas-parameter,SolrCloud Autoscaling Automatically Adding Replicas>> for more details about this option and how it can be used.
[WARNING]
====
The entries in each core.properties file are vital for Solr to function correctly. Overriding entries can result in unusable collections. Altering these entries by specifying `property._name_=_value_` is an expert-level option and should only be used if you have a thorough understanding of the consequences.
====
`async`::
Request ID to track this action which will be <<collections-api.adoc#asynchronous-calls,processed asynchronously>>.
`rule`::
Replica placement rules. See the section <<rule-based-replica-placement.adoc#,Rule-based Replica Placement>> for details.
`snitch`::
Details of the snitch provider. See the section <<rule-based-replica-placement.adoc#,Rule-based Replica Placement>> for details.
`policy`:: Name of the collection-level policy. See <<solrcloud-autoscaling-policy-preferences.adoc#collection-specific-policy, Defining Collection-Specific Policies >> for details.
`waitForFinalState`::
If `true`, the request will complete only when all affected replicas become active. The default is `false`, which means that the API will return the status of the single action, which may be before the new replica is online and active.
`withCollection`::
The name of the collection with which all replicas of this collection must be co-located. The collection must already exist and must have a single shard named `shard1`.
See <<colocating-collections.adoc#, Colocating collections>> for more details.
`alias`::
Starting with version 8.1 when a collection is created additionally an alias can be created
that points to this collection. This parameter allows specifying the name of this alias, effectively combining
this operation with <<collection-aliasing.adoc#createalias,CREATEALIAS>>
Collections are first created in read-write mode but can be put in `readOnly`
mode using the <<collection-management.adoc#modifycollection,MODIFYCOLLECTION>> action.
=== CREATE Response
The response will include the status of the request and the new core names. If the status is anything other than "success", an error message will explain why the request failed.
=== Examples using CREATE
*Input*
[source,text]
----
http://localhost:8983/solr/admin/collections?action=CREATE&name=newCollection&numShards=2&replicationFactor=1&wt=xml
----
*Output*
[source,xml]
----
<response>
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">3764</int>
</lst>
<lst name="success">
<lst>
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">3450</int>
</lst>
<str name="core">newCollection_shard1_replica1</str>
</lst>
<lst>
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">3597</int>
</lst>
<str name="core">newCollection_shard2_replica1</str>
</lst>
</lst>
</response>
----
[[reload]]
== RELOAD: Reload a Collection
`/admin/collections?action=RELOAD&name=_name_`
The RELOAD action is used when you have changed a configuration in ZooKeeper.
=== RELOAD Parameters
`name`::
The name of the collection to reload. This parameter is required.
`async`::
Request ID to track this action which will be <<collections-api.adoc#asynchronous-calls,processed asynchronously>>.
=== RELOAD Response
The response will include the status of the request and the cores that were reloaded. If the status is anything other than "success", an error message will explain why the request failed.
=== Examples using RELOAD
*Input*
[source,text]
----
http://localhost:8983/solr/admin/collections?action=RELOAD&name=newCollection&wt=xml
----
*Output*
[source,xml]
----
<response>
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">1551</int>
</lst>
<lst name="success">
<lst name="10.0.1.6:8983_solr">
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">761</int>
</lst>
</lst>
<lst name="10.0.1.4:8983_solr">
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">1527</int>
</lst>
</lst>
</lst>
</response>
----
[[modifycollection]]
== MODIFYCOLLECTION: Modify Attributes of a Collection
`/admin/collections?action=MODIFYCOLLECTION&collection=_<collection-name>_&__<attribute-name>__=__<attribute-value>__&__<another-attribute-name>__=__<another-value>__&__<yet_another_attribute_name>__=`
It's possible to edit multiple attributes at a time. Changing these values only updates the z-node on ZooKeeper, they do not change the topology of the collection. For instance, increasing `replicationFactor` will _not_ automatically add more replicas to the collection but _will_ allow more ADDREPLICA commands to succeed.
An attribute can be deleted by passing an empty value. For example, `yet_another_attribute_name=` (with no value) will delete the `yet_another_attribute_name` parameter from the collection.
=== MODIFYCOLLECTION Parameters
`collection`::
The name of the collection to be modified. This parameter is required.
`_attribute_=_value_`::
Key-value pairs of attribute names and attribute values.
At least one `_attribute_` parameter is required.
The attributes that can be modified are:
* maxShardsPerNode
* replicationFactor
* autoAddReplicas
* collection.configName
* rule
* snitch
* policy
* withCollection
* readOnly
* async
* other custom properties that use a `property.` prefix
See the <<create,CREATE action>> section above for details on these attributes.
[[readonlymode]]
==== Read-Only Mode
Setting the `readOnly` attribute to `true` puts the collection in read-only mode,
in which any index update requests are rejected. Other collection-level actions (e.g., adding /
removing / moving replicas) are still available in this mode.
The transition from the (default) read-write to read-only mode consists of the following steps:
* the `readOnly` flag is changed in collection state,
* any new update requests are rejected with 403 FORBIDDEN error code (ongoing
long-running requests are aborted, too),
* a forced commit is performed to flush and commit any in-flight updates.
NOTE: This may potentially take a long time if there are still major segment merges running
in the background.
* a collection <<reload, RELOAD action>> is executed.
Removing the `readOnly` property or setting it to false enables the
processing of updates and reloads the collection.
[[list]]
== LIST: List Collections
Fetch the names of the collections in the cluster.
`/admin/collections?action=LIST`
=== Examples using LIST
*Input*
[source,text]
----
http://localhost:8983/solr/admin/collections?action=LIST
----
*Output*
[source,json]
----
{
"responseHeader":{
"status":0,
"QTime":2011},
"collections":["collection1",
"example1",
"example2"]}
----
[[rename]]
== RENAME: Rename a Collection
`/admin/collections?action=RENAME&name=_existingName_&target=_targetName_`
Renaming a collection sets up a standard alias that points to the underlying collection, so
that the same (unmodified) collection can now be referred to in query, index and admin operations
using the new name.
This command does NOT actually rename the underlying Solr collection - it sets up a new one-to-one alias
using the new name, or renames the existing alias so that it uses the new name, while still referring to
the same underlying Solr collection. However, from the user's point of view the collection can now be
accessed using the new name, and the new name can be also referred to in other aliases.
The following limitations apply:
* the existing name must be either a SolrCloud collection or a standard alias referring to a single collection.
Aliases that refer to more than 1 collection are not supported.
* the existing name must not be a Routed Alias.
* the target name must not be an existing alias.
=== RENAME Command Parameters
`name`::
Name of the existing SolrCloud collection or an alias that refers to exactly one collection and is not
a Routed Alias.
`target`::
Target name of the collection. This will be the new alias that refers to the underlying SolrCloud collection.
The original name (or alias) of the collection will be replaced also in the existing aliases so that they
also refer to the new name. Target name must not be an existing alias.
=== Examples using RENAME
Assuming there are two actual SolrCloud collections named `collection1` and `collection2`,
and the following aliases already exist:
* `col1 -&gt; collection1`: this resolves to `collection1`.
* `col2 -&gt; collection2`: this resolves to `collection2`.
* `simpleAlias -&gt; col1`: this resolves to `collection1`.
* `compoundAlias -&gt; col1,col2`: this resolves to `collection1,collection2`
The RENAME of `col1` to `foo` will change the aliases to the following:
* `foo -&gt; collection1`: this resolves to `collection1`.
* `col2 -&gt; collection2`: this resolves to `collection2`.
* `simpleAlias -&gt; foo`: this resolves to `collection1`.
* `compoundAlias -&gt; foo,col2`: this resolves to `collection1,collection2`.
If we then rename `collection1` (which is an actual collection name) to `collection2` (which is also
an actual collection name) the following aliases will exist now:
* `foo -&gt; collection2`: this resolves to `collection2`.
* `col2 -&gt; collection2`: this resolves to `collection2`.
* `simpleAlias -&gt; foo`: this resolves to `collection2`.
* `compoundAlias -&gt; foo,col2`: this would resolve now to `collection2,collection2` so it's reduced to simply `collection2`.
* `collection1` -&gt; `collection2`: this newly created alias effectively hides `collection1` from regular query and
update commands, which are directed now to `collection2`.
[[delete]]
== DELETE: Delete a Collection
`/admin/collections?action=DELETE&name=_collection_`
=== DELETE Parameters
`name`::
The name of the collection to delete. This parameter is required.
`async`::
Request ID to track this action which will be <<collections-api.adoc#asynchronous-calls,processed asynchronously>>.
=== DELETE Response
The response will include the status of the request and the cores that were deleted. If the status is anything other than "success", an error message will explain why the request failed.
=== Examples using DELETE
*Input*
Delete the collection named "newCollection".
[source,text]
----
http://localhost:8983/solr/admin/collections?action=DELETE&name=newCollection&wt=xml
----
*Output*
[source,xml]
----
<response>
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">603</int>
</lst>
<lst name="success">
<lst name="10.0.1.6:8983_solr">
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">19</int>
</lst>
</lst>
<lst name="10.0.1.4:8983_solr">
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">67</int>
</lst>
</lst>
</lst>
</response>
----
[[collectionprop]]
== COLLECTIONPROP: Collection Properties
Add, edit or delete a collection property.
`/admin/collections?action=COLLECTIONPROP&name=_collectionName_&propertyName=_propertyName_&propertyValue=_propertyValue_`
=== COLLECTIONPROP Parameters
`name`::
The name of the collection for which the property would be set.
`propertyName`::
The name of the property.
`propertyValue`::
The value of the property. When not provided, the property is deleted.
=== COLLECTIONPROP Response
The response will include the status of the request and the properties that were updated or removed. If the status is anything other than "0", an error message will explain why the request failed.
=== Examples using COLLECTIONPROP
*Input*
[source,text]
----
http://localhost:8983/solr/admin/collections?action=COLLECTIONPROP&name=coll&propertyName=foo&propertyValue=bar&wt=xml
----
*Output*
[source,xml]
----
<response>
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">0</int>
</lst>
</response>
----
[[migrate]]
== MIGRATE: Migrate Documents to Another Collection
`/admin/collections?action=MIGRATE&collection=_name_&split.key=_key1!_&target.collection=_target_collection_&forward.timeout=60`
The MIGRATE command is used to migrate all documents having a given routing key to another collection. The source collection will continue to have the same data as-is but it will start re-routing write requests to the target collection for the number of seconds specified by the `forward.timeout` parameter. It is the responsibility of the user to switch to the target collection for reads and writes after the MIGRATE action completes.
The routing key specified by the `split.key` parameter may span multiple shards on both the source and the target collections. The migration is performed shard-by-shard in a single thread. One or more temporary collections may be created by this command during the ‘migrate’ process but they are cleaned up at the end automatically.
This is a long running operation and therefore using the `async` parameter is highly recommended. If the `async` parameter is not specified then the operation is synchronous by default and keeping a large read timeout on the invocation is advised. Even with a large read timeout, the request may still timeout but that doesn’t necessarily mean that the operation has failed. Users should check logs, cluster state, source and target collections before invoking the operation again.
This command works only with collections using the compositeId router. The target collection must not receive any writes during the time the MIGRATE command is running otherwise some writes may be lost.
Please note that the MIGRATE API does not perform any de-duplication on the documents so if the target collection contains documents with the same uniqueKey as the documents being migrated then the target collection will end up with duplicate documents.
=== MIGRATE Parameters
`collection`::
The name of the source collection from which documents will be split. This parameter is required.
`target.collection`::
The name of the target collection to which documents will be migrated. This parameter is required.
`split.key`::
The routing key prefix. For example, if the uniqueKey of a document is "a!123", then you would use `split.key=a!`. This parameter is required.
`forward.timeout`::
The timeout, in seconds, until which write requests made to the source collection for the given `split.key` will be forwarded to the target shard. The default is 60 seconds.
`property._name_=_value_`::
Set core property _name_ to _value_. See the section <<defining-core-properties.adoc#,Defining core.properties>> for details on supported properties and values.
`async`::
Request ID to track this action which will be <<collections-api.adoc#asynchronous-calls,processed asynchronously>>.
=== MIGRATE Response
The response will include the status of the request.
=== Examples using MIGRATE
*Input*
[source,text]
----
http://localhost:8983/solr/admin/collections?action=MIGRATE&collection=test1&split.key=a!&target.collection=test2&wt=xml
----
*Output*
[source,xml]
----
<response>
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">19014</int>
</lst>
<lst name="success">
<lst>
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">1</int>
</lst>
<str name="core">test2_shard1_0_replica1</str>
<str name="status">BUFFERING</str>
</lst>
<lst>
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">2479</int>
</lst>
<str name="core">split_shard1_0_temp_shard1_0_shard1_replica1</str>
</lst>
<lst>
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">1002</int>
</lst>
</lst>
<lst>
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">21</int>
</lst>
</lst>
<lst>
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">1655</int>
</lst>
<str name="core">split_shard1_0_temp_shard1_0_shard1_replica2</str>
</lst>
<lst>
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">4006</int>
</lst>
</lst>
<lst>
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">17</int>
</lst>
</lst>
<lst>
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">1</int>
</lst>
<str name="core">test2_shard1_0_replica1</str>
<str name="status">EMPTY_BUFFER</str>
</lst>
<lst name="192.168.43.52:8983_solr">
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">31</int>
</lst>
</lst>
<lst name="192.168.43.52:8983_solr">
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">31</int>
</lst>
</lst>
<lst>
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">1</int>
</lst>
<str name="core">test2_shard1_1_replica1</str>
<str name="status">BUFFERING</str>
</lst>
<lst>
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">1742</int>
</lst>
<str name="core">split_shard1_1_temp_shard1_1_shard1_replica1</str>
</lst>
<lst>
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">1002</int>
</lst>
</lst>
<lst>
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">15</int>
</lst>
</lst>
<lst>
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">1917</int>
</lst>
<str name="core">split_shard1_1_temp_shard1_1_shard1_replica2</str>
</lst>
<lst>
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">5007</int>
</lst>
</lst>
<lst>
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">8</int>
</lst>
</lst>
<lst>
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">1</int>
</lst>
<str name="core">test2_shard1_1_replica1</str>
<str name="status">EMPTY_BUFFER</str>
</lst>
<lst name="192.168.43.52:8983_solr">
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">30</int>
</lst>
</lst>
<lst name="192.168.43.52:8983_solr">
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">30</int>
</lst>
</lst>
</lst>
</response>
----
[[reindexcollection]]
== REINDEXCOLLECTION: Re-Index a Collection
`/admin/collections?action=REINDEXCOLLECTION&name=_name_`
The REINDEXCOLLECTION command reindexes a collection using existing data from the
source collection.
NOTE: Reindexing is potentially a lossy operation - some of the existing indexed data that is not
available as stored fields may be lost, so users should use this command
with caution, evaluating the potential impact by using different source and target
collection names first, and preserving the source collection until the evaluation is
complete.
The target collection must not exist (and may not be an alias). If the target
collection name is the same as the source collection then first a unique sequential name
will be generated for the target collection, and then after reindexing is done an alias
will be created that points from the source name to the actual sequentially-named target collection.
When reindexing is started the source collection is put in <<readonlymode,read-only mode>> to ensure that
all source documents are properly processed.
Using optional parameters a different index schema, collection shape (number of shards and replicas)
or routing parameters can be requested for the target collection.
Reindexing is executed as a streaming expression daemon, which runs on one of the
source collection's replicas. It is usually a time-consuming operation so it's recommended to execute
it as an asynchronous request in order to avoid request time outs. Only one reindexing operation may
execute concurrently for a given source collection. Long-running, erroneous or crashed reindexing
operations may be terminated by using the `abort` option, which also removes partial results.
=== REINDEXCOLLECTION Parameters
`name`::
Source collection name, may be an alias. This parameter is required.
`cmd`::
Optional command. Default command is `start`. Currently supported commands are:
* `start` - default, starts processing if not already running,
* `abort` - aborts an already running reindexing (or clears a left-over status after a crash),
and deletes partial results,
* `status` - returns detailed status of a running reindexing command.
`target`::
Target collection name, optional. If not specified a unique name will be generated and
after all documents have been copied an alias will be created that points from the source
collection name to the unique sequentially-named collection, effectively "hiding"
the original source collection from regular update and search operations.
`q`::
Optional query to select documents for reindexing. Default value is `\*:*`.
`fl`::
Optional list of fields to reindex. Default value is `*`.
`rows`::
Documents are transferred in batches. Depending on the average size of the document large
batch sizes may cause memory issues. Default value is 100.
`configName`::
`collection.configName`::
Optional name of the configset for the target collection. Default is the same as the
source collection.
There's a number of optional parameters that determine the target collection layout. If they
are not specified in the request then their values are copied from the source collection.
The following parameters are currently supported (described in details in the <<create,CREATE collection>> section):
`numShards`, `replicationFactor`, `nrtReplicas`, `tlogReplicas`, `pullReplicas`, `maxShardsPerNode`,
`autoAddReplicas`, `shards`, `policy`, `createNodeSet`, `createNodeSet.shuffle`, `router.*`.
`removeSource`::
Optional boolean. If true then after the processing is successfully finished the source collection will
be deleted.
`async`::
Optional request ID to track this action which will be <<collections-api.adoc#asynchronous-calls,processed asynchronously>>.
When the reindexing process has completed the target collection is marked using
`property.rx: "finished"`, and the source collection state is updated to become read-write.
On any errors the command will delete any temporary and target collections and also reset the
state of the source collection's read-only flag.
=== Examples using REINDEXCOLLECTION
*Input*
[source,text]
----
http://localhost:8983/solr/admin/collections?action=REINDEXCOLLECTION&name=newCollection&numShards=3&configName=conf2&q=id:aa*&fl=id,string_s
----
This request specifies a different schema for the target collection, copies only some of the fields, selects only the documents
matching a query, and also potentially re-shapes the collection by explicitly specifying 3 shards. Since the target collection
hasn't been specified in the parameters, a collection with a unique name, e.g., `.rx_newCollection_2`, will be created and on success
an alias pointing from `newCollection` to `.rx_newCollection_2` will be created, effectively replacing the source collection
for the purpose of indexing and searching. The source collection is assumed to be small so a synchronous request was made.
*Output*
[source,json]
----
{
"responseHeader":{
"status":0,
"QTime":10757},
"reindexStatus":{
"phase":"done",
"inputDocs":13416,
"processedDocs":376,
"actualSourceCollection":".rx_newCollection_1",
"state":"finished",
"actualTargetCollection":".rx_newCollection_2",
"checkpointCollection":".rx_ck_newCollection"
}
}
----
As a result a new collection `.rx_newCollection_2` has been created, with selected documents reindexed to 3 shards, and
with an alias pointing from `newCollection` to this one. The status also shows that the source collection
was already an alias to `.rx_newCollection_1`, which was likely a result of a previous reindexing.
[[colstatus]]
== COLSTATUS: Detailed Status of a Collection's Indexes
The COLSTATUS command provides a detailed description of the collection status, including low-level index
information about segments and field data.
This command also checks the compliance of Lucene index field types with the current Solr collection
schema and indicates the names of non-compliant fields, i.e., Lucene fields with field types incompatible
(or different) from the corresponding Solr field types declared in the current schema. Such incompatibilities may
result from incompatible schema changes or after migration of
data to a different major Solr release.
`/admin/collections?action=COLSTATUS&collection=coll&coreInfo=true&segments=true&fieldInfo=true&sizeInfo=true`
=== COLSTATUS Parameters
`collection`::
Collection name (optional). If missing then it means all collections.
`coreInfo`::
Optional boolean. If true then additional information will be provided about
SolrCore of shard leaders.
`segments`::
Optional boolean. If true then segment information will be provided.
`fieldInfo`::
Optional boolean. If true then detailed Lucene field information will be provided
and their corresponding Solr schema types.
`sizeInfo`::
Optional boolean. If true then additional information about the index files
size and their RAM usage will be provided.
==== Index Size Analysis Tool
The `COLSTATUS` command also provides a tool for analyzing and estimating the composition of raw index data. Please note that
this tool should be used with care because it generates a significant IO load on all shard leaders of the
analyzed collections. A sampling threshold and a sampling percent parameters can be adjusted to reduce this
load to some degree.
Size estimates produced by this tool are only approximate and represent the aggregated size of uncompressed
index data. In reality these values would never occur, because Lucene (and Solr) always stores data in a
compressed format - still, these values help to understand what occupies most of the space and the relative size
of each type of data and each field in the index.
In the following sections whenever "size" is mentioned it means an estimated aggregated size of
uncompressed (raw) data.
The following parameters are specific to this tool:
`rawSize`::
Optional boolean. If true then run the raw index data analysis tool (other boolean options below imply
this option if any of them are true). Command response will include sections that show estimated breakdown of
data size per field and per data type.
`rawSizeSummary`::
Optional boolean. If true then include also a more detailed breakdown of data size per field and per type.
`rawSizeDetails`::
Optional boolean. If true then provide exhaustive details that include statistical distribution of items per
field and per type as well as top 20 largest items per field.
`rawSizeSamplingPercent`::
Optional float. When the index is larger than a certain threshold (100k documents per shard) only a part of
data is actually retrieved and analyzed in order to reduce the IO load, and then the final results are extrapolated.
Values must be greater than 0 and less or equal to 100.0. Default value is 5.0. Very small values (between 0.0 and 1.0)
may introduce significant estimation errors. Also, values that would result in less than 10 documents being sampled
are rejected with an exception.
Response for this command always contains two sections:
* `fieldsBySize` is a map where field names are keys and values are estimated sizes of raw (uncompressed) data
that belongs to the field. The map is sorted by size so that it's easy to see what field occupies most space.
* `typesBySize` is a map where data types are the keys and values are estimates sizes of raw (uncompressed) data
of particular type. This map is also sorted by size.
Optional sections include:
* `summary` section containing a breakdown of data sizes for each field by data type.
* `details` section containing detailed statistical summary of size distribution within each field, per data type.
This section also shows `topN` values by size from each field.
Data types shown in the response can be roughly divided into the following groups:
* `storedFields` - represents the raw uncompressed data in stored fields. For example, for UTF-8 strings this represents
the aggregated sum of the number of bytes in the strings' UTF-8 representation, for long numbers this is 8 bytes per value, etc.
* `terms_terms` - represents the aggregated size of the term dictionary. The size of this data is affected by the
the number and length of unique terms, which in turn depends on the field size and the analysis chain.
* `terms_postings` - represents the aggregated size of all term position and offset information, if present.
This information may be absent if position-based searching, such as phrase queries, is not needed.
* `terms_payloads` - represents the aggregated size of all per-term payload data, if present.
* `norms` - represents the aggregated size of field norm information. This information may be omitted if a field
has an `omitNorms` flag in the schema, which is common for fields that don't need weighting or scoring by field length.
* `termVectors` - represents the aggregated size of term vectors.
* `docValues_*` - represents aggregated size of doc values, by type (e.g., `docValues_numeric`, `docValues_binary`, etc).
* `points` - represents aggregated size of point values.
=== COLSTATUS Response
The response will include an overview of the collection status, the number of
active or inactive shards and replicas, and additional index information
of shard leaders.
=== Examples using COLSTATUS
*Input*
[source,text]
----
http://localhost:8983/solr/admin/collections?action=COLSTATUS&collection=gettingstarted&fieldInfo=true&sizeInfo=true
----
*Output*
[source,json]
----
{
"responseHeader": {
"status": 0,
"QTime": 50
},
"gettingstarted": {
"stateFormat": 2,
"znodeVersion": 16,
"properties": {
"autoAddReplicas": "false",
"maxShardsPerNode": "-1",
"nrtReplicas": "2",
"pullReplicas": "0",
"replicationFactor": "2",
"router": {
"name": "compositeId"
},
"tlogReplicas": "0"
},
"activeShards": 2,
"inactiveShards": 0,
"schemaNonCompliant": [
"(NONE)"
],
"shards": {
"shard1": {
"state": "active",
"range": "80000000-ffffffff",
"replicas": {
"total": 2,
"active": 2,
"down": 0,
"recovering": 0,
"recovery_failed": 0
},
"leader": {
"coreNode": "core_node4",
"core": "gettingstarted_shard1_replica_n1",
"base_url": "http://192.168.0.80:8983/solr",
"node_name": "192.168.0.80:8983_solr",
"state": "active",
"type": "NRT",
"force_set_state": "false",
"leader": "true",
"segInfos": {
"info": {
"minSegmentLuceneVersion": "9.0.0",
"commitLuceneVersion": "9.0.0",
"numSegments": 40,
"segmentsFileName": "segments_w",
"totalMaxDoc": 686953,
"userData": {
"commitCommandVer": "1627350608019193856",
"commitTimeMSec": "1551962478819"
}
},
"fieldInfoLegend": [
"I - Indexed",
"D - DocValues",
"xxx - DocValues type",
"V - TermVector Stored",
"O - Omit Norms",
"F - Omit Term Frequencies & Positions",
"P - Omit Positions",
"H - Store Offsets with Positions",
"p - field has payloads",
"s - field uses soft deletes",
":x:x:x - point data dim : index dim : num bytes"
],
"segments": {
"_i": {
"name": "_i",
"delCount": 738,
"softDelCount": 0,
"hasFieldUpdates": false,
"sizeInBytes": 109398213,
"size": 70958,
"age": "2019-03-07T12:34:24.761Z",
"source": "merge",
"version": "9.0.0",
"createdVersionMajor": 9,
"minVersion": "9.0.0",
"diagnostics": {
"os": "Mac OS X",
"java.vendor": "Oracle Corporation",
"java.version": "1.8.0_191",
"java.vm.version": "25.191-b12",
"lucene.version": "9.0.0",
"mergeMaxNumSegments": "-1",
"os.arch": "x86_64",
"java.runtime.version": "1.8.0_191-b12",
"source": "merge",
"mergeFactor": "10",
"os.version": "10.14.3",
"timestamp": "1551962064761"
},
"attributes": {
"Lucene50StoredFieldsFormat.mode": "BEST_SPEED"
},
"largestFiles": {
"_i.fdt": "42.5 MB",
"_i_Lucene80_0.dvd": "35.3 MB",
"_i_Lucene50_0.pos": "11.1 MB",
"_i_Lucene50_0.doc": "10 MB",
"_i_Lucene50_0.tim": "4.3 MB"
},
"ramBytesUsed": {
"total": 49153,
"postings [PerFieldPostings(segment=_i formats=1)]": {
"total": 31023,
"fields": {
"dc": {
"flags": "I-----------",
"schemaType": "text_general"
},
"dc_str": {
"flags": "-Dsrs-------",
"schemaType": "strings"
},
"dc.title": {
"flags": "I-----------",
"docCount": 70958,
"sumDocFreq": 646756,
"sumTotalTermFreq": 671817,
"schemaType": "text_general"
},
"dc.date": {
"flags": "-Dsrn-------:1:1:8",
"schemaType": "pdates"
}
}}}}}}}}}}}
----
Example of using the raw index data analysis tool:
*Input*
[source,text]
----
http://localhost:8983/solr/admin/collections?action=COLSTATUS&collection=gettingstarted&rawSize=true&rawSizeSamplingPercent=0.1
----
*Output*
[source,json]
----
{
"responseHeader": {
"status": 0,
"QTime": 26812
},
"gettingstarted": {
"stateFormat": 2,
"znodeVersion": 33,
"properties": {
"autoAddReplicas": "false",
"maxShardsPerNode": "-1",
"nrtReplicas": "2",
"pullReplicas": "0",
"replicationFactor": "2",
"router": {
"name": "compositeId"
},
"tlogReplicas": "0"
},
"activeShards": 2,
"inactiveShards": 0,
"schemaNonCompliant": [
"(NONE)"
],
"shards": {
"shard1": {
"state": "active",
"range": "80000000-ffffffff",
"replicas": {
"total": 2,
"active": 2,
"down": 0,
"recovering": 0,
"recovery_failed": 0
},
"leader": {
"coreNode": "core_node5",
"core": "gettingstarted_shard1_replica_n2",
"base_url": "http://192.168.0.80:8983/solr",
"node_name": "192.168.0.80:8983_solr",
"state": "active",
"type": "NRT",
"force_set_state": "false",
"leader": "true",
"segInfos": {
"info": {
"minSegmentLuceneVersion": "9.0.0",
"commitLuceneVersion": "9.0.0",
"numSegments": 46,
"segmentsFileName": "segments_4h",
"totalMaxDoc": 3283741,
"userData": {
"commitCommandVer": "1635676266902323200",
"commitTimeMSec": "1559902446318"
}
},
"rawSize": {
"fieldsBySize": {
"revision.text": "7.9 GB",
"revision.text_str": "734.7 MB",
"revision.comment_str": "259.1 MB",
"revision": "239.2 MB",
"revision.sha1": "211.9 MB",
"revision.comment": "201.3 MB",
"title": "114.9 MB",
"revision.contributor": "103.5 MB",
"revision.sha1_str": "96.4 MB",
"revision.id": "75.2 MB",
"ns": "75.2 MB",
"revision.timestamp": "75.2 MB",
"revision.contributor.id": "74.7 MB",
"revision.format": "69 MB",
"id": "65 MB",
"title_str": "26.8 MB",
"revision.model_str": "25.4 MB",
"_version_": "24.9 MB",
"_root_": "24.7 MB",
"revision.contributor.ip_str": "22 MB",
"revision.contributor_str": "21.8 MB",
"revision_str": "15.5 MB",
"revision.contributor.ip": "13.5 MB",
"restrictions_str": "428.7 KB",
"restrictions": "164.2 KB",
"name_str": "84 KB",
"includes_str": "8.8 KB"
},
"typesBySize": {
"storedFields": "7.8 GB",
"docValues_sortedSet": "1.2 GB",
"terms_postings": "788.8 MB",
"terms_terms": "342.2 MB",
"norms": "237 MB",
"docValues_sortedNumeric": "124.3 MB",
"points": "115.7 MB",
"docValues_numeric": "24.9 MB",
"docValues_sorted": "18.5 MB"
}
}
}
}
},
"shard2": {
"state": "active",
"range": "0-7fffffff",
"replicas": {
"total": 2,
"active": 2,
"down": 0,
"recovering": 0,
"recovery_failed": 0
},
"leader": {
"coreNode": "core_node8",
"core": "gettingstarted_shard2_replica_n6",
"base_url": "http://192.168.0.80:8983/solr",
"node_name": "192.168.0.80:8983_solr",
"state": "active",
"type": "NRT",
"force_set_state": "false",
"leader": "true",
"segInfos": {
"info": {
"minSegmentLuceneVersion": "9.0.0",
"commitLuceneVersion": "9.0.0",
"numSegments": 55,
"segmentsFileName": "segments_4d",
"totalMaxDoc": 3284863,
"userData": {
"commitCommandVer": "1635676259742646272",
"commitTimeMSec": "1559902445005"
}
},
"rawSize": {
"fieldsBySize": {
"revision.text": "8.3 GB",
"revision.text_str": "687.5 MB",
"revision": "238.9 MB",
"revision.sha1": "212 MB",
"revision.comment_str": "211.5 MB",
"revision.comment": "201.7 MB",
"title": "115.9 MB",
"revision.contributor": "103.4 MB",
"revision.sha1_str": "96.3 MB",
"ns": "75.2 MB",
"revision.id": "75.2 MB",
"revision.timestamp": "75.2 MB",
"revision.contributor.id": "74.6 MB",
"revision.format": "69 MB",
"id": "67 MB",
"title_str": "29.5 MB",
"_version_": "24.8 MB",
"revision.model_str": "24 MB",
"revision.contributor_str": "21.7 MB",
"revision.contributor.ip_str": "20.9 MB",
"revision_str": "15.5 MB",
"revision.contributor.ip": "13.8 MB",
"restrictions_str": "411.1 KB",
"restrictions": "132.9 KB",
"name_str": "42 KB",
"includes_str": "41 KB"
},
"typesBySize": {
"storedFields": "8.2 GB",
"docValues_sortedSet": "1.1 GB",
"terms_postings": "787.4 MB",
"terms_terms": "337.5 MB",
"norms": "236.6 MB",
"docValues_sortedNumeric": "124.1 MB",
"points": "115.7 MB",
"docValues_numeric": "24.9 MB",
"docValues_sorted": "20.5 MB"
}
}
}
}
}
}
}
}
----
[[backup]]
== BACKUP: Backup Collection
Backs up Solr collections and associated configurations to a shared filesystem - for example a Network File System.
`/admin/collections?action=BACKUP&name=myBackupName&collection=myCollectionName&location=/path/to/my/shared/drive`
The BACKUP command will backup Solr indexes and configurations for a specified collection. The BACKUP command <<making-and-restoring-backups.adoc#,takes one copy from each shard for the indexes>>. For configurations, it backs up the configset that was associated with the collection and metadata.
Backup data is stored in the repository based on the provided `name` and `location`.
Each backup location can hold multiple backups for the same collection, allowing users to later restore from any of these "backup points" as desired.
Within a location backups are done incrementally, so that index files uploaded previously are skipped and not duplicated in the backup repository.
[NOTE]
====
Previous versions of Solr supported a different snapshot-based backup method without the incremental support described above.
Solr can still restore from backups that use this old format, but creating new backups of this format is not recommended and snapshot-based backups are officially deprecated.
See the `incremental` parameter below for more information.
====
=== BACKUP Parameters
`collection`::
The name of the collection to be backed up. This parameter is required.
`name`::
What to name the backup that is created. This is checked to make sure it doesn't already exist, and otherwise an error message is raised. This parameter is required.
`location`::
The location on a shared drive for the backup command to write to. This parameter is required, unless a default location is defined on the repository configuration, or set as a <<cluster-node-management.adoc#clusterprop,cluster property>>.
+
If the location path is on a mounted drive, the mount must be available on the node that serves as the overseer, even if the overseer node does not host a replica of the collection being backed up.
Since any node can take the overseer role at any time, a best practice to avoid possible backup failures is to ensure the mount point is available on all nodes of the cluster.
+
Each backup location can only hold a backup for one collection, however the same location can be used for repeated backups of the same collection. Repeated backups of the same collection are done incrementally, so that files unchanged since the last backup are not duplicated in the backup repository.
`async`::
Request ID to track this action which will be <<collections-api.adoc#asynchronous-calls,processed asynchronously>>.
`repository`::
The name of a repository to be used for the backup. If no repository is specified then the local filesystem repository will be used automatically.
`maxNumBackupPoints`::
The upper-bound on how many backups should be retained at the backup location.
If the current number exceeds this bound, older backups will be deleted until only `maxNumBackupPoints` backups remain.
This parameter has no effect if `incremental=false` is specified.
`incremental`::
A boolean parameter allowing users to choose whether to create an incremental (`incremental=true`) or a "snapshot" (`incremental=false`) backup.
If unspecified, backups are done incrementally by default.
Incremental backups are preferred in all known circumstances and snapshot backups are deprecated, so this parameter should only be used after much consideration.
[[listbackup]]
== LISTBACKUP: List Backups
Lists information about each backup stored at the specified repository location.
Basic metadata is returned about each backup including: the timestamp the backup was created, the Lucene version used to create the index, and the size of the backup both in number of files and total filesize.
[NOTE]
====
Previous versions of Solr supported a different snapshot-based backup file structure that did not support the storage of multiple backups at the same location.
Solr can still restore backups stored in this old format, but it is deprecated and will be removed in subsequent versions of Solr.
The LISTBACKUP API does not support the deprecated format and attempts to use this API on a location holding an older backup will result in an error message.
====
The file structure used by Solr internally to represent backups changed in 8.9.0.
While backups created prior to this format change can still be restored, the `LISTBACKUP` and `DELETEBACKUP` API commands are only valid on this newer format.
Attempting to use them on a location holding an older backup will result in an error message.
=== LISTBACKUP Parameters
`name`::
The name of the backups to list.
The backup name usually corresponds to the collection-name, but isn't required to.
This parameter is required.
`location`::
The repository location to list backups from. This parameter is required, unless a default location is defined on the repository configuration, or set as a <<cluster-node-management.adoc#clusterprop,cluster property>>.
+
If the location path is on a mounted drive, the mount must be available on the node that serves as the overseer, even if the overseer node does not host a replica of the collection being backed up.
Since any node can take the overseer role at any time, a best practice to avoid possible backup failures is to ensure the mount point is available on all nodes of the cluster.
`repository`::
The name of a repository to be used for accessing backup information.
If no repository is specified then the local filesystem repository will be used automatically.
`async`::
Request ID to track this action which will be <<collections-api.adoc#asynchronous-calls,processed asynchronously>>.
=== LISTBACKUP Example
*Input*
[.dynamic-tabs]
--
[example.tab-pane#v1listbackup]
====
[.tab-label]*V1 API*
[source,bash]
----
http://localhost:8983/solr/admin/collections?action=LISTBACKUP&name=myBackupName&location=/path/to/my/shared/drive
----
====
[example.tab-pane#v2listbackup]
====
[.tab-label]*V2 API*
[source,bash]
----
POST http://localhost:8983/v2/collections/backups
{
"list-backups" : {
"name": "myBackupName",
"location": "/path/to/my/shared/drive"
}
}
----
====
--
*Output*
[source,json]
----
{
"responseHeader":{
"status":0,
"QTime":4},
"collection":"books",
"backups":[{
"indexFileCount":0,
"indexSizeMB":0.0,
"shardBackupIds":{
"shard2":"md_shard2_0.json",
"shard1":"md_shard1_0.json"},
"collection.configName":"books",
"backupId":0,
"collectionAlias":"books",
"startTime":"2021-02-09T03:19:52.085653Z",
"indexVersion":"9.0.0"},
{
"indexFileCount":0,
"indexSizeMB":0.0,
"shardBackupIds":{
"shard2":"md_shard2_1.json",
"shard1":"md_shard1_1.json"},
"collection.configName":"books",
"backupId":1,
"collectionAlias":"books",
"startTime":"2021-02-09T03:19:52.268804Z",
"indexVersion":"9.0.0"}]}
----
[[restore]]
== RESTORE: Restore Collection
Restores Solr indexes and associated configurations to a specified collection.
`/admin/collections?action=RESTORE&name=myBackupName&location=/path/to/my/shared/drive&collection=myRestoredCollectionName`
The RESTORE operation will replace the content of a collection with files from the specified backup.
If the provided `collection` value matches an existing collection, Solr will use it for restoration, assuming it is compatible (same number of shards, etc.) with the stored backup files.
If the provided `collection` value doesn't exist, a new collection with that name is created in a way compatible with the stored backup files.
The collection created will be have the same number of shards and replicas as the original collection, preserving routing information, etc. Optionally, you can override some parameters documented below.
While restoring, if a configset with the same name exists in ZooKeeper then Solr will reuse that, or else it will upload the backed up configset in ZooKeeper and use that.
You can use the collection <<collection-aliasing.adoc#createalias,CREATEALIAS>> command to make sure clients don't need to change the endpoint to query or index against the newly restored collection.
=== RESTORE Parameters
`collection`::
The collection where the indexes will be restored into. This parameter is required.
`name`::
The name of the existing backup that you want to restore. This parameter is required.
`location`::
The location on a shared drive for the RESTORE command to read from. Alternately it can be set as a <<cluster-node-management.adoc#clusterprop,cluster property>>.
`async`::
Request ID to track this action which will be <<collections-api.adoc#asynchronous-calls,processed asynchronously>>.
`repository`::
The name of a repository to be used for the backup. If no repository is specified then the local filesystem repository will be used automatically.
`backupId`::
The ID of a specific backup point to restore from.
+
Backup locations can hold multiple backups of the same collection. This parameter allows users to choose which of those backups should be used to restore from. If not specified the most recent backup point is used.
There are also optional parameters that determine the target collection layout.
The following parameters are currently supported (described in detail in the <<create,CREATE collection>> section):
`createNodeSet`, `createNodeSet.shuffle`.
Note: for `createNodeSet` the special value of `EMPTY` is not allowed with this command.
*Overridable Parameters*
Additionally, there are several parameters that may have been set on the original collection that can be overridden when restoring the backup (described in detail in the <<create,CREATE collection>> section):
`collection.configName`, `replicationFactor`, `nrtReplicas`, `tlogReplicas`, `pullReplicas`, `property._name_=_value_`.
[[deletebackup]]
== DELETEBACKUP: Delete backup files from the remote repository
Deletes backup files stored at the specified repository location.
[NOTE]
====
Previous versions of Solr supported a different snapshot-based backup file structure that did not support the storage of multiple backups at the same location.
Solr can still restore backups stored in this old format, but it is deprecated and will be removed in subsequent versions of Solr.
The DELETEBACKUP API does not support the deprecated format and attempts to use this API on a location holding an older backup will result in an error message.
====
Solr allows storing multiple backups for the same collection at any given logical "location".
These backup points are each given an identifier (`backupId`) which can be used to delete them specifically with this API.
Alternatively Solr can be told to keep the last `maxNumBackupPoints` backups, deleting everything else at the given location.
Deleting backup points in these ways can orphan index files that are no longer referenced by any backup points.
These orphaned files can be detected and deleted using the `purgeUnused` option.
See the parameter descriptions below for more information.
=== DELETEBACKUP Example
*Input*
The following API command deletes the first backup (`backupId=0`) at the specified repository location.
[.dynamic-tabs]
--
[example.tab-pane#v1deletebackup]
====
[.tab-label]*V1 API*
[source,bash]
----
http://localhost:8983/solr/admin/collections?action=DELETEBACKUP&name=myBackupName&location=/path/to/my/shared/drive&backupId=0
----
====
[example.tab-pane#v2deletebackup]
====
[.tab-label]*V2 API*
[source,bash]
----
POST http://localhost:8983/v2/collections/backups
{
"delete-backups" : {
"name": "myBackupName",
"location": "/path/to/my/shared/drive",
"backupId": 0
}
}
----
====
--
*Output*
[source,json]
----
{
"responseHeader":{
"status":0,
"QTime":940},
"deleted":[[
"startTime","2021-02-09T03:19:52.085653Z",
"backupId",0,
"size",28381,
"numFiles",53]],
"collection":"books"}
----
=== DELETEBACKUP Parameters
`name`::
The backup name to delete backup files from. This parameter is required.
`location`::
The repository location to delete backups from. This parameter is required, unless a default location is defined on the repository configuration, or set as a <<cluster-node-management.adoc#clusterprop,cluster property>>.
+
If the location path is on a mounted drive, the mount must be available on the node that serves as the overseer, even if the overseer node does not host a replica of the collection being backed up.
Since any node can take the overseer role at any time, a best practice to avoid possible backup failures is to ensure the mount point is available on all nodes of the cluster.
`repository`::
The name of a repository to be used for deleting backup files. If no repository is specified then the local filesystem repository will be used automatically.
`backupId`::
Explicitly specify a single backup-ID to delete.
Only one of `backupId`, `maxNumBackupPoints`, and `purgeUnused` may be specified per DELETEBACKUP request.
`maxNumBackupPoints`::
Specify how many backups should be retained, deleting all others.
Only one of `backupId`, `maxNumBackupPoints`, and `purgeUnused` may be specified per DELETEBACKUP request.
`purgeUnused`::
Solr's incremental backup support can orphan files if the backups referencing them are deleted.
The `purgeUnused` flag parameter triggers a scan to detect these orphaned files and delete them.
Administrators doing repeated backups at the same location should plan on using this parameter sporadically to reclaim disk space.
Only one of `backupId`, `maxNumBackupPoints`, and `purgeUnused` may be specified per DELETEBACKUP request.
`async`::
Request ID to track this action which will be <<collections-api.adoc#asynchronous-calls,processed asynchronously>>.
[[rebalanceleaders]]
== REBALANCELEADERS: Rebalance Leaders
Reassigns leaders in a collection according to the preferredLeader property across active nodes.
`/admin/collections?action=REBALANCELEADERS&collection=collectionName`
Leaders are assigned in a collection according to the `preferredLeader` property on active nodes. This command should be run after the preferredLeader property has been assigned via the BALANCESHARDUNIQUE or ADDREPLICAPROP commands.
NOTE: It is not _required_ that all shards in a collection have a `preferredLeader` property. Rebalancing will only attempt to reassign leadership to those replicas that have the `preferredLeader` property set to `true` _and_ are not currently the shard leader _and_ are currently active.
=== REBALANCELEADERS Parameters
`collection`::
The name of the collection to rebalance `preferredLeaders` on. This parameter is required.
`maxAtOnce`::
The maximum number of reassignments to have queue up at once. Values \<=0 are use the default value Integer.MAX_VALUE.
+
When this number is reached, the process waits for one or more leaders to be successfully assigned before adding more to the queue.
`maxWaitSeconds`::
Defaults to `60`. This is the timeout value when waiting for leaders to be reassigned. If `maxAtOnce` is less than the number of reassignments that will take place, this is the maximum interval that any _single_ wait for at least one reassignment.
+
For example, if 10 reassignments are to take place and `maxAtOnce` is `1` and `maxWaitSeconds` is `60`, the upper bound on the time that the command may wait is 10 minutes.
=== REBALANCELEADERS Response
The response will include the status of the request. A status of "0" indicates the request was _processed_, not that all assignments were successful. Examine the "Summary" section for that information.
=== Examples using REBALANCELEADERS
*Input*
Either of these commands would cause all the active replicas that had the `preferredLeader` property set and were _not_ already the preferred leader to become leaders.
[source,text]
----
http://localhost:8983/solr/admin/collections?action=REBALANCELEADERS&collection=collection1&wt=json
http://localhost:8983/solr/admin/collections?action=REBALANCELEADERS&collection=collection1&maxAtOnce=5&maxWaitSeconds=30&wt=json
----
*Output*
In this example:
* In the "alreadyLeaders" section, core_node5 was already the leader, so there were no changes in leadership for shard1.
* In the "inactivePreferreds" section, core_node57 had the preferredLeader property set, but the node was not active, the leader for shard7 was not changed. This is considered successful.
* In the "successes" section, core_node23 was _not_ the leader for shard3, so leadership was assigned to that replica.
The "Summary" section with the "Success" tag indicates that the command rebalanced all _active_ replicas with the preferredLeader property set as requried. If a replica cannot be made leader due to not being healthy (for example, it is on a Solr instance that is not running), it's also considered success.
[source,json]
----
{
"responseHeader":{
"status":0,
"QTime":3054},
"Summary":{
"Success":"All active replicas with the preferredLeader property set are leaders"},
"alreadyLeaders":{
"core_node5":{
"status":"skipped",
"msg":"Replica core_node5 is already the leader for shard shard1. No change necessary"}},
"inactivePreferreds":{
"core_node57":{
"status":"skipped",
"msg":"Replica core_node57 is a referredLeader for shard shard7, but is inactive. No change necessary"}},
"successes":{
"shard3":{
"status":"success",
"msg":"Successfully changed leader of slice shard3 to core_node23"}}}
----
Examining the clusterstate after issuing this call should show that every active replica that has the `preferredLeader` property should also have the "leader" property set to _true_.
NOTE: The added work done by an NRT leader is quite small and only present when indexing. The primary use-case is to redistribute the leader role if there are a large number of leaders concentrated on a small number of nodes. Rebalancing will likely not improve performance unless the imbalance of leadership roles is measured in multiples of 10.
NOTE: The BALANCESHARDUNIQUE command that distributes the preferredLeader property does not guarantee perfect distribution and in some collection topologies it is impossible to make that guarantee.