blob: e8e9ce123a9633aa5c0c9dc4d0a4c650805075d2 [file] [log] [blame]
= Colocating Collections
:toclevels: 1
// Licensed to the Apache Software Foundation (ASF) under one
// or more contributor license agreements. See the NOTICE file
// distributed with this work for additional information
// regarding copyright ownership. The ASF licenses this file
// to you under the Apache License, Version 2.0 (the
// "License"); you may not use this file except in compliance
// with the License. You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing,
// software distributed under the License is distributed on an
// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
// KIND, either express or implied. See the License for the
// specific language governing permissions and limitations
// under the License.
Solr provides a way to colocate a collection with another so that cross-collection joins are always possible.
The colocation guarantee applies to all future Collection operations made either via Collections API or by Autoscaling
actions.
A collection may only be colocated with exactly one `withCollection`. However, arbitrarily many collections may be
_linked_ to the same `withCollection`.
== Create a Colocated Collection
The Create Collection API supports a parameter named `withCollection` which can be used to specify a collection
with which the replicas of the newly created collection should be colocated. See <<collection-management.adoc#create,Create Collection API>>.
`/admin/collections?action=CREATE&name=techproducts&numShards=1&replicationFactor=2&withCollection=tech_categories`
In the above example, all replicas of the `techproducts` collection will be colocated on a node with at least one
replica of the `tech_categories` collection.
== Colocating Existing Collections
When collections already exist beforehand, the <<collection-management.adoc#modifycollection, Modify Collection API>> can be
used to set the `withCollection` parameter so that the two collections can be linked. This will *not* trigger
changes to the cluster automatically because moving a large number of replicas immediately might de-stabilize the system.
Instead, it is recommended that the Suggestions UI page should be consulted on the operations that can be performed
to change the cluster manually.
Example:
`/admin/collections?action=MODIFYCOLLECTION&collection=techproducts&withCollection=tech_categories`
== Deleting Colocated Collections
Deleting a collection which has been linked to another will fail unless the link itself is deleted first by using the
<<collection-management.adoc#modifycollection, Modify Collection API>> to un-set the `withCollection` attribute.
Example:
`/admin/collections?action=MODIFYCOLLECTION&collection=techproducts&withCollection=`
== Limitations and Caveats
The collection being used as the `withCollection` must have one shard only and that shard should be named `shard1`. Note
that when using the default router, the shard name is always set to `shard1` but special care must be taken to name the
shard as `shard1` when using the implicit router.
In case new replicas of the `withCollection` have to be added to maintain the colocation guarantees then the new replicas
will be of type `NRT` only. Automatically creating replicas of `TLOG` or `PULL` types is not supported.
In case, replicas have to be moved from one node to another, perhaps in response to a node lost trigger, then the target
nodes will be chosen by preferring nodes that already have a replica of the `withCollection` so that the number of moves
is minimized. However, this also means that unless there are Autoscaling policy violations, Solr will continue to move
such replicas to already loaded nodes instead of preferring empty nodes. Therefore, it is advised to have policy rules
which can prevent such overloading by e.g., setting the maximum number of cores per node to a fixed value.
Example:
`{'cores' : '<8', 'node' : '#ANY'}`
The colocation guarantee is one-way only i.e., a collection 'X' colocated with 'Y' will always have one or more
replicas of 'Y' on any node that has a replica of 'X' but the reverse is not true. There may be nodes which have one or
more replicas of 'Y' but no replicas of 'X'. Such replicas of 'Y' will not be considered a violation of colocation
rules and will not be cleaned up automatically.