| = Colocating Collections |
| :page-toclevels: 1 |
| :page-tocclass: right |
| // Licensed to the Apache Software Foundation (ASF) under one |
| // or more contributor license agreements. See the NOTICE file |
| // distributed with this work for additional information |
| // regarding copyright ownership. The ASF licenses this file |
| // to you under the Apache License, Version 2.0 (the |
| // "License"); you may not use this file except in compliance |
| // with the License. You may obtain a copy of the License at |
| // |
| // http://www.apache.org/licenses/LICENSE-2.0 |
| // |
| // Unless required by applicable law or agreed to in writing, |
| // software distributed under the License is distributed on an |
| // "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY |
| // KIND, either express or implied. See the License for the |
| // specific language governing permissions and limitations |
| // under the License. |
| |
| Solr provides a way to colocate a collection with another so that cross-collection joins are always possible. |
| |
| The colocation guarantee applies to all future Collection operations made either via Collections API or by Autoscaling |
| actions. |
| |
| A collection may only be colocated with exactly one `withCollection`. However, arbitrarily many collections may be |
| _linked_ to the same `withCollection`. |
| |
| == Create a Colocated Collection |
| The Create Collection API supports a parameter named `withCollection` which can be used to specify a collection |
| with which the replicas of the newly created collection should be colocated. See <<collection-management.adoc#create,Create Collection API>>. |
| |
| `/admin/collections?action=CREATE&name=techproducts&numShards=1&replicationFactor=2&withCollection=tech_categories` |
| |
| In the above example, all replicas of the `techproducts` collection will be colocated on a node with at least one |
| replica of the `tech_categories` collection. |
| |
| == Colocating Existing Collections |
| When collections already exist beforehand, the <<collection-management.adoc#modifycollection, Modify Collection API>> can be |
| used to set the `withCollection` parameter so that the two collections can be linked. This will *not* trigger |
| changes to the cluster automatically because moving a large number of replicas immediately might de-stabilize the system. |
| Instead, it is recommended that the Suggestions UI page should be consulted on the operations that can be performed |
| to change the cluster manually. |
| |
| Example: |
| `/admin/collections?action=MODIFYCOLLECTION&collection=techproducts&withCollection=tech_categories` |
| |
| == Deleting Colocated Collections |
| Deleting a collection which has been linked to another will fail unless the link itself is deleted first by using the |
| <<collection-management.adoc#modifycollection, Modify Collection API>> to un-set the `withCollection` attribute. |
| |
| Example: |
| `/admin/collections?action=MODIFYCOLLECTION&collection=techproducts&withCollection=` |
| |
| == Limitations and Caveats |
| |
| The collection being used as the `withCollection` must have one shard only and that shard should be named `shard1`. Note |
| that when using the default router, the shard name is always set to `shard1` but special care must be taken to name the |
| shard as `shard1` when using the implicit router. |
| |
| In case new replicas of the `withCollection` have to be added to maintain the colocation guarantees then the new replicas |
| will be of type `NRT` only. Automatically creating replicas of `TLOG` or `PULL` types is not supported. |
| |
| In case, replicas have to be moved from one node to another, perhaps in response to a node lost trigger, then the target |
| nodes will be chosen by preferring nodes that already have a replica of the `withCollection` so that the number of moves |
| is minimized. However, this also means that unless there are Autoscaling policy violations, Solr will continue to move |
| such replicas to already loaded nodes instead of preferring empty nodes. Therefore, it is advised to have policy rules |
| which can prevent such overloading by e.g., setting the maximum number of cores per node to a fixed value. |
| |
| Example: |
| `{'cores' : '<8', 'node' : '#ANY'}` |
| |
| The colocation guarantee is one-way only i.e., a collection 'X' colocated with 'Y' will always have one or more |
| replicas of 'Y' on any node that has a replica of 'X' but the reverse is not true. There may be nodes which have one or |
| more replicas of 'Y' but no replicas of 'X'. Such replicas of 'Y' will not be considered a violation of colocation |
| rules and will not be cleaned up automatically. |