| = Collection Aliasing |
| :toclevels: 1 |
| // Licensed to the Apache Software Foundation (ASF) under one |
| // or more contributor license agreements. See the NOTICE file |
| // distributed with this work for additional information |
| // regarding copyright ownership. The ASF licenses this file |
| // to you under the Apache License, Version 2.0 (the |
| // "License"); you may not use this file except in compliance |
| // with the License. You may obtain a copy of the License at |
| // |
| // http://www.apache.org/licenses/LICENSE-2.0 |
| // |
| // Unless required by applicable law or agreed to in writing, |
| // software distributed under the License is distributed on an |
| // "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY |
| // KIND, either express or implied. See the License for the |
| // specific language governing permissions and limitations |
| // under the License. |
| |
| A collection alias is a virtual collection which Solr treats the same as a normal collection. The alias collection may point to one or more real collections. |
| |
| Some use cases for collection aliasing: |
| |
| * Time series data |
| * Reindexing content behind the scenes |
| |
| [[createalias]] |
| == CREATEALIAS: Create or Modify an Alias for a Collection |
| |
| The `CREATEALIAS` action will create a new alias pointing to one or more collections. |
| Aliases come in 2 flavors: standard and routed. |
| |
| *Standard aliases* are simple: CREATEALIAS registers the alias name with the names of one or more collections provided |
| by the command. |
| If an existing alias exists, it is replaced/updated. |
| A standard alias can serve as a means to rename a collection, and can be used to atomically swap |
| which backing/underlying collection is "live" for various purposes. |
| When Solr searches an alias pointing to multiple collections, Solr will search all shards of all the collections as an |
| aggregated whole. |
| While it is possible to send updates to an alias spanning multiple collections, standard aliases have no logic for |
| distributing documents among the referenced collections so all updates will go to the first collection in the list. |
| |
| `/admin/collections?action=CREATEALIAS&name=_name_&collections=_collectionlist_` |
| |
| *Routed aliases* are aliases with additional capabilities to act as a kind of super-collection that route |
| updates to the correct collection. Routing is data driven and may be based on a temporal field or on categories |
| specified in a field (normally string based). |
| See <<aliases.adoc#routed-aliases,Routed Aliases>> for some important high-level information |
| before getting started. |
| |
| [source,text] |
| ---- |
| localhost:8983/solr/admin/collections?action=CREATEALIAS&name=timedata&router.start=NOW/DAY&router.field=evt_dt&router.name=time&router.interval=%2B1DAY&router.maxFutureMs=3600000&create-collection.collection.configName=myConfig&create-collection.numShards=2 |
| ---- |
| |
| If run on Jan 15, 2018, the above will create an time routed alias named timedata, that contains collections with names prefixed |
| with `timedata` and an initial collection named `timedata_2018_01_15` will be created immediately. Updates sent to this |
| alias with a (required) value in `evt_dt` that is before or after 2018-01-15 will be rejected, until the last 60 |
| minutes of 2018-01-15. After 2018-01-15T23:00:00 documents for either 2018-01-15 or 2018-01-16 will be accepted. |
| As soon as the system receives a document for an allowable time window for which there is no collection it will |
| automatically create the next required collection (and potentially any intervening collections if `router.interval` is |
| smaller than `router.maxFutureMs`). Both the initial collection and any subsequent collections will be created using |
| the specified configset. All collection creation parameters other than `name` are allowed, prefixed |
| by `create-collection.` |
| |
| This means that one could, for example, partition their collections by day, and within each daily collection route |
| the data to shards based on customer id. Such shards can be of any type (NRT, PULL or TLOG), and rule-based replica |
| placement strategies may also be used. |
| |
| The values supplied in this command for collection creation will be retained |
| in alias properties, and can be verified by inspecting `aliases.json` in ZooKeeper. |
| |
| NOTE: Presently only updates are routed and queries are distributed to all collections in the alias, but future |
| features may enable routing of the query to the single appropriate collection based on a special parameter or perhaps |
| a filter on the routed field. |
| |
| === CREATEALIAS Parameters |
| |
| `name`:: |
| The alias name to be created. This parameter is required. If the alias is to be routed it also functions |
| as a prefix for the names of the dependent collections that will be created. It must therefore adhere to normal |
| requirements for collection naming. |
| |
| `async`:: |
| Request ID to track this action which will be <<collections-api.adoc#asynchronous-calls,processed asynchronously>>. |
| |
| ==== Standard Alias Parameters |
| |
| `collections`:: |
| A comma-separated list of collections to be aliased. The collections must already exist in the cluster. |
| This parameter signals the creation of a standard alias. If it is present all routing parameters are |
| prohibited. If routing parameters are present this parameter is prohibited. |
| |
| ==== Routed Alias Parameters |
| |
| Most routed alias parameters become _alias properties_ that can subsequently be inspected and <<aliasprop,modified>>. |
| |
| `router.name`:: |
| The type of routing to use. Presently only `time` and `category` and `Dimensional[]` are valid. |
| In the case of a multi dimensional routed alias (A. K. A. "DRA", see <<aliases.adoc#dimensional-routed-aliases,Aliases>> |
| documentation), it is required to express all the dimensions in the same order that they will appear in the dimension |
| array. The format for a DRA router.name is Dimensional[dim1,dim2] where dim1 and dim2 are valid router.name |
| values for each sub-dimension. Note that DRA's are very new, and only 2D DRA's are presently supported. Higher |
| numbers of dimensions will be supported soon. See examples below for further clarification on how to configure |
| individual dimensions. This parameter is required. |
| |
| `router.field`:: |
| The field to inspect to determine which underlying collection an incoming document should be routed to. |
| This field is required on all incoming documents. |
| |
| `create-collection.*`:: |
| The `*` wildcard can be replaced with any parameter from the <<collection-management.adoc#create,CREATE>> command except `name`. All other fields |
| are identical in requirements and naming except that we insist that the configset be explicitly specified. |
| The configset must be created beforehand, either uploaded or copied and modified. |
| It's probably a bad idea to use "data driven" mode as schema mutations might happen concurrently leading to errors. |
| |
| ==== Time Routed Alias Parameters |
| |
| `router.start`:: |
| The start date/time of data for this time routed alias in Solr's standard date/time format (i.e., ISO-8601 or "NOW" |
| optionally with <<working-with-dates.adoc#date-math,date math>>). |
| + |
| The first collection created for the alias will be internally named after this value. |
| If a document is submitted with an earlier value for router.field then the earliest collection the alias points to then |
| it will yield an error since it can't be routed. This date/time MUST NOT have a milliseconds component other than 0. |
| Particularly, this means `NOW` will fail 999 times out of 1000, though `NOW/SECOND`, `NOW/MINUTE`, etc. will work |
| just fine. This parameter is required. |
| |
| `TZ`:: |
| The timezone to be used when evaluating any date math in router.start or router.interval. This is equivalent to the |
| same parameter supplied to search queries, but understand in this case it's persisted with most of the other parameters |
| as an alias property. |
| + |
| If GMT-4 is supplied for this value then a document dated 2018-01-14T21:00:00:01.2345Z would be stored in the |
| myAlias_2018-01-15_01 collection (assuming an interval of +1HOUR). |
| + |
| The default timezone is UTC. |
| |
| `router.interval`:: |
| A date math expression that will be appended to a timestamp to determine the next collection in the series. |
| Any date math expression that can be evaluated if appended to a timestamp of the form 2018-01-15T16:17:18 will |
| work here. |
| + |
| This parameter is required. |
| |
| `router.maxFutureMs`:: |
| The maximum milliseconds into the future that a document is allowed to have in `router.field` for it to be accepted |
| without error. If there was no limit, than an erroneous value could trigger many collections to be created. |
| + |
| The default is 10 minutes. |
| |
| `router.preemptiveCreateMath`:: |
| A date math expression that results in early creation of new collections. |
| + |
| If a document arrives with a timestamp that is after the end time of the most recent collection minus this |
| interval, then the next (and only the next) collection will be created asynchronously. Without this setting, collections are created |
| synchronously when required by the document time stamp and thus block the flow of documents until the collection |
| is created (possibly several seconds). Preemptive creation reduces these hiccups. If set to enough time (perhaps |
| an hour or more) then if there are problems creating a collection, this window of time might be enough to take |
| corrective action. However after a successful preemptive creation, the collection is consuming resources without |
| being used, and new documents will tend to be routed through it only to be routed elsewhere. Also, note that |
| router.autoDeleteAge is currently evaluated relative to the date of a newly created collection, and so you may |
| want to increase the delete age by the preemptive window amount so that the oldest collection isn't deleted too |
| soon. Note that it has to be possible to subtract the interval specified from a date, so if prepending a |
| minus sign creates invalid date math, this will cause an error. Also note that a document that is itself |
| destined for a collection that does not exist will still trigger synchronous creation up to that destination collection |
| but will not trigger additional async preemptive creation. Only one type of collection creation can happen |
| per document. |
| Example: `90MINUTES`. |
| + |
| This property is blank by default indicating just-in-time, synchronous creation of new collections. |
| |
| `router.autoDeleteAge`:: |
| A date math expression that results in the oldest collections getting deleted automatically. |
| + |
| The date math is relative to the timestamp of a newly created collection (typically close to the current time), |
| and thus this must produce an earlier time via rounding and/or subtracting. |
| Collections to be deleted must have a time range that is entirely before the computed age. |
| Collections are considered for deletion immediately prior to new collections getting created. |
| Example: `/DAY-90DAYS`. |
| + |
| The default is not to delete. |
| |
| ==== Category Routed Alias Parameters |
| |
| `router.maxCardinality`:: |
| The maximum number of categories allowed for this alias. |
| This setting safeguards against the inadvertent creation of an infinite number of collections in the event of bad data. |
| |
| `router.mustMatch`:: |
| A regular expression that the value of the field specified by `router.field` must match before a corresponding |
| collection will be created. Note that changing this setting after data has been added will not alter the data already |
| indexed. Any valid Java regular expression pattern may be specified. This expression is pre-compiled at the start of |
| each request so batching of updates is strongly recommended. Overly complex patterns will produce cpu |
| or garbage collecting overhead during indexing as determined by the JVM's implementation of regular expressions. |
| |
| ==== Dimensional Routed Alias Parameters |
| |
| |
| `router.#.`:: |
| This prefix denotes which position in the dimension array is being referred to for purposes of dimension configuration. |
| For example in a Dimensional[time,category] router.0.start would be used to set the start time for the time dimension. |
| |
| |
| === CREATEALIAS Response |
| |
| The output will simply be a responseHeader with details of the time it took to process the request. |
| To confirm the creation of the alias, you can look in the Solr Admin UI, under the Cloud section and find the |
| `aliases.json` file. The initial collection for routed aliases should also be visible in various parts of the admin UI. |
| |
| === Examples using CREATEALIAS |
| Create an alias named "testalias" and link it to the collections named "foo" and "bar". |
| |
| [.dynamic-tabs] |
| -- |
| |
| [example.tab-pane#v2createAlias] |
| ==== |
| [.tab-label]*V2 API* |
| *Input* |
| |
| [source,json] |
| ---- |
| { |
| "create-alias":{ |
| "name":"testalias", |
| "collections":["foo","bar"] |
| } |
| } |
| ---- |
| *Output* |
| |
| [source,json] |
| ---- |
| { |
| "responseHeader": { |
| "status": 0, |
| "QTime": 125 |
| } |
| } |
| ---- |
| ==== |
| |
| [example.tab-pane#v1createAlias] |
| ==== |
| [.tab-label]*V1 API* |
| |
| *Input* |
| |
| [source,text] |
| ---- |
| http://localhost:8983/solr/admin/collections?action=CREATEALIAS&name=testalias&collections=foo,bar&wt=xml |
| ---- |
| |
| *Output* |
| |
| [source,xml] |
| ---- |
| <response> |
| <lst name="responseHeader"> |
| <int name="status">0</int> |
| <int name="QTime">122</int> |
| </lst> |
| </response> |
| ---- |
| ==== |
| -- |
| |
| A somewhat contrived example demonstrating creating a TRA with many additional collection creation options. |
| Notice that the collection creation parameters follow the v2 API naming convention, not the v1 naming conventions. |
| |
| [.dynamic-tabs] |
| -- |
| |
| [example.tab-pane#v2createTRA] |
| ==== |
| [.tab-label]*V2 API* |
| |
| *Input* |
| |
| POST /api/c |
| |
| [source,json] |
| ---- |
| { |
| "create-alias" : { |
| "name": "somethingTemporalThisWayComes", |
| "router" : { |
| "name": "time", |
| "field": "evt_dt", |
| "start":"NOW/MINUTE", |
| "interval":"+2HOUR", |
| "maxFutureMs":"14400000" |
| }, |
| "create-collection" : { |
| "config":"_default", |
| "router": { |
| "name":"implicit", |
| "field":"foo_s" |
| }, |
| "shards":"foo,bar,baz", |
| "numShards": 3, |
| "tlogReplicas":1, |
| "pullReplicas":1, |
| "maxShardsPerNode":2, |
| "properties" : { |
| "foobar":"bazbam" |
| } |
| } |
| } |
| } |
| ---- |
| |
| *Output* |
| |
| [source,json] |
| ---- |
| { |
| "responseHeader": { |
| "status": 0, |
| "QTime": 1234 |
| } |
| } |
| ---- |
| ==== |
| |
| [example.tab-pane#v1createTRA] |
| ==== |
| [.tab-label]*V1 API* |
| |
| *Input* |
| |
| [source,text] |
| ---- |
| http://localhost:8983/solr/admin/collections?action=CREATEALIAS |
| &name=somethingTemporalThisWayComes |
| &router.name=time |
| &router.start=NOW/MINUTE |
| &router.field=evt_dt |
| &router.interval=%2B2HOUR |
| &router.maxFutureMs=14400000 |
| &create-collection.collection.configName=_default |
| &create-collection.router.name=implicit |
| &create-collection.router.field=foo_s |
| &create-collection.numShards=3 |
| &create-collection.shards=foo,bar,baz |
| &create-collection.tlogReplicas=1 |
| &create-collection.pullReplicas=1 |
| &create-collection.maxShardsPerNode=2 |
| &create-collection.property.foobar=bazbam |
| ---- |
| |
| *Output* |
| |
| [source,xml] |
| ---- |
| <response> |
| <lst name="responseHeader"> |
| <int name="status">0</int> |
| <int name="QTime">1234</int> |
| </lst> |
| </response> |
| ---- |
| ==== |
| -- |
| |
| Another example, this time of a Dimensional Routed Alias demonstrating how to specify parameters for the |
| individual dimensions |
| |
| [.dynamic-tabs] |
| -- |
| |
| [example.tab-pane#v2createDRA] |
| ==== |
| [.tab-label]*V2 API* |
| |
| *Input* |
| |
| POST /api/c |
| [source,json] |
| ---- |
| { |
| "create-alias":{ |
| "name":"dra_test1", |
| "router": { |
| "name": "Dimensional[time,category]", |
| "routerList" : [ { |
| "field":"myDate_tdt", |
| "start":"2019-01-01T00:00:00Z", |
| "interval":"+1MONTH", |
| "maxFutureMs":600000 |
| },{ |
| "field":"myCategory_s", |
| "maxCardinality":20 |
| }] |
| }, |
| "create-collection": { |
| "config":"_default", |
| "numShards":2 |
| } |
| } |
| } |
| ---- |
| *Output* |
| |
| [source,json] |
| ---- |
| { |
| "responseHeader": { |
| "status": 0, |
| "QTime": 1234 |
| } |
| } |
| ---- |
| ==== |
| |
| [example.tab-pane#v1createDRA] |
| ==== |
| [.tab-label]*V1 API* |
| |
| *Input* |
| |
| [source,text] |
| ---- |
| http://localhost:8983/solr/admin/collections?action=CREATEALIAS |
| &name=dra_test1 |
| &router.name=Dimensional[time,category] |
| &router.0.start=2019-01-01T00:00:00Z |
| &router.0.field=myDate_tdt |
| &router.0.interval=%2B1MONTH |
| &router.0.maxFutureMs=600000 |
| &create-collection.collection.configName=_default |
| &create-collection.numShards=2 |
| &router.1.maxCardinality=20 |
| &router.1.field=myCategory_s |
| ---- |
| |
| *Output* |
| |
| [source,xml] |
| ---- |
| <response> |
| <lst name="responseHeader"> |
| <int name="status">0</int> |
| <int name="QTime">1234</int> |
| </lst> |
| </response> |
| ---- |
| ==== |
| -- |
| |
| [[listaliases]] |
| == LISTALIASES: List of all aliases in the cluster |
| |
| `/admin/collections?action=LISTALIASES` |
| |
| The LISTALIASES action does not take any parameters. |
| |
| === LISTALIASES Response |
| |
| The output will contain a list of aliases with the corresponding collection names. |
| |
| === Examples using LISTALIASES |
| |
| *Input* |
| |
| List the existing aliases, requesting information as XML from Solr: |
| |
| [source,text] |
| ---- |
| http://localhost:8983/solr/admin/collections?action=LISTALIASES&wt=xml |
| ---- |
| |
| *Output* |
| |
| [source,xml] |
| ---- |
| <response> |
| <lst name="responseHeader"> |
| <int name="status">0</int> |
| <int name="QTime">0</int> |
| </lst> |
| <lst name="aliases"> |
| <str name="testalias1">collection1</str> |
| <str name="testalias2">collection1,collection2</str> |
| </lst> |
| <lst name="properties"> |
| <lst name="testalias1"/> |
| <lst name="testalias2"> |
| <str name="someKey">someValue</str> |
| </lst> |
| </lst> |
| </response> |
| ---- |
| |
| [[aliasprop]] |
| == ALIASPROP: Modify Alias Properties for a Collection |
| |
| The `ALIASPROP` action modifies the properties (metadata) on an alias. If a key is set with a value that is empty it will be removed. |
| |
| `/admin/collections?action=ALIASPROP&name=_name_&property.someKey=somevalue` |
| |
| WARNING: This command allows you to revise any property. No alias specific validation is performed. |
| Routed aliases may cease to function, function incorrectly or cause errors if property values |
| are set carelessly. |
| |
| === ALIASPROP Parameters |
| |
| `name`:: |
| The alias name on which to set properties. This parameter is required. |
| |
| `property.*`:: |
| The name of the property to be modified replaces '*', the value for the parameter is passed as the value for the property. |
| |
| `async`:: |
| Request ID to track this action which will be <<collections-api.adoc#asynchronous-calls,processed asynchronously>>. |
| |
| === ALIASPROP Response |
| |
| The output will simply be a responseHeader with details of the time it took to process the request. |
| To confirm the creation of the property or properties, you can look in the Solr Admin UI, under the Cloud section and |
| find the `aliases.json` file or use the LISTALIASES api command. |
| |
| === Examples using ALIASPROP |
| |
| *Input* |
| |
| For an alias named "testalias2" and set the value "someValue" for a property of "someKey" and "otherValue" for "otherKey". |
| |
| [source,text] |
| ---- |
| http://localhost:8983/solr/admin/collections?action=ALIASPROP&name=testalias2&property.someKey=someValue&property.otherKey=otherValue&wt=xml |
| ---- |
| |
| *Output* |
| |
| [source,xml] |
| ---- |
| <response> |
| <lst name="responseHeader"> |
| <int name="status">0</int> |
| <int name="QTime">122</int> |
| </lst> |
| </response> |
| ---- |
| |
| [[deletealias]] |
| == DELETEALIAS: Delete a Collection Alias |
| |
| `/admin/collections?action=DELETEALIAS&name=_name_` |
| |
| === DELETEALIAS Parameters |
| |
| `name`:: |
| The name of the alias to delete. This parameter is required. |
| |
| `async`:: |
| Request ID to track this action which will be <<collections-api.adoc#asynchronous-calls,processed asynchronously>>. |
| |
| === DELETEALIAS Response |
| |
| The output will simply be a responseHeader with details of the time it took to process the request. |
| To confirm the removal of the alias, you can look in the Solr Admin UI, under the Cloud section, and |
| find the `aliases.json` file. |
| |
| === Examples using DELETEALIAS |
| |
| *Input* |
| |
| Remove the alias named "testalias". |
| |
| [source,text] |
| ---- |
| http://localhost:8983/solr/admin/collections?action=DELETEALIAS&name=testalias&wt=xml |
| ---- |
| |
| *Output* |
| |
| [source,xml] |
| ---- |
| <response> |
| <lst name="responseHeader"> |
| <int name="status">0</int> |
| <int name="QTime">117</int> |
| </lst> |
| </response> |
| ---- |