blob: a1e388f144624334c093754e78e392102595ef7c [file] [log] [blame]
= Copying Fields
// Licensed to the Apache Software Foundation (ASF) under one
// or more contributor license agreements. See the NOTICE file
// distributed with this work for additional information
// regarding copyright ownership. The ASF licenses this file
// to you under the Apache License, Version 2.0 (the
// "License"); you may not use this file except in compliance
// with the License. You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing,
// software distributed under the License is distributed on an
// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
// KIND, either express or implied. See the License for the
// specific language governing permissions and limitations
// under the License.
You might want to interpret some document fields in more than one way. Solr has a mechanism for making copies of fields so that you can apply several distinct field types to a single piece of incoming information.
The name of the field you want to copy is the _source_, and the name of the copy is the _destination_. In `schema.xml`, it's very simple to make copies of fields:
[source,xml]
----
<copyField source="cat" dest="text" maxChars="30000" />
----
In this example, we want Solr to copy the `cat` field to a field named `text`. Fields are copied before <<understanding-analyzers-tokenizers-and-filters.adoc#,analysis>> is done, meaning you can have two fields with identical original content, but which use different analysis chains and are stored in the index differently.
In the example above, if the `text` destination field has data of its own in the input documents, the contents of the `cat` field will be added as additional values – just as if all of the values had originally been specified by the client. Remember to configure your fields as `multivalued="true"` if they will ultimately get multiple values (either from a multivalued source or from multiple `copyField` directives).
A common usage for this functionality is to create a single "search" field that will serve as the default query field when users or clients do not specify a field to query. For example, `title`, `author`, `keywords`, and `body` may all be fields that should be searched by default, with copy field rules for each field to copy to a `catchall` field (for example, it could be named anything). Later you can set a rule in `solrconfig.xml` to search the `catchall` field by default. One caveat to this is your index will grow when using copy fields. However, whether this becomes problematic for you and the final size will depend on the number of fields being copied, the number of destination fields being copied to, the analysis in use, and the available disk space.
The `maxChars` parameter, an `int` parameter, establishes an upper limit for the number of characters to be copied from the source value when constructing the value added to the destination field. This limit is useful for situations in which you want to copy some data from the source field, but also control the size of index files.
Both the source and the destination of `copyField` can contain either leading or trailing asterisks, which will match anything. For example, the following line will copy the contents of all incoming fields that match the wildcard pattern `*_t` to the text field.:
[source,xml]
----
<copyField source="*_t" dest="text" maxChars="25000" />
----
[IMPORTANT]
====
The `copyField` command can use a wildcard (*) character in the `dest` parameter only if the `source` parameter contains one as well. `copyField` uses the matching glob from the source field for the `dest` field name into which the source content is copied.
====
Copying is done at the stream source level and no copy feeds into another copy. This means that copy fields cannot be chained i.e., _you cannot_ copy from `here` to `there` and then from `there` to `elsewhere`. However, the same source field can be copied to multiple destination fields:
[source,xml]
----
<copyField source="here" dest="there"/>
<copyField source="here" dest="elsewhere"/>
----