blob: d37c23c0b9d4daa2d062ff099baff5f44c4eeb57 [file] [log] [blame]
= Working with Dates
// Licensed to the Apache Software Foundation (ASF) under one
// or more contributor license agreements. See the NOTICE file
// distributed with this work for additional information
// regarding copyright ownership. The ASF licenses this file
// to you under the Apache License, Version 2.0 (the
// "License"); you may not use this file except in compliance
// with the License. You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing,
// software distributed under the License is distributed on an
// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
// KIND, either express or implied. See the License for the
// specific language governing permissions and limitations
// under the License.
== Date Formatting
Solr's date fields (`DatePointField`, `DateRangeField` and the deprecated `TrieDateField`) represent "dates" as a point in time with millisecond precision. The format used is a restricted form of the canonical representation of dateTime in the http://www.w3.org/TR/xmlschema-2/#dateTime[XML Schema specification] – a restricted subset of https://en.wikipedia.org/wiki/ISO_8601[ISO-8601]. For those familiar with Java date handling, Solr uses {java-javadocs}java/time/format/DateTimeFormatter.html#ISO_INSTANT[DateTimeFormatter.ISO_INSTANT] for formatting, and parsing too with "leniency".
`YYYY-MM-DDThh:mm:ssZ`
* `YYYY` is the year.
* `MM` is the month.
* `DD` is the day of the month.
* `hh` is the hour of the day as on a 24-hour clock.
* `mm` is minutes.
* `ss` is seconds.
* `Z` is a literal 'Z' character indicating that this string representation of the date is in UTC
Note that no time zone can be specified; the String representations of dates is always expressed in Coordinated Universal Time (UTC). Here is an example value:
`1972-05-20T17:33:18Z`
You can optionally include fractional seconds if you wish, although any precision beyond milliseconds will be ignored. Here are example values with sub-seconds:
* `1972-05-20T17:33:18.772Z`
* `1972-05-20T17:33:18.77Z`
* `1972-05-20T17:33:18.7Z`
There must be a leading `'-'` for dates prior to year 0000, and Solr will format dates with a leading `'+'` for years after 9999. Year 0000 is considered year 1 BC; there is no such thing as year 0 AD or BC.
.Query escaping may be required
[WARNING]
====
As you can see, the date format includes colon characters separating the hours, minutes, and seconds. Because the colon is a special character to Solr's most common query parsers, escaping is sometimes required, depending on exactly what you are trying to do.
This is normally an invalid query: `datefield:1972-05-20T17:33:18.772Z`
These are valid queries: +
`datefield:1972-05-20T17\:33\:18.772Z` +
`datefield:"1972-05-20T17:33:18.772Z"` +
`datefield:[1972-05-20T17:33:18.772Z TO *]`
====
=== Date Range Formatting
Solr's `DateRangeField` supports the same point in time date syntax described above (with _date math_ described below) and more to express date ranges. One class of examples is truncated dates, which represent the entire date span to the precision indicated. The other class uses the range syntax (`[ TO ]`). Here are some examples:
* `2000-11` – The entire month of November, 2000.
* `1605-11-05` – The Fifth of November.
* `2000-11-05T13` – Likewise but for an hour of the day (1300 to before 1400, i.e., 1pm to 2pm).
* `-0009` – The year 10 BC. A 0 in the year position is 0 AD, and is also considered 1 BC.
* `[2000-11-01 TO 2014-12-01]` – The specified date range at a day resolution.
* `[2014 TO 2014-12-01]` – From the start of 2014 till the end of the first day of December.
* `[* TO 2014-12-01]` – From the earliest representable time thru till the end of the day on 2014-12-01.
Limitations: The range syntax doesn't support embedded date math. If you specify a date instance supported by DatePointField with date math truncating it, like `NOW/DAY`, you still get the first millisecond of that day, not the entire day's range. Exclusive ranges (using `{` & `}`) work in _queries_ but not for _indexing_ ranges.
== Date Math
Solr's date field types also supports _date math_ expressions, which makes it easy to create times relative to fixed moments in time, include the current time which can be represented using the special value of "```NOW```".
=== Date Math Syntax
Date math expressions consist either adding some quantity of time in a specified unit, or rounding the current time by a specified unit. expressions can be chained and are evaluated left to right.
For example: this represents a point in time two months from now:
`NOW+2MONTHS`
This is one day ago:
`NOW-1DAY`
A slash is used to indicate rounding. This represents the beginning of the current hour:
`NOW/HOUR`
The following example computes (with millisecond precision) the point in time six months and three days into the future and then rounds that time to the beginning of that day:
`NOW+6MONTHS+3DAYS/DAY`
Note that while date math is most commonly used relative to `NOW` it can be applied to any fixed moment in time as well:
`1972-05-20T17:33:18.772Z+6MONTHS+3DAYS/DAY`
=== Request Parameters That Affect Date Math
==== NOW
The `NOW` parameter is used internally by Solr to ensure consistent date math expression parsing across multiple nodes in a distributed request. But it can be specified to instruct Solr to use an arbitrary moment in time (past or future) to override for all situations where the special value of `NOW` would impact date math expressions.
It must be specified as a (long valued) milliseconds since epoch.
Example:
`q=solr&fq=start_date:[* TO NOW]&NOW=1384387200000`
==== TZ
By default, all date math expressions are evaluated relative to the UTC TimeZone, but the `TZ` parameter can be specified to override this behaviour, by forcing all date based addition and rounding to be relative to the specified {java-javadocs}java/util/TimeZone.html[time zone].
For example, the following request will use range faceting to facet over the current month, "per day" relative UTC:
[source,text]
----
http://localhost:8983/solr/my_collection/select?q=*:*&facet.range=my_date_field&facet=true&facet.range.start=NOW/MONTH&facet.range.end=NOW/MONTH%2B1MONTH&facet.range.gap=%2B1DAY&wt=xml
----
[source,xml]
----
<int name="2013-11-01T00:00:00Z">0</int>
<int name="2013-11-02T00:00:00Z">0</int>
<int name="2013-11-03T00:00:00Z">0</int>
<int name="2013-11-04T00:00:00Z">0</int>
<int name="2013-11-05T00:00:00Z">0</int>
<int name="2013-11-06T00:00:00Z">0</int>
<int name="2013-11-07T00:00:00Z">0</int>
...
----
While in this example, the "days" will be computed relative to the specified time zone - including any applicable Daylight Savings Time adjustments:
[source,text]
----
http://localhost:8983/solr/my_collection/select?q=*:*&facet.range=my_date_field&facet=true&facet.range.start=NOW/MONTH&facet.range.end=NOW/MONTH%2B1MONTH&facet.range.gap=%2B1DAY&TZ=America/Los_Angeles&wt=xml
----
[source,xml]
----
<int name="2013-11-01T07:00:00Z">0</int>
<int name="2013-11-02T07:00:00Z">0</int>
<int name="2013-11-03T07:00:00Z">0</int>
<int name="2013-11-04T08:00:00Z">0</int>
<int name="2013-11-05T08:00:00Z">0</int>
<int name="2013-11-06T08:00:00Z">0</int>
<int name="2013-11-07T08:00:00Z">0</int>
...
----
== More DateRangeField Details
`DateRangeField` is almost a drop-in replacement for places where `DatePointField` is used. The only difference is that Solr's XML or SolrJ response formats will expose the stored data as a String instead of a Date. The underlying index data for this field will be a bit larger. Queries that align to units of time a second on up should be faster than TrieDateField, especially if it's in UTC.
The main point of `DateRangeField`, as its name suggests, is to allow indexing date ranges. To do that, simply supply strings in the format shown above. It also supports specifying 3 different relational predicates between the indexed data, and the query range:
* `Intersects` (default)
* `Contains`
* `Within`
You can specify the predicate by querying using the `op` local-params parameter like so:
[source,text]
----
fq={!field f=dateRange op=Contains}[2013 TO 2018]
----
Unlike most local parameters, `op` is actually _not_ defined by any query parser (`field`), it is defined by the field type, in this case `DateRangeField`. In the above example, it would find documents with indexed ranges that _contain_ (or equals) the range 2013 thru 2018. Multi-valued overlapping indexed ranges in a document are effectively coalesced.
For a DateRangeField example use-case, see https://cwiki.apache.org/confluence/display/solr/DateRangeField[see Solr's community wiki].