Map/reduce queries, also known as secondary indexes, are one of the most powerful features in PouchDB. However, they can be quite tricky to use, so this guide is designed to dispell some of the mysteries around them.
{% include alert/start.html variant=“warning” %}
Many developers make the mistake of using the query() API when the more performant allDocs() API would be a better fit.
{% include alert/end.html %}
{% include anchor.html title=“Mappin' and reducin'” hash=“mappin-and-reducin” %}
The PouchDB query() API (which corresponds to the _view API in CouchDB) has two modes: temporary queries and persistent queries.
Temporary queries are very slow, and we only recommend them for quick debugging during development. To use a temporary query, you simply pass in a map function:
db.query(function (doc, emit) { emit(doc.name); }, {key: 'foo'}).then(function (result) { // found docs with name === 'foo' }).catch(function (err) { // handle any errors });
In the above example, the result object will contain stubs of documents where the name attribute is equal to 'foo'. To include the document in each row of results, use the include_docs option.
{% include alert/start.html variant=“info” %}
The emit pattern is part of the standard CouchDB map/reduce API. What the function basically says is, “for each document, emit doc.name as a key.”
{% include alert/end.html %}
Persistent queries are much faster, and are the intended way to use the query() API in your production apps. To use persistent queries, there are two steps.
First, you create a design document, which describes the map function you would like to use:
// document that tells PouchDB/CouchDB // to build up an index on doc.name var ddoc = { _id: '_design/my_index', views: { by_name: { map: function (doc) { emit(doc.name); }.toString() } } }; // save it pouch.put(ddoc).then(function () { // success! }).catch(function (err) { // some error (maybe a 409, because it already exists?) });
{% include alert/start.html variant=“info” %}
The .toString() at the end of the map function is necessary to prep the object for becoming valid JSON.
{% include alert/end.html %}
{% include alert/start.html variant=“info” %}
The emit function will be available in scope when the map function is run, so don't pass it in as a parameter.
{% include alert/end.html %}
Then you actually query it, by using the name you gave the design document when you saved it:
db.query('my_index/by_name').then(function (res) { // got the query results }).catch(function (err) { // some error });
Note that, the first time you query, it will be quite slow because the index isn't built until you query it. To get around this, you can do an empty query to kick off a new build:
db.query('my_index/by_name', { limit: 0 // don't return any results }).then(function (res) { // index was built! }).catch(function (err) { // some error });
After this, your queries will be much faster.
{% include alert/start.html variant=“info”%}
CouchDB builds indexes in exactly the same way as PouchDB. So you may want to familiarize yourself with the “stale” option in order to get the best possible performance for your app.
{% include alert/end.html %}
{% include anchor.html title=“More about map/reduce” hash=“more-about-map-reduce” %}
That was a fairly whirlwind tour of the query() API, so let's get into more detail about how to write your map/reduce functions.
Quick refresher on how indexes work: in relational databases like MySQL and PostgreSQL, you can usually query whatever field you want:
SELECT * FROM pokemon WHERE name = 'Pikachu';
But if you don't want your performance to be terrible, you first add an index:
ALTER TABLE pokemon ADD INDEX myIndex ON (name);
The job of the index is to ensure the field is stored in a B-tree within the database, so your queries run in O(log(n)) time instead of O(n) time.
All of the above is also true in document stores like CouchDB and MongoDB, but conceptually it‘s a little different. By default, documents are assumed to be schemaless blobs with one primary key (called _id in both Mongo and Couch), and any other keys need to be specified separately. The concepts are largely the same; it’s mostly just the vocabulary that's different.
In CouchDB, queries are called map/reduce functions. This is because, like most NoSQL databases, CouchDB is designed to scale well across multiple computers, and to perform efficient query operations in parallel. Basically, the idea is that you divide your query into a map function and a reduce function, each of which may be executed in parallel in a multi-node cluster.
It may sound daunting at first, but in the simplest (and most common) case, you only need the map function. A basic map function might look like this:
function myMapFunction(doc) { emit(doc.name); }
This is functionally equivalent to the SQL index given above. What it essentially says is: “for each document in the database, emit its name as a key.”
And since it‘s just JavaScript, you’re allowed to get as fancy as you want here:
function myMapFunction(doc) { if (doc.type === 'pokemon') { if (doc.name === 'Pikachu') { emit('Pika pi!'); } else { emit(doc.name); } } }
Then you can query it:
// find pokemon with name === 'Pika pi!' pouch.query(myMapFunction, { key : 'Pika pi!', include_docs : true }).then(function (result) { // handle result }).catch(function (err) { // handle errors }); // find the first 5 pokemon whose name starts with 'P' pouch.query(myMapFunction, { startkey : 'P', endkey : 'P\uffff', limit : 5, include_docs : true }).then(function (result) { // handle result }).catch(function (err) { // handle errors });
{% include alert/start.html variant=“info”%}
The pagination options for query() – i.e., startkey/endkey/key/keys/skip/limit/descending – are exactly the same as with allDocs(). For a guide to pagination, read the Bulk operations guide or Pagination strategies with PouchDB.
{% include alert/end.html %}
As for reduce functions, there are a few handy built-ins that do aggregate operations ('_sum', '_count', and '_stats'), and you can typically steer clear of trying to write your own:
// emit the first letter of each pokemon's name var myMapReduceFun = { map: function (doc) { emit(doc.name.charAt(0)); }, reduce: '_count' }; // count the pokemon whose names start with 'P' pouch.query(myMapReduceFun, { key: 'P', reduce: true, group: true }).then(function (result) { // handle result }).catch(function (err) { // handle errors });
If you're adventurous, though, you should check out the CouchDB documentation or the PouchDB documentation for details on reduce functions.
{% include anchor.html title=“PouchDB Find” hash=“pouchdb-find” %}
The map/reduce API is complex. Part of this problem will be resolved when the more developer-friendly Cloudant query language is released in CouchDB 2.0, and the equivalent pouchdb-find plugin is finished.
{% include alert/start.html variant=“warning” %} {% markdown %} pouchdb-find is in beta, but you may find it is already sufficient for simple queries. Eventually it will replace map/reduce as PouchDB's “flagship” query engine. {% endmarkdown %} {% include alert/end.html %}
In the meantime, there are a few tricks you can use to avoid unnecessarily complicating your codebase:
query() API altogether if you can. You'd be amazed how much you can do with just allDocs(). (In fact, under the hood, the query() API is simply implemented on top of allDocs()!){% include anchor.html title=“Related API documentation” hash=“related-api-documentation” %}
{% include anchor.html title=“Next” hash=“next” %}
Now that we‘ve learned how to map reduce, map reuse, and map recycle, let’s move on to destroy() and compact().