A MongoDB inspired query language interface for Apache CouchDB.
Mango provides a single HTTP API endpoint that accepts JSON bodies via HTTP POST. These bodies provide a set of instructions that will be handled with the results being returned to the client in the same order as they were specified. The general principle of this API is to be simple to implement on the client side while providing users a more natural conversion to Apache CouchDB than would otherwise exist using the standard RESTful HTTP interface that already exists.
The general API exposes a set of actions that are similar to what MongoDB exposes (although not all of MongoDB's API is supported). These are meant to be loosely and obviously inspired by MongoDB but without too much attention to maintaining the exact behavior.
Each action is specified as a JSON object with a number of keys that affect the behavior. Each action object has at least one field named “action” which must have a string value indicating the action to be performed. For each action there are zero or more fields that will affect behavior. Some of these fields are required and some are optional.
For convenience, the HTTP API will accept a JSON body that is either a single JSON object which specifies a single action or a JSON array that specifies a list of actions that will then be invoked serially. While multiple commands can be batched into a single HTTP request, there are no guarantees about atomicity or isolation for a batch of commands.
Query can be enabled by setting the following config:
rpc:multicall(config, set, ["native_query_servers", "query", "{mango_native_proc, start_link, []}"]).
This API adds a single URI endpoint to the existing CouchDB HTTP API. Creating databases, authentication, Map/Reduce views, etc are all still supported exactly as currently document. No existing behavior is changed.
The endpoint added is for the URL pattern /dbname/_query
and has the following characteristics:
POST
.Content-Type
must be application/json
.200
, 4XX
, or 5XX
Content-Type
will be application/json
Transfer-Encoding
will be chunked
.This is intended to be a significantly simpler use of HTTP than the current APIs. This is motivated by the fact that this entire API is aimed at customers who are not as savvy at HTTP or non-relational document stores. Once a customer is comfortable using this API we hope to expose any other “power features” through the existing HTTP API and its adherence to HTTP semantics.
This is a list of supported actions that Mango understands. For the time being it is limited to the four normal CRUD actions plus one meta action to create indices on the database.
Insert a document or documents into the database.
Keys:
If the provided document or documents do not contain an “_id” field one will be added using an automatically generated UUID.
It is more performant to specify multiple documents in the “docs” field than it is to specify multiple independent insert actions. Each insert action is submitted as a single bulk update (ie, _bulk_docs in CouchDB terminology). This, however, does not make any guarantees on the isolation or atomicity of the bulk operation. It is merely a performance benefit.
Retrieve documents from the database.
Keys:
r
as the read quorum. This is obviously less performant than using the document local to the index.The important thing to note about the find command is that it must execute over a generated index. If a selector is provided that cannot be satisfied using an existing index the list of basic indices that could be used will be returned.
For the most part, indices are generated in response to the “create_index” action (described below) although there are two special indices that can be used as well. The “_id” is automatically indexed and is similar to every other index. There is also a special “_seq” index to retrieve documents in the order of their update sequence.
Its also quite possible to generate a query that can't be satisfied by any index. In this case an error will be returned stating that fact. Generally speaking the easiest way to stumble onto this is to attempt to OR two separate fields which would require a complete table scan. In the future I expect to support these more complicated queries using an extended indexing API (which deviates from the current MongoDB model a bit).
Update an existing document in the database
Keys:
Updates are fairly straightforward other than to mention that the selector (like find) must be satisifiable using an existing index.
On the update field, if the provided JSON object has one or more update operator (described below) then the operation is applied onto the existing document (if one exists) else the entire contents are replaced with exactly the value of the update
field.
Remove a document from the database.
Keys:
Deletes behave quite similarly to update except they attempt to remove documents from the database. Its important to note that if a document has conflicts it may “appear” that delete‘s aren’t having an effect. This is because the delete operation by default only removes a single revision. Specify "force":true
if you would like to attempt to delete all live revisions.
If you wish to delete a specific revision of the document, you can specify it in the selector using the special “_rev” field.
Create an index on the database
Keys:
missing\_is\_null
adds an entry to the index for the document with a value of null
Anytime an operation is required to locate a document in the database it is required that an index must exist that can be used to locate it. By default the only two indices that exist are for the document “_id” and the special “_seq” index.
Indices are created in the background. If you attempt to create an index on a large database and then immediately utilize it, the request may block for a considerable amount of time before the request completes.
Indices can specify multiple fields to index simultaneously. This is roughly analogous to a compound index in SQL with the corresponding tradeoffs. For instance, an index may contain the (ordered set of) fields “foo”, “bar”, and “baz”. If a selector specifying “bar” is received, it can not be answered. Although if a selector specifying “foo” and “bar” is received, it can be answered more efficiently than if there were only an index on “foo” and “bar” independently.
NB: while the index allows the ability to specify sort directions these are currently not supported. The sort direction must currently be specified as “asc” in the JSON. [INTERNAL]: This will require that we patch the view engine as well as the cluster coordinators in Fabric to follow the specified sort orders. The concepts are straightforward but the implementation may need some thought to fit into the current shape of things.
List the indexes that exist in a given database.
Keys:
Delete the specified index from the database.
Keys:
list\_indexes
actionIndexes require resources to maintain. If you find that an index is no longer necessary then it can be beneficial to remove it from the database.
Shows debugging information for a given selector
Keys:
This is a useful debugging utility that will show how a given selector is normalized before execution as well as information on what indexes could be used to satisfy it.
If "extended": true
is included then the list of existing indices that could be used for this selector are also returned.
This API uses a few defined JSON structures for various operations. Here we'll describe each in detail.
The Mango query language is expressed as a JSON object describing documents of interest. Within this structure it is also possible to express conditional logic using specially named fields. This is inspired by and intended to maintain a fairly close parity to the existing MongoDB behavior.
As an example, the simplest selector for Mango might look something like such:
{"_id": "Paul"}
Which would match the document named “Paul” (if one exists). Extending this example using other fields might look like such:
{"_id": "Paul", "location": "Boston"}
This would match a document named “Paul” AND having a “location” value of “Boston”. Seeing as though I'm sitting in my basement in Omaha, this is unlikely.
There are two special syntax elements for the object keys in a selector. The first is that the period (full stop, or simply .
) character denotes subfields in a document. For instance, here are two equivalent examples:
{"location": {"city": "Omaha"}} {"location.city": "Omaha"}
If the object's key contains the period it could be escaped with backslash, i.e.
{"location\\.city": "Omaha"}
Note that the double backslash here is necessary to encode an actual single backslash.
The second important syntax element is the use of a dollar sign ($
) prefix to denote operators. For example:
{"age": {"$gt": 21}}
In this example, we have created the boolean expression age > 21
.
There are two core types of operators in the selector syntax: combination operators and condition operators. In general, combination operators contain groups of condition operators. We'll describe the list of each below.
For the most part every operator must be of the form {"$operator": argument}
. Though there are two implicit operators for selectors.
First, any JSON object that is not the argument to a condition operator is an implicit $and
operator on each field. For instance, these two examples are identical:
{"foo": "bar", "baz": true} {"$and": [{"foo": {"$eq": "bar"}}, {"baz": {"$eq": true}}]}
And as shown, any field that contains a JSON value that has no operators in it is an equality condition. For instance, these are equivalent:
{"foo": "bar"} {"foo": {"$eq": "bar"}}
And to be clear, these are also equivalent:
{"foo": {"bar": "baz"}} {"foo": {"$eq": {"bar": "baz"}}}
Although, the previous example would actually be normalized internally to this:
{"foo.bar": {"$eq": "baz"}}
These operators are responsible for combining groups of condition operators. Most familiar are the standard boolean operators plus a few extra for working with JSON arrays.
Each of the combining operators take a single argument that is either a condition operator or an array of condition operators.
The list of combining characters:
Condition operators are specified on a per field basis and apply to the value indexed for that field. For instance, the basic “$eq” operator matches when the indexed field is equal to its argument. There is currently support for the basic equality and inequality operators as well as a number of meta operators. Some of these operators will accept any JSON argument while some require a specific JSON formatted argument. Each is noted below.
The list of conditional arguments:
(In)equality operators
Object related operators
Array related operators
Misc related operators
Need to describe the syntax for update operators.
The sort syntax is a basic array of field name and direction pairs. It looks like such:
[{field1: dir1} | ...]
Where field1 can be any field (dotted notation is available for sub-document fields) and dir1 can be “asc” or “desc”.
Note that it is highly recommended that you specify a single key per object in your sort ordering so that the order is not dependent on the combination of JSON libraries between your application and the internals of Mango's indexing engine.
When retrieving documents from the database you can specify that only a subset of the fields are returned. This allows you to limit your results strictly to the parts of the document that are interesting for the local application logic. The fields returned are specified as an array. Unlike MongoDB only the fields specified are included, there is no automatic inclusion of the “_id” or other metadata fields when a field list is included.
A trivial example:
["foo", "bar", "baz"]
Short summary until the full documentation can be brought over.
Issue a query.
Request body is a JSON object that has the selector and the various options like limit/skip etc. Or we could post the selector and put the other options into the query string. Though I'd probably prefer to have it all in the body for consistency.
Response is streamed out like a view.
Request body contains the index definition.
Response body is empty and the result is returned as the status code (200 OK -> created, 3something for exists).
Request body is empty.
Response body is all of the indexes that are available for use by find.
Remove the specified index.
Request body is empty.
Response body is empty. The status code gives enough information.