commit | c4da61c8eb98cedd3cf7a28c293cb1f6d3ec8571 | [log] [tgz] |
---|---|---|
author | Benjamin Anderson <b@banjiewen.net> | Wed Oct 29 12:52:30 2014 -0700 |
committer | Eric Avdey <eiri@eiri.ca> | Thu Nov 24 13:55:18 2016 -0400 |
tree | 567dec1b0f2135b1b8ae662f9aba3a343496bce7 | |
parent | 252467cb4a27637090b5f9006483f5b7ab551699 [diff] |
Chunk missing revisions before attempting to save on target In cases with pathological documents revision patterns (e.g., 10000 open conflicts and tree depth of 300000 on a single document), attempting to replicate the full revision tree in one batch causes the system to crash by attempting to send an oversized message. We've observed messages of > 4GB in the wild. This patch divides the set of revisions-to-replicate for a single document into chunks of a configurable size, thereby allowing operators to keep the system stable when attempting to replicate these troublesome documents. BugzID: 37676
Mem3 is the node membership application for clustered CouchDB. It is used in CouchDB since version 2.0 and tracks two very important things for the cluster:
Both the nodes and shards are tracked in node-local couch databases. Shards are heavily used, so an ETS cache is also maintained for low-latency lookups. The nodes and shards are synchronized via continuous CouchDB replication, which serves as ‘gossip’ in Dynamo parlance. The shards ETS cache is kept in sync based on membership and database event listeners.
A very important point to make here is that CouchDB does not necessarily divide up each database into equal shards across the nodes of a cluster. For instance, in a 20-node cluster, you may have the need to create a small database with very few documents. For efficiency reasons, you may create your database with Q=4 and keep the default of N=3. This means you only have 12 shards total, so 8 nodes will hold none of the data for this database. Given this feature, we even shard use out across the cluster by altering the ‘start’ node for the database's shards.
Splitting and merging shards is an immature feature of the system, and will require attention in the near-term. We believe we can implement both functions and perform them while the database remains online.
Mem3 requires R13B03 or higher and can be built with rebar, which comes bundled in the repository. Rebar needs to be able to find the couch_db.hrl
header file; one way to accomplish this is to set ERL_LIBS to point to the apps subdirectory of a CouchDB checkout, e.g.
ERL_LIBS="/usr/local/src/couchdb/apps" ./rebar compile