d23025ebd7176f6c307ddf49902cf20b33bd55c4 - couchdb-couch-replicator

commit	d23025ebd7176f6c307ddf49902cf20b33bd55c4	[log] [tgz]
author	Nick Vatamaniuc <vatamane@apache.org>	Fri Feb 03 20:49:32 2017 -0500
committer	ILYA Khlopotov <iilyak@apache.org>	Mon Feb 06 12:18:02 2017 -0800
tree	a8500531e43c65aa09d9387eb6aebaba2c063a84
parent	be0060f3fffc308b7532e6b99355f0e0cdede88e [diff]

Allow configuring maximum document ID length during replication

Currently due to a bug in http parser and lack of document ID length
enforcement, large document IDs will break replication jobs. Large IDs
will pass through the _change feed, revs diffs, but then fail
during open_revs get request. open_revs request will keep retrying until
it gives up after long enough time, then replication task crashes and
restart again with the same pattern. The current effective limit is
around 8k or so. (The buffer size default 8192 and if the first line
of the request is larger than that, request will fail).

(See http://erlang.org/pipermail/erlang-questions/2011-June/059567.html
for more information about the possible failure mechanism).

Bypassing the parser bug by increasing recbuf size, will alow replication
to finish, however that means simply spreading the abnormal document through
the rest of the system, and might not be desirable always.

Also once long document IDs have been inserted in the source DB. Simply deleting
them doesn't work as they'd still appear in the change feed. They'd have to
be purged or somehow skipped during the replication step. This commit helps
do the later.

Operators can configure maximum length via this setting:
```
replicator.max_document_id_length=0
```

The default value is 0 which means there is no maximum enforced, which is
backwards compatible behavior.

During replication if maximum is hit by a document, that document is skipped,
an error is written to the log:

```
Replicator: document id `aaaaaaaaaaaaaaaaaaaaa...` from source db `http://.../cdyno-0000001/` is too long, ignoring.
```

and `"doc_write_failures"` statistic is bumped.

COUCHDB-3291

2 files changed

tree: a8500531e43c65aa09d9387eb6aebaba2c063a84