ISSUE #1967: make ledger creation and removal robust to zk connectionloss


Descriptions of the changes in this PR:

The bookkeeper project ZooKeeperClient wrapper for the ZooKeeper client
will resend zk node creations and removals upon reconnect after a
ConnectionLoss event. In the event that the original succeeded, the
resent operation will erroneously return LedgerExistException or
NoSuchLedgerExistsException for creation and removal respectively.

For removal, this patch limits the operation by allowing it to always
succeed if the ledger does not exist in order to make it idempotent.
This is appears to be the simplest solution as exclusive removal isn't
important.

**Note, the above is an actual change to the bk client semantics**

For creation, exclusive creation is cleary important for correctness,
so this patch adds a creator token field to the LedgerMetdata to
disambiguate the above race from a real race. For
AbstractZkLedgerManager, this is simply a random long value.

There's an oportunity for optimization with the above if exclusive
ledger creation failures are expected to be common.  You only actually
need to perform this check if the operation was really resent.  I chose
not to go this route yet because it would require messing with the
ZooKeeperClient interface to surface that information without burdening
other callers.

If the client is set to version 2 or older, this field will be ignored
and the old behavior will be retained.  If the client is version 3 or
newer but creation races with an older client, the new client will
interpret the nonce to be BLANK and thereby detect the race correctly.



Reviewers: Enrico Olivelli <eolivelli@gmail.com>, Sijie Guo <sijie@apache.org>

This closes #2006 from reddycharan/zkretrialrobust, closes #1967
12 files changed
tree: c3cedc73460eb2bce1300b1c4d4278bfb5ebb830
  1. .github/
  2. .test-infra/
  3. .travis_scripts/
  4. bin/
  5. bookkeeper-benchmark/
  6. bookkeeper-common/
  7. bookkeeper-common-allocator/
  8. bookkeeper-dist/
  9. bookkeeper-http/
  10. bookkeeper-proto/
  11. bookkeeper-server/
  12. bookkeeper-stats/
  13. bookkeeper-stats-providers/
  14. buildtools/
  15. circe-checksum/
  16. conf/
  17. cpu-affinity/
  18. deploy/
  19. dev/
  20. docker/
  21. metadata-drivers/
  22. microbenchmarks/
  23. shaded/
  24. site/
  25. stats/
  26. stream/
  27. tests/
  28. tools/
  29. .gitignore
  30. .travis.yml
  31. LICENSE
  32. NOTICE
  33. pom.xml
  34. README.md
README.md

Build Status Build Status Coverage Status Maven Central

Apache BookKeeper

Apache BookKeeper is a scalable, fault tolerant and low latency storage service optimized for append-only workloads.

It is suitable for being used in following scenarios:

  • WAL (Write-Ahead-Logging), e.g. HDFS NameNode.
  • Message Store, e.g. Apache Pulsar.
  • Offset/Cursor Store, e.g. Apache Pulsar.
  • Object/Blob Store, e.g. storing state machine snapshots.

Get Started

  • Concepts: Start with the basic concepts of Apache BookKeeper. This will help you to fully understand the other parts of the documentation.
  • Getting Started to setup BookKeeper to write logs.

Documentation

Developers

You can also read Turning Ledgers into Logs to learn how to turn ledgers into continuous log streams. If you are looking for a high level log stream API, you can checkout DistributedLog.

Administrators

Contributors

Get In Touch

Report a Bug

For filing bugs, suggesting improvements, or requesting new features, help us out by opening a Github issue or opening an Apache jira.

Need Help?

Subscribe or mail the user@bookkeeper.apache.org list - Ask questions, find answers, and also help other users.

Subscribe or mail the dev@bookkeeper.apache.org list - Join development discussions, propose new ideas and connect with contributors.

Join us on Slack - This is the most immediate way to connect with Apache BookKeeper committers and contributors.

Contributing

We feel that a welcoming open community is important and welcome contributions.

Contributing Code

  1. See Developer Setup to get your local environment setup.

  2. Take a look at our open issues: JIRA Issues Github Issues.

  3. Review our coding style and follow our pull requests to learn about our conventions.

  4. Make your changes according to our contribution guide.

Improving Website and Documentation

  1. See Building the website and documentation on how to build the website and documentation.