src/main/site/xdoc/acid-semantics.xml - hbase - Git at Google

 <?xml version="1.0" encoding="UTF-8"?>
 <!--
 Licensed to the Apache Software Foundation (ASF) under one
 or more contributor license agreements.  See the NOTICE file
 distributed with this work for additional information
 regarding copyright ownership.  The ASF licenses this file
 to you under the Apache License, Version 2.0 (the
 "License"); you may not use this file except in compliance
 with the License.  You may obtain a copy of the License at

   http://www.apache.org/licenses/LICENSE-2.0

 Unless required by applicable law or agreed to in writing,
 software distributed under the License is distributed on an
 "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
 KIND, either express or implied.  See the License for the
 specific language governing permissions and limitations
 under the License.
 -->

 <!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V2.0//EN"
           "http://forrest.apache.org/dtd/document-v20.dtd">

 <document xmlns="http://maven.apache.org/XDOC/2.0"
   xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
   xsi:schemaLocation="http://maven.apache.org/XDOC/2.0 http://maven.apache.org/xsd/xdoc-2.0.xsd">
   <properties>
     <title>
       Apache HBase (TM) ACID Properties
     </title>
   </properties>

   <body>
     <section name="About this Document">
       <p>Apache HBase (TM) is not an ACID compliant database. However, it does guarantee certain specific
       properties.</p>
       <p>This specification enumerates the ACID properties of HBase.</p>
     </section>
     <section name="Definitions">
       <p>For the sake of common vocabulary, we define the following terms:</p>
       <dl>
         <dt>Atomicity</dt>
         <dd>an operation is atomic if it either completes entirely or not at all</dd>

         <dt>Consistency</dt>
         <dd>
           all actions cause the table to transition from one valid state directly to another
           (eg a row will not disappear during an update, etc)
         </dd>

         <dt>Isolation</dt>
         <dd>
           an operation is isolated if it appears to complete independently of any other concurrent transaction
         </dd>

         <dt>Durability</dt>
         <dd>any update that reports &quot;successful&quot; to the client will not be lost</dd>

         <dt>Visibility</dt>
         <dd>an update is considered visible if any subsequent read will see the update as having been committed</dd>
       </dl>
       <p>
         The terms <em>must</em> and <em>may</em> are used as specified by RFC 2119.
         In short, the word &quot;must&quot; implies that, if some case exists where the statement
         is not true, it is a bug. The word &quot;may&quot; implies that, even if the guarantee
         is provided in a current release, users should not rely on it.
       </p>
     </section>
     <section name="APIs to consider">
       <ul>
         <li>Read APIs
         <ul>
           <li>get</li>
           <li>scan</li>
         </ul>
         </li>
         <li>Write APIs</li>
         <ul>
           <li>put</li>
           <li>batch put</li>
           <li>delete</li>
         </ul>
         <li>Combination (read-modify-write) APIs</li>
         <ul>
           <li>incrementColumnValue</li>
           <li>checkAndPut</li>
         </ul>
       </ul>
     </section>

     <section name="Guarantees Provided">

       <section name="Atomicity">

         <ol>
           <li>All mutations are atomic within a row. Any put will either wholely succeed or wholely fail.[3]</li>
           <ol>
             <li>An operation that returns a &quot;success&quot; code has completely succeeded.</li>
             <li>An operation that returns a &quot;failure&quot; code has completely failed.</li>
             <li>An operation that times out may have succeeded and may have failed. However,
             it will not have partially succeeded or failed.</li>
           </ol>
           <li> This is true even if the mutation crosses multiple column families within a row.</li>
           <li> APIs that mutate several rows will _not_ be atomic across the multiple rows.
           For example, a multiput that operates on rows 'a','b', and 'c' may return having
           mutated some but not all of the rows. In such cases, these APIs will return a list
           of success codes, each of which may be succeeded, failed, or timed out as described above.</li>
           <li> The checkAndPut API happens atomically like the typical compareAndSet (CAS) operation
           found in many hardware architectures.</li>
           <li> The order of mutations is seen to happen in a well-defined order for each row, with no
           interleaving. For example, if one writer issues the mutation &quot;a=1,b=1,c=1&quot; and
           another writer issues the mutation &quot;a=2,b=2,c=2&quot;, the row must either
           be &quot;a=1,b=1,c=1&quot; or &quot;a=2,b=2,c=2&quot; and must <em>not</em> be something
           like &quot;a=1,b=2,c=1&quot;.</li>
           <ol>
             <li>Please note that this is not true _across rows_ for multirow batch mutations.</li>
           </ol>
         </ol>
       </section>
       <section name="Consistency and Isolation">
         <ol>
           <li>All rows returned via any access API will consist of a complete row that existed at
           some point in the table's history.</li>
           <li>This is true across column families - i.e a get of a full row that occurs concurrent
           with some mutations 1,2,3,4,5 will return a complete row that existed at some point in time
           between mutation i and i+1 for some i between 1 and 5.</li>
           <li>The state of a row will only move forward through the history of edits to it.</li>
         </ol>

         <section name="Consistency of Scans">
         <p>
           A scan is <strong>not</strong> a consistent view of a table. Scans do
           <strong>not</strong> exhibit <em>snapshot isolation</em>.
         </p>
         <p>
           Rather, scans have the following properties:
         </p>

         <ol>
           <li>
             Any row returned by the scan will be a consistent view (i.e. that version
             of the complete row existed at some point in time) [1]
           </li>
           <li>
             A scan will always reflect a view of the data <em>at least as new as</em>
             the beginning of the scan. This satisfies the visibility guarantees
           enumerated below.</li>
           <ol>
             <li>For example, if client A writes data X and then communicates via a side
             channel to client B, any scans started by client B will contain data at least
             as new as X.</li>
             <li>A scan _must_ reflect all mutations committed prior to the construction
             of the scanner, and _may_ reflect some mutations committed subsequent to the
             construction of the scanner.</li>
             <li>Scans must include <em>all</em> data written prior to the scan (except in
             the case where data is subsequently mutated, in which case it _may_ reflect
             the mutation)</li>
           </ol>
         </ol>
         <p>
           Those familiar with relational databases will recognize this isolation level as &quot;read committed&quot;.
         </p>
         <p>
           Please note that the guarantees listed above regarding scanner consistency
           are referring to &quot;transaction commit time&quot;, not the &quot;timestamp&quot;
           field of each cell. That is to say, a scanner started at time <em>t</em> may see edits
           with a timestamp value greater than <em>t</em>, if those edits were committed with a
           &quot;forward dated&quot; timestamp before the scanner was constructed.
         </p>
         </section>
       </section>
       <section name="Visibility">
         <ol>
           <li> When a client receives a &quot;success&quot; response for any mutation, that
           mutation is immediately visible to both that client and any client with whom it
           later communicates through side channels. [3]</li>
           <li> A row must never exhibit so-called &quot;time-travel&quot; properties. That
           is to say, if a series of mutations moves a row sequentially through a series of
           states, any sequence of concurrent reads will return a subsequence of those states.</li>
           <ol>
             <li>For example, if a row's cells are mutated using the &quot;incrementColumnValue&quot;
             API, a client must never see the value of any cell decrease.</li>
             <li>This is true regardless of which read API is used to read back the mutation.</li>
           </ol>
           <li> Any version of a cell that has been returned to a read operation is guaranteed to
           be durably stored.</li>
         </ol>

       </section>
       <section name="Durability">
         <ol>
           <li> All visible data is also durable data. That is to say, a read will never return
           data that has not been made durable on disk[2]</li>
           <li> Any operation that returns a &quot;success&quot; code (eg does not throw an exception)
           will be made durable.[3]</li>
           <li> Any operation that returns a &quot;failure&quot; code will not be made durable
           (subject to the Atomicity guarantees above)</li>
           <li> All reasonable failure scenarios will not affect any of the guarantees of this document.</li>

         </ol>
       </section>
       <section name="Tunability">
         <p>All of the above guarantees must be possible within Apache HBase. For users who would like to trade
         off some guarantees for performance, HBase may offer several tuning options. For example:</p>
         <ul>
           <li>Visibility may be tuned on a per-read basis to allow stale reads or time travel.</li>
           <li>Durability may be tuned to only flush data to disk on a periodic basis</li>
         </ul>
       </section>
     </section>
     <section name="More Information">
       <p>
       For more information, see the <a href="book.html#client">client architecture</a> or <a href="book.html#datamodel">data model</a> sections in the Apache HBase Reference Guide.
       </p>
     </section>

     <section name="Footnotes">
       <p>[1] A consistent view is not guaranteed intra-row scanning -- i.e. fetching a portion of
           a row in one RPC then going back to fetch another portion of the row in a subsequent RPC.
           Intra-row scanning happens when you set a limit on how many values to return per Scan#next
           (See <a href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Scan.html#setBatch(int)">Scan#setBatch(int)</a>).
       </p>

       <p>[2] In the context of Apache HBase, &quot;durably on disk&quot; implies an hflush() call on the transaction
       log. This does not actually imply an fsync() to magnetic media, but rather just that the data has been
       written to the OS cache on all replicas of the log. In the case of a full datacenter power loss, it is
       possible that the edits are not truly durable.</p>
       <p>[3] Puts will either wholely succeed or wholely fail, provided that they are actually sent
       to the RegionServer.  If the writebuffer is used, Puts will not be sent until the writebuffer is filled
       or it is explicitly flushed.</p>

     </section>

   </body>
 </document>
	<?xml version="1.0" encoding="UTF-8"?>
	<!--
	Licensed to the Apache Software Foundation (ASF) under one
	or more contributor license agreements. See the NOTICE file
	distributed with this work for additional information
	regarding copyright ownership. The ASF licenses this file
	to you under the Apache License, Version 2.0 (the
	"License"); you may not use this file except in compliance
	with the License. You may obtain a copy of the License at

	http://www.apache.org/licenses/LICENSE-2.0

	Unless required by applicable law or agreed to in writing,
	software distributed under the License is distributed on an
	"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
	KIND, either express or implied. See the License for the
	specific language governing permissions and limitations
	under the License.
	-->

	<!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V2.0//EN"
	"http://forrest.apache.org/dtd/document-v20.dtd">

	<document xmlns="http://maven.apache.org/XDOC/2.0"
	xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
	xsi:schemaLocation="http://maven.apache.org/XDOC/2.0 http://maven.apache.org/xsd/xdoc-2.0.xsd">
	<properties>
	<title>
	Apache HBase (TM) ACID Properties
	</title>
	</properties>

	<body>
	<section name="About this Document">
	<p>Apache HBase (TM) is not an ACID compliant database. However, it does guarantee certain specific
	properties.</p>
	<p>This specification enumerates the ACID properties of HBase.</p>
	</section>
	<section name="Definitions">
	<p>For the sake of common vocabulary, we define the following terms:</p>
	<dl>
	<dt>Atomicity</dt>
	<dd>an operation is atomic if it either completes entirely or not at all</dd>

	<dt>Consistency</dt>
	<dd>
	all actions cause the table to transition from one valid state directly to another
	(eg a row will not disappear during an update, etc)
	</dd>

	<dt>Isolation</dt>
	<dd>
	an operation is isolated if it appears to complete independently of any other concurrent transaction
	</dd>

	<dt>Durability</dt>
	<dd>any update that reports "successful" to the client will not be lost</dd>

	<dt>Visibility</dt>
	<dd>an update is considered visible if any subsequent read will see the update as having been committed</dd>
	</dl>
	<p>
	The terms <em>must</em> and <em>may</em> are used as specified by RFC 2119.
	In short, the word "must" implies that, if some case exists where the statement
	is not true, it is a bug. The word "may" implies that, even if the guarantee
	is provided in a current release, users should not rely on it.
	</p>
	</section>
	<section name="APIs to consider">
	<ul>
	<li>Read APIs
	<ul>
	<li>get</li>
	<li>scan</li>
	</ul>
	</li>
	<li>Write APIs</li>
	<ul>
	<li>put</li>
	<li>batch put</li>
	<li>delete</li>
	</ul>
	<li>Combination (read-modify-write) APIs</li>
	<ul>
	<li>incrementColumnValue</li>
	<li>checkAndPut</li>
	</ul>
	</ul>
	</section>

	<section name="Guarantees Provided">

	<section name="Atomicity">

	<ol>
	<li>All mutations are atomic within a row. Any put will either wholely succeed or wholely fail.[3]</li>
	<ol>
	<li>An operation that returns a "success" code has completely succeeded.</li>
	<li>An operation that returns a "failure" code has completely failed.</li>
	<li>An operation that times out may have succeeded and may have failed. However,
	it will not have partially succeeded or failed.</li>
	</ol>
	<li> This is true even if the mutation crosses multiple column families within a row.</li>
	<li> APIs that mutate several rows will _not_ be atomic across the multiple rows.
	For example, a multiput that operates on rows 'a','b', and 'c' may return having
	mutated some but not all of the rows. In such cases, these APIs will return a list
	of success codes, each of which may be succeeded, failed, or timed out as described above.</li>
	<li> The checkAndPut API happens atomically like the typical compareAndSet (CAS) operation
	found in many hardware architectures.</li>
	<li> The order of mutations is seen to happen in a well-defined order for each row, with no
	interleaving. For example, if one writer issues the mutation "a=1,b=1,c=1" and
	another writer issues the mutation "a=2,b=2,c=2", the row must either
	be "a=1,b=1,c=1" or "a=2,b=2,c=2" and must <em>not</em> be something
	like "a=1,b=2,c=1".</li>
	<ol>
	<li>Please note that this is not true _across rows_ for multirow batch mutations.</li>
	</ol>
	</ol>
	</section>
	<section name="Consistency and Isolation">
	<ol>
	<li>All rows returned via any access API will consist of a complete row that existed at
	some point in the table's history.</li>
	<li>This is true across column families - i.e a get of a full row that occurs concurrent
	with some mutations 1,2,3,4,5 will return a complete row that existed at some point in time
	between mutation i and i+1 for some i between 1 and 5.</li>
	<li>The state of a row will only move forward through the history of edits to it.</li>
	</ol>

	<section name="Consistency of Scans">
	<p>
	A scan is <strong>not</strong> a consistent view of a table. Scans do
	<strong>not</strong> exhibit <em>snapshot isolation</em>.
	</p>
	<p>
	Rather, scans have the following properties:
	</p>

	<ol>
	<li>
	Any row returned by the scan will be a consistent view (i.e. that version
	of the complete row existed at some point in time) [1]
	</li>
	<li>
	A scan will always reflect a view of the data <em>at least as new as</em>
	the beginning of the scan. This satisfies the visibility guarantees
	enumerated below.</li>
	<ol>
	<li>For example, if client A writes data X and then communicates via a side
	channel to client B, any scans started by client B will contain data at least
	as new as X.</li>
	<li>A scan _must_ reflect all mutations committed prior to the construction
	of the scanner, and _may_ reflect some mutations committed subsequent to the
	construction of the scanner.</li>
	<li>Scans must include <em>all</em> data written prior to the scan (except in
	the case where data is subsequently mutated, in which case it _may_ reflect
	the mutation)</li>
	</ol>
	</ol>
	<p>
	Those familiar with relational databases will recognize this isolation level as "read committed".
	</p>
	<p>
	Please note that the guarantees listed above regarding scanner consistency
	are referring to "transaction commit time", not the "timestamp"
	field of each cell. That is to say, a scanner started at time <em>t</em> may see edits
	with a timestamp value greater than <em>t</em>, if those edits were committed with a
	"forward dated" timestamp before the scanner was constructed.
	</p>
	</section>
	</section>
	<section name="Visibility">
	<ol>
	<li> When a client receives a "success" response for any mutation, that
	mutation is immediately visible to both that client and any client with whom it
	later communicates through side channels. [3]</li>
	<li> A row must never exhibit so-called "time-travel" properties. That
	is to say, if a series of mutations moves a row sequentially through a series of
	states, any sequence of concurrent reads will return a subsequence of those states.</li>
	<ol>
	<li>For example, if a row's cells are mutated using the "incrementColumnValue"
	API, a client must never see the value of any cell decrease.</li>
	<li>This is true regardless of which read API is used to read back the mutation.</li>
	</ol>
	<li> Any version of a cell that has been returned to a read operation is guaranteed to
	be durably stored.</li>
	</ol>

	</section>
	<section name="Durability">
	<ol>
	<li> All visible data is also durable data. That is to say, a read will never return
	data that has not been made durable on disk[2]</li>
	<li> Any operation that returns a "success" code (eg does not throw an exception)
	will be made durable.[3]</li>
	<li> Any operation that returns a "failure" code will not be made durable
	(subject to the Atomicity guarantees above)</li>
	<li> All reasonable failure scenarios will not affect any of the guarantees of this document.</li>

	</ol>
	</section>
	<section name="Tunability">
	<p>All of the above guarantees must be possible within Apache HBase. For users who would like to trade
	off some guarantees for performance, HBase may offer several tuning options. For example:</p>
	<ul>
	<li>Visibility may be tuned on a per-read basis to allow stale reads or time travel.</li>
	<li>Durability may be tuned to only flush data to disk on a periodic basis</li>
	</ul>
	</section>
	</section>
	<section name="More Information">
	<p>
	For more information, see the <a href="book.html#client">client architecture</a> or <a href="book.html#datamodel">data model</a> sections in the Apache HBase Reference Guide.
	</p>
	</section>

	<section name="Footnotes">
	<p>[1] A consistent view is not guaranteed intra-row scanning -- i.e. fetching a portion of
	a row in one RPC then going back to fetch another portion of the row in a subsequent RPC.
	Intra-row scanning happens when you set a limit on how many values to return per Scan#next
	(See <a href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Scan.html#setBatch(int)">Scan#setBatch(int)</a>).
	</p>

	<p>[2] In the context of Apache HBase, "durably on disk" implies an hflush() call on the transaction
	log. This does not actually imply an fsync() to magnetic media, but rather just that the data has been
	written to the OS cache on all replicas of the log. In the case of a full datacenter power loss, it is
	possible that the edits are not truly durable.</p>
	<p>[3] Puts will either wholely succeed or wholely fail, provided that they are actually sent
	to the RegionServer. If the writebuffer is used, Puts will not be sent until the writebuffer is filled
	or it is explicitly flushed.</p>

	</section>

	</body>
	</document>