| <?xml version="1.0" encoding="UTF-8"?> |
| <!-- |
| Licensed to the Apache Software Foundation (ASF) under one |
| or more contributor license agreements. See the NOTICE file |
| distributed with this work for additional information |
| regarding copyright ownership. The ASF licenses this file |
| to you under the Apache License, Version 2.0 (the |
| "License"); you may not use this file except in compliance |
| with the License. You may obtain a copy of the License at |
| |
| http://www.apache.org/licenses/LICENSE-2.0 |
| |
| Unless required by applicable law or agreed to in writing, |
| software distributed under the License is distributed on an |
| "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY |
| KIND, either express or implied. See the License for the |
| specific language governing permissions and limitations |
| under the License. |
| --> |
| <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> |
| <!-- Generated by Apache Maven Doxia at 2018-08-13 --> |
| <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"> |
| <head> |
| <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /> |
| <title>Apache James Project – Mailbox HBase</title> |
| <style type="text/css" media="all"> |
| @import url("../css/james.css"); |
| @import url("../css/maven-base.css"); |
| @import url("../css/maven-theme.css"); |
| @import url("../css/site.css"); |
| @import url("../js/jquery/css/custom-theme/jquery-ui-1.8.5.custom.css"); |
| @import url("../js/jquery/css/print.css"); |
| @import url("../js/fancybox/jquery.fancybox-1.3.4.css"); |
| </style> |
| <script type="text/javascript" src="../js/jquery/js/jquery-1.4.2.min.js"></script> |
| <script type="text/javascript" src="../js/jquery/js/jquery-ui-1.8.5.custom.min.js"></script> |
| <script type="text/javascript" src="../js/fancybox/jquery.fancybox-1.3.4.js"></script> |
| <link rel="stylesheet" href="../css/print.css" type="text/css" media="print" /> |
| <meta name="Date-Revision-yyyymmdd" content="20180813" /> |
| <meta http-equiv="Content-Language" content="en" /> |
| |
| <!-- Google Analytics --> |
| <script type="text/javascript"> |
| |
| var _gaq = _gaq || []; |
| _gaq.push(['_setAccount', 'UA-1384591-1']); |
| _gaq.push(['_trackPageview']); |
| |
| (function() { |
| var ga = document.createElement('script'); ga.type = 'text/javascript'; ga.async = true; |
| ga.src = ('https:' == document.location.protocol ? 'https://ssl' : 'http://www') + '.google-analytics.com/ga.js'; |
| var s = document.getElementsByTagName('script').item(0); s.parentNode.insertBefore(ga, s); |
| })(); |
| |
| </script> |
| </head> |
| <body class="composite"> |
| <div id="banner"> |
| <a href="../index.html" id="bannerLeft" title="james-logo.png"> |
| |
| |
| <img src="../images/logos/james-logo.png" alt="James Project" /> |
| </a> |
| <a href="http://www.apache.org/index.html" id="bannerRight"> |
| |
| |
| <img src="images/logos/asf_logo_small.png" alt="The Apache Software Foundation" /> |
| </a> |
| <div class="clear"> |
| <hr/> |
| </div> |
| </div> |
| <div id="breadcrumbs"> |
| |
| |
| <div class="xleft"> |
| <span id="publishDate">Last Published: 2018-08-13</span> |
| </div> |
| <div class="xright"> <a href="../index.html" title="Home">Home</a> |
| | |
| <a href="../documentation.html" title="James">James</a> |
| | |
| <a href="../mime4j/index.html" title="Mime4J">Mime4J</a> |
| | |
| <a href="../jsieve/index.html" title="jSieve">jSieve</a> |
| | |
| <a href="../jspf/index.html" title="jSPF">jSPF</a> |
| | |
| <a href="../jdkim/index.html" title="jDKIM">jDKIM</a> |
| | |
| <a href="../hupa/index.html" title="Hupa">Hupa</a> |
| |
| |
| </div> |
| <div class="clear"> |
| <hr/> |
| </div> |
| </div> |
| <div id="leftColumn"> |
| <div id="navcolumn"> |
| |
| |
| <h5>James components</h5> |
| <ul> |
| <li class="collapsed"> |
| <a href="../documentation.html" title="About James">About James</a> |
| </li> |
| <li class="collapsed"> |
| <a href="../server/index.html" title="Server">Server</a> |
| </li> |
| <li class="collapsed"> |
| <a href="../mailet/index.html" title="Mailets">Mailets</a> |
| </li> |
| <li class="expanded"> |
| <a href="../mailbox/index.html" title="Mailbox">Mailbox</a> |
| <ul> |
| <li class="none"> |
| <a href="../mailbox/source-code.html" title="Source Code">Source Code</a> |
| </li> |
| <li class="none"> |
| <a href="../mailbox/apidocs/index.html" title="Javadoc">Javadoc</a> |
| </li> |
| <li class="none"> |
| <a href="https://issues.apache.org/jira/browse/MAILBOX" title="Issue Tracker">Issue Tracker</a> |
| </li> |
| <li class="expanded"> |
| <a href="../mailbox/mailbox-api.html" title="Framework">Framework</a> |
| <ul> |
| <li class="none"> |
| <a href="../mailbox/mailbox-store.html" title="Mailbox Store">Mailbox Store</a> |
| </li> |
| <li class="none"> |
| <a href="../mailbox/mailbox-tool.html" title="Mailbox Tool">Mailbox Tool</a> |
| </li> |
| </ul> |
| </li> |
| <li class="expanded"> |
| <a href="../mailbox/index.html" title="Implementations">Implementations</a> |
| <ul> |
| <li class="none"> |
| <a href="../mailbox/mailbox-memory.html" title="Mailbox Memory">Mailbox Memory</a> |
| </li> |
| <li class="none"> |
| <a href="../mailbox/mailbox-cassandra.html" title="Mailbox Cassandra">Mailbox Cassandra</a> |
| </li> |
| <li class="none"> |
| <a href="../mailbox/mailbox-maildir.html" title="Mailbox Maildir">Mailbox Maildir</a> |
| </li> |
| <li class="none"> |
| <a href="../mailbox/mailbox-jpa.html" title="Mailbox JPA">Mailbox JPA</a> |
| </li> |
| <li class="none"> |
| <a href="../mailbox/mailbox-jcr.html" title="Mailbox JCR">Mailbox JCR</a> |
| </li> |
| <li class="none"> |
| <strong>Mailbox HBase</strong> |
| </li> |
| </ul> |
| </li> |
| <li class="none"> |
| <a href="../mailbox/mailbox-spring.html" title="Wiring">Wiring</a> |
| </li> |
| <li class="none"> |
| <a href="../download.cgi" title="Download releases">Download releases</a> |
| </li> |
| </ul> |
| </li> |
| <li class="collapsed"> |
| <a href="../protocols/index.html" title="Protocols">Protocols</a> |
| </li> |
| <li class="collapsed"> |
| <a href="../mpt/index.html" title="MPT">MPT</a> |
| </li> |
| </ul> |
| <h5>Apache Software Foundation</h5> |
| <ul> |
| <li> |
| <strong> |
| <a title="ASF" href="http://www.apache.org/">ASF</a> |
| </strong> |
| </li> |
| <li> |
| <a title="Get Involved" href="http://www.apache.org/foundation/getinvolved.html">Get Involved</a> |
| </li> |
| <li> |
| <a title="FAQ" href="http://www.apache.org/foundation/faq.html">FAQ</a> |
| </li> |
| <li> |
| <a title="License" href="http://www.apache.org/licenses/" >License</a> |
| </li> |
| <li> |
| <a title="Sponsorship" href="http://www.apache.org/foundation/sponsorship.html">Sponsorship</a> |
| </li> |
| <li> |
| <a title="Thanks" href="http://www.apache.org/foundation/thanks.html">Thanks</a> |
| </li> |
| <li> |
| <a title="Security" href="http://www.apache.org/security/">Security</a> |
| </li> |
| </ul> |
| <a href="http://maven.apache.org/" title="Built by Maven" class="poweredBy"> |
| <img class="poweredBy" alt="Built by Maven" src="../images/logos/maven-feather.png" /> |
| </a> |
| |
| |
| </div> |
| </div> |
| <div id="bodyColumn"> |
| <div id="contentBox"> |
| |
| |
| |
| |
| <div class="section"> |
| <h2><a name="Mailbox_HBase_Responsibility"></a>Mailbox HBase Responsibility</h2> |
| |
| <p>This module provides a mailbox implementation for persisting mailboxes (messages, and subscriptions) in a HBase cluster.</p> |
| |
| <p>It only supports the Basic capability.</p> |
| </div> |
| |
| |
| <div class="section"> |
| <h2><a name="Overview"></a>Overview</h2> |
| |
| <p> |
| This should provide an overview of the design and implementation of Mailbox HBase. |
| |
| </p> |
| |
| <div class="section"> |
| <h3><a name="Tables"></a>Tables</h3> |
| |
| <p>The current implementations stores Messages, Mailboxes and Subscriptions in their own tables.</p> There are: |
| |
| <ul> |
| |
| <li>JAMES_MAILBOXES - for storing mailboxes.</li> |
| |
| <li>JAMES_MESSAGES - for storing messages.</li> |
| |
| <li>JAMES_SUBSCRIPTIONS - for storing user subscriptions.</li> |
| </ul> |
| </div> |
| |
| |
| <div class="section"> |
| <h3><a name="Mailbox_UID_generation"></a>Mailbox UID generation</h3> |
| |
| <p>Mailboxes are identified using a unique |
| <a class="externalLink" href="http://download.oracle.com/javase/6/docs/api/java/util/UUID.html">UUID</a> |
| </p> |
| </div> |
| |
| |
| <div class="section"> |
| <h3><a name="Message_UID_generation"></a>Message UID generation</h3> |
| |
| <p>The IMAP RFC states that mailboxes should keep message UIDs unique and in ascending order. Mailbox HBase uses |
| <a class="externalLink" href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/HTable.html#incrementColumnValue(byte[],%20byte[],%20byte[],%20long)">incrementColumnValue</a> |
| int the HBaseUidProvider implementation to achieve this. |
| </p> |
| </div> |
| |
| <div class="section"> |
| <h3><a name="HBase_row_keys"></a>HBase row keys</h3> |
| HBase uses keys to access values. The current design uses the following row key structure: |
| |
| <ul> |
| |
| <li>JAMES_MAILBOXES: row key is mailbox UUID</li> |
| |
| <li>JAMES_MESSAGES: row key is compound by concatenating mailbox UID and message UID (in reverseorder). |
| This way we have messages groupd by mailbox and in descending order (most recent first). |
| </li> |
| |
| <li>JAMES_SUBSCRIPTION: row key is user name.</li> |
| </ul> |
| </div> |
| |
| <div class="section"> |
| <h3><a name="Misc"></a>Misc</h3> |
| |
| <p>Message bodies (more importantly big attachements) sent to many users are stored many times. There is no space sharing yet.</p> |
| |
| <p>Message data and message meta-data (flags and properties) are stored in different column families |
| so the column family optimization options can apply. Keep in mind that message data does not change, while meta-data does change. |
| </p> |
| </div> |
| |
| </div> |
| |
| |
| <div class="section"> |
| <h2><a name="Installation"></a>Installation</h2> |
| |
| <p>In order for the mailbox implementation to work you have to provide it with a link to your HBase cluster. Putting |
| <i>hbase-site.xml</i> on the class path should be enough. Mailbox HBase will pick it up an read all the configuration parameters from it. |
| </p> |
| </div> |
| |
| |
| <div class="section"> |
| <h2><a name="Mailbox_HBase_Classes"></a>Mailbox HBase Classes</h2> |
| |
| <p>This is a overview of the most important classes in the implementation. </p> |
| |
| <div class="section"> |
| <h3><a name="HBaseMailboxManager"></a>HBaseMailboxManager</h3> |
| |
| <p> |
| <b>HBaseMailboxManager</b> extends the |
| <b>StoreMailboxManager</b> class. |
| It has a simple implementation that just overrides the |
| <i>doCreateMailbox</i> method to return a HBaseMailbox implementation and |
| <i>createMessageManger</i> method to return a HBaseMessageManager implementation. |
| Other then that it relies on the default StoreMailboxManager implementation. |
| </p> |
| </div> |
| |
| |
| <div class="section"> |
| <h3><a name="HBaseMessageManager"></a>HBaseMessageManager</h3> |
| |
| <p> |
| <b>HBaseMessageManager</b> extends StoreMailboxManager and provides an implementation for getPermanentFlags method. |
| </p> |
| </div> |
| |
| |
| <div class="section"> |
| <h3><a name="Chunked_Streams"></a>Chunked Streams</h3> |
| |
| <p>Message bodies can have varying sizes. Some have attachements of up to 25Mb, some even greater. |
| There are practical limits to the size of a HBase column (see |
| <a class="externalLink" href="http://hbase.apache.org/book.html#supported.datatypes">http://hbase.apache.org/book.html#supported.datatypes</a>). |
| To address this issue, the implementation splits the message into smaller chunks and saves each chunk into a separate column. |
| The columns have increasing integer names starting with 1 and there can be at most Long.MAX_VALUE chunks. |
| </p> |
| |
| <p> |
| The magic happens in |
| <b>ChunkInputStream</b> and |
| <b>ChunkOutputStream</b> that extend |
| InputStream and OutputStream from java.io package. |
| <br /> |
| Data is retrieved using HBase Get operation and stored into an internal byte array. |
| Data is stored using HBase Put operation and chunks are split into |
| <b>chunkSize</b> configurable sized chunks. |
| Things could be more efficient if HBase had streaming support. |
| </p> |
| </div> |
| |
| <div class="section"> |
| <h3><a name="HBaseMessage"></a>HBaseMessage</h3> |
| |
| <p>Extends AbstractMessage and represents a message in the message store. |
| What is important to remember is that the current implementation retrieves just the message meta-data from HBase |
| and uses ChunkInputStream to load the message body only when needed. |
| </p> |
| </div> |
| </div> |
| |
| |
| |
| </div> |
| </div> |
| <div class="clear"> |
| <hr/> |
| </div> |
| <div id="footer"> |
| <div class="xright">Copyright © 2006-2018 |
| <a href="https://www.apache.org/">The Apache Software Foundation</a>. |
| All Rights Reserved. |
| |
| </div> |
| <div class="clear"> |
| <hr/> |
| </div> |
| </div> |
| </body> |
| </html> |