| <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd"> |
| <!-- |
| Licensed to the Apache Software Foundation (ASF) under one |
| or more contributor license agreements. See the NOTICE file |
| distributed with this work for additional information |
| regarding copyright ownership. The ASF licenses this file |
| to you under the Apache License, Version 2.0 (the |
| "License"); you may not use this file except in compliance |
| with the License. You may obtain a copy of the License at |
| |
| http://www.apache.org/licenses/LICENSE-2.0 |
| |
| Unless required by applicable law or agreed to in writing, |
| software distributed under the License is distributed on an |
| "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY |
| KIND, either express or implied. See the License for the |
| specific language governing permissions and limitations |
| under the License. |
| --> |
| <html> |
| <head> |
| <meta http-equiv="content-type" content="text/html; charset=UTF-8" /> |
| <meta charset="utf-8" /> |
| <meta name="viewport" content="width=device-width, initial-scale=1.0" /> |
| <meta name="author" content="dev@gora.apache.org" /> |
| |
| <META http-equiv="Content-Type" content="text/html;charset=UTF-8" /> |
| <META name="Description" content="Apache Gora -- Gora Core Module" /> |
| <META name="Keywords" content="Apache Gora NoSQL Framework" /> |
| <META name="Owner" content="dev@gora.apache.org" /> |
| <META name="Robots" content="index, follow" /> |
| <META name="Security" content="Public" /> |
| <META name="Source" content="wiki template" /> |
| <META name="DC.Rights" content="Copyright 2010-2023, The Apache Software Foundation" /> |
| |
| <!-- The styles --> |
| <link href="/resources/css/bootstrap.css" rel="stylesheet"> |
| <style type="text/css"> |
| body { |
| padding-top: 60px; |
| padding-bottom: 40px; |
| } |
| .headerlink { |
| visibility: hidden; |
| } |
| dt:hover > .headerlink, p:hover > .headerlink, td:hover > .headerlink, h1:hover > .headerlink, h2:hover > .headerlink, h3:hover > .headerlink, h4:hover > .headerlink, h5:hover > .headerlink, h6:hover > .headerlink { |
| visibility: visible |
| } </style> |
| <link href="/resources/css/bootstrap-responsive.css" rel="stylesheet"> |
| <link href="/resources/css/gora.css" rel="stylesheet"> |
| |
| <style type="text/css"> |
| .stpulldown-gradient |
| { |
| background: #E1E1E1; |
| background: -moz-linear-gradient(top, #E1E1E1 0%, #A7A7A7 100%); /* firefox */ |
| background: -webkit-gradient(linear, left top, left bottom, color-stop(0%,#E1E1E1), color-stop(100%,#A7A7A7)); /* webkit */ |
| filter: progid:DXImageTransform.Microsoft.gradient( startColorstr='#E1E1E1', endColorstr='#A7A7A7',GradientType=0 ); /* ie */ |
| background: -o-linear-gradient(top, #E1E1E1 0%,#A7A7A7 100%); /* opera */ |
| color: #636363; |
| } |
| #stpulldown .stpulldown-logo |
| { |
| height: 40px; |
| width: 300px; |
| margin-left: 20px; |
| margin-top: 5px; |
| background:url("http://gora.apache.org/resources/img/feather-small.png") no-repeat; |
| } |
| </style> |
| <!-- HTML5 shim, for IE6-8 support of HTML5 elements --> |
| <!--[if lt IE 9]> |
| <script src="http://html5shim.googlecode.com/svn/trunk/html5.js"></script> |
| <![endif]--> |
| |
| <!-- Fav and touch icons --> |
| <link rel="apple-touch-icon-precomposed" sizes="144x144" href="http://twitter.github.com/bootstrap/assets/ico/apple-touch-icon-144-precomposed.png"> |
| <link rel="apple-touch-icon-precomposed" sizes="114x114" href="http://twitter.github.com/bootstrap/assets/ico/apple-touch-icon-114-precomposed.png"> |
| <link rel="apple-touch-icon-precomposed" sizes="72x72" href="http://twitter.github.com/bootstrap/assets/ico/apple-touch-icon-72-precomposed.png"> |
| <link rel="apple-touch-icon-precomposed" href="http://twitter.github.com/bootstrap/assets/ico/apple-touch-icon-57-precomposed.png"> |
| <link rel="shortcut icon" href="/resources/img/feather-small.png"> |
| |
| <title>Apache Gora™ - Gora Core Module</title> |
| </head> |
| |
| <body> |
| <div class="navbar navbar-inverse navbar-fixed-top"> |
| <div class="navbar-inner"> |
| <div class="container"> |
| <a class="btn btn-navbar" data-toggle="collapse" data-target=".nav-collapse"> |
| <span class="icon-bar"></span> |
| <span class="icon-bar"></span> |
| <span class="icon-bar"></span> |
| </a> |
| <a class="brand" href="/index.html"><img src="/resources/img/gora-logo.png" alt="Apache Gora" title="Apache Gora"/></a> |
| <div class="nav-collapse collapse"> |
| <ul class="nav"> |
| <li><a href="/downloads.html">Downloads</a></li> |
| <li class="dropdown"> |
| <a href="#" class="dropdown-toggle" data-toggle="dropdown">Community <b class="caret"></b></a> |
| <ul class="dropdown-menu pull-right"> |
| <li><a href="https://whimsy.apache.org/board/minutes/Gora.html">Board Reporting</a></li> |
| <li><a href="/contribute.html">Contribute</a></li> |
| <li><a href="/mailing_lists.html">Mailing Lists</a></li> |
| <li><a href="/credits.html">People</a></li> |
| <li><a href="/related.html">Related Projects</a></li> |
| </ul> |
| </li> |
| <li class="dropdown"> |
| <a href="#" class="dropdown-toggle" data-toggle="dropdown">Documentation <b class="caret"></b></a> |
| <ul class="dropdown-menu pull-right"> |
| <li><a href="/about.html">About</a></li> |
| <li><a href="/current/index.html">Current Documentation</a></li> |
| <li><a href="/current/api/javadoc.html">JavaDoc Documentation</a></li> |
| <li><a href="/current/tutorial.html">Gora Tutorial</a></li> |
| <li><a href="https://cwiki.apache.org/confluence/display/GORA/">Gora Wiki</a></li> |
| <li><a href="http://en.wikipedia.org/wiki/Apache_Gora">Gora Wikipedia Entry</a></li> |
| </ul> |
| </li> |
| <li class="dropdown"> |
| <a href="#" class="dropdown-toggle" data-toggle="dropdown">Development <b class="caret"></b></a> |
| <ul class="dropdown-menu pull-right"> |
| <li><a href="https://issues.apache.org/jira/browse/GORA">Issue Tracking</a></li> |
| <li><a href="/mailing_lists.html">Mailing Lists</a></li> |
| <li><a href="https://builds.apache.org/view/All/job/gora-trunk/">Nightly Builds</a></li> |
| <li><a href="https://analysis.apache.org/dashboard/index/76356">Sonar Analysis</a></li> |
| <li><a href="/version_control.html">Version Control</a></li> |
| <li><a href="/roadmap.html">Roadmap</a></li> |
| </ul> |
| </li> |
| <li class="dropdown"> |
| <a href="#" class="dropdown-toggle" data-toggle="dropdown"> |
| <img src="/resources/img/feather-small.png" alt="Apache" title="Apache" /> |
| <b class="caret"></b> |
| </a> |
| <ul class="dropdown-menu pull-right"> |
| <li><a href="http://www.apache.org">Apache Home</a></li> |
| <li><a href="http://www.apache.org/licenses/">Apache License</a></li> |
| <li><a href="http://www.apache.org/security/">Security</a></li> |
| <li><a href="http://www.apache.org/foundation/sponsorship.html">Support</a></li> |
| <li><a href="http://www.apache.org/foundation/thanks.html">Thanks</a></li> |
| </ul> |
| </li> |
| </ul> |
| <form id="search-form" class="navbar-search pull-right" action="http://www.google.com/cse" method="get"> |
| <input value="gora.apache.org" name="sitesearch" type="hidden" /> |
| <input class="search-query" name="q" id="query" type="text" /> |
| </form> |
| <script type="text/javascript" src="http://www.google.com/coop/cse/brand?form=search-form"></script> |
| </div> <!--/.nav-collapse --> |
| </div> <!-- /container --> |
| </div> <!-- /navbar-inner --> |
| </div> <!-- /navbar --> |
| |
| <div class="container top-buffer" id="Gora_Gora Core Module"> |
| |
| <h1 id="overview">Overview<a class="headerlink" href="#overview" title="Permalink">¶</a></h1> |
| <p>This is the main documentation for DataStore's contained within the |
| <code>gora-core</code> module which (as it's name implies) |
| holds most of the core functionality for the gora project.</p> |
| <p>Every module |
| in gora depends on gora-core therefore most of the generic documentation |
| about the project is gathered here as well as the documentation for <code>AvroStore</code>, |
| <code>DataFileAvroStore</code> and <code>MemStore</code>. In addition to this, gora-core holds all of the |
| core <strong>MapReduce</strong>, <strong>GoraSparkEngine</strong>, <strong>Persistency</strong>, <strong>Query</strong>, <strong>DataStoreBase</strong> and <strong>Utility</strong> functionality.</p> |
| <div id="toc"><ul><li><a class="toc-href" href="#avrostore" title="AvroStore">AvroStore</a><ul><li><a class="toc-href" href="#description" title="Description">Description</a></li><li><a class="toc-href" href="#goraproperties" title="gora.properties">gora.properties</a></li><li><a class="toc-href" href="#avrostore-xml-mappings" title="AvroStore XML mappings">AvroStore XML mappings</a></li></ul></li><li><a class="toc-href" href="#datafileavrostore" title="DataFileAvroStore">DataFileAvroStore</a><ul><li><a class="toc-href" href="#description_1" title="Description">Description</a></li><li><a class="toc-href" href="#goraproperties_1" title="gora.properties">gora.properties</a></li><li><a class="toc-href" href="#gora-core-mappings" title="Gora Core mappings">Gora Core mappings</a></li></ul></li><li><a class="toc-href" href="#memstore" title="MemStore">MemStore</a><ul><li><a class="toc-href" href="#description_2" title="Description">Description</a></li><li><a class="toc-href" href="#goraproperties_2" title="gora.properties">gora.properties</a></li><li><a class="toc-href" href="#memstore-xml-mappings" title="MemStore XML mappings">MemStore XML mappings</a></li></ul></li><li><a class="toc-href" href="#gorasparkengine" title="GoraSparkEngine">GoraSparkEngine</a><ul><li><a class="toc-href" href="#description_3" title="Description">Description</a></li></ul></li></ul></div> |
| <h1 id="avrostore">AvroStore<a class="headerlink" href="#avrostore" title="Permalink">¶</a></h1> |
| <h2 id="description">Description<a class="headerlink" href="#description" title="Permalink">¶</a></h2> |
| <p>AvroStore can be used for binary-compatible Avro serializations. It supports Binary and JSON serializations.</p> |
| <h2 id="goraproperties">gora.properties<a class="headerlink" href="#goraproperties" title="Permalink">¶</a></h2> |
| <table class="table"> |
| <thead> |
| <tr> |
| <th align="left">Property Key</th> |
| <th align="left">Property Value</th> |
| <th align="left">Required</th> |
| <th align="left">Description</th> |
| </tr> |
| </thead> |
| <tbody> |
| <tr> |
| <td>gora.datastore.default=</td> |
| <td>org.apache.gora.avro.store.AvroStore</td> |
| <td>Yes</td> |
| <td>Implementation of the persistent Java storage class</td> |
| </tr> |
| <tr> |
| <td>gora.avrostore.input.path=</td> |
| <td>*hdfs://uri/path/to/hdfs/input/path* || *file:///uri/path/to/local/input/path*</td> |
| <td>Yes</td> |
| <td>This value should point to the input directory on hdfs (if running Gora in a distributed Hadoop environment) or to some location input directory on the local file system (if running Gora locally).</td> |
| </tr> |
| <tr> |
| <td>gora.avrostore.output.path=</td> |
| <td>*hdfs://uri/path/to/hdfs/output/path* || *file:///uri/path/to/local/output/path*</td> |
| <td>Yes</td> |
| <td>This value should point to the output directory on hdfs (if running Gora in a distributed Hadoop environment) or to some location output location on the local file system (if running Gora locally).</td> |
| </tr> |
| <tr> |
| <td>gora.avrostore.codec.type=</td> |
| <td>BINARY || JSON</td> |
| <td>No</td> |
| <td>The property key specifying avro encoder/decoder type to use. Can take values <code>BINARY</code> or <code>JSON</code> but resolves to BINARY is one is not supplied.</td> |
| </tr> |
| |
| </tbody></table> |
| <h2 id="avrostore-xml-mappings">AvroStore XML mappings<a class="headerlink" href="#avrostore-xml-mappings" title="Permalink">¶</a></h2> |
| <p>In the stores covered within the gora-core module, no physical mappings are required.</p> |
| <h1 id="datafileavrostore">DataFileAvroStore<a class="headerlink" href="#datafileavrostore" title="Permalink">¶</a></h1> |
| <h2 id="description_1">Description<a class="headerlink" href="#description_1" title="Permalink">¶</a></h2> |
| <p>DataFileAvroStore is file based store which extends <codeAvroStore to use Avro's <code>DataFile{Writer,Reader}</code>'s as a backend. |
| This datastore supports MapReduce.</p> |
| <h2 id="goraproperties_1">gora.properties<a class="headerlink" href="#goraproperties_1" title="Permalink">¶</a></h2> |
| <p>DataFileAvroStore would be configured exactly the same as in AvroStore above with the following exception</p> |
| <table class="table"> |
| <thead> |
| <tr> |
| <th align="left">Property Key</th> |
| <th align="left">Property Value</th> |
| <th align="left">Required</th> |
| <th align="left">Description</th> |
| </tr> |
| </thead> |
| <tbody> |
| <tr> |
| <td>gora.datastore.default=</td> |
| <td>org.apache.gora.avro.store.DataFileAvroStore</td> |
| <td>Yes</td> |
| <td>Implementation of the persistent Java storage class</td> |
| </tr> |
| |
| </tbody></table> |
| <h2 id="gora-core-mappings">Gora Core mappings<a class="headerlink" href="#gora-core-mappings" title="Permalink">¶</a></h2> |
| <p>In the stores covered within the gora-core module, no physical mappings are required.</p> |
| <h1 id="memstore">MemStore<a class="headerlink" href="#memstore" title="Permalink">¶</a></h1> |
| <h2 id="description_2">Description<a class="headerlink" href="#description_2" title="Permalink">¶</a></h2> |
| <p>Essentially this store is a ConcurrentSkipListMap in which operations run as follows</p> |
| <ul> |
| <li>put(K key, T Object) - expect average log(n)</li> |
| <li>get(K key, String [] fields) - expect average log(n)</li> |
| <li>delete(K key) - expect average log(n)</li> |
| </ul> |
| <h2 id="goraproperties_2">gora.properties<a class="headerlink" href="#goraproperties_2" title="Permalink">¶</a></h2> |
| <p>MemStore would be configured exactly the same as in AvroStore above with the following exception</p> |
| <table class="table"> |
| <thead> |
| <tr> |
| <th align="left">Property Key</th> |
| <th align="left">Property Value</th> |
| <th align="left">Required</th> |
| <th align="left">Description</th> |
| </tr> |
| </thead> |
| <tbody> |
| <tr> |
| <td>gora.datastore.default=</td> |
| <td>org.apache.gora.memory.store.MemStore</td> |
| <td>Yes</td> |
| <td>Implementation of the Java class used to hold data in memory</td> |
| </tr> |
| |
| </tbody></table> |
| <h2 id="memstore-xml-mappings">MemStore XML mappings<a class="headerlink" href="#memstore-xml-mappings" title="Permalink">¶</a></h2> |
| <p>In the stores covered within the gora-core module, no physical mappings are required.</p> |
| <h1 id="gorasparkengine">GoraSparkEngine<a class="headerlink" href="#gorasparkengine" title="Permalink">¶</a></h1> |
| <h2 id="description_3">Description<a class="headerlink" href="#description_3" title="Permalink">¶</a></h2> |
| <p>GoraSparkEngine is Spark backend of Gora. Assume that input and output data stores are:</p> |
| <pre><code>DataStore<K1, V1> inStore; |
| DataStore<K2, V2> outStore; |
| </code></pre> |
| <p>First step of using GoraSparkEngine is to initialize it:</p> |
| <pre><code>GoraSparkEngine<K1, V1> goraSparkEngine = new GoraSparkEngine<>(K1.class, V1.class); |
| </code></pre> |
| <p>Construct a <code>JavaSparkContext</code>. Register input data store’s value class as Kryo class:</p> |
| <pre><code>SparkConf sparkConf = new SparkConf().setAppName("Gora Spark Integration Application").setMaster("local"); |
| Class[] c = new Class[1]; |
| c[0] = inStore.getPersistentClass(); |
| sparkConf.registerKryoClasses(c); |
| JavaSparkContext sc = new JavaSparkContext(sparkConf); |
| </code></pre> |
| <p>JavaPairRDD can be retrieved from input data store:</p> |
| <pre><code>JavaPairRDD<Long, Pageview> goraRDD = goraSparkEngine.initialize(sc, inStore); |
| </code></pre> |
| <p>After that, all Spark functionality can be applied. For example running count can be done as follows:</p> |
| <pre><code>long count = goraRDD.count(); |
| </code></pre> |
| <p>Map and Reduce functions can be run on a <code>JavaPairRDD</code> as well. Assume that this is the variable after map/reduce is applied:</p> |
| <pre><code>JavaPairRDD<String, MetricDatum> mapReducedGoraRdd; |
| </code></pre> |
| <p>Result can be written as follows:</p> |
| <pre><code>Configuration sparkHadoopConf = goraSparkEngine.generateOutputConf(outStore); |
| mapReducedGoraRdd.saveAsNewAPIHadoopDataset(sparkHadoopConf); |
| </code></pre> |
| |
| |
| </div> <!-- /container (main block) --> |
| |
| <hr> |
| |
| <div class="container"> |
| <footer> |
| <p>Copyright © 2010-2023 The Apache Software Foundation. Licensed under <a href="http://www.apache.org/licenses/LICENSE-2.0">Apache License 2.0</a>. |
| </p> |
| <p>Apache Gora, Gora, Apache, the Apache feather logo, and the Apache Gora project logo are trademarks of The Apache Software Foundation. |
| </p> |
| </footer> |
| |
| </div> <!-- /container --> |
| |
| <!-- The javascript |
| ================================================== --> |
| <!-- Placed at the end of the document so the pages load faster --> |
| <script src="http://ajax.googleapis.com/ajax/libs/jquery/1.8.1/jquery.min.js"; type="text/javascript"></script> |
| <script src="/resources/js/bootstrap.min.js"></script> |
| <script type="text/javascript">stLight.options({publisher: "4059fafd-3891-49f9-8c96-e4100290d8e6", doNotHash: false, doNotCopy: false, hashAddressBar: false});</script> |
| <link rel="stylesheet" href="/resources/css/docco.css"> |
| <script src="//cdn.jsdelivr.net/gh/highlightjs/cdn-release@11.0.1/build/highlight.min.js"></script> |
| <script>hljs.highlightAll();</script> |
| </body> |
| </html> |