| <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd"> |
| <!-- |
| Licensed to the Apache Software Foundation (ASF) under one |
| or more contributor license agreements. See the NOTICE file |
| distributed with this work for additional information |
| regarding copyright ownership. The ASF licenses this file |
| to you under the Apache License, Version 2.0 (the |
| "License"); you may not use this file except in compliance |
| with the License. You may obtain a copy of the License at |
| |
| http://www.apache.org/licenses/LICENSE-2.0 |
| |
| Unless required by applicable law or agreed to in writing, |
| software distributed under the License is distributed on an |
| "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY |
| KIND, either express or implied. See the License for the |
| specific language governing permissions and limitations |
| under the License. |
| --> |
| <html> |
| <head> |
| <meta http-equiv="content-type" content="text/html; charset=UTF-8" /> |
| <meta charset="utf-8" /> |
| <meta name="viewport" content="width=device-width, initial-scale=1.0" /> |
| <meta name="author" content="dev@gora.apache.org" /> |
| |
| <meta http-equiv="Content-Type" content="text/html;charset=UTF-8" /> |
| <meta name="Description" content="Apache Gora -- Gora Core Module" /> |
| <meta name="Keywords" content="Apache Gora NoSQL Framework" /> |
| <meta name="Owner" content="dev@gora.apache.org" /> |
| <meta name="Robots" content="index, follow" /> |
| <meta name="Security" content="Public" /> |
| <meta name="Source" content="wiki template" /> |
| <meta |
| name="DC.Rights" |
| content="Copyright 2010-2024, The Apache Software Foundation" |
| /> |
| <link href="/resources/css/bootstrap.min.css" rel="stylesheet" /> |
| <!-- Fav and touch icons --> |
| <link |
| rel="apple-touch-icon-precomposed" |
| sizes="144x144" |
| href="http://twitter.github.com/bootstrap/assets/ico/apple-touch-icon-144-precomposed.png" |
| /> |
| <link |
| rel="apple-touch-icon-precomposed" |
| sizes="114x114" |
| href="http://twitter.github.com/bootstrap/assets/ico/apple-touch-icon-114-precomposed.png" |
| /> |
| <link |
| rel="apple-touch-icon-precomposed" |
| sizes="72x72" |
| href="http://twitter.github.com/bootstrap/assets/ico/apple-touch-icon-72-precomposed.png" |
| /> |
| <link |
| rel="apple-touch-icon-precomposed" |
| href="http://twitter.github.com/bootstrap/assets/ico/apple-touch-icon-57-precomposed.png" |
| /> |
| <link rel="shortcut icon" href="/resources/img/feather-small.png" /> |
| |
| <title>Apache Gora™ - Gora Core Module</title> |
| </head> |
| |
| <body style="padding-top: 100px"> |
| <nav class="navbar navbar-expand-lg navbar-dark bg-dark fixed-top shadow-lg"> |
| <div class="container-fluid"> |
| <a class="navbar-brand" href="/index.html" |
| ><img |
| src="/resources/img/gora-logo.png" |
| alt="Apache Gora" |
| title="Apache Gora" |
| height="50px" |
| /></a> |
| <button |
| class="navbar-toggler" |
| type="button" |
| data-bs-toggle="collapse" |
| data-bs-target="#navbarNav" |
| aria-controls="navbarNav" |
| aria-expanded="false" |
| aria-label="Toggle navigation" |
| > |
| <span class="navbar-toggler-icon"></span> |
| </button> |
| <div class="collapse navbar-collapse" id="navbarNav"> |
| <ul class="navbar-nav me-auto"> |
| <li class="nav-item"> |
| <a class="nav-link" href="/downloads.html">Downloads</a> |
| </li> |
| <li class="nav-item dropdown"> |
| <a |
| class="nav-link dropdown-toggle" |
| href="#" |
| id="navbarDropdown1" |
| role="button" |
| data-bs-toggle="dropdown" |
| aria-expanded="false" |
| >Community</a |
| > |
| <ul class="dropdown-menu" aria-labelledby="navbarDropdown1"> |
| <li> |
| <a |
| class="dropdown-item" |
| href="https://whimsy.apache.org/board/minutes/Gora.html" |
| >Board Reporting</a |
| > |
| </li> |
| <li> |
| <a class="dropdown-item" href="/contribute.html" |
| >Contribute</a |
| > |
| </li> |
| <li> |
| <a class="dropdown-item" href="/mailing_lists.html" |
| >Mailing Lists</a |
| > |
| </li> |
| <li> |
| <a class="dropdown-item" href="/credits.html">People</a> |
| </li> |
| <li> |
| <a class="dropdown-item" href="/related.html" |
| >Related Projects</a |
| > |
| </li> |
| </ul> |
| </li> |
| <li class="nav-item dropdown"> |
| <a |
| class="nav-link dropdown-toggle" |
| href="#" |
| id="navbarDropdown2" |
| role="button" |
| data-bs-toggle="dropdown" |
| aria-expanded="false" |
| >Documentation</a |
| > |
| <ul class="dropdown-menu" aria-labelledby="navbarDropdown2"> |
| <li><a class="dropdown-item" href="/about.html">About</a></li> |
| <li> |
| <a class="dropdown-item" href="/current/index.html" |
| >Current Documentation</a |
| > |
| </li> |
| <li> |
| <a class="dropdown-item" href="/current/api/javadoc.html" |
| >JavaDoc Documentation</a |
| > |
| </li> |
| <li> |
| <a class="dropdown-item" href="/current/tutorial.html" |
| >Gora Tutorial</a |
| > |
| </li> |
| <li> |
| <a |
| class="dropdown-item" |
| href="https://cwiki.apache.org/confluence/display/GORA/" |
| >Gora Wiki</a |
| > |
| </li> |
| </ul> |
| </li> |
| <li class="nav-item dropdown"> |
| <a |
| class="nav-link dropdown-toggle" |
| href="#" |
| id="navbarDropdown3" |
| role="button" |
| data-bs-toggle="dropdown" |
| aria-expanded="false" |
| >Development</a |
| > |
| <ul class="dropdown-menu" aria-labelledby="navbarDropdown3"> |
| <li> |
| <a |
| class="dropdown-item" |
| href="https://issues.apache.org/jira/browse/GORA" |
| >Issue Tracking</a |
| > |
| </li> |
| <li> |
| <a class="dropdown-item" href="/mailing_lists.html" |
| >Mailing Lists</a |
| > |
| </li> |
| <li> |
| <a class="dropdown-item" href="/version_control.html" |
| >Version Control</a |
| > |
| </li> |
| <li> |
| <a class="dropdown-item" href="/roadmap.html">Roadmap</a> |
| </li> |
| </ul> |
| </li> |
| <li class="nav-item dropdown"> |
| <a |
| class="nav-link dropdown-toggle" |
| href="#" |
| id="navbarDropdown4" |
| role="button" |
| data-bs-toggle="dropdown" |
| aria-expanded="false" |
| > |
| <img |
| src="/resources/img/feather-small.png" |
| alt="Apache" |
| title="Apache" |
| /> |
| </a> |
| <ul class="dropdown-menu" aria-labelledby="navbarDropdown4"> |
| <li> |
| <a class="dropdown-item" href="http://www.apache.org" |
| >Apache Home</a |
| > |
| </li> |
| <li> |
| <a |
| class="dropdown-item" |
| href="http://www.apache.org/licenses/" |
| >Apache License</a |
| > |
| </li> |
| <li> |
| <a |
| class="dropdown-item" |
| href="http://www.apache.org/security/" |
| >Security</a |
| > |
| </li> |
| <li> |
| <a |
| class="dropdown-item" |
| href="http://www.apache.org/foundation/sponsorship.html" |
| >Support</a |
| > |
| </li> |
| <li> |
| <a |
| class="dropdown-item" |
| href="http://www.apache.org/foundation/thanks.html" |
| >Thanks</a |
| > |
| </li> |
| </ul> |
| </li> |
| </ul> |
| </div> |
| </div> |
| </nav> |
| |
| <div class="container top-buffer" id="Gora_Gora Core Module"> |
| <h1 id="overview">Overview<a class="headerlink" href="#overview" title="Permalink">¶</a></h1> |
| <p>This is the main documentation for DataStore's contained within the |
| <code>gora-core</code> module which (as it's name implies) |
| holds most of the core functionality for the gora project.</p> |
| <p>Every module |
| in gora depends on gora-core therefore most of the generic documentation |
| about the project is gathered here as well as the documentation for <code>AvroStore</code>, |
| <code>DataFileAvroStore</code> and <code>MemStore</code>. In addition to this, gora-core holds all of the |
| core <strong>MapReduce</strong>, <strong>GoraSparkEngine</strong>, <strong>Persistency</strong>, <strong>Query</strong>, <strong>DataStoreBase</strong> and <strong>Utility</strong> functionality.</p> |
| <div id="toc"><ul><li><a class="toc-href" href="#avrostore" title="AvroStore">AvroStore</a><ul><li><a class="toc-href" href="#description" title="Description">Description</a></li><li><a class="toc-href" href="#goraproperties" title="gora.properties">gora.properties</a></li><li><a class="toc-href" href="#avrostore-xml-mappings" title="AvroStore XML mappings">AvroStore XML mappings</a></li></ul></li><li><a class="toc-href" href="#datafileavrostore" title="DataFileAvroStore">DataFileAvroStore</a><ul><li><a class="toc-href" href="#description_1" title="Description">Description</a></li><li><a class="toc-href" href="#goraproperties_1" title="gora.properties">gora.properties</a></li><li><a class="toc-href" href="#gora-core-mappings" title="Gora Core mappings">Gora Core mappings</a></li></ul></li><li><a class="toc-href" href="#memstore" title="MemStore">MemStore</a><ul><li><a class="toc-href" href="#description_2" title="Description">Description</a></li><li><a class="toc-href" href="#goraproperties_2" title="gora.properties">gora.properties</a></li><li><a class="toc-href" href="#memstore-xml-mappings" title="MemStore XML mappings">MemStore XML mappings</a></li></ul></li><li><a class="toc-href" href="#gorasparkengine" title="GoraSparkEngine">GoraSparkEngine</a><ul><li><a class="toc-href" href="#description_3" title="Description">Description</a></li></ul></li></ul></div> |
| <h1 id="avrostore">AvroStore<a class="headerlink" href="#avrostore" title="Permalink">¶</a></h1> |
| <h2 id="description">Description<a class="headerlink" href="#description" title="Permalink">¶</a></h2> |
| <p>AvroStore can be used for binary-compatible Avro serializations. It supports Binary and JSON serializations.</p> |
| <h2 id="goraproperties">gora.properties<a class="headerlink" href="#goraproperties" title="Permalink">¶</a></h2> |
| <table class="table"> |
| <thead> |
| <tr> |
| <th align="left">Property Key</th> |
| <th align="left">Property Value</th> |
| <th align="left">Required</th> |
| <th align="left">Description</th> |
| </tr> |
| </thead> |
| <tbody> |
| <tr> |
| <td>gora.datastore.default=</td> |
| <td>org.apache.gora.avro.store.AvroStore</td> |
| <td>Yes</td> |
| <td>Implementation of the persistent Java storage class</td> |
| </tr> |
| <tr> |
| <td>gora.avrostore.input.path=</td> |
| <td>*hdfs://uri/path/to/hdfs/input/path* || *file:///uri/path/to/local/input/path*</td> |
| <td>Yes</td> |
| <td>This value should point to the input directory on hdfs (if running Gora in a distributed Hadoop environment) or to some location input directory on the local file system (if running Gora locally).</td> |
| </tr> |
| <tr> |
| <td>gora.avrostore.output.path=</td> |
| <td>*hdfs://uri/path/to/hdfs/output/path* || *file:///uri/path/to/local/output/path*</td> |
| <td>Yes</td> |
| <td>This value should point to the output directory on hdfs (if running Gora in a distributed Hadoop environment) or to some location output location on the local file system (if running Gora locally).</td> |
| </tr> |
| <tr> |
| <td>gora.avrostore.codec.type=</td> |
| <td>BINARY || JSON</td> |
| <td>No</td> |
| <td>The property key specifying avro encoder/decoder type to use. Can take values <code>BINARY</code> or <code>JSON</code> but resolves to BINARY is one is not supplied.</td> |
| </tr> |
| |
| </tbody></table> |
| <h2 id="avrostore-xml-mappings">AvroStore XML mappings<a class="headerlink" href="#avrostore-xml-mappings" title="Permalink">¶</a></h2> |
| <p>In the stores covered within the gora-core module, no physical mappings are required.</p> |
| <h1 id="datafileavrostore">DataFileAvroStore<a class="headerlink" href="#datafileavrostore" title="Permalink">¶</a></h1> |
| <h2 id="description_1">Description<a class="headerlink" href="#description_1" title="Permalink">¶</a></h2> |
| <p>DataFileAvroStore is file based store which extends <codeAvroStore to use Avro's <code>DataFile{Writer,Reader}</code>'s as a backend. |
| This datastore supports MapReduce.</p> |
| <h2 id="goraproperties_1">gora.properties<a class="headerlink" href="#goraproperties_1" title="Permalink">¶</a></h2> |
| <p>DataFileAvroStore would be configured exactly the same as in AvroStore above with the following exception</p> |
| <table class="table"> |
| <thead> |
| <tr> |
| <th align="left">Property Key</th> |
| <th align="left">Property Value</th> |
| <th align="left">Required</th> |
| <th align="left">Description</th> |
| </tr> |
| </thead> |
| <tbody> |
| <tr> |
| <td>gora.datastore.default=</td> |
| <td>org.apache.gora.avro.store.DataFileAvroStore</td> |
| <td>Yes</td> |
| <td>Implementation of the persistent Java storage class</td> |
| </tr> |
| |
| </tbody></table> |
| <h2 id="gora-core-mappings">Gora Core mappings<a class="headerlink" href="#gora-core-mappings" title="Permalink">¶</a></h2> |
| <p>In the stores covered within the gora-core module, no physical mappings are required.</p> |
| <h1 id="memstore">MemStore<a class="headerlink" href="#memstore" title="Permalink">¶</a></h1> |
| <h2 id="description_2">Description<a class="headerlink" href="#description_2" title="Permalink">¶</a></h2> |
| <p>Essentially this store is a ConcurrentSkipListMap in which operations run as follows</p> |
| <ul> |
| <li>put(K key, T Object) - expect average log(n)</li> |
| <li>get(K key, String [] fields) - expect average log(n)</li> |
| <li>delete(K key) - expect average log(n)</li> |
| </ul> |
| <h2 id="goraproperties_2">gora.properties<a class="headerlink" href="#goraproperties_2" title="Permalink">¶</a></h2> |
| <p>MemStore would be configured exactly the same as in AvroStore above with the following exception</p> |
| <table class="table"> |
| <thead> |
| <tr> |
| <th align="left">Property Key</th> |
| <th align="left">Property Value</th> |
| <th align="left">Required</th> |
| <th align="left">Description</th> |
| </tr> |
| </thead> |
| <tbody> |
| <tr> |
| <td>gora.datastore.default=</td> |
| <td>org.apache.gora.memory.store.MemStore</td> |
| <td>Yes</td> |
| <td>Implementation of the Java class used to hold data in memory</td> |
| </tr> |
| |
| </tbody></table> |
| <h2 id="memstore-xml-mappings">MemStore XML mappings<a class="headerlink" href="#memstore-xml-mappings" title="Permalink">¶</a></h2> |
| <p>In the stores covered within the gora-core module, no physical mappings are required.</p> |
| <h1 id="gorasparkengine">GoraSparkEngine<a class="headerlink" href="#gorasparkengine" title="Permalink">¶</a></h1> |
| <h2 id="description_3">Description<a class="headerlink" href="#description_3" title="Permalink">¶</a></h2> |
| <p>GoraSparkEngine is Spark backend of Gora. Assume that input and output data stores are:</p> |
| <pre><code>DataStore<K1, V1> inStore; |
| DataStore<K2, V2> outStore; |
| </code></pre> |
| <p>First step of using GoraSparkEngine is to initialize it:</p> |
| <pre><code>GoraSparkEngine<K1, V1> goraSparkEngine = new GoraSparkEngine<>(K1.class, V1.class); |
| </code></pre> |
| <p>Construct a <code>JavaSparkContext</code>. Register input data store’s value class as Kryo class:</p> |
| <pre><code>SparkConf sparkConf = new SparkConf().setAppName("Gora Spark Integration Application").setMaster("local"); |
| Class[] c = new Class[1]; |
| c[0] = inStore.getPersistentClass(); |
| sparkConf.registerKryoClasses(c); |
| JavaSparkContext sc = new JavaSparkContext(sparkConf); |
| </code></pre> |
| <p>JavaPairRDD can be retrieved from input data store:</p> |
| <pre><code>JavaPairRDD<Long, Pageview> goraRDD = goraSparkEngine.initialize(sc, inStore); |
| </code></pre> |
| <p>After that, all Spark functionality can be applied. For example running count can be done as follows:</p> |
| <pre><code>long count = goraRDD.count(); |
| </code></pre> |
| <p>Map and Reduce functions can be run on a <code>JavaPairRDD</code> as well. Assume that this is the variable after map/reduce is applied:</p> |
| <pre><code>JavaPairRDD<String, MetricDatum> mapReducedGoraRdd; |
| </code></pre> |
| <p>Result can be written as follows:</p> |
| <pre><code>Configuration sparkHadoopConf = goraSparkEngine.generateOutputConf(outStore); |
| mapReducedGoraRdd.saveAsNewAPIHadoopDataset(sparkHadoopConf); |
| </code></pre> |
| |
| </div> |
| <!-- /container (main block) --> |
| |
| <hr /> |
| |
| <div class="container"> |
| <footer> |
| <p> |
| Copyright © 2010-2024 The Apache Software Foundation. |
| Licensed under |
| <a href="http://www.apache.org/licenses/LICENSE-2.0" |
| >Apache License 2.0</a |
| >. |
| </p> |
| <p> |
| Apache Gora, Gora, Apache, the Apache feather logo, and the Apache |
| Gora project logo are trademarks of The Apache Software Foundation. |
| </p> |
| </footer> |
| </div> |
| <!-- /container --> |
| |
| <script src="/resources/js/bootstrap.bundle.min.js"></script> |
| <script type="text/javascript"> |
| stLight.options({ |
| publisher: "4059fafd-3891-49f9-8c96-e4100290d8e6", |
| doNotHash: false, |
| doNotCopy: false, |
| hashAddressBar: false, |
| }); |
| </script> |
| <script src="//cdn.jsdelivr.net/gh/highlightjs/cdn-release@11.0.1/build/highlight.min.js"></script> |
| <script> |
| hljs.highlightAll(); |
| </script> |
| </body> |
| </html> |