blob: c645ebc969854a4f4a579db09c98f35b8da75de9 [file] [log] [blame]
<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8">
<meta name="viewport" content="width=device-width, initial-scale=1">
<title>Pegasus | Data Model</title>
<link rel="stylesheet" href="/assets/css/app.css">
<link rel="shortcut icon" href="/assets/images/favicon.ico">
<link rel="stylesheet" href="/assets/css/utilities.min.css">
<link rel="stylesheet" href="/assets/css/docsearch.v3.css">
<script src="/assets/js/jquery.min.js"></script>
<script src="/assets/js/all.min.js"></script>
<script src="/assets/js/docsearch.v3.js"></script>
<!-- Begin Jekyll SEO tag v2.8.0 -->
<title>Data Model | Pegasus</title>
<meta name="generator" content="Jekyll v4.3.3" />
<meta property="og:title" content="Data Model" />
<meta property="og:locale" content="en_US" />
<meta name="description" content="Introduction" />
<meta property="og:description" content="Introduction" />
<meta property="og:site_name" content="Pegasus" />
<meta property="og:type" content="article" />
<meta property="article:published_time" content="2024-04-22T13:02:52+00:00" />
<meta name="twitter:card" content="summary" />
<meta property="twitter:title" content="Data Model" />
<script type="application/ld+json">
{"@context":"https://schema.org","@type":"BlogPosting","dateModified":"2024-04-22T13:02:52+00:00","datePublished":"2024-04-22T13:02:52+00:00","description":"Introduction","headline":"Data Model","mainEntityOfPage":{"@type":"WebPage","@id":"/overview/data-model/"},"url":"/overview/data-model/"}</script>
<!-- End Jekyll SEO tag -->
</head>
<body>
<nav class="navbar is-primary">
<div class="container">
<!--container will be unwrapped when it's in docs-->
<div class="navbar-brand">
<a href="/" class="navbar-item ">
<!-- Pegasus Icon -->
<img src="/assets/images/pegasus.svg">
</a>
<div class="navbar-item">
<a href="/docs" class="button is-primary is-outlined is-inverted">
<span class="icon"><i class="fas fa-book"></i></span>
<span>Docs</span>
</a>
</div>
<div class="navbar-item is-hidden-desktop">
<!--A simple language switch button that only supports zh and en.-->
<!--IF its language is zh, then switches to en.-->
<a class="button is-primary is-outlined is-inverted" href="/zh/overview/data-model/"><strong></strong></a>
</div>
<a role="button" class="navbar-burger burger" aria-label="menu" aria-expanded="false" data-target="navMenu">
<!-- Appears in mobile mode only -->
<span aria-hidden="true"></span>
<span aria-hidden="true"></span>
<span aria-hidden="true"></span>
</a>
</div>
<div class="navbar-menu" id="navMenu">
<div class="navbar-end">
<!--dropdown-->
<div class="navbar-item has-dropdown is-hoverable ">
<a href=""
class="navbar-link ">
<span class="icon" style="margin-right: .25em">
<i class="fas fa-users"></i>
</span>
<span>
ASF
</span>
</a>
<div class="navbar-dropdown">
<a href="https://www.apache.org/"
class="navbar-item ">
Foundation
</a>
<a href="https://www.apache.org/licenses/"
class="navbar-item ">
License
</a>
<a href="https://www.apache.org/events/current-event.html"
class="navbar-item ">
Events
</a>
<a href="https://www.apache.org/foundation/sponsorship.html"
class="navbar-item ">
Sponsorship
</a>
<a href="https://www.apache.org/security/"
class="navbar-item ">
Security
</a>
<a href="https://privacy.apache.org/policies/privacy-policy-public.html"
class="navbar-item ">
Privacy
</a>
<a href="https://www.apache.org/foundation/thanks.html"
class="navbar-item ">
Thanks
</a>
</div>
</div>
<!--dropdown-->
<div class="navbar-item has-dropdown is-hoverable ">
<a href="/community"
class="navbar-link ">
<span class="icon" style="margin-right: .25em">
<i class="fas fa-user-plus"></i>
</span>
<span>
Community
</span>
</a>
<div class="navbar-dropdown">
<a href="/community/#contact-us"
class="navbar-item ">
Contact Us
</a>
<a href="/community/#contribution"
class="navbar-item ">
Contribution
</a>
<a href="https://cwiki.apache.org/confluence/display/PEGASUS/Coding+guides"
class="navbar-item ">
Coding Guides
</a>
<a href="https://github.com/apache/incubator-pegasus/issues?q=is%3Aissue+is%3Aopen+label%3Atype%2Fbug"
class="navbar-item ">
Bug Tracking
</a>
<a href="https://cwiki.apache.org/confluence/display/INCUBATOR/PegasusProposal"
class="navbar-item ">
Apache Proposal
</a>
</div>
</div>
<a href="/blogs"
class="navbar-item ">
<span class="icon" style="margin-right: .25em">
<i class="fas fa-rss"></i>
</span>
<span>Blog</span>
</a>
<a href="/docs/downloads"
class="navbar-item ">
<span class="icon" style="margin-right: .25em">
<i class="fas fa-fire"></i>
</span>
<span>Releases</span>
</a>
</div>
<div class="navbar-item is-hidden-mobile">
<!--A simple language switch button that only supports zh and en.-->
<!--IF its language is zh, then switches to en.-->
<a class="button is-primary is-outlined is-inverted" href="/zh/overview/data-model/"><strong></strong></a>
</div>
</div>
</div>
</nav>
<section class="section">
<div class="container">
<div class="columns is-multiline">
<div class="column is-one-fourth">
<aside class="menu">
<p class="menu-label"></p>
<ul class="menu-list">
<li>
<a href="/overview"
class="">
Overview
</a>
</li>
<li>
<a href="/overview/background"
class="">
Background
</a>
</li>
<li>
<a href="/overview/architecture"
class="">
Architecture
</a>
</li>
<li>
<a href="/overview/data-model"
class="">
Data Model
</a>
</li>
<li>
<a href="/overview/benchmark"
class="">
Benchmark
</a>
</li>
<li>
<a href="/docs/build/compile-by-docker"
class="">
Installation
</a>
</li>
<li>
<a href="/overview/onebox"
class="">
Onebox
</a>
</li>
</ul>
</aside>
</div>
<div class="column is-half">
<div class="content">
<h1 id="Data Model">Data Model</h1>
<h2 id="introduction">Introduction</h2>
<p>The data model of Pegasus is a simple Key-Value model, it does not support complex schemas. However, to enhance its expressive power, Key is split into <strong>HashKey</strong> and <strong>SortKey</strong>, namely composite key (<code class="language-plaintext highlighter-rouge">[HashKey, SortKey] -&gt;Value</code>), which is similar to <a href="https://aws.amazon.com/dynamodb/">DynamoDB</a>’s <a href="http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/howitworks.corecomponents.html#howitworks.corecomponents.primarykey">composite primary key</a>.</p>
<h3 id="hashkey">HashKey</h3>
<p>Byte string. Similar to the partition key in DynamoDB, HashKey is used to calculate which partition (a.k.a. shard) the data belongs to. Pegasus uses a specific hash function to calculate the hash value for a HashKey, and then modulo the number of partitions to obtain the <strong>Partition ID</strong> for the data. Therefore, data with the same HashKey is always stored in the same partition.</p>
<blockquote>
<p>Note:
On the C++ client side, the HashKey length limit is 64KB.
On the Java client side, if <a href="https://github.com/apache/incubator-pegasus/blob/v2.5.0/java-client/src/main/java/org/apache/pegasus/client/ClientOptions.java#L360C12-L360C12">WriteLimiter</a> is enabled, then the limit is 1KB.
On the server side, since Pegasus 2.0.0, if <code class="language-plaintext highlighter-rouge">[replication]max_allowed_write_size</code> is set as non-zero, limit the size of the entire request packet to this value, defaulting to 1MB.</p>
</blockquote>
<h3 id="sortkey">SortKey</h3>
<p>Byte string. Similar to the sort key in DynamoDB, SortKey is used for sorting data within a partition. In fact, when storing data internally in RocksDB, we concatenate HashKey and SortKey as the keys of RocksDB.</p>
<blockquote>
<p>Note:
On the C++ client side, there is no limit to the length of SortKey.
On the Java client side, if <a href="https://github.com/apache/incubator-pegasus/blob/v2.5.0/java-client/src/main/java/org/apache/pegasus/client/ClientOptions.java#L360C12-L360C12">WriteLimiter</a> is enabled, then the limit is 1KB.
On the server side, since Pegasus 2.0.0, if <code class="language-plaintext highlighter-rouge">[replication]max_allowed_write_size</code> is set as non-zero, limit the size of the entire request packet to this value, defaulting to 1MB.</p>
</blockquote>
<h3 id="value">Value</h3>
<p>Byte string.</p>
<blockquote>
<p>Note:
On the C++ client side, there is no limit to the length of the Value.
On the Java client side, if <a href="https://github.com/apache/incubator-pegasus/blob/v2.5.0/java-client/src/main/java/org/apache/pegasus/client/ClientOptions.java#L360C12-L360C12">WriteLimiter</a> is enabled, then the limit is 400KB.
On the server side, since Pegasus 2.0.0, if <code class="language-plaintext highlighter-rouge">[replication]max_allowed_write_size</code> is set as non-zero, limit the size of the entire request packet to this value, defaulting to 1MB.</p>
</blockquote>
<p><img src="/assets/images/pegasus-data-model.png" alt="pegasus-data-model" class="img-responsive docs-image" /></p>
<h2 id="pegasus-vs-hbase">Pegasus vs. HBase</h2>
<p>Although Pegasus is not as semantically rich as HBase’s tabular model, it can still meet most applications’ needs, thanks to its HashKey+SortKey combination key design.
For example, users can treat HashKey as a row key and SortKey as an attribute name or column name, so that multiple data of the same HashKey can be viewed as one row, which can also express the concept of row in HBase.
Taking this into consideration, Pegasus not only provides the <code class="language-plaintext highlighter-rouge">get</code>/<code class="language-plaintext highlighter-rouge">set</code>/<code class="language-plaintext highlighter-rouge">del</code> interface for accessing individual data, but also provides the <code class="language-plaintext highlighter-rouge">multi_get</code>/<code class="language-plaintext highlighter-rouge">multi_set</code>/<code class="language-plaintext highlighter-rouge">multi_del</code> interfaces for accessing batch data in the same HashKey, and these interfaces provide single line atomic semantics, making it convenient for users to use.</p>
<p><img src="/assets/images/pegasus-data-model-sample.png" alt="pegasus-data-model" class="img-responsive docs-image" /></p>
<h2 id="pegasus-vs-redis">Pegasus vs. Redis</h2>
<p>Although Pegasus does not support rich data structures such as <code class="language-plaintext highlighter-rouge">List</code>/<code class="language-plaintext highlighter-rouge">Set</code>/<code class="language-plaintext highlighter-rouge">Hash</code> like Redis, users can still use Pegasus to implement similar semantics.
For example, users can equate HashKey with Redis’ <code class="language-plaintext highlighter-rouge">key</code> and use SortKey as the <code class="language-plaintext highlighter-rouge">field</code> of Hash (or <code class="language-plaintext highlighter-rouge">member</code> of Set) to implement Hash in Redis.</p>
</div>
</div>
<div class="column is-one-fourth is-hidden-mobile" style="padding-left: 3rem">
<p class="menu-label">
<span class="icon">
<i class="fa fa-bars" aria-hidden="true"></i>
</span>
Table of contents
</p>
<ul class="menu-list">
<li><a href="#Data Model">Data Model</a>
<ul>
<li><a href="#introduction">Introduction</a>
<ul>
<li><a href="#hashkey">HashKey</a></li>
<li><a href="#sortkey">SortKey</a></li>
<li><a href="#value">Value</a></li>
</ul>
</li>
<li><a href="#pegasus-vs-hbase">Pegasus vs. HBase</a></li>
<li><a href="#pegasus-vs-redis">Pegasus vs. Redis</a></li>
</ul>
</li>
</ul>
</div>
</div>
</div>
</section>
<footer class="footer">
<div class="container">
<div class="content is-small has-text-centered">
<div style="margin-bottom: 20px;">
<a href="http://incubator.apache.org">
<img src="/assets/images/egg-logo.png"
width="15%"
alt="Apache Incubator"/>
</a>
</div>
Copyright &copy; 2023 <a href="http://www.apache.org">The Apache Software Foundation</a>.
Licensed under the <a href="http://www.apache.org/licenses/LICENSE-2.0">Apache License, Version
2.0</a>.
<br><br>
Apache Pegasus is an effort undergoing incubation at The Apache Software Foundation (ASF),
sponsored by the Apache Incubator. Incubation is required of all newly accepted projects
until a further review indicates that the infrastructure, communications, and decision making process
have stabilized in a manner consistent with other successful ASF projects. While incubation status is
not necessarily a reflection of the completeness or stability of the code, it does indicate that the
project has yet to be fully endorsed by the ASF.
<br><br>
Apache Pegasus, Pegasus, Apache, the Apache feather logo, and the Apache Pegasus project logo are either
registered trademarks or trademarks of The Apache Software Foundation in the United States and other
countries.
</div>
</div>
</footer>
<script src="/assets/js/app.js" type="text/javascript"></script>
</body>
</html>