blob: 3859b1773742de1ee4d0654dc9cdef77b9c2f2d8 [file] [log] [blame]
<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8">
<meta name="viewport" content="width=device-width, initial-scale=1">
<title>Pegasus | Data Model</title>
<link rel="stylesheet" href="/zh/assets/css/app.css">
<link rel="shortcut icon" href="/zh/assets/images/favicon.ico">
<link rel="stylesheet" href="/zh/assets/css/utilities.min.css">
<link rel="stylesheet" href="/zh/assets/css/docsearch.v3.css">
<script src="/assets/js/jquery.min.js"></script>
<script src="/assets/js/all.min.js"></script>
<script src="/assets/js/docsearch.v3.js"></script>
<!-- Begin Jekyll SEO tag v2.8.0 -->
<title>Data Model | Pegasus</title>
<meta name="generator" content="Jekyll v4.3.3" />
<meta property="og:title" content="Data Model" />
<meta property="og:locale" content="en_US" />
<meta name="description" content="介绍" />
<meta property="og:description" content="介绍" />
<meta property="og:site_name" content="Pegasus" />
<meta property="og:type" content="article" />
<meta property="article:published_time" content="2024-04-22T13:02:52+00:00" />
<meta name="twitter:card" content="summary" />
<meta property="twitter:title" content="Data Model" />
<script type="application/ld+json">
{"@context":"https://schema.org","@type":"BlogPosting","dateModified":"2024-04-22T13:02:52+00:00","datePublished":"2024-04-22T13:02:52+00:00","description":"介绍","headline":"Data Model","mainEntityOfPage":{"@type":"WebPage","@id":"/overview/data-model/"},"url":"/overview/data-model/"}</script>
<!-- End Jekyll SEO tag -->
</head>
<body>
<nav class="navbar is-primary">
<div class="container">
<!--container will be unwrapped when it's in docs-->
<div class="navbar-brand">
<a href="/zh/" class="navbar-item ">
<!-- Pegasus Icon -->
<img src="/assets/images/pegasus.svg">
</a>
<div class="navbar-item">
<a href="/zh/docs" class="button is-primary is-outlined is-inverted">
<span class="icon"><i class="fas fa-book"></i></span>
<span>Docs</span>
</a>
</div>
<div class="navbar-item is-hidden-desktop">
<!--A simple language switch button that only supports zh and en.-->
<!--IF its language is zh, then switches to en.-->
<!--If you don't want a url to be relativized, you can add a space explicitly into the href to
prevents a url from being relativized by polyglot.-->
<a class="button is-primary is-outlined is-inverted" href=" /overview/data-model/"><strong>En</strong></a>
</div>
<a role="button" class="navbar-burger burger" aria-label="menu" aria-expanded="false" data-target="navMenu">
<!-- Appears in mobile mode only -->
<span aria-hidden="true"></span>
<span aria-hidden="true"></span>
<span aria-hidden="true"></span>
</a>
</div>
<div class="navbar-menu" id="navMenu">
<div class="navbar-end">
<!--dropdown-->
<div class="navbar-item has-dropdown is-hoverable ">
<a href=""
class="navbar-link ">
<span class="icon" style="margin-right: .25em">
<i class="fas fa-users"></i>
</span>
<span>
ASF
</span>
</a>
<div class="navbar-dropdown">
<a href="https://www.apache.org/"
class="navbar-item ">
Foundation
</a>
<a href="https://www.apache.org/licenses/"
class="navbar-item ">
License
</a>
<a href="https://www.apache.org/events/current-event.html"
class="navbar-item ">
Events
</a>
<a href="https://www.apache.org/foundation/sponsorship.html"
class="navbar-item ">
Sponsorship
</a>
<a href="https://www.apache.org/security/"
class="navbar-item ">
Security
</a>
<a href="https://privacy.apache.org/policies/privacy-policy-public.html"
class="navbar-item ">
Privacy
</a>
<a href="https://www.apache.org/foundation/thanks.html"
class="navbar-item ">
Thanks
</a>
</div>
</div>
<!--dropdown-->
<div class="navbar-item has-dropdown is-hoverable ">
<a href="/zh/community"
class="navbar-link ">
<span class="icon" style="margin-right: .25em">
<i class="fas fa-user-plus"></i>
</span>
<span>
开源社区
</span>
</a>
<div class="navbar-dropdown">
<a href="/zh/community/#contact-us"
class="navbar-item ">
联系我们
</a>
<a href="/zh/community/#contribution"
class="navbar-item ">
参与贡献
</a>
<a href="https://cwiki.apache.org/confluence/display/PEGASUS/Coding+guides"
class="navbar-item ">
编码指引
</a>
<a href="https://github.com/apache/incubator-pegasus/issues?q=is%3Aissue+is%3Aopen+label%3Atype%2Fbug"
class="navbar-item ">
Bug 追踪
</a>
<a href="https://cwiki.apache.org/confluence/display/INCUBATOR/PegasusProposal"
class="navbar-item ">
Apache 提案
</a>
</div>
</div>
<a href="/zh/blogs"
class="navbar-item ">
<span class="icon" style="margin-right: .25em">
<i class="fas fa-rss"></i>
</span>
<span>Blog</span>
</a>
<a href="/zh/docs/downloads"
class="navbar-item ">
<span class="icon" style="margin-right: .25em">
<i class="fas fa-fire"></i>
</span>
<span>版本发布</span>
</a>
</div>
<div class="navbar-item is-hidden-mobile">
<!--A simple language switch button that only supports zh and en.-->
<!--IF its language is zh, then switches to en.-->
<!--If you don't want a url to be relativized, you can add a space explicitly into the href to
prevents a url from being relativized by polyglot.-->
<a class="button is-primary is-outlined is-inverted" href=" /overview/data-model/"><strong>En</strong></a>
</div>
</div>
</div>
</nav>
<section class="section">
<div class="container">
<div class="columns is-multiline">
<div class="column is-one-fourth">
<aside class="menu">
<p class="menu-label"></p>
<ul class="menu-list">
<li>
<a href="/zh/overview"
class="">
概览
</a>
</li>
<li>
<a href="/zh/overview/background"
class="">
项目背景
</a>
</li>
<li>
<a href="/zh/overview/architecture"
class="">
系统架构
</a>
</li>
<li>
<a href="/zh/overview/data-model"
class="">
数据模型
</a>
</li>
<li>
<a href="/zh/overview/benchmark"
class="">
性能测试
</a>
</li>
<li>
<a href="/zh/docs/build/compile-by-docker"
class="">
安装构建
</a>
</li>
<li>
<a href="/zh/overview/onebox"
class="">
体验 Onebox 集群
</a>
</li>
</ul>
</aside>
</div>
<div class="column is-half">
<div class="content">
<h1 id="数据模型">数据模型</h1>
<h2 id="介绍">介绍</h2>
<p>Pegasus 的数据模型非常简单,就是 Key-Value 模型,不支持复杂的 Schema。但是为了增强其表达能力,Key被分裂为 <strong>HashKey</strong><strong>SortKey</strong>,即组合键(composite key, <code class="language-plaintext highlighter-rouge">[HashKey, SortKey] -&gt; Value</code>),这与 <a href="https://aws.amazon.com/dynamodb/">DynamoDB</a> 中的 <a href="http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/HowItWorks.CoreComponents.html#HowItWorks.CoreComponents.PrimaryKey">composite primary key</a>(partition key and sort key)是类似的。</p>
<h3 id="hashkey">HashKey</h3>
<p>字节串。类似于 DynamoDB 中的 partition key,HashKey 用于计算数据属于哪个分片。Pegasus 使用一个特定的 hash 函数,对HashKey 计算出一个hash值,然后对分片个数取模,就得到该数据对应的 <strong>Partition ID</strong> 。因此,HashKey 相同的数据总是存储在同一个分片中。</p>
<blockquote>
<p>注意:
在C++客户端侧,HashKey长度限制为64KB。
在Java客户侧,如果开启了<a href="https://github.com/apache/incubator-pegasus/blob/v2.5.0/java-client/src/main/java/org/apache/pegasus/client/ClientOptions.java#L360C12-L360C12">WriteLimiter</a>,则限制为1KB。
在Server侧,从Pegasus 2.0.0开始,如果设置 <code class="language-plaintext highlighter-rouge">[replication]max_allowed_write_size</code> 为非0,则限制整个请求包的大小为该值,默认为1MB。</p>
</blockquote>
<h3 id="sortkey">SortKey</h3>
<p>字节串。类似于 DynamoDB 中的 sort key,SortKey 用于数据在分片内的排序。实际上,在内部存储到RocksDB时,我们将 HashKey 和 SortKey 拼在一起作为 RocksDB 的 key。</p>
<blockquote>
<p>注意:在C++客户端侧,SortKey长度无限制。在Java客户侧,如果开启了<a href="https://github.com/apache/incubator-pegasus/blob/v2.5.0/java-client/src/main/java/org/apache/pegasus/client/ClientOptions.java#L360C12-L360C12">WriteLimiter</a>,则限制为1KB。
在Server侧,从Pegasus 2.0.0开始,如果设置 <code class="language-plaintext highlighter-rouge">[replication]max_allowed_write_size</code> 为非0,则限制整个请求包的大小为该值,默认为1MB。</p>
</blockquote>
<h3 id="value">Value</h3>
<p>字节串。</p>
<blockquote>
<p>注意:在C++客户端侧,Value长度无限制。在Java客户侧,如果开启了<a href="https://github.com/apache/incubator-pegasus/blob/v2.5.0/java-client/src/main/java/org/apache/pegasus/client/ClientOptions.java#L360C12-L360C12">WriteLimiter</a>,则限制为400KB。
在Server侧,从Pegasus 2.0.0开始,如果设置 <code class="language-plaintext highlighter-rouge">[replication]max_allowed_write_size</code> 为非0,则限制整个请求包的大小为该值,默认为1MB。</p>
</blockquote>
<p><img src="/assets/images/pegasus-data-model.png" alt="pegasus-data-model" class="img-responsive docs-image" /></p>
<h2 id="pegasus-vs-hbase">Pegasus vs. HBase</h2>
<p>虽然不及 HBase 的表格模型语义丰富,但是 Pegasus 也能满足大部分业务需求,这得益于其 HashKey+SortKey 组合键的设计。</p>
<p>譬如用户可以将 HashKey 当作 row key,将 SortKey 当作 attribute name 或者 column name,这样同一 HashKey 的多条数据可以看作一行,同样能表达出 HBase 中 row 的概念。正是考虑到这一点,Pegasus 除了提供存取单条数据的 <code class="language-plaintext highlighter-rouge">get</code>/<code class="language-plaintext highlighter-rouge">set</code>/<code class="language-plaintext highlighter-rouge">del</code> 接口,还提供了存取同一 HashKey 数据的 <code class="language-plaintext highlighter-rouge">multi_get</code>/<code class="language-plaintext highlighter-rouge">multi_set</code>/<code class="language-plaintext highlighter-rouge">multi_del</code> 接口,并且这些接口都是单行原子操作,让用户在使用时更加简单。</p>
<p><img src="/assets/images/pegasus-data-model-sample.png" alt="pegasus-data-model" class="img-responsive docs-image" /></p>
<h2 id="pegasus-vs-redis">Pegasus vs. Redis</h2>
<p>虽然不像Redis一样支持丰富的<code class="language-plaintext highlighter-rouge">List</code>/<code class="language-plaintext highlighter-rouge">Set</code>/<code class="language-plaintext highlighter-rouge">Hash</code>等数据结构,但用户同样可以使用Pegasus实现类似的语义。</p>
<p>譬如用户可以将 HashKey 等同于 Redis 的 <code class="language-plaintext highlighter-rouge">key</code>,将 SortKey 作为 Hash 的 <code class="language-plaintext highlighter-rouge">field</code>(或 Set 的<code class="language-plaintext highlighter-rouge">member</code>),实现 Redis 中 Hash (或 Set)。</p>
</div>
</div>
<div class="column is-one-fourth is-hidden-mobile" style="padding-left: 3rem">
<p class="menu-label">
<span class="icon">
<i class="fa fa-bars" aria-hidden="true"></i>
</span>
本页导航
</p>
<ul class="menu-list">
<li><a href="#数据模型">数据模型</a>
<ul>
<li><a href="#介绍">介绍</a>
<ul>
<li><a href="#hashkey">HashKey</a></li>
<li><a href="#sortkey">SortKey</a></li>
<li><a href="#value">Value</a></li>
</ul>
</li>
<li><a href="#pegasus-vs-hbase">Pegasus vs. HBase</a></li>
<li><a href="#pegasus-vs-redis">Pegasus vs. Redis</a></li>
</ul>
</li>
</ul>
</div>
</div>
</div>
</section>
<footer class="footer">
<div class="container">
<div class="content is-small has-text-centered">
<div style="margin-bottom: 20px;">
<a href="http://incubator.apache.org">
<img src="/assets/images/egg-logo.png"
width="15%"
alt="Apache Incubator"/>
</a>
</div>
Copyright &copy; 2023 <a href="http://www.apache.org">The Apache Software Foundation</a>.
Licensed under the <a href="http://www.apache.org/licenses/LICENSE-2.0">Apache License, Version
2.0</a>.
<br><br>
Apache Pegasus is an effort undergoing incubation at The Apache Software Foundation (ASF),
sponsored by the Apache Incubator. Incubation is required of all newly accepted projects
until a further review indicates that the infrastructure, communications, and decision making process
have stabilized in a manner consistent with other successful ASF projects. While incubation status is
not necessarily a reflection of the completeness or stability of the code, it does indicate that the
project has yet to be fully endorsed by the ASF.
<br><br>
Apache Pegasus, Pegasus, Apache, the Apache feather logo, and the Apache Pegasus project logo are either
registered trademarks or trademarks of The Apache Software Foundation in the United States and other
countries.
</div>
</div>
</footer>
<script src="/assets/js/app.js" type="text/javascript"></script>
</body>
</html>