blob: d00956a351198dc3b186fbc465d267c2706d5052 [file] [log] [blame]
<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8">
<meta name="viewport" content="width=device-width, initial-scale=1">
<title>Pegasus | Overview</title>
<link rel="stylesheet" href="/assets/css/app.css">
<link rel="shortcut icon" href="/assets/images/favicon.ico">
<link rel="stylesheet" href="/assets/css/utilities.min.css">
<link rel="stylesheet" href="/assets/css/docsearch.v3.css">
<script src="/assets/js/jquery.min.js"></script>
<script src="/assets/js/all.min.js"></script>
<script src="/assets/js/docsearch.v3.js"></script>
<!-- Begin Jekyll SEO tag v2.8.0 -->
<title>Overview | Pegasus</title>
<meta name="generator" content="Jekyll v4.3.3" />
<meta property="og:title" content="Overview" />
<meta property="og:locale" content="en" />
<meta property="og:site_name" content="Pegasus" />
<meta property="og:type" content="article" />
<meta property="article:published_time" content="2024-04-22T13:02:52+00:00" />
<meta name="twitter:card" content="summary" />
<meta property="twitter:title" content="Overview" />
<script type="application/ld+json">
{"@context":"https://schema.org","@type":"BlogPosting","dateModified":"2024-04-22T13:02:52+00:00","datePublished":"2024-04-22T13:02:52+00:00","headline":"Overview","mainEntityOfPage":{"@type":"WebPage","@id":"/overview/"},"url":"/overview/"}</script>
<!-- End Jekyll SEO tag -->
</head>
<body>
<nav class="navbar is-primary">
<div class="container">
<!--container will be unwrapped when it's in docs-->
<div class="navbar-brand">
<a href="/" class="navbar-item ">
<!-- Pegasus Icon -->
<img src="/assets/images/pegasus.svg">
</a>
<div class="navbar-item">
<a href="/docs" class="button is-primary is-outlined is-inverted">
<span class="icon"><i class="fas fa-book"></i></span>
<span>Docs</span>
</a>
</div>
<div class="navbar-item is-hidden-desktop">
<!--A simple language switch button that only supports zh and en.-->
<!--IF its language is zh, then switches to en.-->
<a class="button is-primary is-outlined is-inverted" href="/zh/overview/index.html"><strong></strong></a>
</div>
<a role="button" class="navbar-burger burger" aria-label="menu" aria-expanded="false" data-target="navMenu">
<!-- Appears in mobile mode only -->
<span aria-hidden="true"></span>
<span aria-hidden="true"></span>
<span aria-hidden="true"></span>
</a>
</div>
<div class="navbar-menu" id="navMenu">
<div class="navbar-end">
<!--dropdown-->
<div class="navbar-item has-dropdown is-hoverable ">
<a href=""
class="navbar-link ">
<span class="icon" style="margin-right: .25em">
<i class="fas fa-users"></i>
</span>
<span>
ASF
</span>
</a>
<div class="navbar-dropdown">
<a href="https://www.apache.org/"
class="navbar-item ">
Foundation
</a>
<a href="https://www.apache.org/licenses/"
class="navbar-item ">
License
</a>
<a href="https://www.apache.org/events/current-event.html"
class="navbar-item ">
Events
</a>
<a href="https://www.apache.org/foundation/sponsorship.html"
class="navbar-item ">
Sponsorship
</a>
<a href="https://www.apache.org/security/"
class="navbar-item ">
Security
</a>
<a href="https://privacy.apache.org/policies/privacy-policy-public.html"
class="navbar-item ">
Privacy
</a>
<a href="https://www.apache.org/foundation/thanks.html"
class="navbar-item ">
Thanks
</a>
</div>
</div>
<!--dropdown-->
<div class="navbar-item has-dropdown is-hoverable ">
<a href="/community"
class="navbar-link ">
<span class="icon" style="margin-right: .25em">
<i class="fas fa-user-plus"></i>
</span>
<span>
Community
</span>
</a>
<div class="navbar-dropdown">
<a href="/community/#contact-us"
class="navbar-item ">
Contact Us
</a>
<a href="/community/#contribution"
class="navbar-item ">
Contribution
</a>
<a href="https://cwiki.apache.org/confluence/display/PEGASUS/Coding+guides"
class="navbar-item ">
Coding Guides
</a>
<a href="https://github.com/apache/incubator-pegasus/issues?q=is%3Aissue+is%3Aopen+label%3Atype%2Fbug"
class="navbar-item ">
Bug Tracking
</a>
<a href="https://cwiki.apache.org/confluence/display/INCUBATOR/PegasusProposal"
class="navbar-item ">
Apache Proposal
</a>
</div>
</div>
<a href="/blogs"
class="navbar-item ">
<span class="icon" style="margin-right: .25em">
<i class="fas fa-rss"></i>
</span>
<span>Blog</span>
</a>
<a href="/docs/downloads"
class="navbar-item ">
<span class="icon" style="margin-right: .25em">
<i class="fas fa-fire"></i>
</span>
<span>Releases</span>
</a>
</div>
<div class="navbar-item is-hidden-mobile">
<!--A simple language switch button that only supports zh and en.-->
<!--IF its language is zh, then switches to en.-->
<a class="button is-primary is-outlined is-inverted" href="/zh/overview/index.html"><strong></strong></a>
</div>
</div>
</div>
</nav>
<section class="section">
<div class="container">
<div class="columns is-multiline">
<div class="column is-one-fourth">
<aside class="menu">
<p class="menu-label"></p>
<ul class="menu-list">
<li>
<a href="/overview"
class="">
Overview
</a>
</li>
<li>
<a href="/overview/background"
class="">
Background
</a>
</li>
<li>
<a href="/overview/architecture"
class="">
Architecture
</a>
</li>
<li>
<a href="/overview/data-model"
class="">
Data Model
</a>
</li>
<li>
<a href="/overview/benchmark"
class="">
Benchmark
</a>
</li>
<li>
<a href="/docs/build/compile-by-docker"
class="">
Installation
</a>
</li>
<li>
<a href="/overview/onebox"
class="">
Onebox
</a>
</li>
</ul>
</aside>
</div>
<div class="column is-half">
<div class="content">
<h1 id="Overview">Overview</h1>
<p>Apache Pegasus is a distributed key-value storage system which is designed to be:</p>
<ul>
<li><strong>horizontally scalable</strong>: distributed using hash-based partitioning</li>
<li><strong>strongly consistent</strong>: ensured by <a href="https://www.microsoft.com/en-us/research/publication/pacifica-replication-in-log-based-distributed-storage-systems/">PacificA</a> consensus protocol</li>
<li><strong>high-performance</strong>: using <a href="https://rocksdb.org/">RocksDB</a> as underlying storage engine</li>
<li><strong>simple</strong>: well-defined, easy-to-use APIs</li>
</ul>
<h2 id="background">Background</h2>
<p>Pegasus targets to fill the gap between Redis and <a href="https://hbase.apache.org/">HBase</a>. As the former
is in-memory, low latency, but does not provide a strong-consistency guarantee.
And unlike the latter, Pegasus is entirely written in C++ and its write-path
relies merely on the local filesystem.</p>
<p>Apart from the performance requirements, we also need a storage system
to ensure multiple-level data safety and support fast data migration
between data centers, automatic load balancing, and online partition split.</p>
<h2 id="features">Features</h2>
<ul>
<li>
<p><strong>Persistence of data</strong>: Each write is replicated three-way to different ReplicaServers before responding to the client. Using PacificA protocol, Pegasus has the ability for strong consistent replication and membership changes.</p>
</li>
<li>
<p><strong>Automatic load balancing over ReplicaServers</strong>: Load balancing is a builtin function of MetaServer, which manages the distribution of replicas. When the cluster is in an inbalance state, the administrator can invoke a simple rebalance command that automatically schedules the replica migration.</p>
</li>
<li>
<p><strong>Cold Backup</strong>: Pegasus supports an extensible backup and restore mechanism to ensure data safety. The location of snapshot could be a distributed filesystem like HDFS or local filesystem. The snapshot storing in the filesystem can be further used for analysis based on <a href="https://github.com/pegasus-kv/pegasus-spark">pegasus-spark</a>.</p>
</li>
<li>
<p><strong>Eventually-consistent intra-datacenter replication</strong>: This is a feature we called <em>duplication</em>. It allows a change made in the local cluster accesible after a short time period by the remote cluster. It help achieving higher availability of your service and gaining better performance by accessing only local cluster.</p>
</li>
</ul>
<h2 id="presentations">Presentations</h2>
<p>(Incomplete statistics. If you have any new Pegasus related sharing, please feel free to submit a <a href="https://github.com/apache/incubator-pegasus-website/pulls">PR</a>)</p>
<ul>
<li>2023, Chengdu China, COSCon 2023, <em>How does Apache Pegasus used in SensorsData</em>, Guohao Li (<a href="https://kaiyuanshe.cn/activity/recVnSz8ru/agenda/recAg8mw7f">Intro</a>, <a href="https://www.slideshare.net/acelyc1112009/how-does-apache-pegasusused-in-sensorsdata">Slides</a>)</li>
<li>2023, Beijing China, DataFunSummit 2023, <em>The Implementation and Future Planning of Apache Pegasus Application</em>, Yuchen He</li>
<li>2022, Beijing China, DataFunSummit 2022, <em>The Design, Implementation, and Open Source Way of Pegasus</em>, Yuchen He (<a href="https://mp.weixin.qq.com/s/rLiwNdl2baCw6m1FoQT4jw">Intro</a>)</li>
<li>2022, Beijing China, Pegasus meetup, <em>How does the Apache Pegasus used in Advertising Data Stream in SensorsData</em>, Jiaoming Shi (<a href="https://www.slideshare.net/acelyc1112009/how-does-the-apache-pegasus-used-in-advertising-data-stream-in-sensorsdata">Slides</a>, <a href="https://www.bilibili.com/video/BV1q84y1h7xG/">video</a>)</li>
<li>2022, Beijing China, Pegasus meetup, <em>How to continuously improve Apache Pegasus in complex toB scenarios</em>, Hao Wang (<a href="https://www.slideshare.net/acelyc1112009/how-to-continuously-improve-apache-pegasus-in-complex-tob-scenarios">Slides</a>, <a href="https://www.bilibili.com/video/BV1M14y1g7yy/">video</a>)</li>
<li>2022, Beijing China, Pegasus meetup, <em>The Construction and Practice of Apache Pegasus in Offline and Online Scenarios Integration</em>, Wei Wang (<a href="https://www.slideshare.net/acelyc1112009/the-construction-and-practice-of-apache-pegasus-in-offline-and-online-scenarios-integration">Slides</a>, <a href="https://www.bilibili.com/video/BV1Ux4y137ib/">video</a>)</li>
<li>2022, Beijing China, Pegasus meetup, <em>How does Apache Pegasus used in Xiaomi’s Universal Recommendation Algorithm Framework</em>, Wei Liang (<a href="https://www.slideshare.net/acelyc1112009/how-does-apache-pegasus-used-in-xiaomis-universal-recommendation-algorithm-framework">Slides</a>, <a href="https://www.bilibili.com/video/BV16M411b7Pc/">video</a>)</li>
<li>2022, Beijing China, Pegasus meetup, <em>The Introduction of the Apache Pegasus 2.4.0 release</em>, Shuo Jia (<a href="https://www.slideshare.net/acelyc1112009/the-introduction-of-apache-pegasus-240">Slides</a>, <a href="https://www.bilibili.com/video/BV1C8411N7hp/">video</a>)</li>
<li>2022, Online, ApacheCon Asia 2022, <em>How does Apache Pegasus (incubating) community develop at SensorsData</em>, Dan Wang, Yingchun Lai (<a href="https://www.slideshare.net/acelyc1112009/how-does-apache-pegasus-incubating-community-develop-at-sensorsdata">Slides</a>, <a href="https://www.bilibili.com/video/BV18v4y1U7RG/">video</a>)</li>
<li>2021, Beijing China, System Software Tech Day, <em>Apache Pegasus: A high performance, strong consistent distributed key-value storage system</em>, Yuchen He (<a href="https://www.modb.pro/db/168862">Intro</a>, <a href="https://www.bilibili.com/video/BV1SP4y1p7cW/">video</a>)</li>
<li>2021, Beijing China, Pegasus meetup, <em>The Design, Implementation and Open Source Way of Apache Pegasus</em>, Yuchen He (<a href="https://www.slideshare.net/acelyc1112009/the-design-implementation-and-open-source-way-of-apache-pegasus">Slides</a>, <a href="https://www.bilibili.com/video/BV1YL411s7dP/">video</a>)</li>
<li>2021, Beijing China, Pegasus meetup, <em>Apache Pegasus’s Practice in Data Access Business of Xiaomi</em>, Fateng Xiao (<a href="https://www.slideshare.net/acelyc1112009/apache-pegasuss-practice-in-data-access-business-of-xiaomi">Slides</a>, <a href="https://www.bilibili.com/video/BV1K44y1t76C/">video</a>)</li>
<li>2021, Beijing China, Pegasus meetup, <em>The Advertising Algorithm Architecture in Xiaomi and How does Pegasus Practice in Feature Caching</em>, Gang Hao (<a href="https://www.slideshare.net/acelyc1112009/the-advertising-algorithm-architecture-in-xiaomi-and-how-does-pegasus-practice-in-feature-caching">Slides</a>, <a href="https://www.bilibili.com/video/BV1JR4y1n77B/">video</a>)</li>
<li>2021, Beijing China, Pegasus meetup, <em>How do we manage more than one thousand of Pegasus clusters - engine part</em>, Guohao Li (<a href="https://www.slideshare.net/acelyc1112009/how-do-we-manage-more-than-one-thousand-of-pegasus-clusters-engine-part">Slides</a>, <a href="https://www.bilibili.com/video/BV1y44y147U6/">video</a>)</li>
<li>2021, Beijing China, Pegasus meetup, <em>How do we manage more than one thousand of Pegasus clusters - backend part</em>, Dan Wang (<a href="https://www.slideshare.net/acelyc1112009/how-do-we-manage-more-than-one-thousand-of-pegasus-clusters-backend-part">Slides</a>, <a href="https://www.bilibili.com/video/BV1Lv411G7aW/">video</a>)</li>
<li>2021, Online, ApacheCon Asia 2021, <em>Apache Pegasus (incubating): A distributed key-value storage system</em>, Yuchen He, Shuo Jia (<a href="https://www.slideshare.net/acelyc1112009/apache-pegasus-incubating-a-distributed-keyvalue-storage-system">Slides</a>, <a href="https://www.bilibili.com/video/BV1b3411z7rR/">video</a>)</li>
<li>2020, Beijing China, MIDC 2020, <em>Pegasus: Make an open source Key-Value storage system</em>, Tao Wu (<a href="https://zhuanlan.zhihu.com/p/281519769">Intro</a>)</li>
<li>2018, Beijing China, MIDC 2018, <em>Pegasus: A distributed Key-Value storage system</em>, Zuoyan Qin</li>
<li>2018, Beijing China, <em>Pegasus In Depth</em>, Zuoyan Qin (<a href="https://www.slideshare.net/ssuser0a3cdd/pegasus-in-depth">Slides</a>)</li>
<li>2018, Beijing China, <em>Pegasus KV Storage, Let the Users focus on their work</em>, Zuoyan Qin (<a href="https://www.slideshare.net/ssuser0a3cdd/pegasus-kv-storage-let-the-users-focus-on-their-work-201807">Slides</a>)</li>
<li>2017, Shenzhen China, ArchSummit, <em>Behind Pegasus, What matters in a Distributed System</em>, Weijie Sun (<a href="https://sz2017.archsummit.com/presentation/969">Intro</a>, <a href="https://www.slideshare.net/ssuser0a3cdd/behind-pegasus-what-matters-in-a-distributed-system-arch-summit-shenzhen2017">Slides</a>)</li>
<li>2016, Beijing China, ArchSummit, <em>Pegasus: Designing a Distributed Key Value System</em>, Zuoyan Qin (<a href="http://bj2016.archsummit.com/presentation/3023">Intro</a>, <a href="https://www.slideshare.net/ssuser0a3cdd/pegasus-designing-a-distributed-key-value-system-arch-summit-beijing2016">Slides</a>)</li>
</ul>
</div>
</div>
<div class="column is-one-fourth is-hidden-mobile" style="padding-left: 3rem">
<p class="menu-label">
<span class="icon">
<i class="fa fa-bars" aria-hidden="true"></i>
</span>
Table of contents
</p>
<ul class="menu-list">
<li><a href="#Overview">Overview</a>
<ul>
<li><a href="#background">Background</a></li>
<li><a href="#features">Features</a></li>
<li><a href="#presentations">Presentations</a></li>
</ul>
</li>
</ul>
</div>
</div>
</div>
</section>
<footer class="footer">
<div class="container">
<div class="content is-small has-text-centered">
<div style="margin-bottom: 20px;">
<a href="http://incubator.apache.org">
<img src="/assets/images/egg-logo.png"
width="15%"
alt="Apache Incubator"/>
</a>
</div>
Copyright &copy; 2023 <a href="http://www.apache.org">The Apache Software Foundation</a>.
Licensed under the <a href="http://www.apache.org/licenses/LICENSE-2.0">Apache License, Version
2.0</a>.
<br><br>
Apache Pegasus is an effort undergoing incubation at The Apache Software Foundation (ASF),
sponsored by the Apache Incubator. Incubation is required of all newly accepted projects
until a further review indicates that the infrastructure, communications, and decision making process
have stabilized in a manner consistent with other successful ASF projects. While incubation status is
not necessarily a reflection of the completeness or stability of the code, it does indicate that the
project has yet to be fully endorsed by the ASF.
<br><br>
Apache Pegasus, Pegasus, Apache, the Apache feather logo, and the Apache Pegasus project logo are either
registered trademarks or trademarks of The Apache Software Foundation in the United States and other
countries.
</div>
</div>
</footer>
<script src="/assets/js/app.js" type="text/javascript"></script>
</body>
</html>