blob: 53698a2cc59e62c4b902227122b13aed18ebf848 [file] [log] [blame]
<!DOCTYPE html>
<!--[if lt IE 7]> <html class="no-js lt-ie9 lt-ie8 lt-ie7"> <![endif]-->
<!--[if IE 7]> <html class="no-js lt-ie9 lt-ie8"> <![endif]-->
<!--[if IE 8]> <html class="no-js lt-ie9"> <![endif]-->
<!--[if gt IE 8]><!--> <html class="no-js"> <!--<![endif]-->
<head>
<meta charset="utf-8">
<meta http-equiv="X-UA-Compatible" content="IE=edge,chrome=1">
<meta name="viewport" content="width=device-width,initial-scale=1,maximum-scale=1"/>
<title>Gearpump Basic Concepts - GearPump 0.6.2 Documentation</title>
<meta name="description" content="Gearpump Basic Concepts">
<link rel="stylesheet" href="css/bootstrap-3.3.5.min.css">
<style>
body {
padding-top: 60px;
padding-bottom: 40px;
}
</style>
<link rel="stylesheet" href="css/main.css">
<link rel="stylesheet" href="css/pygments-default.css">
<script src="js/vendor/modernizr-2.6.1-respond-1.1.0.min.js"></script>
</head>
<body>
<!--[if lt IE 7]>
<p class="chromeframe">You are using an outdated browser. <a href="http://browsehappy.com/">Upgrade your browser today</a> or <a href="http://www.google.com/chromeframe/?redirect=true">install Google Chrome Frame</a> to better experience this site.</p>
<![endif]-->
<div class="navbar navbar-inverse navbar-fixed-top" id="topbar">
<div class="container">
<div class="navbar-header">
<a class="navbar-brand" href="/">GearPump
<span class="label label-primary" style="font-size: .6em">0.6.2</span>
</a>
</div>
<div class="collapse navbar-collapse">
<ul class="nav navbar-nav">
<li><a href="index.html">Overview</a></li>
<li class="dropdown">
<a href="#" class="dropdown-toggle" data-toggle="dropdown">Introduction<b class="caret"></b></a>
<ul class="dropdown-menu">
<li><a href="submit-your-1st-application.html">Submit Your 1st Application</a></li>
<li><a href="commandline.html">Client Command Line</a></li>
<li class="divider"></li>
<li><a href="basic-concepts.html">Basic Concepts</a></li>
<li><a href="features.html">Technical Highlights</a></li>
<li><a href="message-delivery.html">Reliable Message Delivery</a></li>
<li><a href="performance-report.html">Performance</a></li>
<li><a href="gearpump-internals.html">Gearpump Internals</a></li>
</ul>
</li>
<li class="dropdown">
<a href="#" class="dropdown-toggle" data-toggle="dropdown">Deploying<b class="caret"></b></a>
<ul class="dropdown-menu">
<li class="dropdown-header">Deployment</li>
<li><a href="deployment-docker.html">Docker</a><li>
<li><a href="deployment-local.html">Local</a><li>
<li><a href="deployment-standalone.html">Standalone</a></li>
<li><a href="deployment-yarn.html">YARN</a></li>
<li class="divider"></li>
<li><a href="deployment-ha.html">High Availability</a></li>
<li><a href="deployment-msg-delivery.html">Reliable Message Delivery</a></li>
<li><a href="deployment-configuration.html">Configuration</a></li>
<li class="divider"></li>
<li><a href="deployment-security.html">Security</a></li>
</ul>
</li>
<li class="dropdown">
<a href="#" class="dropdown-toggle" data-toggle="dropdown">Programming Guide<b class="caret"></b></a>
<ul class="dropdown-menu">
<li><a href="dev-write-1st-app.html">Write Your 1st App</a></li>
<li><a href="dev-custom-serializer.html">Customized Message Passing</a></li>
<li class="divider"></li>
<li><a href="api/scala/index.html">Scala API</a></li>
<li><a href="api/java/index.html">Java API</a></li>
<li><a href="dev-rest-api.html">RESTful API</a></li>
<li class="divider"></li>
<li><a href="dev-connectors.html">Gearpump Connectors</a></li>
<li class="divider"></li>
<li><a href="dev-storm.html">Storm Compatibility</a></li>
<!--
<li><a href="dev-samoa.html">Samoa Compatibility</a></li>
<li class="divider"></li>
<li><a href="dev-iot.html">Gearpump with IoT</a></li>
-->
</ul>
</li>
<li class="dropdown">
<a href="#" class="dropdown-toggle" data-toggle="dropdown">More<b class="caret"></b></a>
<ul class="dropdown-menu">
<li><a href="how-to-contribute.html">How to Contribute</a></li>
<li><a href="coding-style.html">Coding Style</a></li>
<li class="divider"></li>
<li><a href="faq.html">FAQ</a><li>
<li><a href="about.html">About</a></li>
</ul>
</li>
</ul>
</div>
</div>
</div>
<div class="container" id="content">
<h1 class="title">Gearpump Basic Concepts</h1>
<h3 id="system-timestamp-and-application-timestamp">System timestamp and Application timestamp</h3>
<p>System timestamp denotes the time of backend cluster system. Application timestamp denotes the time at which message is generated. For example, for IOT edge device, the timestamp at which field sensor device creates a message is type of application timestamp, while the timestamp at which that message get received by the backend is type of system time.</p>
<h3 id="master-and-worker">Master, and Worker</h3>
<p>Gearpump follow master slave architecture. Every cluster contains one or more Master node, and several worker nodes. Worker node is responsible to manage local resources on single machine, and Master node is responsible to manage global resources of the whole cluster.</p>
<p><img src="img/actor_hierarchy.png" alt="Actor Hierarchy" /></p>
<h3 id="application">Application</h3>
<p>Application is what we want to parallel and run on the cluster. There are different application types, for example MapReduce application and streaming application are different application types. Gearpump natively supports Streaming Application types, it also contains several templates to help user to create custom application types, like distributedShell.</p>
<h3 id="appmaster-and-executor">AppMaster and Executor</h3>
<p>In runtime, every application instance is represented by single AppMaster and a list of Executors. AppMaster represent the command and control center of the Application instance, it communicate with user, master, worker, and executor to get the job done. Each executor is a parallel unit for distributed application. Typically AppMaster and Executor will be started as JVM processes on worker nodes.</p>
<h3 id="application-submission-flow">Application Submission Flow</h3>
<p>When user submits an application to Master, Master will first find an available worker to start the AppMaster. After AppMaster is started, AppMaster will request Master for more resources (worker) to start executors. The Executor now is only an empty container, after the executors are started, the AppMaster will then distribute real computation tasks to the executor and run them in parallel way.</p>
<p>To submit an application, a Gearpump client specifies a computation defined within a DAG and submits this to an active master. The SubmitApplication message is sent to the Master who then forwards this to an AppManager.</p>
<p><img src="img/submit.png" alt="Submit App" />
Figure: User Submit Application</p>
<p>The AppManager locates an available worker and launches an AppMaster in a sub-process JVM of the worker. The AppMaster will then negotiate with the Master for Resource allocation in order to distribute the DAG as defined within the Application. The allocated workers will then launch Executors (new JVMs).</p>
<p><img src="img/submit2.png" alt="Launch Executors and Tasks" />
Figure: Launch Executors and Tasks</p>
<h3 id="streaming-topology-processor-and-task">Streaming Topology, Processor, and Task</h3>
<p>For streaming application type, each application contains a topology, which is a DAG (directed acyclic graph) to describe the data flow. Each node in the DAG is a processor. For example, for word count it contains two processors, Split and Sum. The Split processor splits a line to a list of words, then the Sum processor summarize the frequency of each word.
An application is a DAG of processors. Each processor handles messages.</p>
<p><img src="img/dag.png" alt="DAG" />
Figure: Processor DAG</p>
<h3 id="streaming-task-and-partitioner">Streaming Task and Partitioner</h3>
<p>For streaming application type, Task is the minimum unit of parallelism. In runtime, each Processor is paralleled to a list of tasks, with different tasks running in different executor. You can define Partitioner to denote the data shuffling rule between upstream processor tasks and downstream processor tasks.</p>
<p><img src="img/shuffle.png" alt="Data Shuffle" />
Figure: Task Data Shuffling</p>
</div> <!-- /container -->
<script src="js/vendor/jquery-2.1.4.min.js"></script>
<script src="js/vendor/bootstrap-3.3.5.min.js"></script>
<script src="js/vendor/anchor-1.1.1.min.js"></script>
<script src="js/main.js"></script>
<!-- MathJax Section -->
<script type="text/x-mathjax-config">
MathJax.Hub.Config({
TeX: { equationNumbers: { autoNumber: "AMS" } }
});
</script>
<script>
// Note that we load MathJax this way to work with local file (file://), HTTP and HTTPS.
// We could use "//cdn.mathjax...", but that won't support "file://".
(function(d, script) {
script = d.createElement('script');
script.type = 'text/javascript';
script.async = true;
script.onload = function(){
MathJax.Hub.Config({
tex2jax: {
inlineMath: [ ["$", "$"], ["\\\\(","\\\\)"] ],
displayMath: [ ["$$","$$"], ["\\[", "\\]"] ],
processEscapes: true,
skipTags: ['script', 'noscript', 'style', 'textarea', 'pre']
}
});
};
script.src = ('https:' == document.location.protocol ? 'https://' : 'http://') +
'cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML';
d.getElementsByTagName('head')[0].appendChild(script);
}(document));
</script>
</body>
</html>