blob: eae9d1b9a4cde288846abdd76ad9cb95e1d0536e [file] [log] [blame]
<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8">
<meta http-equiv="X-UA-Compatible" content="IE=edge">
<meta name="description" content="">
<meta name="viewport" content="width=device-width, initial-scale=1">
<title>RHEEM: Enabling Cross-Platform Data Processing</title>
<meta name="generator" content="Hugo 0.46" />
<link rel="stylesheet" href="https://rheem-ecosystem.github.io/css/owl.carousel.css" />
<link rel="stylesheet" href="https://rheem-ecosystem.github.io/css/bootstrap.min.css" />
<link rel="stylesheet" href="https://rheem-ecosystem.github.io/css/font-awesome.min.css" />
<link rel="stylesheet" href="https://rheem-ecosystem.github.io/css/airspace-local-fonts.css" />
<link rel="stylesheet" href="https://rheem-ecosystem.github.io/css/airspace.css" />
<link rel="stylesheet" href="https://rheem-ecosystem.github.io/css/style.css" />
<link rel="stylesheet" href="https://rheem-ecosystem.github.io/css/ionicons.min.css" />
<link rel="stylesheet" href="https://rheem-ecosystem.github.io/css/animate.css" />
<link rel="stylesheet" href="https://rheem-ecosystem.github.io/css/responsive.css" />
<link rel="stylesheet" href="https://rheem-ecosystem.github.io/css/syntax.css" />
<link rel="stylesheet" href="https://use.fontawesome.com/releases/v5.2.0/css/all.css" integrity="sha384-hWVjflwFxL6sNzntih27bfxkr27PmbbK/iSvJ+a4+0owXq79v+lsFkW54bOGbiDQ" crossorigin="anonymous">
<script src="//ajax.googleapis.com/ajax/libs/jquery/1.10.2/jquery.min.js"></script>
<script>window.jQuery || document.write('<script src="js/vendor/jquery-1.10.2.min.js"><\/script>')</script>
<script src="https://rheem-ecosystem.github.io/js/bootstrap.min.js"></script>
<script src="https://rheem-ecosystem.github.io/js/owl.carousel.min.js"></script>
<script src="https://rheem-ecosystem.github.io/js/plugins.js"></script>
<script src="https://rheem-ecosystem.github.io/js/min/waypoints.min.js"></script>
<script src="https://rheem-ecosystem.github.io/js/jquery.counterup.js"></script>
<script src="https://rheem-ecosystem.github.io/js/main.js"></script>
<script>
var doNotTrack = false;
if (!doNotTrack) {
(function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){
(i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o),
m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)
})(window,document,'script','https://www.google-analytics.com/analytics.js','ga');
ga('create', 'UA-124750019-1', 'auto');
ga('send', 'pageview');
}
</script>
<script>
var doNotTrack = false;
if (!doNotTrack) {
window.ga=window.ga||function(){(ga.q=ga.q||[]).push(arguments)};ga.l=+new Date;
ga('create', 'UA-124750019-1', 'auto');
ga('send', 'pageview');
}
</script>
<script async src='https://www.google-analytics.com/analytics.js'></script>
</head>
<body>
<header>
<a href="https://github.com/rheem-ecosystem/rheem"><img style="position: absolute; top: 0; left: 0; border: 0; z-index: 12; " src="https://camo.githubusercontent.com/121cd7cbdc3e4855075ea8b558508b91ac463ac2/68747470733a2f2f73332e616d617a6f6e6177732e636f6d2f6769746875622f726962626f6e732f666f726b6d655f6c6566745f677265656e5f3030373230302e706e67" alt="Fork me on GitHub" data-canonical-src="https://s3.amazonaws.com/github/ribbons/forkme_left_green_007200.png"></a>
<div class="container-fluid">
<div class="row">
<div class="col-md-12">
<nav class="navbar navbar-default">
<div class="container-fluid">
<div class="navbar-header">
<button type="button" class="navbar-toggle collapsed" data-toggle="collapse" data-target="#bs-example-navbar-collapse-1">
<span class="sr-only">Toggle navigation</span>
<span class="icon-bar"></span>
<span class="icon-bar"></span>
<span class="icon-bar"></span>
</button>
<a class="navbar-brand" href="https://rheem-ecosystem.github.io/">
<img src="https://rheem-ecosystem.github.io/img/logo.png" alt="Rheem X-Platform Logo" class="logo-rheem">
</a>
</div>
<div class="collapse navbar-collapse move" id="bs-example-navbar-collapse-1">
<ul class="nav navbar-nav navbar-right">
<li><a href="https://rheem-ecosystem.github.io/">Home</a></li>
<li><a href="/about/">About</a></li>
<li><a href="/documentation/">Documentation</a></li>
<li><a href="/publication/">Publications</a></li>
<li><a href="/people/">People</a></li>
<li><a href="/download/">Download</a></li>
<li><a href="/contact/">Contact</a></li>
</ul>
</div>
</div>
</nav>
</div>
</div>
</div>
</header>
<div class="post">
<section class="section" style="border: 1px dotted #ddd;">
<div class="container">
<div class="row">
<div>
<div class="block">
<h1>RHEEM: Enabling Cross-Platform Data Processing</h1>
<div class="post-info-wrapper">
<p class="italic">By <span class="bold">Divy Agrawal, Sanjay Chawla, Zoi Kaoudi, Sebastian Kruse, Jorge-Arnulfo Quiané-Ruiz, Bertty Contreras-Rojas, Ahmed Elmagarmid, Yasser Idris, Ji Lucas, Essam Mansour, Mourad Ouzzani, Paolo Papotti, Nan Tang, Saravanan Thirumuruganathan and Anis Troudi</span> on <span class="bold">2018</span></p>
</div>
<hr />
<p><p>Solving business problems increasingly requires going beyond the limits of a single data processing platform (platform for short), such as Hadoop or a DBMS. As a result, organizations typically perform tedious and costly tasks to juggle their code and data across different platforms. Addressing this pain and achieving automatic cross-platform data processing is quite challenging: finding the most efficient platform for a given task requires quite good expertise for all the available platforms. We present Rheem, a general-purpose cross-platform data processing system that decouples applications from the underlying platforms. It not only determines the best platform to run an incoming task, but also splits the task into subtasks and assigns each subtask to a specific platform to minimize the overall cost (e.g., runtime or monetary cost). It features (i) an interface to easily compose data analytic tasks; (ii) a novel cost-based optimizer able to find the most efficient platform in almost all cases; and (iii) an executor to efficiently orchestrate tasks over different platforms. As a result, it allows users to focus on the business logic of their applications rather than on the mechanics of how to compose and execute them. Using different real-world applications with Rheem, we demonstrate how cross-platform data processing can accelerate performance by more than one order of magnitude compared to single-platform data processing.</p>
</p>
<hr />
<p class="center-text" style="width: 100%">
<a href="https://rheem-ecosystem.github.io/pdf/paper/rheem.pdf" class="btn btn-main"><i class="fa fa-file-pdf-o"></i> Download</a>
<br>
</p>
</div>
</div>
</div>
</div>
</section>
</div>
<p class="center-text" style="padding: 30px;">
<a class="btn btn-main" href="https://rheem-ecosystem.github.io//publication">Back to Publications</a>
</p>
<section id="call-to-action">
<div class="container-fluid">
<div class="row">
<div class="col-md-12">
<div class="block">
<h2>Turning a Zoo into a Circus</h2>
<p><a class="nothing" href="https://rheem-ecosystem.github.io/about">Read more</a> on how <b>RHEEM</b> tame the Zoo of existing data processing platforms to work together.</p>
</div>
</div>
</div>
</div>
</section>
<footer>
<div class="container-fluid">
<div class="row">
<div class="col-md-12">
<div class="footer-manu">
<ul>
<li><a href="https://www.qcri.org/">About QCRI</a></li>
<li><a href="/contact">Contact us</a></li>
<li><a href="https://github.com/rheem-ecosystem">Fork me</a></li>
<li><a href="https://github.com/rheem-ecosystem/blob/master/LICENSE.TXT">License</a></li>
</ul>
</div>
<p>Copyright &copy; Developed by the dRHEEMers @ QCRI. All rights reserved.</p>
</div>
</div>
</div>
</footer>
</body>
</html>