blob: dc61663a7c23275f2929842c5ed98d8f7e0bba3d [file] [log] [blame]
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
<title>Apache Wayang</title>
<link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/bootstrap@4.5.3/dist/css/bootstrap.min.css" integrity="sha384-TX8t27EcRE3e/ihU7zmQxVncDAy5uIKz4rEkgIXeMed4M0jlfIDPvg6uqKI2xXr2" crossorigin="anonymous">
<link rel="stylesheet" href="../../static/css/color.css">
<link rel="stylesheet" href="../../static/fa/css/all.min.css">
<link rel="icon" type="image/png" href="../../static/img/wayang-favicon.png" />
<style>
.service-item {
text-align: center;
}
.italic {
font-style: italic;
}
.bold {
font-weight: bold;
}
hr {
border: 1px dotted #555555;
width: 80%;
}
p {
text-align: justify;
}
</style>
</head>
<!-- TODO: the padding of the body need to be resposive -->
<body style="padding: 4em; background: white">
<div class="container shadow-lg p-3 mb-5 bg-white rounded">
<nav class="sticky-top navbar navbar-expand-lg navbar-light d-flex bd-highlight mt-n3 mx-n3 shadow mb-4" style="background: #A6A6A6">
<a class="p-2 flex-grow-1 bd-highlight navbar-brand">
<img src="../../static/img/logo-plain.png">
</a>
<button class="navbar-toggler" type="button" data-toggle="collapse" data-target="#navbarText" aria-controls="navbarText" aria-expanded="false" aria-label="Toggle navigation">
<span class="navbar-toggler-icon"></span>
</button>
<div class="collapse navbar-collapse" id="navbarText">
<ul class="navbar-nav mr-auto" style="margin-left: 25%">
<li class="nav-item active">
<a class="nav-link" href="../../index.html">Home <span class="sr-only">(current)</span></a>
</li>
<li class="nav-item">
<a class="nav-link" href="../../about.html">About</a>
</li>
<li class="nav-item">
<a class="nav-link" href="../../documentation.html">Documentation</a>
</li>
<li class="nav-item">
<a class="nav-link" href="../../publications.html">Publications</a>
</li>
<li class="nav-item dropdown">
<a class="nav-link dropdown-toggle" data-toggle="dropdown" href="#" role="button" aria-haspopup="true" aria-expanded="false">Apache</a>
<div class="dropdown-menu">
<a class="dropdown-item" href="http://www.apache.org/foundation/how-it-works.html">Apache Software Foundation</a>
<a class="dropdown-item" href="http://www.apache.org/licenses/">Apache License</a>
<a class="dropdown-item" href="http://www.apache.org/foundation/sponsorship.html">Sponsorship</a>
<a class="dropdown-item" href="http://www.apache.org/foundation/thanks.html">Thanks</a>
</div>
</li>
</li>
</ul>
</div>
</nav>
<div class="row mb-3 mt-n5 d-flex align-items-center" style="background-image: url(../../static/img/background-1.png); height: 10em; background-position: 50%">
<div class="col" style="text-align: center">
<h1 style="color: white; font-size: 4em">Publication</h1>
<h2 style="color: white; font-size: 2em">"Interoperating a Zoo of Data Processing Platforms Using Rheem"</h2>
</div>
</div>
<div class="row justify-content-md-center mb-4">
<div class="col-10 ">
<div class="post-info-wrapper">
<p class="italic">By <span class="bold">Yasser Idris and Sebastian Kruse</span> on <span class="bold">2017</span></p>
</div>
<hr />
<p>
We are witnessing a proliferation of big data, which has lead to a zoo of data processing systems. Each system providing a different set of features. For example, Spark provides scalability to analytic tasks, but Java 8 Streams provides low-latency. Furthermore, complex applications, such as ETL and ML, are now requiring a mixture of platforms to perform tasks efficiently. In such complex data analytics pipelines, the use of multiple data processing system is not only for performance reasons, but also because of data diversity. Datasets often natively reside on different data formats and storage engines. Unfortunately, developers are left alone in the challenging tasks of: (1) choosing the right platform for their applications; and (2) performing tedious and costly data migration and integration tasks to obtain the results.
In this talk, we will present Rheem, an open source scalable cross-platform system that frees developers from these burdens. Rheem provides an abstraction layer on top of Spark (and other processing platforms) with the aim of enabling cross-platform optimization and interoperability. It automatically selects the best data processing platforms for a given task and also handles the cross-platform execution. In particular, we will discuss how Rheem allows Spark to work in tandem with other platforms in order to achieve higher performance. We will also show how easy a developer can write complex applications on top of Rheem to seamlessly use multiple different data processing platforms according to their tasks at hand. Using Rheem developers do not have to worry about the integration or data migration between Spark and other platforms.
</p>
<hr />
</div>
<div class="col-10 text-center">
<a href="https://databricks.com/session/interoperating-a-zoo-of-data-processing-platforms-using-rheem" class="btn btn-success">
<i class="far fa-file-pdf"></i> Download
</a>
</div>
</div>
<nav class="navbar fixed-bottom navbar-light bg-light position-relative mb-n3 mx-n3 mb-4" style="background: #A6A6A6">
<div class="row justify-content-center">
<div class="col-10 text-center">
<p style="text-align: justify">
Apache Wayang is an effort undergoing Incubation at The Apache Software Foundation (ASF), sponsored by the Incubator. Incubation is required of all newly accepted projects until a further review indicates that the infrastructure, communications, and decision making process have stabilized in a manner consistent with other successful ASF projects. While incubation status is not necessarily a reflection of the completeness or stability of the code, it does indicate that the project has yet to be fully endorsed by the ASF.
</p>
<a href="http://incubator.apache.org/">
<img src="../../static/img/egg-logo.png">
</a>
<br />
<p>
Copyright &#169; 2021 The Apache Software Foundation.<br />
Licensed under the Apache License, Version 2.0.<br />
Apache, the Apache Feather logo, and the Apache Incubator project logo are trademarks of The Apache Software Foundation.
</p>
</div>
</div> </nav>
</div>
<script src="https://code.jquery.com/jquery-3.5.1.slim.min.js" integrity="sha384-DfXdz2htPH0lsSSs5nCTpuj/zy4C+OGpamoFVy38MVBnE+IbbVYUew+OrCXaRkfj" crossorigin="anonymous"></script>
<script src="https://cdn.jsdelivr.net/npm/popper.js@1.16.1/dist/umd/popper.min.js" integrity="sha384-9/reFTGAW83EW2RDu2S0VKaIzap3H66lZH81PoYlFhbGU+6BZp6G7niu735Sk7lN" crossorigin="anonymous"></script>
<script src="https://cdn.jsdelivr.net/npm/bootstrap@4.5.3/dist/js/bootstrap.min.js" integrity="sha384-w1Q4orYjBQndcko6MimVbzY0tgp4pWB4lZ7lr30WKz0vr/aWKhXdBNmNb5D92v7s" crossorigin="anonymous"></script>
</body>
</html>