| <!DOCTYPE html>
|
| <html lang="en"><head> |
| <meta charset="utf-8"> |
| <title>Orchestrating Scalable Data Pipelines with Apache Toree, YuniKorn, Spark, and Airflow | Community Over Code Europe</title> |
| |
| <meta name="viewport" content="width=device-width, initial-scale=1"> |
| <meta name="description" |
| content="Community Over Code Europe is the annual gathering in Europe of the Apache Software Foundation community."> |
| |
| |
| <meta name="generator" content="Hugo 0.119.0"><meta property="og:title" content="Orchestrating Scalable Data Pipelines with Apache Toree, YuniKorn, Spark, and Airflow" /> |
| <meta property="og:description" content="This session explores the integrated use of Apache Toree, YuniKorn, Spark, and Airflow to create efficient, scalable data pipelines. We will start by discussing how Apache Toree provides an interactive analysis environment with Spark via Jupyter Notebook. Then, we’ll discuss using Apache YuniKorn to manage and schedule these computational resources, ensuring system efficiency. Central to our talk, we’ll delve into the role of Apache Spark in large-scale data processing, highlighting its integration with Toree and YuniKorn." /> |
| <meta property="og:type" content="article" /> |
| <meta property="og:url" content="https://eu.communityovercode.org/sessions/2024/orchestrating-scalable-data-pipelines-with-apache-toree-yunikorn-spark-and-airflow/" /><meta property="og:image" content="https://eu.communityovercode.org/images/card.jpg"/><meta property="article:section" content="sessions" /> |
| |
| |
| <meta name="twitter:card" content="summary_large_image"/> |
| <meta name="twitter:image" content="https://eu.communityovercode.org/images/card.jpg"/> |
| |
| <meta name="twitter:title" content="Orchestrating Scalable Data Pipelines with Apache Toree, YuniKorn, Spark, and Airflow"/> |
| <meta name="twitter:description" content="This session explores the integrated use of Apache Toree, YuniKorn, Spark, and Airflow to create efficient, scalable data pipelines. We will start by discussing how Apache Toree provides an interactive analysis environment with Spark via Jupyter Notebook. Then, we’ll discuss using Apache YuniKorn to manage and schedule these computational resources, ensuring system efficiency. Central to our talk, we’ll delve into the role of Apache Spark in large-scale data processing, highlighting its integration with Toree and YuniKorn."/> |
| <!-- plugins --> |
| |
| <link rel="stylesheet" href="/plugins/bootstrap.min.css"> |
| |
| <link rel="stylesheet" href="/plugins/bootstrap-table.min.css"> |
| |
| <link rel="stylesheet" href="/plugins/fontawesome.css"> |
| |
| |
| <!-- Main Stylesheet --> |
| |
| <link rel="stylesheet" href='/scss/style.min.css?v=202406201555' media="screen"> |
| |
| <!--Favicon--> |
| <link rel="shortcut icon" href="/favicon.ico" type="image/x-icon"> |
| <link rel="icon" href="/favicon.ico" type="image/x-icon"> |
| |
| |
| </head><body class="interior">
|
| <header class="header-bar">
|
| <nav class="navbar navbar-expand-lg main-nav navbar-light fixed-top">
|
|
|
| <a class="navbar-brand ml-4 pb-2" href='/'>
|
| <img src="/images/coc-logo-color.svg" alt="Community Over Code Europe" class="img-fluid logo-b">
|
| </a>
|
| <button class="navbar-toggler" type="button" data-bs-toggle="collapse" data-bs-target="#navigation"
|
| aria-controls="navigation" aria-expanded="false" aria-label="Toggle navigation">
|
| <span class="navbar-toggler-icon"></span>
|
| </button>
|
|
|
|
|
| <div class="collapse navbar-collapse text-center my-auto" id="navigation">
|
| <ul class="navbar-nav me-auto align-items-center">
|
|
|
|
|
| <li class="nav-item dropdown">
|
| <a class="nav-link dropdown-toggle" href="#" role="button" data-toggle="dropdown" aria-haspopup="true"
|
| aria-expanded="false">
|
| About
|
| </a>
|
| <div class="dropdown-menu">
|
|
|
| <a class="dropdown-item" href="/about">Community Over Code</a>
|
|
|
| <a class="dropdown-item" href="/about-the-asf">About the ASF</a>
|
|
|
| <a class="dropdown-item" href="/diversity-and-inclusion">Diversity & Inclusion</a>
|
|
|
| </div>
|
| </li>
|
|
|
|
|
|
|
| <li class="nav-item">
|
| <a class="nav-link" href="/program">Program</a>
|
| </li>
|
|
|
|
|
|
|
| <li class="nav-item">
|
| <a class="nav-link" href="/speakers">Speakers</a>
|
| </li>
|
|
|
|
|
|
|
| <li class="nav-item dropdown">
|
| <a class="nav-link dropdown-toggle" href="#" role="button" data-toggle="dropdown" aria-haspopup="true"
|
| aria-expanded="false">
|
| Venue
|
| </a>
|
| <div class="dropdown-menu">
|
|
|
| <a class="dropdown-item" href="/venue">About the venue</a>
|
|
|
| <a class="dropdown-item" href="/how-to-get-there">How to get there</a>
|
|
|
| </div>
|
| </li>
|
|
|
|
|
|
|
| <li class="nav-item">
|
| <a class="nav-link" href="/#latest-news">News</a>
|
| </li>
|
|
|
|
|
|
|
| <li class="nav-item">
|
| <a class="nav-link" href="/faq">FAQ</a>
|
| </li>
|
|
|
|
|
|
|
|
|
| <li class="nav-item">
|
| <a id="nav-button" href="/tickets" class="btn btn-orange text-white btn-rounded">Tickets</a>
|
| </li>
|
|
|
| </ul>
|
| </div>
|
| </nav>
|
| </header>
|
|
|
|
|
| <section class="page-header">
|
| <div class="container">
|
| <div class="row justify-content-center">
|
| <div class="col-lg-8">
|
| <div class="content text-center">
|
| <h1 class="mb-3">Orchestrating Scalable Data Pipelines with Apache Toree, YuniKorn, Spark, and Airflow</h1>
|
| <div class="divider mx-auto mb-4 bg-secondary"></div>
|
| </div>
|
| </div>
|
| </div>
|
| </div>
|
| </section>
|
| |
| |
| |
| <section class="speaker-detail"> |
| <div class="container"> |
| <div class="row mt-4"> |
| <div class="image-column col-lg-3 d-none d-lg-block"> |
| <div class="schedule-block col-lg-10 col-md-12 col-sm-12"> |
| <div class="sec-title text-center"> |
| <span class="title">Speaker(s):</span> |
| <div class="speaker-info" style="margin-bottom: 20px;"> |
| |
| <figure class="thumb my-3"> |
| <a href="/speakers/luciano-resende/"> |
| <div class="img-container"> |
| |
| |
| |
| <img src="/images/speakers/luciano-resende_hu28b7b3799f8ad686a0c16c9e4fda2d33_370258_400x0_resize_q75_h2_box.webp" alt="Photo of images/speakers/luciano-resende.jpg" class="img-fluid rounded-circle"> |
| |
| </div> |
| <h5 class="name">Luciano Resende</h5> |
| </a> |
| </figure> |
| |
| <figure class="thumb my-3"> |
| <a href="/speakers/hongyue-zhang/"> |
| <div class="img-container"> |
| |
| |
| |
| <img src="/images/speakers/hongyue-zhang_hu5fa3aaf0f52c05feba607eda67517e41_126874_400x0_resize_q75_h2_box.webp" alt="Photo of images/speakers/hongyue-zhang.jpg" class="img-fluid rounded-circle"> |
| |
| </div> |
| <h5 class="name">Hongyue Zhang</h5> |
| </a> |
| </figure> |
| |
| </div> |
| |
| </div> |
| </div> |
| </div> |
| <div class="info-column col-lg-9 col-md-12 col-sm-12"> |
| <div class="inner-column"> |
| <div class="text-box"> |
| <div class="session-meta" id="date"> |
| |
| <div> |
| <em>Jun-04 16:10-16:40 in Rhapsody</em> |
| </div> |
| |
| <div class="d-lg-none d-xl-none"> |
| By |
| |
| <a class="speaker-inline-item" href="https://eu.communityovercode.org/speakers/luciano-resende/">Luciano Resende</a> |
| |
| <a class="speaker-inline-item" href="https://eu.communityovercode.org/speakers/hongyue-zhang/">Hongyue Zhang</a> |
| |
| </div> |
| |
| |
| |
| <div class="content mt-4"><p>This session explores the integrated use of Apache Toree, YuniKorn, Spark, and Airflow to create efficient, scalable data pipelines. We will start by discussing how Apache Toree provides an interactive analysis environment with Spark via Jupyter Notebook. Then, we’ll discuss using Apache YuniKorn to manage and schedule these computational resources, ensuring system efficiency. Central to our talk, we’ll delve into the role of Apache Spark in large-scale data processing, highlighting its integration with Toree and YuniKorn. Finally, we’ll demonstrate how Apache Airflow orchestrates this complex workflow, managing dependencies, and providing end-to-end processing solutions. Attendees will learn to leverage these Apache projects for optimized data processing.</p> |
| </div> |
| </div> |
| </div> |
| </div> |
| </div> |
| |
| </div> |
| </div> |
| </section> |
| |
| |
| |
|
|
|
|
| <footer>
|
| <div class="container-fluid">
|
| <div class="container py-5">
|
| <div class="d-flex justify-content-between">
|
| <div class="col-6 col-md-4 col-lg-3">
|
| <div class="mb-3"> <img src="/images/logo-h.svg" class="img-fluid" alt="Community Over Code Europe"></div>
|
|
|
|
|
| <ul class="list-inline mb-0">
|
|
|
| <li class="list-inline-item mx-2 h3" data-toggle="tooltip" data-placement="top" title aria-label="Email us" data-original-title="Email us">
|
| <a title="Email us" target="_blank" href="mailto:coceu@sg.com.mx?subject=[EU]">
|
| <i class="fa fa-envelope" aria-hidden="true"></i>
|
| </a>
|
| </li>
|
|
|
| <li class="list-inline-item mx-2 h3" data-toggle="tooltip" data-placement="top" title aria-label="Slack" data-original-title="Slack">
|
| <a title="Slack" target="_blank" href="https://s.apache.org/apachecon-slack">
|
| <i class="fab fa-slack" aria-hidden="true"></i>
|
| </a>
|
| </li>
|
|
|
| <li class="list-inline-item mx-2 h3" data-toggle="tooltip" data-placement="top" title aria-label="Watch us on YouTube" data-original-title="Watch us on YouTube">
|
| <a title="Watch us on YouTube" target="_blank" href="https://www.youtube.com/@communityovercode">
|
| <i class="fab fa-youtube" aria-hidden="true"></i>
|
| </a>
|
| </li>
|
|
|
| </ul>
|
|
|
|
|
| </div>
|
| <div class="col-md-6 text-right">
|
| <div class="footer-links">
|
|
|
|
|
| <ul>
|
|
|
| <li><a href="/coc" >
|
| Code of Conduct
|
| </a></li>
|
|
|
| <li><a href="/accessibility" >
|
| Accessibility
|
| </a></li>
|
|
|
| <li><a href="/privacy" >
|
| Privacy Policy
|
| </a></li>
|
|
|
| <li><a href="/team" >
|
| Organizers
|
| </a></li>
|
|
|
| <li><a href="https://communityovercode.org/wp-content/uploads/2023/12/community-over-code-prospectus-2024.pdf" >
|
| Prospectus
|
| </a></li>
|
|
|
| </ul>
|
|
|
|
|
| </div>
|
| </div>
|
| </div>
|
| </div>
|
| </div>
|
| <div class="footer-section footer-section__policies-section bg-dark">
|
| <div class="container my-0 footer-section__policies-section--disclaimer">
|
| Community Over Code operates under the terms of <a href="https://apache.org/foundation/policies/conduct">The ASF Code of Conduct</a>.
|
|
|
| </div>
|
| </div>
|
| </footer>
|
|
|
|
|
| <!-- JS Plugins -->
|
|
|
| <script src="/plugins/jquery.min.js"></script>
|
|
|
| <script src="/plugins/bootstrap.bundle.min.js"></script>
|
|
|
| <script src="/plugins/bootstrap-table.min.js"></script>
|
|
|
| <script src="/plugins/bootstrap-table.min.js"></script>
|
|
|
| <script src="https://js.tito.io/v2"></script>
|
|
|
|
|
| <script>
|
| var _paq = window._paq = window._paq || [];
|
|
|
| _paq.push(["disableCookies"]);
|
| _paq.push(['trackPageView']);
|
| _paq.push(['enableLinkTracking']);
|
| (function() {
|
| var u="https://analytics.apache.org/";
|
| _paq.push(['setTrackerUrl', u+'matomo.php']);
|
| _paq.push(['setSiteId', '39']);
|
| var d=document, g=d.createElement('script'), s=d.getElementsByTagName('script')[0];
|
| g.async=true; g.src=u+'matomo.js'; s.parentNode.insertBefore(g,s);
|
| })();
|
| </script>
|
|
|
| </body>
|
|
|
| </html> |