| <!DOCTYPE html> |
| <html> |
| <head> |
| <meta charset="utf-8"> |
| <meta http-equiv="X-UA-Compatible" content="IE=edge"> |
| <meta name="viewport" content="width=device-width, initial-scale=1"> |
| |
| <link rel="shortcut icon" href="/favicon.ico" type="image/x-icon"> |
| <link rel="icon" href="/favicon.ico" type="image/x-icon"> |
| |
| <title>Resources</title> |
| |
| <!-- Bootstrap core CSS --> |
| <link href="/assets/css/bootstrap.min.css" rel="stylesheet"> |
| <!-- Bootstrap theme --> |
| <link href="/assets/css/bootstrap-theme.min.css" rel="stylesheet"> |
| |
| <!-- Custom styles for this template --> |
| <link rel="stylesheet" href="http://fortawesome.github.io/Font-Awesome/assets/font-awesome/css/font-awesome.css"> |
| <link href="/css/style.css" rel="stylesheet"> |
| <link href="/assets/css/owl.theme.css" rel="stylesheet"> |
| <link href="/assets/css/owl.carousel.css" rel="stylesheet"> |
| <script type="text/javascript" src="/assets/js/jquery.min.js"></script> |
| <script type="text/javascript" src="/assets/js/bootstrap.min.js"></script> |
| <script type="text/javascript" src="/assets/js/owl.carousel.min.js"></script> |
| <script type="text/javascript" src="/assets/js/storm.js"></script> |
| </head> |
| |
| |
| <body> |
| <header> |
| <div class="container-fluid"> |
| <div class="row"> |
| <div class="col-md-5"> |
| <a href="/index.html"><img src="/images/logo.png" class="logo"/></a> |
| </div> |
| <div class="col-md-5"> |
| |
| </div> |
| <div class="col-md-2"> |
| <a href="/downloads.html" class="btn-std btn-block btn-download">Download</a> |
| </div> |
| </div> |
| </div> |
| </header> |
| <!--Header End--> |
| <!--Navigation Begin--> |
| <div class="navbar" role="banner"> |
| <div class="container-fluid"> |
| <div class="navbar-header"> |
| <button class="navbar-toggle" type="button" data-toggle="collapse" data-target=".bs-navbar-collapse"> |
| <span class="icon-bar"></span> |
| <span class="icon-bar"></span> |
| <span class="icon-bar"></span> |
| </button> |
| </div> |
| <nav class="collapse navbar-collapse bs-navbar-collapse" role="navigation"> |
| <ul class="nav navbar-nav"> |
| <li><a href="/index.html" id="home">Home</a></li> |
| <li><a href="/getting-help.html" id="getting-help">Getting Help</a></li> |
| <li><a href="/about/integrates.html" id="project-info">Project Information</a></li> |
| <li class="dropdown"> |
| <a href="#" class="dropdown-toggle" data-toggle="dropdown" id="documentation">Documentation <b |
| class="caret"></b></a> |
| <ul class="dropdown-menu"> |
| |
| |
| <li><a href="/releases/2.4.0/index.html">2.4.0</a></li> |
| |
| |
| <li><a href="/releases/2.3.0/index.html">2.3.0</a></li> |
| |
| |
| <li><a href="/releases/2.2.1/index.html">2.2.1</a></li> |
| |
| |
| <li><a href="/releases/2.2.0/index.html">2.2.0</a></li> |
| |
| |
| <li><a href="/releases/2.1.1/index.html">2.1.1</a></li> |
| |
| |
| <li><a href="/releases/2.1.0/index.html">2.1.0</a></li> |
| |
| |
| <li><a href="/releases/2.0.0/index.html">2.0.0</a></li> |
| |
| |
| <li><a href="/releases/1.2.4/index.html">1.2.4</a></li> |
| |
| |
| <li><a href="/releases/1.2.3/index.html">1.2.3</a></li> |
| |
| |
| </ul> |
| </li> |
| <li><a href="/talksAndVideos.html">Talks and Slideshows</a></li> |
| <li class="dropdown"> |
| <a href="#" class="dropdown-toggle" data-toggle="dropdown" id="contribute">Community <b |
| class="caret"></b></a> |
| <ul class="dropdown-menu"> |
| <li><a href="/contribute/Contributing-to-Storm.html">Contributing</a></li> |
| <li><a href="/contribute/People.html">People</a></li> |
| <li><a href="/contribute/BYLAWS.html">ByLaws</a></li> |
| <li><a href="/Powered-By.html">PoweredBy</a></li> |
| </ul> |
| </li> |
| <li><a href="/2022/03/25/storm240-released.html" id="news">News</a></li> |
| </ul> |
| </nav> |
| </div> |
| </div> |
| |
| |
| <div class="container-fluid"> |
| <h1 class="page-title">Resources</h1> |
| <div class="row"> |
| <div class="col-md-12"> |
| <!-- Documentation --> |
| |
| <p class="post-meta"></p> |
| |
| <div class="documentation-content"> |
| <div class="row"> |
| <div class="col-md-12"> |
| <div class="resources"> |
| <ul class="nav nav-tabs" role="tablist"> |
| <li role="presentation" class="active"><a href="#talks" aria-controls="talks" role="tab" |
| data-toggle="tab">Talks</a></li> |
| <li role="presentation"><a href="#slideshows" aria-controls="slideshows" role="tab" |
| data-toggle="tab">Slideshows</a></li> |
| </ul> |
| |
| <div class="tab-content"> |
| <div role="tabpanel" class="tab-pane active" id="talks"> |
| |
| <!-- ################### --> |
| <div class="brickSS"> |
| <div class="row"> |
| |
| <div class="col-md-6"> |
| <h3>Roshan Naik - Apache Storm 2.0 : Rearchitecture and Performance</h3> |
| <div> |
| <p>Published on Oct 29, 2019</p> |
| <p> |
| The effort to rearchitect Apache Storm's core engine was born |
| from the observation that there exists a significant gap between |
| hardware capabilities and the performance of the best streaming |
| engines. This talk takes a look at the performance and |
| architecture of the new engine which features a leaner threading |
| model, a lock free messaging subsystem and a new |
| ultra-lightweight Back Pressure model. |
| </p> |
| </div> |
| </div> |
| <div class="col-md-6"> |
| <iframe width="560" height="315" |
| src="https://www.youtube.com/embed/QsPzAtZXIk4" frameborder="0" |
| allow="accelerometer; autoplay; encrypted-media; gyroscope; picture-in-picture" |
| allowfullscreen></iframe> |
| </div> |
| |
| </div> |
| </div> |
| |
| <!-- ################### --> |
| <!-- ################### --> |
| <div class="brickSS"> |
| <div class="row"> |
| <div class="col-md-6"> |
| <iframe width="560" height="315" |
| src="https://www.youtube.com/embed/xNsG26uZ7sw" frameborder="0" |
| allowfullscreen></iframe> |
| </div> |
| <div class="col-md-6"> |
| <h3>Kishore Patil (Verizon Media) - Apache Storm 2.0 Overview - Features |
| and Performance Enhancements</h3> |
| <div> |
| <p>Published on Oct 29, 2019</p> |
| <p> |
| Apache Storm 2.0 is the biggest release since becoming a top |
| level Apache project. This talk provides a feature summary for |
| the 2.0 release and overviews of the major enhancements to |
| scalability, scheduling, secruity, metrics, Streaming SQL, |
| Streams API etc. |
| </p> |
| </div> |
| </div> |
| |
| </div> |
| </div> |
| |
| <!-- ################### --> |
| <!-- ################### --> |
| <div class="brickSS"> |
| <div class="row"> |
| |
| <div class="col-md-6"> |
| <h3>Nitin Aggarwal, Ishan Chhabra - Fighting Fraud in Real Time by |
| Processing 1M+ TPS Using Storm on Slider (YARN)</h3> |
| <div> |
| <p>Published on Jul 2, 2016</p> |
| <p> |
| Rocket Fuel participates in 100+ billion real time advertising |
| auctions everyday to pick out the right opportunities for our |
| clients to advertise to internet consumers. However, some of |
| these opportunities are created by fraudulent entities, who try |
| to steal away money from various participants in the online |
| advertising ecosystem, including Rocket Fuel. In this talk, we |
| describe Helios, our in-house system built using Storm, Slider |
| and HBase to combat this fraud in real time. We start with the |
| core use case and topology design, and then detail how we run |
| Storm on YARN via Slider which, along with other optimizations, |
| enables Storm to scale to 1M+ TPS in this setting. We also |
| describe the design and implementation of our recently open |
| sourced, highly scalable custom spout to ingest rolling log |
| files from HDFS into Storm pipelines with minimal delay. |
| Finally, we cover how we use HBase (on YARN via Slider) to join |
| high velocity streams, despite out of order events and stream |
| delays, and to compute aggregates at large scale via |
| materialized cubes. |
| </p> |
| </div> |
| </div> |
| <div class="col-md-6"> |
| <iframe width="560" height="315" |
| src="https://www.youtube.com/embed/bZN3z1DjJss" frameborder="0" |
| allowfullscreen></iframe> |
| </div> |
| |
| </div> |
| </div> |
| |
| <!-- ################### --> |
| <!-- ################### --> |
| <div class="brickSS"> |
| <div class="row"> |
| <div class="col-md-6"> |
| <iframe width="560" height="315" |
| src="https://www.youtube.com/embed/_Q2uzRIkTd8" frameborder="0" |
| allowfullscreen></iframe> |
| </div> |
| <div class="col-md-6"> |
| <h3>P. Taylor Goetz - The Future of Apache Storm</h3> |
| <div> |
| <p>Published on June 30, 2016</p> |
| <p> |
| At over 3 years old as an open source project, and an Apache |
| project for more than 2 years, Apache Storm is one of the most |
| mature and widely adopted real-time data platforms available. |
| Apache Storm is used across a wide range of industries, from |
| Fortune 500 companies, to three-letter government agencies, to |
| big data startups, and everything in between. In this session |
| we'll explore how Storm has evolved over the years, and more |
| importantly how it will continue to evolve and innovate in both |
| the near and long term. We will discuss new features, |
| performance improvements, project roadmaps, and its relationship |
| with other open source streaming solutions. |
| </p> |
| </div> |
| </div> |
| |
| </div> |
| </div> |
| |
| <!-- ################### --> |
| <!-- ################### --> |
| <div class="brickSS"> |
| <div class="row"> |
| <div class="col-md-6"> |
| <h3>Savitha Ravikrishnan, Dheeraj Kapur - Managing Hadoop, HBase, and |
| Storm Clusters at Yahoo Scale</h3> |
| <div> |
| <p>Published on June 30, 2016</p> |
| <p> |
| Hadoop at Yahoo is a massive infrastructure and a challenging |
| platform to manage. On a day-to-day basis, it unfolds many |
| challenges in order to be able to run at this scale. We are here |
| to discuss our success in terms of effectively being able to |
| manage this infrastructure from an operations perspective. We |
| have come a long way from full downtime to now no longer |
| requiring any downtime for upgrades and cater to massive |
| workloads in our 40+ clusters in the ecosystem spread across |
| multiple data centers. We are using CI/CD with no downtime |
| upgrades for Hadoop, HBase, Storm, and Support services. Things |
| get even more complex with multi-tenancy, differing workload |
| characteristics, and strict SLAs on HBase and Storm. We will |
| talk about rolling upgrades, and automation & tools we have |
| built to manage a massive grid infrastructure with support for |
| multi-tenancy and full CI/CD. |
| </p> |
| </div> |
| </div> |
| <div class="col-md-6"> |
| <iframe width="560" height="315" |
| src="https://www.youtube.com/embed/Ib8ch-xVzbw" frameborder="0" |
| allowfullscreen></iframe> |
| </div> |
| |
| </div> |
| </div> |
| |
| <!-- ################### --> |
| <!-- ################### --> |
| <div class="brickSS"> |
| <div class="row"> |
| <div class="col-md-6"> |
| <iframe width="560" height="315" |
| src="https://www.youtube.com/embed/gk06oNbNkOg" frameborder="0" |
| allowfullscreen></iframe> |
| </div> |
| <div class="col-md-6"> |
| <h3>Casey Stella - Outlier Analysis and Anomaly Detection for Sensors |
| with Spark and Storm</h3> |
| <div> |
| <p>Published on Apr 14, 2016</p> |
| <p> |
| Detecting outliers and anomalies in data is one of the most |
| common tasks that the working data scientist is asked to do, |
| especially when dealing with volumes of sensor data. Despite |
| this, the library support for problems of this variety are |
| woefully unavailable. Often data scientists are forced to go to |
| research papers and implement their own solutions. This talk |
| will cover the various approaches that I have seen work well in |
| the field and provide reference implementations. We will cover |
| both batch and streaming approaches in Spark and Storm aimed at |
| analyzing sensor feeds. |
| </p> |
| </div> |
| </div> |
| |
| </div> |
| </div> |
| |
| <!-- ################### --> |
| <!-- ################### --> |
| <div class="brickSS"> |
| <div class="row"> |
| <div class="col-md-6"> |
| <h3>P. Taylor Goetz - Beyond the Tweeting Toaster</h3> |
| <div> |
| <p>Published on Sep 1, 2015</p> |
| <p> |
| In this session we will look at how streaming sensor data fits |
| into a variety of (I)IoT analytics use cases, and how Apache |
| Storm and Kafka fit into an overall architecture for large-scale |
| streaming analytics. You will also learn how to leverage the |
| highly accessible Arduino microcontroller platform to create |
| low-cost sensor networks and stream data to Apache Storm for |
| analysis in real time. Finally, we will give a live |
| demonstration of sensor analysis using Kafka, Storm, and an |
| out-of-the-box Arduino board (no soldering required!). |
| </p> |
| </div> |
| </div> |
| <div class="col-md-6"> |
| <iframe width="560" height="315" |
| src="https://www.youtube.com/embed/_h5VjQoAJq4" frameborder="0" |
| allowfullscreen></iframe> |
| </div> |
| </div> |
| </div> |
| |
| <!-- ################### --> |
| <!-- ################### --> |
| <div class="brickSS"> |
| <div class="row"> |
| <div class="col-md-6"> |
| <iframe width="560" height="315" |
| src="https://www.youtube.com/embed/h77pMHzFrSg" frameborder="0" |
| allowfullscreen></iframe> |
| </div> |
| <div class="col-md-6"> |
| <h3>David Wilcox and Bobby Evans - Yahoo Ad Manager Plus: A Case |
| Study</h3> |
| <div> |
| <p>Published on Jun 30, 2015</p> |
| <p> |
| The Yahoo Ad Manager Plus platform (YAM+) provides reports to |
| advertisers on impressions, clicks, and conversions for their ad |
| campaigns. Impressions and clicks are straightforward. |
| Conversions require joining "action beacons" from advertisers |
| with impressions and clicks from advertisements served by YAM+. |
| A conversion is recorded if a user clicked on or was shown an |
| advertisement associated with the campaign identified in the |
| beacon. This presentation describes a storm topology that uses |
| HBase and Druid to provide low-latency feedback to advertisers |
| on the performance of their campaigns. It covers storm and HBase |
| tuning that was needed to support this reporting at production |
| scale. |
| </p> |
| </div> |
| </div> |
| </div> |
| </div> |
| |
| <!-- ################### --> |
| <!-- ################### --> |
| <div class="brickSS"> |
| <div class="row"> |
| <div class="col-md-6"> |
| <h3>Bobby Evans - From Gust to Tempest: Scaling Apache Storm</h3> |
| <div> |
| <p>Published on Jun 30, 2015</p> |
| <p> |
| At Yahoo, we extensively use the Apache Storm distributed |
| real-time computation platform at medium scale deployment. In |
| this talk we overview a collection of recent developments at |
| Yahoo enabling massive Storm scaling to an order of magnitude |
| larger clusters. They include a resource aware scheduler, |
| load-aware shuffle grouping, a stand-alone heart-beat server |
| that reduces the load on ZooKeeper, compression of ZooKeeper |
| data and using timestamps instead of whole data through |
| ZooKeeper, and finally a new distributed cache mechanism to |
| distribute large files required by bolts. |
| </p> |
| </div> |
| </div> |
| <div class="col-md-6"> |
| <iframe width="560" height="315" |
| src="https://www.youtube.com/embed/pB9d3tMM__k" frameborder="0" |
| allowfullscreen></iframe> |
| </div> |
| </div> |
| </div> |
| |
| <!-- ################### --> |
| <!-- ################### --> |
| <div class="brickSS"> |
| <div class="row"> |
| <div class="col-md-6"> |
| <iframe width="560" height="315" |
| src="https://www.youtube.com/embed/hiBc6bHoe3o" frameborder="0" |
| allowfullscreen></iframe> |
| </div> |
| <div class="col-md-6"> |
| <h3>Sheetal Dolas - Design Patterns for real time data analytics</h3> |
| <div> |
| <p>Published on Jun 30, 2015</p> |
| <p> |
| Businesses are moving from large-scale batch data analysis to |
| large-scale real-time data analysis. Apache Storm has emerged as |
| one of the most popular platforms for the purpose.</p> |
| <p> |
| This talk covers proven design patterns for real time stream |
| processing. Patterns that have been vetted in large-scale |
| production deployments that process 10s of billions of |
| events/day and 10s of terabytes of data/day. |
| </p> |
| </div> |
| </div> |
| </div> |
| </div> |
| |
| <!-- ################### --> |
| |
| <!-- ################### --> |
| <div class="brickSS"> |
| <div class="row"> |
| <div class="col-md-6"> |
| <h3>Andrew Montalenti - streamparse: real-time streams with Python and |
| Apache Storm - PyCon 2015</h3> |
| <div> |
| <p>Published on Apr 12, 2015</p> |
| <p> |
| Real-time streams are everywhere, but does Python have a good |
| way of processing them? Until recently, there were no good |
| options. A new open source project, streamparse, makes working |
| with real-time data streams easy for Pythonistas. If you have |
| ever wondered how to process 10,000 data tuples per second with |
| Python -- while maintaining high availability and low latency -- |
| this talk is for you.</p> |
| </div> |
| </div> |
| <div class="col-md-6"> |
| <iframe width="560" height="315" |
| src="https://www.youtube.com/embed/ja4Qj9-l6WQ" frameborder="0" |
| allowfullscreen></iframe> |
| </div> |
| |
| </div> |
| </div> |
| |
| <!-- ################### --> |
| <!-- ################### --> |
| <div class="brickSS"> |
| <div class="row"> |
| <div class="col-md-6"> |
| <iframe width="560" height="315" |
| src="https://www.youtube.com/embed/cH8hKyf4Y40" frameborder="0" |
| allowfullscreen></iframe> |
| </div> |
| <div class="col-md-6"> |
| <h3>P. Taylor Goetz - Scaling Storm: Cluster Sizing and Performance |
| Optimization</h3> |
| <div> |
| <p>Published on Jun 23, 2014</p> |
| <p> |
| ne of the most commonly asked questions about Storm is how to |
| properly size and scale a cluster for a given use case. While |
| there is no magic bullet when it comes to capacity planning for |
| a Storm cluster, there are many operational and development |
| techniques that can be applied to eek out the maximum throughput |
| for a given application. In this session we’ll cover capacity |
| planning, performance tuning and optimization from both an |
| operational and development perspective. We will discuss the |
| basics of scaling, common mistakes and misconceptions, how |
| different technology decisions affect performance, and how to |
| identify and scale around the bottlenecks in a Storm |
| deployment.</p> |
| </div> |
| </div> |
| </div> |
| </div> |
| |
| <!-- ################### --> |
| |
| <div class="brickSS"> |
| <div class="row"> |
| <div class="col-md-6"> |
| <h3>Real-Time Big Data Analytics with Storm |
| </h3> |
| <div> |
| <p>Published on Oct 12, 2013</p> |
| <p> |
| This talk provides an overview of the open source Storm system |
| for processing Big Data in realtime. The talk starts with an |
| overview of the technology, including key components: Nimbus, |
| Zookeeper, Topology, Tuple, Trident. The presentation then dives |
| into the complex Big Data architecture in which Storm can be |
| integrated. The result is a compelling stack of technologies |
| including integrated Hadoop clusters, MPP, and NoSQL databases. |
| |
| The presentation then reviews real world use cases for realtime |
| Big Data analytics. Social updates, in particular real-time news |
| feeds on sites like Twitter and Facebook, benefit from Storm's |
| capacity to process benefits from distributed logic of |
| streaming. Another case study is financial compliance |
| monitoring, where Storm plays a primary role in reducing the |
| market data to a useable subset in realtime. In a final use |
| case, Storm is crucial to collect rich operational intelligence, |
| because it builds multidimensional stats and executes |
| distributed queries.</p> |
| </div> |
| </div> |
| <div class="col-md-6"> |
| <iframe width="560" height="315" |
| src="https://www.youtube.com/embed/hVO5nbxnBkU" frameborder="0" |
| allowfullscreen></iframe> |
| </div> |
| </div> |
| </div> |
| |
| <!-- ################### --> |
| |
| |
| <div class="brickSS"> |
| <div class="row"> |
| <div class="col-md-6"> |
| <iframe width="560" height="315" |
| src="https://www.youtube.com/embed/od8U-XijzlQ" frameborder="0" |
| allowfullscreen></iframe> |
| </div> |
| |
| <div class="col-md-6"> |
| <h3>Andrew Montalenti & Keith Bourgoin - Real-time streams and logs with |
| Storm and Kafka - PyData SV 2014 |
| </h3> |
| <div> |
| <p>Published on Jun 12, 2014</p> |
| <p> |
| |
| Some of the biggest issues at the center of analyzing large |
| amounts of data are query flexibility, latency, and fault |
| tolerance. Modern technologies that build upon the success of |
| "big data" platforms, such as Apache Hadoop, have made it |
| possible to spread the load of data analysis to commodity |
| machines, but these analyses can still take hours to run and do |
| not respond well to rapidly-changing data sets.</p> |
| |
| <p>A new generation of data processing platforms -- which we call |
| "stream architectures" -- have converted data sources into |
| streams of data that can be processed and analyzed in real-time. |
| This has led to the development of various distributed real-time |
| computation frameworks (e.g. Apache Storm) and multi-consumer |
| data integration technologies (e.g. Apache Kafka). Together, |
| they offer a way to do predictable computation on real-time data |
| streams.</p> |
| |
| <p>In this talk, we will give an overview of these technologies and |
| how they fit into the Python ecosystem. This will include a |
| discussion of current open source interoperability options with |
| Python, and how to combine real-time computation with batch |
| logic written for Hadoop. We will also discuss Kafka and Storm's |
| alternatives, current industry usage, and some real-world |
| examples of how these technologies are being used in production |
| by Parse.ly today.</p> |
| </div> |
| </div> |
| </div> |
| </div> |
| <!-- ################### --> |
| <div class="brickSS"> |
| <div class="row"> |
| <div class="col-md-6"> |
| <h3>Yahoo talks about Spark vs. Storm |
| </h3> |
| <div> |
| <p>Published on Sep 18, 2014</p> |
| <p> |
| Bobby Evans and Tom Graves, the engineering leads for Spark and |
| Storm development at Yahoo will talk about how these |
| technologies are used on Yahoo's grids and reasons why to use |
| one or the other.</p> |
| |
| <p>Bobby Evans is the low latency data processing architect at |
| Yahoo. He is a PMC member on many Apache projects including |
| Storm, Hadoop, Spark, and Tez. His team is responsible for |
| delivering Storm as a service to all of Yahoo and maintaining |
| Spark on Yarn for Yahoo (Although Tom really does most of that |
| work).</p> |
| |
| <p>Tom Graves a Senior Software Engineer on the Platform team at |
| Yahoo. He is an Apache PMC member on Hadoop, Spark, and Tez. His |
| team is responsible for delivering and maintaining Spark on Yarn |
| for Yahoo.</p> |
| </div> |
| </div> |
| <div class="col-md-6"> |
| <iframe width="560" height="315" |
| src="https://www.youtube.com/embed/uJ5rdAPHE1w" frameborder="0" |
| allowfullscreen></iframe> |
| </div> |
| </div> |
| </div> |
| <!-- ################### --> |
| <div class="brickSS"> |
| <div class="row"> |
| <div class="col-md-6"> |
| <iframe width="560" height="315" |
| src="https://www.youtube.com/embed/5F0eQ7mkpTU" frameborder="0" |
| allowfullscreen></iframe> |
| </div> |
| </div> |
| |
| <div class="col-md-6"> |
| <h3>Apache Storm Deployment and Use Cases by Spotify Developers |
| </h3> |
| <div> |
| <p>Published on Apr 3, 2014</p> |
| <p> |
| This talk was presented at the New York City Storm User Group hosted |
| by Spotify on March 25, 2014.</p> |
| <p> |
| This is the first time that a Spotify engineer has spoken publicly |
| about their deployment and use cases for Storm! In this talk, |
| Software Engineer Neville Li describes: |
| <ul> |
| <li> |
| Real-time features developed using Storm and Kafka including |
| recommendations, social features, data visualization and ad |
| targeting |
| </li> |
| |
| <li>Architecture</li> |
| |
| <li>Production integration</li> |
| |
| <li>Best practices for deployment</li> |
| </ul> |
| </p> |
| <p>Spotify is an exciting case study - users create 600 Gigabyte of data |
| per day and 150 Gigabyte of data per day via different services. |
| Every day 4 Terabyte of data is generated in Hadoop, a 700-node |
| cluster running over 2.000 jobs per day. They currently have 28 |
| Petabytes of storage, spread out over 4 data centres across the |
| world.</p> |
| </div> |
| </div> |
| </div> |
| |
| <!-- ################### --> |
| <!-- ################### --> |
| <div class="brickSS"> |
| <div class="row"> |
| <div class="col-md-6"> |
| <h3>Nathan Bijnens: A Real-Time Architecture Using Hadoop & |
| Storm</h3> |
| <div> |
| <p>Published on Dec 19, 2013</p> |
| <p>With the proliferation of data sources and growing user bases, |
| the amount of data generated requires new ways for storage and |
| processing. Hadoop opened new possibilities, yet it falls short |
| of instant delivery. Adding stream processing using Nathan |
| Marz's Storm, can overcome this delay and bridge the gap to |
| real-time aggregation and reporting. On the Batch layer all |
| master data is kept and is immutable. Once the base data is |
| stored a recurring process will index the data. This process |
| reads all master data, parses it and will create new views out |
| of it. The new views will replace all previously created views. |
| In the Speed layer data is stored not yet absorbed in the Batch |
| layer. Hours of data instead of years of data. Once the data is |
| indexed in the Batch layer the data can discarded in the Speed |
| layer. The Query Service merges the data from the Speed and |
| Batch layers. This talk focuses on the Lambda architecture, |
| which combines multiple technologies to be able to process vast |
| amounts of data, while still being able to react timely and |
| report near real-time statistics. Filmed at JAX London 2013.</p> |
| </div> |
| </div> |
| <div class="col-md-6"> |
| <iframe width="560" height="315" |
| src="https://www.youtube.com/embed/CrABmVi12_A" frameborder="0" |
| allowfullscreen></iframe> |
| </div> |
| |
| </div> |
| </div> |
| <!-- ################### --> |
| |
| <div class="brickSS"> |
| <div class="row"> |
| <div class="col-md-6"> |
| <iframe width="560" height="315" |
| src="https://www.youtube.com/embed/LpNbjXFPyZ0" frameborder="0" |
| allowfullscreen></iframe> |
| </div> |
| |
| <div class="col-md-6"> |
| <h3>Infrastructure at Scale: Apache Kafka, Twitter Storm & Elastic |
| Search</h3> |
| <div> |
| <p>Published on Nov 29, 2013</p> |
| <p> |
| This is a technical architect's case study of how Loggly has |
| employed the latest social-media-scale technologies as the |
| backbone ingestion processing for our multi-tenant, |
| geo-distributed, and real-time log management system. This |
| presentation describes design details of how we built a |
| second-generation system fully leveraging AWS services including |
| Amazon Route 53 DNS with heartbeat and latency-based routing, |
| multi-region VPCs, Elastic Load Balancing, Amazon Relational |
| Database Service, and a number of pro-active and re-active |
| approaches to scaling computational and indexing capacity.</p> |
| <p> |
| The talk includes lessons learned in our first generation |
| release, validated by thousands of customers; speed bumps and |
| the mistakes we made along the way; various data models and |
| architectures previously considered; and success at scale: |
| speeds, feeds, and an unmeltable log processing engine.</p> |
| </div> |
| </div> |
| </div> |
| </div> |
| <!-- ################### --> |
| |
| <!-- ################### --> |
| |
| <div class="brickSS"> |
| <div class="row"> |
| |
| <div class="col-md-6"> |
| <h3>Real-Time Big Data Analytics with Storm |
| </h3> |
| <div> |
| <p>Published on Oct 12, 2013</p> |
| <p> |
| This talk provides an overview of the open source Storm system |
| for processing Big Data in realtime. The talk starts with an |
| overview of the technology, including key components: Nimbus, |
| Zookeeper, Topology, Tuple, Trident. The presentation then dives |
| into the complex Big Data architecture in which Storm can be |
| integrated. The result is a compelling stack of technologies |
| including integrated Hadoop clusters, MPP, and NoSQL |
| databases.</p> |
| <p> |
| The presentation then reviews real world use cases for realtime |
| Big Data analytics. Social updates, in particular real-time news |
| feeds on sites like Twitter and Facebook, benefit from Storm's |
| capacity to process benefits from distributed logic of |
| streaming. Another case study is financial compliance |
| monitoring, where Storm plays a primary role in reducing the |
| market data to a useable subset in realtime. In a final use |
| case, Storm is crucial to collect rich operational intelligence, |
| because it builds multidimensional stats and executes |
| distributed queries.</p> |
| </div> |
| </div> |
| <div class="col-md-6"> |
| <iframe width="560" height="315" |
| src="https://www.youtube.com/embed/hVO5nbxnBkU" frameborder="0" |
| allowfullscreen></iframe> |
| </div> |
| </div> |
| </div> |
| |
| <!-- ################### --> |
| |
| |
| <!-- ################### --> |
| <div class="brickSS"> |
| <div class="row"> |
| <div class="col-md-6"> |
| <iframe width="560" height="315" |
| src="https://www.youtube.com/embed/bdps8tE0gYo" frameborder="0" |
| allowfullscreen></iframe> |
| </div> |
| <div class="col-md-6"> |
| <h3>ETE 2012: Nathan Marz on Storm |
| </h3> |
| <div> |
| <p>Published on May 15, 2012</p> |
| <p> |
| |
| Storm makes it easy to write and scale complex realtime |
| computations on a cluster of computers, doing for realtime |
| processing what Hadoop did for batch processing. Storm |
| guarantees that every message will be processed. And it's fast |
| -- you can process millions of messages per second with a small |
| cluster. Best of all, you can write Storm topologies using any |
| programming language. Storm was open-sourced by Twitter in |
| September of 2011 and has since been adopted by numerous |
| companies around the world.</p> |
| <p> |
| Storm provides a small set of simple, easy to understand |
| primitives. These primitives can be used to solve a stunning |
| number of realtime computation problems, from stream processing |
| to continuous computation to distributed RPC. In this talk |
| you'll learn: |
| <ul> |
| <li>The concepts of Storm: streams, spouts, bolts, and |
| topologies |
| </li> |
| <li>Developing and testing topologies using Storm's local mode |
| </li> |
| <li>Deploying topologies on Storm clusters</li> |
| <li>How Storm achieves fault-tolerance and guarantees data |
| processing |
| </li> |
| <li>Computing intense functions on the fly in parallel using |
| Distributed RPC |
| </li> |
| <li>Making realtime computations idempotent using transactional |
| topologies |
| </li> |
| <li>Examples of production usage of Storm</li> |
| </ul> |
| </p> |
| </div> |
| </div> |
| </div> |
| </div> |
| <!-- ################### --> |
| <!-- ########## END VIDEOS ######### --> |
| </div> |
| |
| <div role="tabpanel" class="tab-pane" id="slideshows"> |
| <div class="row" style="padding-left: 45px;"> |
| <div class="col-md-6 brick"> |
| <h2>Resource Aware Scheduling in Storm</h2> |
| |
| <iframe src="//www.slideshare.net/slideshow/embed_code/key/GDqAajROY4Q3ot" |
| width="425" height="355" frameborder="0" marginwidth="0" |
| marginheight="0" scrolling="no" |
| style="border:1px solid #CCC; border-width:1px; margin-bottom:5px; max-width: 100%;" |
| allowfullscreen></iframe> |
| <div style="margin-bottom:5px"><strong> <a |
| href="//www.slideshare.net/HadoopSummit/resource-aware-scheduling-in-apache-spark" |
| title="Resource Aware Scheduling in Apache Storm" target="_blank">Resource |
| Aware Scheduling in Apache Storm</a> </strong> from <strong><a |
| target="_blank" href="//www.slideshare.net/HadoopSummit">HadoopSummit</a></strong> |
| </div> |
| </div> |
| |
| <div class="col-md-6 brick"> |
| <h2>Apache Storm 0.9 basic training</h2> |
| <iframe src="//www.slideshare.net/slideshow/embed_code/key/GUD3Y58U1f973x" |
| width="425" height="355" frameborder="0" marginwidth="0" |
| marginheight="0" scrolling="no" |
| style="border:1px solid #CCC; border-width:1px; margin-bottom:5px; max-width: 100%;" |
| allowfullscreen></iframe> |
| <div style="margin-bottom:5px"><strong> <a |
| href="//www.slideshare.net/miguno/apache-storm-09-basic-training-verisign" |
| title="Apache Storm 0.9 basic training - Verisign" target="_blank">Apache |
| Storm 0.9 basic training - Verisign</a> </strong> from <strong><a |
| href="//www.slideshare.net/miguno" target="_blank">Michael Noll</a></strong> |
| </div> |
| </div> |
| |
| <div class="col-md-6 brick"> |
| <h2>Scaling Apache Storm - Strata + Hadoop World 2014</h2> |
| <iframe src="//www.slideshare.net/slideshow/embed_code/key/NRMYq1985xMCWv" |
| width="340" height="290" frameborder="0" marginwidth="0" |
| marginheight="0" scrolling="no" |
| style="border:1px solid #CCC; border-width:1px; margin-bottom:5px; max-width: 100%;" |
| allowfullscreen></iframe> |
| <div style="margin-bottom:5px"><strong> <a |
| href="//www.slideshare.net/ptgoetz/scaling-apache-storm-strata-hadoopworld-2014" |
| title="Scaling Apache Storm - Strata + Hadoop World 2014" |
| target="_blank">Scaling Apache Storm - Strata + Hadoop World |
| 2014</a> </strong> from <strong><a href="//www.slideshare.net/ptgoetz" |
| target="_blank">P. Taylor |
| Goetz</a></strong></div> |
| </div> |
| |
| <div class="col-md-6 brick"> |
| <h2>Yahoo compares Storm and Spark</h2> |
| <iframe src="//www.slideshare.net/slideshow/embed_code/key/BRgWgMTzazVSbG" |
| width="340" height="290" frameborder="0" marginwidth="0" |
| marginheight="0" scrolling="no" |
| style="border:1px solid #CCC; border-width:1px; margin-bottom:5px; max-width: 100%;" |
| allowfullscreen></iframe> |
| <div style="margin-bottom:5px"><strong> <a |
| href="//www.slideshare.net/ChicagoHUG/yahoo-compares-storm-and-spark" |
| title="Yahoo compares Storm and Spark" target="_blank">Yahoo |
| compares Storm and Spark</a> </strong> from <strong><a |
| href="//www.slideshare.net/ChicagoHUG" target="_blank">Chicago |
| Hadoop Users Group</a></strong></div> |
| </div> |
| </div> |
| |
| <div class="row" style="padding-left: 45px;"> |
| <div class="col-md-6 brick"> |
| <h2>Hadoop Summit Europe 2014: Apache Storm Architecture</h2> |
| <iframe src="//www.slideshare.net/slideshow/embed_code/key/m9vKPotXvQ8hb7" |
| width="340" height="290" frameborder="0" marginwidth="0" |
| marginheight="0" scrolling="no" |
| style="border:1px solid #CCC; border-width:1px; margin-bottom:5px; max-width: 100%;" |
| allowfullscreen></iframe> |
| <div style="margin-bottom:5px"><strong> <a |
| href="//www.slideshare.net/ptgoetz/storm-hadoop-summit2014" |
| title="Hadoop Summit Europe 2014: Apache Storm Architecture" |
| target="_blank">Hadoop Summit Europe 2014: Apache Storm |
| Architecture</a> </strong> from <strong><a |
| href="//www.slideshare.net/ptgoetz" target="_blank">P. Taylor |
| Goetz</a></strong></div> |
| </div> |
| |
| <div class="col-md-6 brick"> |
| <h2>Storm: distributed and fault-tolerant realtime computation</h2> |
| <iframe src="//www.slideshare.net/slideshow/embed_code/key/zF8J7y8oz4Qtbc" |
| width="340" height="290" frameborder="0" marginwidth="0" |
| marginheight="0" scrolling="no" |
| style="border:1px solid #CCC; border-width:1px; margin-bottom:5px; max-width: 100%;" |
| allowfullscreen></iframe> |
| <div style="margin-bottom:5px"><strong> <a |
| href="//www.slideshare.net/nathanmarz/storm-distributed-and-faulttolerant-realtime-computation" |
| title="Storm: distributed and fault-tolerant realtime computation" |
| target="_blank">Storm: distributed and fault-tolerant realtime |
| computation</a> </strong> from <strong><a |
| href="//www.slideshare.net/nathanmarz" target="_blank">Nathan |
| Marz</a></strong></div> |
| </div> |
| |
| <div class="col-md-6 brick"> |
| <h2>Realtime Analytics with Storm and Hadoop</h2> |
| <iframe src="//www.slideshare.net/slideshow/embed_code/key/wAvMj9LtK7OAwn" |
| width="340" height="290" frameborder="0" marginwidth="0" |
| marginheight="0" scrolling="no" |
| style="border:1px solid #CCC; border-width:1px; margin-bottom:5px; max-width: 100%;" |
| allowfullscreen></iframe> |
| <div style="margin-bottom:5px"><strong> <a |
| href="//www.slideshare.net/Hadoop_Summit/realtime-analytics-with-storm" |
| title="Realtime Analytics with Storm and Hadoop" target="_blank">Realtime |
| Analytics with Storm and Hadoop</a> </strong> from <strong><a |
| href="//www.slideshare.net/Hadoop_Summit" target="_blank">Hadoop |
| Summit</a></strong></div> |
| </div> |
| </div> |
| </div> |
| </div> |
| </div> |
| </div> |
| </div> |
| </div> |
| |
| |
| </div> |
| </div> |
| </div> |
| <footer> |
| <div class="container-fluid"> |
| <div class="row"> |
| <div class="col-md-3"> |
| <div class="footer-widget"> |
| <h5>Meetups</h5> |
| <ul class="latest-news"> |
| |
| <li><a href="https://www.meetup.com/Apache-Storm-Apache-Kafka/">Apache Storm & Apache Kafka</a> |
| <span class="small">(Sunnyvale, CA)</span></li> |
| |
| <li><a href="https://www.meetup.com/Apache-Storm-Kafka-Users/">Apache Storm & Kafka Users</a> |
| <span class="small">(Seattle, WA)</span></li> |
| |
| <li><a href="https://www.meetup.com/New-York-City-Storm-User-Group/">NYC Storm User Group</a> |
| <span class="small">(New York, NY)</span></li> |
| |
| <li><a href="https://www.meetup.com/Bay-Area-Stream-Processing">Bay Area Stream Processing</a> |
| <span class="small">(Emeryville, CA)</span></li> |
| |
| <li><a href="https://www.meetup.com/Boston-Storm-Users/">Boston Realtime Data</a> <span |
| class="small">(Boston, MA)</span></li> |
| |
| <li><a href="https://www.meetup.com/storm-london">London Storm User Group</a> <span |
| class="small">(London, UK)</span></li> |
| </ul> |
| </div> |
| </div> |
| <div class="col-md-3"> |
| <div class="footer-widget"> |
| <h5>About Apache Storm</h5> |
| <p>Apache Storm integrates with any queueing system and any database system. Apache Storm's spout |
| abstraction makes it easy to integrate a new queuing system. Likewise, integrating Apache Storm |
| with database systems is easy.</p> |
| </div> |
| </div> |
| <div class="col-md-3"> |
| <div class="footer-widget"> |
| <h5>First Look</h5> |
| <ul class="footer-list"> |
| <li><a href="/releases/current/Rationale.html">Rationale</a></li> |
| <li><a href="/releases/current/Tutorial.html">Tutorial</a></li> |
| <li><a href="/releases/current/Setting-up-development-environment.html">Setting up development |
| environment</a></li> |
| <li><a href="/releases/current/Creating-a-new-Storm-project.html">Creating a new Apache Storm |
| project</a></li> |
| </ul> |
| </div> |
| </div> |
| <div class="col-md-3"> |
| <div class="footer-widget"> |
| <h5>Documentation</h5> |
| <ul class="footer-list"> |
| <li><a href="/releases/current/index.html">Index</a></li> |
| <li><a href="/releases/current/javadocs/index.html">Javadoc</a></li> |
| <li><a href="/releases/current/FAQ.html">FAQ</a></li> |
| </ul> |
| </div> |
| </div> |
| </div> |
| <hr/> |
| <div class="row"> |
| <div class="col-md-12"> |
| <p align="center">Copyright © 2022 <a href="https://www.apache.org">Apache Software Foundation</a>. All |
| Rights Reserved. |
| <br>Apache Storm, Apache, the Apache feather logo, and the Apache Storm project logos are trademarks |
| of The Apache Software Foundation. |
| <br>All other marks mentioned may be trademarks or registered trademarks of their respective owners. |
| </p> |
| </div> |
| </div> |
| </div> |
| </footer> |
| <!--Footer End--> |
| <!-- Scroll to top --> |
| <span class="totop"><a href="#"><i class="fa fa-angle-up"></i></a></span> |
| |
| </body> |
| |
| </html> |
| |