features/machinelearning.html - ignite-website - Git at Google

 <!--
 ▄▄▄       ██▓███   ▄▄▄       ▄████▄   ██░ ██ ▓█████     ██▓  ▄████  ███▄    █  ██▓▄▄▄█████▓▓█████
 ▒████▄    ▓██░  ██▒▒████▄    ▒██▀ ▀█  ▓██░ ██▒▓█   ▀    ▓██▒ ██▒ ▀█▒ ██ ▀█   █ ▓██▒▓  ██▒ ▓▒▓█   ▀
 ▒██  ▀█▄  ▓██░ ██▓▒▒██  ▀█▄  ▒▓█    ▄ ▒██▀▀██░▒███      ▒██▒▒██░▄▄▄░▓██  ▀█ ██▒▒██▒▒ ▓██░ ▒░▒███
 ░██▄▄▄▄██ ▒██▄█▓▒ ▒░██▄▄▄▄██ ▒▓▓▄ ▄██▒░▓█ ░██ ▒▓█  ▄    ░██░░▓█  ██▓▓██▒  ▐▌██▒░██░░ ▓██▓ ░ ▒▓█  ▄
 ▓█   ▓██▒▒██▒ ░  ░ ▓█   ▓██▒▒ ▓███▀ ░░▓█▒░██▓░▒████▒   ░██░░▒▓███▀▒▒██░   ▓██░░██░  ▒██▒ ░ ░▒████▒
 ▒▒   ▓▒█░▒▓▒░ ░  ░ ▒▒   ▓▒█░░ ░▒ ▒  ░ ▒ ░░▒░▒░░ ▒░ ░   ░▓   ░▒   ▒ ░ ▒░   ▒ ▒ ░▓    ▒ ░░   ░░ ▒░ ░
  ▒   ▒▒ ░░▒ ░       ▒   ▒▒ ░  ░  ▒    ▒ ░▒░ ░ ░ ░  ░    ▒ ░  ░   ░ ░ ░░   ░ ▒░ ▒ ░    ░     ░ ░  ░
  ░   ▒   ░░         ░   ▒   ░         ░  ░░ ░   ░       ▒ ░░ ░   ░    ░   ░ ░  ▒ ░  ░         ░
      ░  ░               ░  ░░ ░       ░  ░  ░   ░  ░    ░        ░          ░  ░              ░  ░
 -->

 <!--
 Licensed to the Apache Software Foundation (ASF) under one
 or more contributor license agreements.  See the NOTICE file
 distributed with this work for additional information
 regarding copyright ownership.  The ASF licenses this file
 to you under the Apache License, Version 2.0 (the
 "License"); you may not use this file except in compliance
 with the License.  You may obtain a copy of the License at

 http://www.apache.org/licenses/LICENSE-2.0

 Unless required by applicable law or agreed to in writing,
 software distributed under the License is distributed on an
 "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
 "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
 KIND, either express or implied.  See the License for the
 specific language governing permissions and limitations
 under the License.
 -->

 <!DOCTYPE html>
 <html>
 <head>
     <link rel="canonical" href="https://ignite.apache.org/features/machinelearning.html" />
     <meta charset="utf-8">
     <meta name="viewport" content="width=device-width, initial-scale=1.0">
     <meta http-equiv="Cache-Control" content="no-cache, no-store, must-revalidate" />
     <meta http-equiv="Pragma" content="no-cache" />
     <meta http-equiv="Expires" content="0" />
     <title>Machine Learning - Apache Ignite</title>
     <link media="all" rel="stylesheet" href="/css/all.css?v=1538416900">
     <link href="https://netdna.bootstrapcdn.com/font-awesome/4.2.0/css/font-awesome.css" rel="stylesheet">
     <link href='https://fonts.googleapis.com/css?family=Open+Sans:400,300,300italic,400italic,600,600italic,700,700italic,800,800italic' rel='stylesheet' type='text/css'>

     <!--#include virtual="/includes/sh.html" -->
 </head>
 <body>
 <div id="wrapper">
     <!--#include virtual="/includes/header.html" -->

     <main id="main" role="main" class="container">
         <section id="machine-learning" class="page-section">
             <h1 class="first">Machine Learning</h1>
             <div class="col-sm-12 col-md-12 col-xs-12" style="padding-left:0; padding-right:0;">
                 <div class="col-sm-6 col-md-7 col-xs-12" style="padding-left:0; padding-right:0;">
                     <p>Apache Ignite Machine Learning (ML) is a set of simple, scalable and efficient tools that allow
                         building predictive machine learning models without costly data transfers.
                     </p>
                     <p>
                         The rationale for adding machine and deep learning (DL) to Apache Ignite is quite simple.
                         Today's data scientists have to deal with two major factors that keep ML from mainstream adoption.

                     </p>
                     <div class="page-heading">Problem #1: Constant Data Movement (ETL)</div>

                     <p>
                         First, the models are trained and deployed (after the training is over) in different systems.
                         The data scientists have to wait for ETL or some other data transfer process to move the data
                         into a system like Apache Mahout or Apache Spark for a training purpose. Then they have to wait
                         while this process completes and redeploy the models in a production environment. The whole
                         process can take hours moving terabytes of data from one system to another. Moreover, the
                         training part usually happens over the old data set.
                     </p>
                 </div>
                 <div class="col-sm-6 col-md-5 col-xs-12" style="padding-right:0; top: -10px;">
                     <img class="img-responsive" src="/images/machine_learning.png" width="440px" style="float:right;"/>
                 </div>
             </div>

             <div class="page-heading">Problem #2: Lack of Horizontal Scalability</div>

             <p>
                 The second factor is related to scalability. ML and DL algorithms that have to
                 process data sets which no longer fit within a single server unit are constantly growing.
                 This urges the data scientist to come up with sophisticated solutions or turn to distributed
                 computing platforms such as Apache Spark and TensorFlow. However, those platforms mostly solve
                 only a part of the puzzle which is the models training, making it a burden of the developers to
                 decide how do deploy the models in production later.
             </p>

             <div class="page-heading">Zero ETL and Massive Scalability</div>

             <p>
                 Ignite Machine Learning relies on Ignite's memory-centric storage that brings massive scalability
                 for ML and DL tasks and eliminates the wait imposed by ETL between the different systems.
                 For instance, it allows users to run ML/DL training and inference directly on data stored across
                 memory and disk in an Ignite cluster. Next, Ignite provides a host
                 of ML and DL algorithms that are optimized for Ignite's collocated distributed processing.
                 These implementations deliver in-memory speed and unlimited horizontal scalability when running
                 in place against massive data sets or incrementally against incoming data streams, without
                 requiring the data to be moved into another store. By eliminating the data movement and the
                 long processing wait times, Ignite Machine learning enables continuous learning that can
                 improve decisions based on the latest data as it arrives in real-time.
             </p>

             <div class="page-heading">Fault Tolerance and Continuous Learning</div>
             <p>
                 Apache Ignite Machine Learning is tolerant to node failures. This means that in the case of node
                 failures during the learning process, all recovery procedures will be transparent to the user,
                 learning processes won't be interrupted, and we will get results in the time similar to the case when
                 all nodes work fine.
             </p>
             <p><a href="https://apacheignite.readme.io/docs/machine-learning" target="_blank" rel="noopener">Read more</a></p>
         </section>

         <section id="ga-grid" class="page-section">
             <div class="col-sm-12 col-md-12 col-xs-12">
                 <div class="col-sm-6 col-md-7 col-xs-12" style="padding-left:0; padding-right:15px;">
                     <h2 style="padding-bottom: 5px; margin-bottom: 20px;">Genetic Algorithms</h2>

                     <p>Machine learning component goes with a set of genetic algorithms (GA) which is a method of
                         solving optimization problems by simulating the process of biological evolution.
                     </p>
                     <p>
                         GAs are excellent for searching through large and complex data sets for an optimal solution.
                         Real world applications of GAs include:  automotive design, computer gaming, robotics, investments,
                         traffic/shipment routing and more.
                     </p>

                     <div class="page-links">
                         <a href="https://apacheignite.readme.io/docs/genetic-algorithms" target="_blank" rel="noopener">Genetic Algorithms<i class="fa fa-angle-double-right"></i></a>
                     </div>
                 </div>
                 <div class="col-sm-6 col-md-5 col-xs-12" style="padding-right:0;">
                     <a href="/images/GAGrid_Overview.png"><img class="img-responsive" src="/images/GAGrid_Overview.png"></a>&nbsp;
                     <p class="img-caption">Click on the image to view full size.</p>
                 </div>
             </div><p>&nbsp;</p>
         </section>
     </main>

     <!--#include virtual="/includes/footer.html" -->
 </div>
 <!--#include virtual="/includes/scripts.html" -->
 </body>
 </html>
	<!--
	▄▄▄ ██▓███ ▄▄▄ ▄████▄ ██░ ██ ▓█████ ██▓ ▄████ ███▄ █ ██▓▄▄▄█████▓▓█████
	▒████▄ ▓██░ ██▒▒████▄ ▒██▀ ▀█ ▓██░ ██▒▓█ ▀ ▓██▒ ██▒ ▀█▒ ██ ▀█ █ ▓██▒▓ ██▒ ▓▒▓█ ▀
	▒██ ▀█▄ ▓██░ ██▓▒▒██ ▀█▄ ▒▓█ ▄ ▒██▀▀██░▒███ ▒██▒▒██░▄▄▄░▓██ ▀█ ██▒▒██▒▒ ▓██░ ▒░▒███
	░██▄▄▄▄██ ▒██▄█▓▒ ▒░██▄▄▄▄██ ▒▓▓▄ ▄██▒░▓█ ░██ ▒▓█ ▄ ░██░░▓█ ██▓▓██▒ ▐▌██▒░██░░ ▓██▓ ░ ▒▓█ ▄
	▓█ ▓██▒▒██▒ ░ ░ ▓█ ▓██▒▒ ▓███▀ ░░▓█▒░██▓░▒████▒ ░██░░▒▓███▀▒▒██░ ▓██░░██░ ▒██▒ ░ ░▒████▒
	▒▒ ▓▒█░▒▓▒░ ░ ░ ▒▒ ▓▒█░░ ░▒ ▒ ░ ▒ ░░▒░▒░░ ▒░ ░ ░▓ ░▒ ▒ ░ ▒░ ▒ ▒ ░▓ ▒ ░░ ░░ ▒░ ░
	▒ ▒▒ ░░▒ ░ ▒ ▒▒ ░ ░ ▒ ▒ ░▒░ ░ ░ ░ ░ ▒ ░ ░ ░ ░ ░░ ░ ▒░ ▒ ░ ░ ░ ░ ░
	░ ▒ ░░ ░ ▒ ░ ░ ░░ ░ ░ ▒ ░░ ░ ░ ░ ░ ░ ▒ ░ ░ ░
	░ ░ ░ ░░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░
	-->

	<!--
	Licensed to the Apache Software Foundation (ASF) under one
	or more contributor license agreements. See the NOTICE file
	distributed with this work for additional information
	regarding copyright ownership. The ASF licenses this file
	to you under the Apache License, Version 2.0 (the
	"License"); you may not use this file except in compliance
	with the License. You may obtain a copy of the License at

	http://www.apache.org/licenses/LICENSE-2.0

	Unless required by applicable law or agreed to in writing,
	software distributed under the License is distributed on an
	"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
	"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
	KIND, either express or implied. See the License for the
	specific language governing permissions and limitations
	under the License.
	-->

	<!DOCTYPE html>
	<html>
	<head>
	<link rel="canonical" href="https://ignite.apache.org/features/machinelearning.html" />
	<meta charset="utf-8">
	<meta name="viewport" content="width=device-width, initial-scale=1.0">
	<meta http-equiv="Cache-Control" content="no-cache, no-store, must-revalidate" />
	<meta http-equiv="Pragma" content="no-cache" />
	<meta http-equiv="Expires" content="0" />
	<title>Machine Learning - Apache Ignite</title>
	<link media="all" rel="stylesheet" href="/css/all.css?v=1538416900">
	<link href="https://netdna.bootstrapcdn.com/font-awesome/4.2.0/css/font-awesome.css" rel="stylesheet">
	<link href='https://fonts.googleapis.com/css?family=Open+Sans:400,300,300italic,400italic,600,600italic,700,700italic,800,800italic' rel='stylesheet' type='text/css'>

	<!--#include virtual="/includes/sh.html" -->
	</head>
	<body>
	<div id="wrapper">
	<!--#include virtual="/includes/header.html" -->

	<main id="main" role="main" class="container">
	<section id="machine-learning" class="page-section">
	<h1 class="first">Machine Learning</h1>
	<div class="col-sm-12 col-md-12 col-xs-12" style="padding-left:0; padding-right:0;">
	<div class="col-sm-6 col-md-7 col-xs-12" style="padding-left:0; padding-right:0;">
	<p>Apache Ignite Machine Learning (ML) is a set of simple, scalable and efficient tools that allow
	building predictive machine learning models without costly data transfers.
	</p>
	<p>
	The rationale for adding machine and deep learning (DL) to Apache Ignite is quite simple.
	Today's data scientists have to deal with two major factors that keep ML from mainstream adoption.

	</p>
	<div class="page-heading">Problem #1: Constant Data Movement (ETL)</div>

	<p>
	First, the models are trained and deployed (after the training is over) in different systems.
	The data scientists have to wait for ETL or some other data transfer process to move the data
	into a system like Apache Mahout or Apache Spark for a training purpose. Then they have to wait
	while this process completes and redeploy the models in a production environment. The whole
	process can take hours moving terabytes of data from one system to another. Moreover, the
	training part usually happens over the old data set.
	</p>
	</div>
	<div class="col-sm-6 col-md-5 col-xs-12" style="padding-right:0; top: -10px;">
	<img class="img-responsive" src="/images/machine_learning.png" width="440px" style="float:right;"/>
	</div>
	</div>

	<div class="page-heading">Problem #2: Lack of Horizontal Scalability</div>

	<p>
	The second factor is related to scalability. ML and DL algorithms that have to
	process data sets which no longer fit within a single server unit are constantly growing.
	This urges the data scientist to come up with sophisticated solutions or turn to distributed
	computing platforms such as Apache Spark and TensorFlow. However, those platforms mostly solve
	only a part of the puzzle which is the models training, making it a burden of the developers to
	decide how do deploy the models in production later.
	</p>

	<div class="page-heading">Zero ETL and Massive Scalability</div>

	<p>
	Ignite Machine Learning relies on Ignite's memory-centric storage that brings massive scalability
	for ML and DL tasks and eliminates the wait imposed by ETL between the different systems.
	For instance, it allows users to run ML/DL training and inference directly on data stored across
	memory and disk in an Ignite cluster. Next, Ignite provides a host
	of ML and DL algorithms that are optimized for Ignite's collocated distributed processing.
	These implementations deliver in-memory speed and unlimited horizontal scalability when running
	in place against massive data sets or incrementally against incoming data streams, without
	requiring the data to be moved into another store. By eliminating the data movement and the
	long processing wait times, Ignite Machine learning enables continuous learning that can
	improve decisions based on the latest data as it arrives in real-time.
	</p>

	<div class="page-heading">Fault Tolerance and Continuous Learning</div>
	<p>
	Apache Ignite Machine Learning is tolerant to node failures. This means that in the case of node
	failures during the learning process, all recovery procedures will be transparent to the user,
	learning processes won't be interrupted, and we will get results in the time similar to the case when
	all nodes work fine.
	</p>
	<p><a href="https://apacheignite.readme.io/docs/machine-learning" target="_blank" rel="noopener">Read more</a></p>
	</section>

	<section id="ga-grid" class="page-section">
	<div class="col-sm-12 col-md-12 col-xs-12">
	<div class="col-sm-6 col-md-7 col-xs-12" style="padding-left:0; padding-right:15px;">
	<h2 style="padding-bottom: 5px; margin-bottom: 20px;">Genetic Algorithms</h2>

	<p>Machine learning component goes with a set of genetic algorithms (GA) which is a method of
	solving optimization problems by simulating the process of biological evolution.
	</p>
	<p>
	GAs are excellent for searching through large and complex data sets for an optimal solution.
	Real world applications of GAs include: automotive design, computer gaming, robotics, investments,
	traffic/shipment routing and more.
	</p>

	<div class="page-links">
	<a href="https://apacheignite.readme.io/docs/genetic-algorithms" target="_blank" rel="noopener">Genetic Algorithms<i class="fa fa-angle-double-right"></i></a>
	</div>
	</div>
	<div class="col-sm-6 col-md-5 col-xs-12" style="padding-right:0;">
	<a href="/images/GAGrid_Overview.png"><img class="img-responsive" src="/images/GAGrid_Overview.png"></a>
	<p class="img-caption">Click on the image to view full size.</p>
	</div>
	</div><p> </p>
	</section>
	</main>

	<!--#include virtual="/includes/footer.html" -->
	</div>
	<!--#include virtual="/includes/scripts.html" -->
	</body>
	</html>