<!--
▄▄▄       ██▓███   ▄▄▄       ▄████▄   ██░ ██ ▓█████     ██▓  ▄████  ███▄    █  ██▓▄▄▄█████▓▓█████
▒████▄    ▓██░  ██▒▒████▄    ▒██▀ ▀█  ▓██░ ██▒▓█   ▀    ▓██▒ ██▒ ▀█▒ ██ ▀█   █ ▓██▒▓  ██▒ ▓▒▓█   ▀
▒██  ▀█▄  ▓██░ ██▓▒▒██  ▀█▄  ▒▓█    ▄ ▒██▀▀██░▒███      ▒██▒▒██░▄▄▄░▓██  ▀█ ██▒▒██▒▒ ▓██░ ▒░▒███
░██▄▄▄▄██ ▒██▄█▓▒ ▒░██▄▄▄▄██ ▒▓▓▄ ▄██▒░▓█ ░██ ▒▓█  ▄    ░██░░▓█  ██▓▓██▒  ▐▌██▒░██░░ ▓██▓ ░ ▒▓█  ▄
▓█   ▓██▒▒██▒ ░  ░ ▓█   ▓██▒▒ ▓███▀ ░░▓█▒░██▓░▒████▒   ░██░░▒▓███▀▒▒██░   ▓██░░██░  ▒██▒ ░ ░▒████▒
▒▒   ▓▒█░▒▓▒░ ░  ░ ▒▒   ▓▒█░░ ░▒ ▒  ░ ▒ ░░▒░▒░░ ▒░ ░   ░▓   ░▒   ▒ ░ ▒░   ▒ ▒ ░▓    ▒ ░░   ░░ ▒░ ░
 ▒   ▒▒ ░░▒ ░       ▒   ▒▒ ░  ░  ▒    ▒ ░▒░ ░ ░ ░  ░    ▒ ░  ░   ░ ░ ░░   ░ ▒░ ▒ ░    ░     ░ ░  ░
 ░   ▒   ░░         ░   ▒   ░         ░  ░░ ░   ░       ▒ ░░ ░   ░    ░   ░ ░  ▒ ░  ░         ░
     ░  ░               ░  ░░ ░       ░  ░  ░   ░  ░    ░        ░          ░  ░              ░  ░
-->

<!--
Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements.  See the NOTICE file
distributed with this work for additional information
regarding copyright ownership.  The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License.  You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing,
software distributed under the License is distributed on an
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
KIND, either express or implied.  See the License for the
specific language governing permissions and limitations
under the License.
-->

<!DOCTYPE html>
<html lang="en">
<head>

    <link rel="canonical" href="https://ignite.apache.org/features/machinelearning.html" />
    <meta charset="utf-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <meta http-equiv="Cache-Control" content="no-cache, no-store, must-revalidate" />
    <meta http-equiv="Pragma" content="no-cache" />
    <meta http-equiv="Expires" content="0" />

    <title>Machine Learning - Apache Ignite</title>

    <meta name="description"
          content="Apache Ignite Machine Learning is a set of simple, scalable, and efficient APIs that
                        allow building predictive machine learning models at scale and to enable continuous learning."/>

    <!--#include virtual="/includes/styles.html" -->

    
</head>
<body>

    <!--#include virtual="/includes/header.html" -->
<article>
    <header>
        <div class="container">
            <h1>Apache Ignite <strong>Machine Learning</strong></h1>
        </div>
    </header>
    <div class="container">
            <p>
                Apache Ignite® Machine Learning (ML) is a set of simple, scalable, and efficient tools that
                allow building predictive machine learning models without costly data transfers. The rationale for
                adding machine and deep learning (DL) to Apache Ignite is quite simple.
                Today's data scientists have to deal with two major factors that keep ML from mainstream adoption.
            </p>
            <h2>Problem #1: Constant Data Movement (ETL)</h2>

            <img class="diagram-right img-responsive" src="/images/svg-diagrams/machine_learning.svg" alt="Apache Ignite Machine Learning" />
            <p>
                First, the models are trained and deployed (after the training is over) in different systems.
                The data scientists have to wait for ETL or some other data transfer process to move the data
                into a system like Apache Mahout or Apache Spark for a training purpose. Then they have to wait
                while this process completes and redeploy the models in a production environment. The whole
                process can take hours moving terabytes of data from one system to another. Moreover, the
                training part usually happens over the old data set.
            </p>
                    

            <h2>Problem #2: Lack of Horizontal Scalability</h2>

            <p>
                The second factor relates to scalability. ML and DL algorithms have to process data sets that no
                longer fit within a single server unit are continually growing. This requires data scientists to come
                up with sophisticated solutions or turn to distributed computing platforms such as Apache Spark and
                TensorFlow. However, those platforms mostly solve only a part of the puzzle, which is the models
                training, making it a burden for the developers to decide how to deploy the models in production later.
            </p>

            <h2>Zero ETL and Massive Scalability</h2>

            <p>
                Ignite Machine Learning relies on Ignite's multi-tier storage that brings massive scalability
                for ML and DL tasks and eliminates the wait imposed by ETL between the different systems.
                For instance, it allows users to run ML/DL training and inference directly on the data stored across
                memory and disk in an Ignite cluster. Next, Ignite provides a host
                of ML and DL algorithms that are optimized for Ignite's collocated distributed processing.
                These implementations deliver in-memory speed and unlimited horizontal scalability when running
                in place against massive data sets or incrementally against incoming data streams, without
                requiring the data to be moved into another store. By eliminating the data movement and the
                lengthy processing wait times, Ignite Machine learning enables continuous learning that can
                improve decisions based on the latest data as it arrives in real-time.
            </p>

            <h2>Fault Tolerance and Continuous Learning</h2>
            <p>
                Ignite Machine Learning is tolerant to node failures. This means that in the case of node
                failures during the learning process, all recovery procedures will be transparent to the user,
                learning processes won't be interrupted, and you will get results in the time similar to the case when
                all nodes are up and running.
            </p>

            <div class="jumbotron jumbotron-fluid">
                <div class="container">
                  <div class="title display-6">Learn More</div>
                  <hr class="my-4">
                  <div class="row">
                    <div class="col-sm-6">
                        <ul>
                            <li><a href="https://apacheignite.readme.io/docs/machine-learning" target="docs">Ignite Machine Learning Documentation <i class="fas fa-angle-double-right"></i></a></li>
                            <li><a href="https://apacheignite.readme.io/docs/ml-partition-based-dataset" target="docs">Partition-Based Data Sets <i class="fas fa-angle-double-right"></i></a></li>
                        </ul>
                    </div>
                    <div class="col-sm-6">
                        <ul>
                            <li><a href="/features/tensorflow.html">Apache Ignite integration for TensorFlow <i class="fas fa-angle-double-right"></i></a></li>
                        </ul>
                    </div>
                </div>
            </div>
        </div>

    </div>
    
</article>

    <!--#include virtual="/includes/footer.html" -->

<!--#include virtual="/includes/scripts.html" -->
</body>
</html>
