blob: e5d131595722b56f3a0645a3c2e5e9d16755c316 [file] [log] [blame]
// Licensed to the Apache Software Foundation (ASF) under one or more
// contributor license agreements. See the NOTICE file distributed with
// this work for additional information regarding copyright ownership.
// The ASF licenses this file to You under the Apache License, Version 2.0
// (the "License"); you may not use this file except in compliance with
// the License. You may obtain a copy of the License at
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// See the License for the specific language governing permissions and
// limitations under the License.
= k-NN Classification
The Apache Ignite Machine Learning component provides two versions of the widely used k-NN (k-nearest neighbors) algorithm - one for classification tasks and the other for regression tasks.
This documentation reviews k-NN as a solution for classification tasks.
== Trainer and Model
The k-NN algorithm is a non-parametric method whose input consists of the k-closest training examples in the feature space.
Also, k-NN classification's output represents a class membership. An object is classified by the majority votes of its neighbors. The object is assigned to a particular class that is most common among its k nearest neighbors. `k` is a positive integer, typically small. There is a special case when `k` is `1`, then the object is simply assigned to the class of that single nearest neighbor.
Presently, Ignite supports a few parameters for k-NN classification algorithm:
* `k` - a number of nearest neighbors
* `distanceMeasure` - one of the distance metrics provided by the ML framework such as Euclidean, Hamming or Manhattan.
* `isWeighted` - false by default, if true it enables a weighted KNN algorithm.
* `dataCache` - holds a training set of objects for which the class is already known.
* `indexType` - distributed spatial index, has three values: ARRAY, KD_TREE, BALL_TREE.
[source, java]
// Create trainer
KNNClassificationTrainer trainer = new KNNClassificationTrainer();
// Create trainer
KNNClassificationTrainer trainer = new KNNClassificationTrainer()
.withDistanceMeasure(new EuclideanDistance())
// Train model.
KNNClassificationModel knnMdl =
// Make a prediction.
double prediction = knnMdl.predict(observation);
== Example
To see how kNN Classification can be used in practice, try this[example] that is available on GitHub and delivered with every Apache Ignite distribution.
The training dataset is the Iris dataset which can be loaded from the[UCI Machine Learning Repository].