blob: f54488a07726c1008f151ab93621fa9aa7fc5d8d [file] [log] [blame]
<!DOCTYPE html>
<!--[if lt IE 7]> <html class="no-js lt-ie9 lt-ie8 lt-ie7"> <![endif]-->
<!--[if IE 7]> <html class="no-js lt-ie9 lt-ie8"> <![endif]-->
<!--[if IE 8]> <html class="no-js lt-ie9"> <![endif]-->
<!--[if gt IE 8]><!-->
<html class="no-js">
<!--<![endif]-->
<!--
-->
<head>
<title>Algorithms Reference Matrix Factorization - SystemDS 3.2.0</title>
<meta charset="utf-8">
<meta http-equiv="X-UA-Compatible" content="IE=edge,chrome=1">
<meta name="description" content="">
<meta name="viewport" content="width=device-width">
<link rel="stylesheet" href="./../css/bootstrap.min.css">
<link rel="stylesheet" href="./../css/main.css">
<link rel="stylesheet" href="./../css/pygments-default.css">
<link rel="shortcut icon" href="./../img/favicon.png">
<script src="./../js/vendor/jquery-1.12.0.min.js"></script>
<script src="./../js/vendor/bootstrap.min.js"></script>
<script src="./../js/vendor/anchor.min.js"></script>
<script src="./../js/main.js"></script>
</head>
<body>
<!--
-->
<header class="navbar navbar-default navbar-fixed-top" id="topbar">
<div class="container">
<div class="navbar-header">
<div class="navbar-brand brand projectlogo">
<a href="https://systemds.apache.org/"><img class="logo" src="./../img/systemds-logo.png" alt="Apache SystemDS" title="Apache SystemDS" /></a>
</div>
<div class="navbar-brand brand projecttitle">
<a href="https://systemds.apache.org/">Apache SystemDS<sup id="trademark"></sup></a><br />
<span class="version">3.2.0</span>
</div>
<button type="button" class="navbar-toggle collapsed" data-toggle="collapse" data-target=".navbar-collapse">
<span class="sr-only">Toggle navigation</span>
<span class="icon-bar"></span>
<span class="icon-bar"></span>
<span class="icon-bar"></span>
</button>
</div>
<nav class="navbar-collapse collapse">
<ul class="nav navbar-nav navbar-right">
<li class="dropdown">
<a href="#" class="dropdown-toggle" data-toggle="dropdown">Overview<b class="caret"></b></a>
<ul class="dropdown-menu" role="menu">
<li><b>Home:</b></li>
<li><a href="./../">Docs Home</a></li>
<li class="divider"></li>
<li><b>Running SystemDS:</b></li>
<li><a href="./../site/run">Standalone Guide</a></li>
<li><a href="./../site/gpu">GPU Guide</a></li>
<li><a href="./../site/native-backend">Native Backend (BLAS)</a></li>
<li><a href="./../site/docker">Run with Docker</a></li>
<li class="divider"></li>
<li><b>Language Guides:</b></li>
<li><a href="./../site/dml-language-reference.html">DML Language Reference</a></li>
<li><a href="./../site/builtins-reference.html">Built-in Functions Reference</a></li>
<li><a href="./../site/dml-vs-r-guide.html">DML vs R guide</a></li>
<li class="divider"></li>
<li><b>Algorithms:</b></li>
<li><a href="./../site/algorithms-reference.html">ML Algorithms Reference</a></li>
<li class="divider"></li>
<li><b>Other:</b></li>
<li><a href="https://github.com/apache/systemds/blob/main/CONTRIBUTING.md">Contributing to SystemDS 🡕</a></li>
</ul>
</li>
<li><a href="https://github.com/apache/systemds">GitHub 🡕</a></li>
<li class="dropdown">
<a href="#" class="dropdown-toggle" data-toggle="dropdown">API<b class="caret"></b></a>
<ul class="dropdown-menu" role="menu">
<li><a href="./../api/java/">Java</a></li>
<li><a href="./../api/python/">Python</a></li>
</ul>
</li>
<li><a href="https://issues.apache.org/jira/secure/Dashboard.jspa?selectPageId=12335852">Issues</a></li>
</ul>
</nav>
</div>
</header>
<div class="container" id="content">
<h1 class="title">Algorithms Reference Matrix Factorization</h1>
<!--
-->
<h2 id="principal-component-analysis">Principal Component Analysis</h2>
<h3 id="pca-description">PCA Description</h3>
<p>Principal Component Analysis (PCA) is a simple, non-parametric method to
transform the given data set with possibly correlated columns into a set
of linearly uncorrelated or orthogonal columns, called <em>principal
components</em>. The principal components are ordered in such a way
that the first component accounts for the largest possible variance,
followed by remaining principal components in the decreasing order of
the amount of variance captured from the data. PCA is often used as a
dimensionality reduction technique, where the original data is projected
or rotated onto a low-dimensional space with basis vectors defined by
top-$K$ (for a given value of $K$) principal components.</p>
<h4 id="pca-details">PCA Details</h4>
<p>Principal Component Analysis (PCA) is a non-parametric procedure for
orthogonal linear transformation of the input data to a new coordinate
system, such that the greatest variance by some projection of the data
comes to lie on the first coordinate (called the first principal
component), the second greatest variance on the second coordinate, and
so on. In other words, PCA first selects a normalized direction in
$m$-dimensional space ($m$ is the number of columns in the input data)
along which the variance in input data is maximized – this is referred
to as the first principal component. It then repeatedly finds other
directions (principal components) in which the variance is maximized. At
every step, PCA restricts the search for only those directions that are
perpendicular to all previously selected directions. By doing so, PCA
aims to reduce the redundancy among input variables. To understand the
notion of redundancy, consider an extreme scenario with a data set
comprising of two variables, where the first one denotes some quantity
expressed in meters, and the other variable represents the same quantity
but in inches. Both these variables evidently capture redundant
information, and hence one of them can be removed. In a general
scenario, keeping solely the linear combination of input variables would
both express the data more concisely and reduce the number of variables.
This is why PCA is often used as a dimensionality reduction technique.</p>
<p>The specific method to compute such a new coordinate system is as
follows – compute a covariance matrix $C$ that measures the strength of
correlation among all pairs of variables in the input data; factorize
$C$ according to eigen decomposition to calculate its eigenvalues and
eigenvectors; and finally, order eigenvectors in the decreasing order of
their corresponding eigenvalue. The computed eigenvectors (also known as
<em>loadings</em>) define the new coordinate system and the square
root of eigen values provide the amount of variance in the input data
explained by each coordinate or eigenvector.</p>
<h4 id="pca-returns">PCA Returns</h4>
<p>When MODEL is not provided, PCA procedure is
applied on INPUT data to generate MODEL as well as the rotated data
OUTPUT (if PROJDATA is set to $1$) in the new coordinate system. The
produced model consists of basis vectors MODEL$/dominant.eigen.vectors$
for the new coordinate system; eigen values
MODEL$/dominant.eigen.values$; and the standard deviation
MODEL$/dominant.eigen.standard.deviations$ of principal components. When
MODEL is provided, INPUT data is rotated according to the coordinate
system defined by MODEL$/dominant.eigen.vectors$. The resulting data is
stored at location OUTPUT.</p>
<hr />
<h2 id="matrix-completion-via-alternating-minimizations">Matrix Completion via Alternating Minimizations</h2>
<h3 id="mcam-description">MCAM Description</h3>
<p>Low-rank matrix completion is an effective technique for statistical
data analysis widely used in the data mining and machine learning
applications. Matrix completion is a variant of low-rank matrix
factorization with the goal of recovering a partially observed and
potentially noisy matrix from a subset of its revealed entries. Perhaps
the most popular applications in which matrix completion has been
successfully applied is in the context of collaborative filtering in
recommender systems. In this setting, the rows in the data matrix
correspond to users, the columns to items such as movies, and entries to
feedback provided by users for items. The goal is to predict missing
entries of the rating matrix. This implementation uses the alternating
least-squares (ALS) technique for solving large-scale matrix completion
problems.</p>
<h3 id="mcam-details">MCAM Details</h3>
<p>Given an $m \times n$ input matrix $V$ and a rank parameter
$r \ll \min{(m,n)}$, low-rank matrix factorization seeks to find an
$m \times r$ matrix $L$ and an $r \times n$ matrix $R$ such that
$V \approx LR$, i.e., we aim to approximate $V$ by the low-rank matrix
$LR$. The quality of the approximation is determined by an
application-dependent loss function $\mathcal{L}$. We aim at finding the
loss-minimizing factor matrices, i.e.,</p>
\[\begin{equation}
(L^*, R^*) = \textrm{argmin}_{L,R}{\mathcal{L}(V,L,R)}
\end{equation}\]
<p>In the
context of collaborative filtering in the recommender systems it is
often the case that the input matrix $V$ contains several missing
entries. Such entries are coded with the 0 value and the loss function
is computed only based on the nonzero entries in $V$, i.e.,</p>
\[%\label{eq:loss}
\mathcal{L}=\sum_{(i,j)\in\Omega}l(V_{ij},L_{i*},R_{*j})\]
<p>where $L_{i*}$ and $R_{*j}$, respectively, denote the $i$th row of $L$ and the
$j$th column of $R$, $\Omega={\omega_1,\dots,\omega_N}$ denotes the
training set containing the observed (nonzero) entries in $V$, and $l$
is some local loss function.</p>
<p>ALS is an optimization technique that can
be used to solve quadratic problems. For matrix completion, the
algorithm repeatedly keeps one of the unknown matrices ($L$ or $R$)
fixed and optimizes the other one. In particular, ALS alternates between
recomputing the rows of $L$ in one step and the columns of $R$ in the
subsequent step. Our implementation of the ALS algorithm supports the
loss functions summarized in <a href="#table16"><strong>Table 16</strong></a>
commonly used for matrix completion <a href="algorithms-bibliography.html">[ZhouWSP08]</a>.</p>
<h4 id="table-16">Table 16</h4>
<p>Popular loss functions supported by our ALS implementation; $N_{i*}$
and $N_{*j}$, respectively, denote the number of nonzero entries in
row $i$ and column $j$ of $V$.</p>
<table>
<thead>
<tr>
<th>Loss</th>
<th>Definition</th>
</tr>
</thead>
<tbody>
<tr>
<td>$\mathcal{L}_\text{Nzsl}$</td>
<td>$\sum_{i,j:V_{ij}\neq 0} (V_{ij} - [LR]_{ij})^2$</td>
</tr>
<tr>
<td>$\mathcal{L}_\text{Nzsl+L2}$</td>
<td>$\mathcal{L}<em>\text{Nzsl} + \lambda \Bigl( \sum</em>{ik} L_{ik}^2 + \sum_{kj} R_{kj}^2 \Bigr)$</td>
</tr>
<tr>
<td>$\mathcal{L}_\text{Nzsl+wL2}$</td>
<td>$\mathcal{L}<em>\text{Nzsl} + \lambda \Bigl(\sum</em>{ik}N_{i<em>} L_{ik}^2 + \sum_{kj}N_{</em>j} R_{kj}^2 \Bigr)$</td>
</tr>
</tbody>
</table>
<hr />
<p>Note that the matrix completion problem as defined in (1)
is a non-convex problem for all loss functions from
<a href="#table-16"><strong>Table 16</strong></a>. However, when fixing one of the matrices
$L$ or $R$, we get a least-squares problem with a globally optimal
solution. For example, for the case of \(\mathcal{L}_\text{Nzsl+wL2}\) we
have the following closed form solutions</p>
\[\begin{aligned}
L^\top_{n+1,i*} &amp;\leftarrow (R^{(i)}_n {[R^{(i)}_n]}^\top + \lambda N_2 I)^{-1} R_n V^\top_{i*}, \\
R_{n+1,*j} &amp;\leftarrow ({[L^{(j)}_{n+1}]}^\top L^{(j)}_{n+1} + \lambda N_1 I)^{-1} L^\top_{n+1} V_{*j},
\end{aligned}\]
<p>where $L_{n+1,i*}$ (resp. $R_{n+1,*j}$) denotes the
$i$th row of $L_{n+1}$ (resp. $j$th column of $R_{n+1}$), $\lambda$
denotes the regularization parameter, $I$ is the identity matrix of
appropriate dimensionality, $V_{i*}$ (resp. $V_{*j}$) denotes the
revealed entries in row $i$ (column $j$), $R^{(i)}<em>n$ ( resp.
$L^{(j)}</em>{n+1}$ ) refers to the corresponding columns of $R_n$ (rows of
$L_{n+1}$), and $N_1$ (resp. $N_2$) denotes a diagonal matrix that
contains the number of nonzero entries in row $i$ (column $j$) of $V$.</p>
<p><strong>Prediction.</strong> Based on the factor matrices computed by ALS we provide
two prediction scripts:</p>
<ol>
<li><code class="language-plaintext highlighter-rouge">ALS_predict.dml</code> computes the predicted ratings for a given
list of users and items.</li>
<li><code class="language-plaintext highlighter-rouge">ALS_topk_predict.dml</code> computes top-K item (where $K$ is
given as input) with highest predicted ratings together with their
corresponding ratings for a given list of users.</li>
</ol>
<h3 id="mcam-returns">MCAM Returns</h3>
<p>We output the factor matrices $L$ and $R$ after the algorithm has
converged. The algorithm is declared as converged if one of the two
criteria is meet: (1) the decrease in the value of loss function falls
below <code class="language-plaintext highlighter-rouge">thr</code> given as an input parameter (if parameter
<code class="language-plaintext highlighter-rouge">check=TRUE</code>), or (2) the maximum number of iterations
(defined as parameter <code class="language-plaintext highlighter-rouge">maxi</code>) is reached. Note that for a
given user $i$ prediction is possible only if user $i$ has rated at
least one item, i.e., row $i$ in matrix $V$ has at least one nonzero
entry. In case, some users have not rated any items the corresponding
factor in $L$ will be all 0s. Similarly if some items have not been
rated at all the corresponding factors in $R$ will contain only 0s. Our
prediction scripts output the predicted ratings for a given list of
users and items as well as the top-K items with highest predicted
ratings together with the predicted ratings for a given list of users.
Note that the predictions will only be provided for the users who have
rated at least one item, i.e., the corresponding rows contain at least
one nonzero entry.</p>
</div>
<!--
-->
<!-- MathJax Section -->
<script type="text/x-mathjax-config">
MathJax.Hub.Config({ TeX: { equationNumbers: { autoNumber: "AMS" } } });
</script>
<script>
// Note that we load MathJax this way to work with local file (file://), HTTP and HTTPS.
// We could use "//cdn.mathjax...", but that won't support "file://".
(function(d, script) {
script = d.createElement('script');
script.type = 'text/javascript';
script.async = true;
script.onload = function() {
MathJax.Hub.Config({
tex2jax: {
inlineMath: [
["$", "$"],
["\\\\(", "\\\\)"]
],
displayMath: [
["$$", "$$"],
["\\[", "\\]"]
],
processEscapes: true,
skipTags: ['script', 'noscript', 'style', 'textarea', 'pre']
}
});
};
script.src = ('https:' == document.location.protocol ? 'https://' : 'http://') +
'cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML';
d.getElementsByTagName('head')[0].appendChild(script);
}(document));
</script>
</body>
</html>