| <!DOCTYPE html> |
| <!--[if lt IE 7]> <html class="no-js lt-ie9 lt-ie8 lt-ie7"> <![endif]--> |
| <!--[if IE 7]> <html class="no-js lt-ie9 lt-ie8"> <![endif]--> |
| <!--[if IE 8]> <html class="no-js lt-ie9"> <![endif]--> |
| <!--[if gt IE 8]><!--> |
| <html class="no-js"> |
| <!--<![endif]--> |
| <!-- |
| |
| --> |
| |
| <head> |
| <title>Algorithms Reference Matrix Factorization - SystemDS 3.2.0</title> |
| <meta charset="utf-8"> |
| <meta http-equiv="X-UA-Compatible" content="IE=edge,chrome=1"> |
| <meta name="description" content=""> |
| <meta name="viewport" content="width=device-width"> |
| <link rel="stylesheet" href="./../css/bootstrap.min.css"> |
| <link rel="stylesheet" href="./../css/main.css"> |
| <link rel="stylesheet" href="./../css/pygments-default.css"> |
| <link rel="shortcut icon" href="./../img/favicon.png"> |
| <script src="./../js/vendor/jquery-1.12.0.min.js"></script> |
| <script src="./../js/vendor/bootstrap.min.js"></script> |
| <script src="./../js/vendor/anchor.min.js"></script> |
| <script src="./../js/main.js"></script> |
| </head> |
| |
| <body> |
| <!-- |
| |
| --> |
| <header class="navbar navbar-default navbar-fixed-top" id="topbar"> |
| <div class="container"> |
| <div class="navbar-header"> |
| <div class="navbar-brand brand projectlogo"> |
| <a href="https://systemds.apache.org/"><img class="logo" src="./../img/systemds-logo.png" alt="Apache SystemDS" title="Apache SystemDS" /></a> |
| </div> |
| <div class="navbar-brand brand projecttitle"> |
| <a href="https://systemds.apache.org/">Apache SystemDS<sup id="trademark">™</sup></a><br /> |
| <span class="version">3.2.0</span> |
| </div> |
| <button type="button" class="navbar-toggle collapsed" data-toggle="collapse" data-target=".navbar-collapse"> |
| <span class="sr-only">Toggle navigation</span> |
| <span class="icon-bar"></span> |
| <span class="icon-bar"></span> |
| <span class="icon-bar"></span> |
| </button> |
| </div> |
| <nav class="navbar-collapse collapse"> |
| <ul class="nav navbar-nav navbar-right"> |
| <li class="dropdown"> |
| <a href="#" class="dropdown-toggle" data-toggle="dropdown">Overview<b class="caret"></b></a> |
| <ul class="dropdown-menu" role="menu"> |
| <li><b>Home:</b></li> |
| <li><a href="./../">Docs Home</a></li> |
| <li class="divider"></li> |
| <li><b>Running SystemDS:</b></li> |
| <li><a href="./../site/run">Standalone Guide</a></li> |
| <li><a href="./../site/gpu">GPU Guide</a></li> |
| <li><a href="./../site/native-backend">Native Backend (BLAS)</a></li> |
| <li><a href="./../site/docker">Run with Docker</a></li> |
| <li class="divider"></li> |
| <li><b>Language Guides:</b></li> |
| <li><a href="./../site/dml-language-reference.html">DML Language Reference</a></li> |
| <li><a href="./../site/builtins-reference.html">Built-in Functions Reference</a></li> |
| <li><a href="./../site/dml-vs-r-guide.html">DML vs R guide</a></li> |
| <li class="divider"></li> |
| <li><b>Algorithms:</b></li> |
| <li><a href="./../site/algorithms-reference.html">ML Algorithms Reference</a></li> |
| <li class="divider"></li> |
| <li><b>Other:</b></li> |
| <li><a href="https://github.com/apache/systemds/blob/main/CONTRIBUTING.md">Contributing to SystemDS 🡕</a></li> |
| </ul> |
| </li> |
| <li><a href="https://github.com/apache/systemds">GitHub 🡕</a></li> |
| <li class="dropdown"> |
| <a href="#" class="dropdown-toggle" data-toggle="dropdown">API<b class="caret"></b></a> |
| <ul class="dropdown-menu" role="menu"> |
| <li><a href="./../api/java/">Java</a></li> |
| <li><a href="./../api/python/">Python</a></li> |
| </ul> |
| </li> |
| <li><a href="https://issues.apache.org/jira/secure/Dashboard.jspa?selectPageId=12335852">Issues</a></li> |
| </ul> |
| </nav> |
| </div> |
| </header> |
| |
| <div class="container" id="content"> |
| <h1 class="title">Algorithms Reference Matrix Factorization</h1> |
| <!-- |
| |
| --> |
| |
| <h2 id="principal-component-analysis">Principal Component Analysis</h2> |
| |
| <h3 id="pca-description">PCA Description</h3> |
| |
| <p>Principal Component Analysis (PCA) is a simple, non-parametric method to |
| transform the given data set with possibly correlated columns into a set |
| of linearly uncorrelated or orthogonal columns, called <em>principal |
| components</em>. The principal components are ordered in such a way |
| that the first component accounts for the largest possible variance, |
| followed by remaining principal components in the decreasing order of |
| the amount of variance captured from the data. PCA is often used as a |
| dimensionality reduction technique, where the original data is projected |
| or rotated onto a low-dimensional space with basis vectors defined by |
| top-$K$ (for a given value of $K$) principal components.</p> |
| |
| <h4 id="pca-details">PCA Details</h4> |
| |
| <p>Principal Component Analysis (PCA) is a non-parametric procedure for |
| orthogonal linear transformation of the input data to a new coordinate |
| system, such that the greatest variance by some projection of the data |
| comes to lie on the first coordinate (called the first principal |
| component), the second greatest variance on the second coordinate, and |
| so on. In other words, PCA first selects a normalized direction in |
| $m$-dimensional space ($m$ is the number of columns in the input data) |
| along which the variance in input data is maximized – this is referred |
| to as the first principal component. It then repeatedly finds other |
| directions (principal components) in which the variance is maximized. At |
| every step, PCA restricts the search for only those directions that are |
| perpendicular to all previously selected directions. By doing so, PCA |
| aims to reduce the redundancy among input variables. To understand the |
| notion of redundancy, consider an extreme scenario with a data set |
| comprising of two variables, where the first one denotes some quantity |
| expressed in meters, and the other variable represents the same quantity |
| but in inches. Both these variables evidently capture redundant |
| information, and hence one of them can be removed. In a general |
| scenario, keeping solely the linear combination of input variables would |
| both express the data more concisely and reduce the number of variables. |
| This is why PCA is often used as a dimensionality reduction technique.</p> |
| |
| <p>The specific method to compute such a new coordinate system is as |
| follows – compute a covariance matrix $C$ that measures the strength of |
| correlation among all pairs of variables in the input data; factorize |
| $C$ according to eigen decomposition to calculate its eigenvalues and |
| eigenvectors; and finally, order eigenvectors in the decreasing order of |
| their corresponding eigenvalue. The computed eigenvectors (also known as |
| <em>loadings</em>) define the new coordinate system and the square |
| root of eigen values provide the amount of variance in the input data |
| explained by each coordinate or eigenvector.</p> |
| |
| <h4 id="pca-returns">PCA Returns</h4> |
| |
| <p>When MODEL is not provided, PCA procedure is |
| applied on INPUT data to generate MODEL as well as the rotated data |
| OUTPUT (if PROJDATA is set to $1$) in the new coordinate system. The |
| produced model consists of basis vectors MODEL$/dominant.eigen.vectors$ |
| for the new coordinate system; eigen values |
| MODEL$/dominant.eigen.values$; and the standard deviation |
| MODEL$/dominant.eigen.standard.deviations$ of principal components. When |
| MODEL is provided, INPUT data is rotated according to the coordinate |
| system defined by MODEL$/dominant.eigen.vectors$. The resulting data is |
| stored at location OUTPUT.</p> |
| |
| <hr /> |
| |
| <h2 id="matrix-completion-via-alternating-minimizations">Matrix Completion via Alternating Minimizations</h2> |
| |
| <h3 id="mcam-description">MCAM Description</h3> |
| |
| <p>Low-rank matrix completion is an effective technique for statistical |
| data analysis widely used in the data mining and machine learning |
| applications. Matrix completion is a variant of low-rank matrix |
| factorization with the goal of recovering a partially observed and |
| potentially noisy matrix from a subset of its revealed entries. Perhaps |
| the most popular applications in which matrix completion has been |
| successfully applied is in the context of collaborative filtering in |
| recommender systems. In this setting, the rows in the data matrix |
| correspond to users, the columns to items such as movies, and entries to |
| feedback provided by users for items. The goal is to predict missing |
| entries of the rating matrix. This implementation uses the alternating |
| least-squares (ALS) technique for solving large-scale matrix completion |
| problems.</p> |
| |
| <h3 id="mcam-details">MCAM Details</h3> |
| |
| <p>Given an $m \times n$ input matrix $V$ and a rank parameter |
| $r \ll \min{(m,n)}$, low-rank matrix factorization seeks to find an |
| $m \times r$ matrix $L$ and an $r \times n$ matrix $R$ such that |
| $V \approx LR$, i.e., we aim to approximate $V$ by the low-rank matrix |
| $LR$. The quality of the approximation is determined by an |
| application-dependent loss function $\mathcal{L}$. We aim at finding the |
| loss-minimizing factor matrices, i.e.,</p> |
| |
| \[\begin{equation} |
| (L^*, R^*) = \textrm{argmin}_{L,R}{\mathcal{L}(V,L,R)} |
| \end{equation}\] |
| |
| <p>In the |
| context of collaborative filtering in the recommender systems it is |
| often the case that the input matrix $V$ contains several missing |
| entries. Such entries are coded with the 0 value and the loss function |
| is computed only based on the nonzero entries in $V$, i.e.,</p> |
| |
| \[%\label{eq:loss} |
| \mathcal{L}=\sum_{(i,j)\in\Omega}l(V_{ij},L_{i*},R_{*j})\] |
| |
| <p>where $L_{i*}$ and $R_{*j}$, respectively, denote the $i$th row of $L$ and the |
| $j$th column of $R$, $\Omega={\omega_1,\dots,\omega_N}$ denotes the |
| training set containing the observed (nonzero) entries in $V$, and $l$ |
| is some local loss function.</p> |
| |
| <p>ALS is an optimization technique that can |
| be used to solve quadratic problems. For matrix completion, the |
| algorithm repeatedly keeps one of the unknown matrices ($L$ or $R$) |
| fixed and optimizes the other one. In particular, ALS alternates between |
| recomputing the rows of $L$ in one step and the columns of $R$ in the |
| subsequent step. Our implementation of the ALS algorithm supports the |
| loss functions summarized in <a href="#table16"><strong>Table 16</strong></a> |
| commonly used for matrix completion <a href="algorithms-bibliography.html">[ZhouWSP08]</a>.</p> |
| |
| <h4 id="table-16">Table 16</h4> |
| |
| <p>Popular loss functions supported by our ALS implementation; $N_{i*}$ |
| and $N_{*j}$, respectively, denote the number of nonzero entries in |
| row $i$ and column $j$ of $V$.</p> |
| |
| <table> |
| <thead> |
| <tr> |
| <th>Loss</th> |
| <th>Definition</th> |
| </tr> |
| </thead> |
| <tbody> |
| <tr> |
| <td>$\mathcal{L}_\text{Nzsl}$</td> |
| <td>$\sum_{i,j:V_{ij}\neq 0} (V_{ij} - [LR]_{ij})^2$</td> |
| </tr> |
| <tr> |
| <td>$\mathcal{L}_\text{Nzsl+L2}$</td> |
| <td>$\mathcal{L}<em>\text{Nzsl} + \lambda \Bigl( \sum</em>{ik} L_{ik}^2 + \sum_{kj} R_{kj}^2 \Bigr)$</td> |
| </tr> |
| <tr> |
| <td>$\mathcal{L}_\text{Nzsl+wL2}$</td> |
| <td>$\mathcal{L}<em>\text{Nzsl} + \lambda \Bigl(\sum</em>{ik}N_{i<em>} L_{ik}^2 + \sum_{kj}N_{</em>j} R_{kj}^2 \Bigr)$</td> |
| </tr> |
| </tbody> |
| </table> |
| |
| <hr /> |
| |
| <p>Note that the matrix completion problem as defined in (1) |
| is a non-convex problem for all loss functions from |
| <a href="#table-16"><strong>Table 16</strong></a>. However, when fixing one of the matrices |
| $L$ or $R$, we get a least-squares problem with a globally optimal |
| solution. For example, for the case of \(\mathcal{L}_\text{Nzsl+wL2}\) we |
| have the following closed form solutions</p> |
| |
| \[\begin{aligned} |
| L^\top_{n+1,i*} &\leftarrow (R^{(i)}_n {[R^{(i)}_n]}^\top + \lambda N_2 I)^{-1} R_n V^\top_{i*}, \\ |
| R_{n+1,*j} &\leftarrow ({[L^{(j)}_{n+1}]}^\top L^{(j)}_{n+1} + \lambda N_1 I)^{-1} L^\top_{n+1} V_{*j}, |
| \end{aligned}\] |
| |
| <p>where $L_{n+1,i*}$ (resp. $R_{n+1,*j}$) denotes the |
| $i$th row of $L_{n+1}$ (resp. $j$th column of $R_{n+1}$), $\lambda$ |
| denotes the regularization parameter, $I$ is the identity matrix of |
| appropriate dimensionality, $V_{i*}$ (resp. $V_{*j}$) denotes the |
| revealed entries in row $i$ (column $j$), $R^{(i)}<em>n$ ( resp. |
| $L^{(j)}</em>{n+1}$ ) refers to the corresponding columns of $R_n$ (rows of |
| $L_{n+1}$), and $N_1$ (resp. $N_2$) denotes a diagonal matrix that |
| contains the number of nonzero entries in row $i$ (column $j$) of $V$.</p> |
| |
| <p><strong>Prediction.</strong> Based on the factor matrices computed by ALS we provide |
| two prediction scripts:</p> |
| |
| <ol> |
| <li><code class="language-plaintext highlighter-rouge">ALS_predict.dml</code> computes the predicted ratings for a given |
| list of users and items.</li> |
| <li><code class="language-plaintext highlighter-rouge">ALS_topk_predict.dml</code> computes top-K item (where $K$ is |
| given as input) with highest predicted ratings together with their |
| corresponding ratings for a given list of users.</li> |
| </ol> |
| |
| <h3 id="mcam-returns">MCAM Returns</h3> |
| |
| <p>We output the factor matrices $L$ and $R$ after the algorithm has |
| converged. The algorithm is declared as converged if one of the two |
| criteria is meet: (1) the decrease in the value of loss function falls |
| below <code class="language-plaintext highlighter-rouge">thr</code> given as an input parameter (if parameter |
| <code class="language-plaintext highlighter-rouge">check=TRUE</code>), or (2) the maximum number of iterations |
| (defined as parameter <code class="language-plaintext highlighter-rouge">maxi</code>) is reached. Note that for a |
| given user $i$ prediction is possible only if user $i$ has rated at |
| least one item, i.e., row $i$ in matrix $V$ has at least one nonzero |
| entry. In case, some users have not rated any items the corresponding |
| factor in $L$ will be all 0s. Similarly if some items have not been |
| rated at all the corresponding factors in $R$ will contain only 0s. Our |
| prediction scripts output the predicted ratings for a given list of |
| users and items as well as the top-K items with highest predicted |
| ratings together with the predicted ratings for a given list of users. |
| Note that the predictions will only be provided for the users who have |
| rated at least one item, i.e., the corresponding rows contain at least |
| one nonzero entry.</p> |
| |
| </div> |
| <!-- |
| |
| --> |
| |
| |
| <!-- MathJax Section --> |
| <script type="text/x-mathjax-config"> |
| MathJax.Hub.Config({ TeX: { equationNumbers: { autoNumber: "AMS" } } }); |
| </script> |
| <script> |
| // Note that we load MathJax this way to work with local file (file://), HTTP and HTTPS. |
| // We could use "//cdn.mathjax...", but that won't support "file://". |
| (function(d, script) { |
| script = d.createElement('script'); |
| script.type = 'text/javascript'; |
| script.async = true; |
| script.onload = function() { |
| MathJax.Hub.Config({ |
| tex2jax: { |
| inlineMath: [ |
| ["$", "$"], |
| ["\\\\(", "\\\\)"] |
| ], |
| displayMath: [ |
| ["$$", "$$"], |
| ["\\[", "\\]"] |
| ], |
| processEscapes: true, |
| skipTags: ['script', 'noscript', 'style', 'textarea', 'pre'] |
| } |
| }); |
| }; |
| script.src = ('https:' == document.location.protocol ? 'https://' : 'http://') + |
| 'cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML'; |
| d.getElementsByTagName('head')[0].appendChild(script); |
| }(document)); |
| </script> |
| |
| </body> |
| |
| </html> |