Sparse inductive matrix completion

Overview

The goal of matrix completion is to recover all the elements of the matrix given only the fraction of them and assuming that the matrix has the low rank structure. This is the state of the art approach in recommender systems, where the matrix represents the user ranks for some objects like movies, songs or goods. The classical algorithms for matrix completion [1] treat the problem as nonparametric problem ||R−U V^T|| → min_{U,V}, where the estimation of the matrix elements is based solely on the observed entries of the matrix. However, in several matrix completion applications there exists a side information [2] which has been empirically shown to be useful in many cases. For example, in blog recommender systems there is a lot of extra information about both users X and blogs Y additionally to the information of the type “User A follows blog B” and one possible problem to solve is ||R − X (U V^T) Y^T|| → min_{U,V}. In the situations where the dimensionality of side information is high, there is a need to select the sparse subset of side information features in order to maintain quality of results and increase their interpretability. One of the approaches is to use the group LASSO type of penalty [3]. The goal of the project is to investigate the possibility of inductive matrix completion with feature selection by group LASSO.

Tasks

  1. Study the classical matrix completion algorithms.
  2. Get familiar with LASSO and group LASSO.
  3. Find the appropriate real world data for the problem and also think on generating the model data.
  4. Program the algorithm and compare with standard approaches.

References

[1] Emmanuel J Cand`es and Benjamin Recht. Exact matrix completion via convex optimization. Foundations of Computational mathematics, 9(6):717–772, 2009.
[2] Donghyuk Shin, Suleyman Cetintas, Kuang-Chih Lee, and Inderjit S. Dhillon. Tumblr blog recommendation with boosted inductive matrix completion. In Proceedings of the CIKM 2015, pages 203–212, 2015.
[3] Ming Yuan and Yi Lin. Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 68(1):49–67, 2006.