Improper prediction of node labels in graphs


The classical approaches to node labels prediction in a graph require the modelling of a graph structure. The classical example is Stochastic Block Model and its various extensions. However, these models are rarely close to the structure of real world graphs, which results in pure approximation of a graph structure and thus inaccurate label predictions. Recently, Rakhlin & Sridharan proposed the approach for improper prediction of node labels, i.e. the prediction in online fashion without modelling the structure of a graph. We aim to extend their work to improve the computational efficiency and to improve the quality by using side information in nodes.


  1. Study the paper [1] and get familiar with the approach of improper prediction.
  2. Find the appropriate real world data for the problem and also think on generating the model data.
  3. Program the algorithm and compare with standard approaches.
  4. Improve the algorithm to achieve better computational efficiency.
  5. Continue with multiclass setting [2] and develop the approach to include side information there.


[1]  A. Rakhlin and K. Sridharan. A tutorial on online supervised learning with applications to node classification in social networks. CoRR, abs/1608.09014, 2016.
[2] A. Rakhlin and K. Sridharan, K.. Efficient Online Multiclass Prediction on Graphs via Surrogate Losses. Proceedings of the 20th International Conference on Artificial Intelligence and Statistics, in PMLR 54:1403-1411, 2017.