The development of new efficient Markov Chain Monte Carlo methods for solving high-dimensional integration and optimization problems in machine learning

The modern machine learning is characterized by high dimensional models and massive datasets. Despite recent advances in developing MCMC algorithms in high dimensions, it is still an arduous task to design efficient samplers in high dimensions. The design of efficient sampling methods depends on understanding of the basic building blocks of MCMC (mixtures of kernels, augmentation strategies and blocking) and on careful design of the proposal mechanisms. The latter requires domain specific knowledge and heuristics. There are great opportunities for combining existing sub-optimal algorithms with MCMC in many machine learning problems. The following areas of machine learning already significantly benefit from sampling methods: computer vision (tracking, stereo matching, color constancy, restoration of old movies, segmentation), web statistics (estimating coverage of search engines, proportions belonging to specific domains and the average size of web pages), speech and audio processing (signal enhancement), regression and classification (neural networks and kernel machines), computer graphics (light transport, sampling plausible solutions to multi-body constraint problems), data association (vehicle matching in highway systems, multitarget tracking). It should emphasized that in order to obtain the best results out of MCMC algorithms, it is important not to treat them as black boxes, but instead try to incorporate as much domain specific knowledge as possible into their design. Markov Chains Monte Carlo (MCMC) algorithms have played a significant role in statistics, econometrics, physics and computing science over the last two decades. This facility, together with the tremendous increase of computer power in recent years, makes MCMC perhaps the main reason for the widespread use of Bayesian statistical modeling and inference across the spectrum of quantitative scientific disciplines. There are several high-dimensional problems, such as computing the volume of a convex body in d dimensions, for which MCMC simulation is the only known general approach for providing a solution within a reasonable time (polynomial in d). Many papers on Monte Carlo simulation appeared in the physics literature after 1953. Despite the existence of a few MCMC publications in the statistics literature at this time, it is generally accepted that it was only in 1990 that MCMC made the first significant impact in statistics (Gelfand & Smith, 1990). In the neural networks literature, the publication of Neal (1996) was particularly influential. MCMC techniques are often applied to solve integration and optimisation problems in large dimensional spaces. These two types of problem play a fundamental role in machine learning, physics, statistics, econometrics and decision analysis. For example, in Bayesian inference and learning MCMC algorithms can help to solve such intractable integration problems as normalization, marginalization and the computation of posterior means. In optimization optimization the goal is to extract the solution that minimises some objective function from a large set of feasible solutions. In fact, this set can be continuous and unbounded. In general, it is too computationally expensive to compare all the solutions to find out which one is optimal. MCMC algorithms can be used to approximate hard combinatorial optimisation problems by much simpler random search problems. MCMC is a strategy for generating samples from a distribution p on a state space X by exploring the state space X using a Markov chain mechanism. This mechanism is constructed so that the chain spends more time in the most important regions. In particular, it is constructed so that the samples mimic samples drawn from the target distribution p. The main advantage of MCMC is that it can be used in the case where p is known only up to a normalising constant. The main directions of the current research in MCMC area include convergence analysis and perfect sampling (bounding the mixing time), adaptive MCMC, sequential Monte Carlo and particle filters. The topic of variance reduction in MCMC algorithms is rather hot and attracted a lot of interests recently (Dellaportas & Kontoyiannis).

Responsible persons: Denis Belomestny, Leonid Iosipoi