On Exploration, Exploitation and Learning in Adaptive Importance Sampling

Xiaoyu Lu, Tom Rainforth, Yuan Zhou, Yee Whye Teh, Frank Wood, Hongseok Yang, Jan-Willem van de Meent
arXiv preprint arXiv:1810.13296v1,   October 2018.

(Accepted by Advances in Approximate Bayesian Inference Workshop, 2017)


We study adaptive importance sampling (AIS) as an online learning problem and argue for the importance of the trade-off between exploration and exploitation in this adaptation. Borrowing ideas from the bandits literature, we propose Daisee, a partition-based AIS algorithm. We further introduce a notion of regret for AIS and show that Daisee has (T‾‾√(logT)34) cumulative pseudo-regret, where T is the number of iterations. We then extend Daisee to adaptively learn a hierarchical partitioning of the sample space for more efficient sampling and confirm the performance of both algorithms empirically.

Structured Variationally Auto-encoded Optimizatio

Xiaoyu Lu, Javier Gonzalez, Zhenwen Dai, Neil D. Lawrence

(Accepted by ICML, 2018)


We tackle the problem of optimizing a black-box objective function defined over a highly-structured input space. This problem is ubiquitous in science and engineering. In machine learning, inferring the structure of a neural network or the Automatic Statistician (AS), where the optimal kernel combination for a Gaussian process is selected, are two important examples. We use the \as as a case study to describe our approach, that can be easily generalized to other domains. We propose an Structure Generating Variational Auto-encoder (SG-VAE) to embed the original space of kernel combinations into some low-dimensional continuous manifold where Bayesian optimization (BO) ideas are used. This is possible when structural knowledge of the problem is available, which can be given via a simulator or any other form of generating potentially good solutions. The right exploration-exploitation balance is imposed by propagating into the search the uncertainty of the latent space of the SG-VAE, that is computed using variational inference. The key aspect of our approach is that the SG-VAE can be used to bias the search towards relevant regions, making it suitable for transfer learning tasks. Several experiments in various application domains are used to illustrate the utility and generality of the approach described in this work.

Relativistic Monte Carlo

Xiaoyu Lu, Valerio Perrone, Leonard Hasenclever, Yee Whye Teh, Sebastian J. Vollmer
arXiv preprint arXiv:1609.04388,   October 2016.

(Accepted by Artificial Intelligence and Statistics Conference (AISTATS), 2017)


Hamiltonian Monte Carlo (HMC) is a popular Markov chain Monte Carlo (MCMC) algorithm that generates proposals for a Metropolis-Hastings algorithm by simulating the dynamics of a Hamiltonian system. However, HMC is sensitive to large time discretizations and performs poorly if there is a mismatch between the spatial geometry of the target distribution and the scales of the momentum distribution. In particular the mass matrix of HMC is hard to tune well. In order to alleviate these problems we propose relativistic Hamiltonian Monte Carlo, a version of HMC based on relativistic dynamics that introduce a maximum velocity on particles. We also derive stochastic gradient versions of the algorithm and show that the resulting algorithms bear interesting relationships to gradient clipping, RMSprop, Adagrad and Adam, popular optimisation methods in deep learning. Based on this, we develop relativistic stochastic gradient descent by taking the zero-temperature limit of relativistic stochastic gradient Hamiltonian Monte Carlo. In experiments we show that the relativistic algorithms perform better than classical Newtonian variants and Adam.

Tucker Gaussian Process for Regression and Collaborative Filtering

Hyunjik Kim, Xiaoyu Lu, Seth Flaxman, Yee Whye Teh
arXiv preprint arXiv:1605.07025v2 ,   October 2016.


We introduce the Tucker Gaussian Process (TGP), a model for regression that regularises a Gaussian Process (GP) towards simpler regression functions for enhanced generalisation performance. We derive it using a novel approach to scalable GP learning, and show that our model is particularly well-suited to grid-structured data and problems where the dependence on covariates is close to being separable. A prime example is collaborative filtering, for which our model provides an effective GP based method that has a low-rank matrix factorisation at its core. We show that TGP generalises classical Bayesian matrix factorisation models, and goes beyond them to give a natural and elegant method for incorporating side information.

Inference Trees: Adaptive Inference with Exploration

Tom Rainforth, Yuan Zhou, Xiaoyu Lu, Yee Whye Teh, Frank Wood, Hongseok Yang, Jan-Willem van de Meent
arXiv preprint arXiv:1605.07025v2 ,   October 2016.


We introduce inference trees (ITs), a new class of inference methods that build on ideas from Monte Carlo tree search to perform adaptive sampling in a manner that balances exploration with exploitation, ensures consistency, and alleviates pathologies in existing adaptive methods. ITs adaptively sample from hierarchical partitions of the parameter space, while simultaneously learning these partitions in an online manner. This enables ITs to not only identify regions of high posterior mass, but also maintain uncertainty estimates to track regions where significant posterior mass may have been missed. ITs can be based on any inference method that provides a consistent estimate of the marginal likelihood. They are particularly effective when combined with sequential Monte Carlo, where they capture long-range dependencies and yield improvements beyond proposal adaptation alone.

  • Machine Learning Scientist, Amazon04/2019 - Present
  • Research Intern, Microsoft06/2018 - 09/2018

    • Research project in imitation learning with latent variable modelling.

    • Delivered a research paper.

  • Quantitative Research Intern, JP Morgan Chase09/2017 - 11/2017

    Work in the Model Governance Group, current projects on caturing risks not in VaR(Value at Risk) for CDS.

  • Applied Scientist Intern, Amazon06/2017 - 08/2017

    • Research project in Bayesian Optimization when the input space is non-Euclidean, with an application in automated model selection. Successfully implemented the model in Python and presented the work to the group.

    • Delivered a paper and has been accepted by ICML 2018. The paper has also been submitted to the Bayesian optimization for science and engineering, NIPS, 2017 which is under review.

    • Implemented the VAE(Variational Autoencoder) module in a deep learning framework (MxNet) and contributed to the public repository.

  • Non-stipendiary Lecturer, Oxford University10/2016 - 06/2017

    Lecturer in Probability and Statistics

  • Business Technical Intern, Google, Ireland07/2015 - 09/2015

    • Collect big data from database using query languages such as SQL.

    • Create competitive analysis and benchmarking study for account hijacking, recommend strategy ad- justments based on findings.

    • Analyse hijacking trends within a specific set of products and develop an action plan based on trends and patterns.

    • Analyse preventable abuse related issues which impact users, and identify core and common prevention focus areas across Product Quality Operation.

    • Partner with engineering teams to improve our hijacking prevention, detection and recovery systems.

    • Build statistical models to select relavant features and predict goodness/badness of clusters of accounts.

    • Delivered excellent presentation and documentation.

  • Quantitative Strategies Summer Analyst, Credit Suisse, London06/2014-08/2014

    • Build pricing models for Calendar Spread Options using Excel and VBA.

    • Perform model calibration and validation, as well as hedging simulation for historical data.

    • Delivered excellent results and received exceptional feedback from managers and colleagues.

  • Power Trading Intern,Gazprom Marketing and Trading, London07/2013-09/2013

    • Build predictive models for bid-offer curves for forecasting in the European power market.

    • Data sampling and manipulation using statistical and programming tools including R and Python.

    • Have received excellent feedback and successfully implemented the pricing model which was in pro- duction.

  • Summer Intern,Guotai Junan Securities, Qingdao, China06/2012-08/2012

    • In charge of assisting with daily business and organising files in a group.

    • Provide customer service and maintain relationships with clients in a fast-paced environment.

    • Improved customers’ satisfaction by 10%.

  • PhD2014 - 2019
    Statistical Machine Learning

    University of Oxford, New College

  • M.Math.2010 - 2014
    Mathematics and Statistics

    University of Oxford, Lady Margaret Hall

  • 2014
    Clarendon Fund Scholarship, PAG Oxford Scholarship
    • Awarded to the top 3% of accepted graduate students across all disciplines at the University of Oxford
    • Full scholarship for PhD studies
  • 2014
    Royal Statistical Society Prize
    • Ranked top 1st at University of Oxford.
  • 2014
    Gibbs Prize
    • Awarded to the best undergraduate student at Department of Statistics, University of Oxford.
  • 2013
    Department of StatisticsPrize 2013
    • Awarded due to excellent academic performance.
  • 2013
    Top ten finalist, TARGETjobs
    • Top ten finalist for the Mathematics, Economics and Finance Undergraduate Of The Year Award.
  • 2013
    Certificate of Appreciation
    • Awarded due to outstanding volunteer service in Chinglai, Thailand.
  • Language
    • Chinese: Mother tongue
    • English: Fluent
  • IT Skills
    • Intermediate Knowledge: Julia, Python, R, Matlab, Linux, Latex.
    • • Basic Knowledge: Microsoft Offices, VBA.
  • Sports and Arts
    • Current member of Incognito Salsa Team, performed at various shows across London and at Scottish Salsa Congress 2018.
    • Previous member of Oxford University Dancesports Club Beginners Team 2017. Won 2nd place in Cha Cha Cha and Jive at Varsity competition.
    • Achieved Grade 9 (Chinese Musician Association Certificate) in Piano.
  • Volunteering
    • I volunteered at Camillian Social Centre in Chinglai, Thailand in 2013, teaching disabled children Maths, yoga, and social science, as well as organising tailored activities for individuals, which was an invaluable and enjoyable experience for me.
  • Societies
    • Talent Management Team Leader, AIESEC, 10/2012-06/2012
    Responsible for organising recruitment events, socials and running training sessions.
    • LMH(Lady Margaret Hall) Ambassador, 10/2012-06/2012
    Responsible for college tours, Q&As and school visits.
  • Amazon
  • 1 Station Road, Station Square
  • Cambridge, CB1 2GA
  • United Kingdom
  • E-Mail: raindrop2bird@gmail.com
  • luxiaoyu@amazon.co.uk