I am third year PhD student in the OxWaSP CDT programe at the University of Oxford supervised by Professor Yee Whye Teh. My research interests lie in the field of machine learning, reinforcement learning, deep learning, Gaussian Process and Bayesian inference. Currently, I am working on a project on adaptive importance sampling, and its connection to exploration-exploitation.
(Accepted by Artificial Intelligence and Statistics Conference (AISTATS), 2017)
Hamiltonian Monte Carlo (HMC) is a popular Markov chain Monte Carlo (MCMC) algorithm that generates proposals for a Metropolis-Hastings algorithm by simulating the dynamics of a Hamiltonian system. However, HMC is sensitive to large time discretizations and performs poorly if there is a mismatch between the spatial geometry of the target distribution and the scales of the momentum distribution. In particular the mass matrix of HMC is hard to tune well. In order to alleviate these problems we propose relativistic Hamiltonian Monte Carlo, a version of HMC based on relativistic dynamics that introduce a maximum velocity on particles. We also derive stochastic gradient versions of the algorithm and show that the resulting algorithms bear interesting relationships to gradient clipping, RMSprop, Adagrad and Adam, popular optimisation methods in deep learning. Based on this, we develop relativistic stochastic gradient descent by taking the zero-temperature limit of relativistic stochastic gradient Hamiltonian Monte Carlo. In experiments we show that the relativistic algorithms perform better than classical Newtonian variants and Adam.
(Under review by Conference on Neural Information Processing Systems (NIPS), 2017)
We introduce the Tucker Gaussian Process (TGP), a model for regression that regularises a Gaussian Process (GP) towards simpler regression functions for enhanced generalisation performance. We derive it using a novel approach to scalable GP learning, and show that our model is particularly well-suited to grid-structured data and problems where the dependence on covariates is close to being separable. A prime example is collaborative filtering, for which our model provides an effective GP based method that has a low-rank matrix factorisation at its core. We show that TGP generalises classical Bayesian matrix factorisation models, and goes beyond them to give a natural and elegant method for incorporating side information.
Lecturer in Probability and Statistics
• Collect big data from database using query languages such as SQL.
• Create competitive analysis and benchmarking study for account hijacking, recommend strategy ad- justments based on findings.
• Analyse hijacking trends within a specific set of products and develop an action plan based on trends and patterns.
• Analyse preventable abuse related issues which impact users, and identify core and common prevention focus areas across Product Quality Operation.
• Partner with engineering teams to improve our hijacking prevention, detection and recovery systems.
• Build statistical models to select relavant features and predict goodness/badness of clusters of accounts.
• Delivered excellent presentation and documentation.
• Build pricing models for Calendar Spread Options using Excel and VBA.
• Perform model calibration and validation, as well as hedging simulation for historical data.
• Delivered excellent results and received exceptional feedback from managers and colleagues.
• Build predictive models for bid-offer curves for forecasting in the European power market.
• Data sampling and manipulation using statistical and programming tools including R and Python.
• Have received excellent feedback and successfully implemented the pricing model which was in pro- duction.
• In charge of assisting with daily business and organising files in a group.
• Provide customer service and maintain relationships with clients in a fast-paced environment.
• Improved customers’ satisfaction by 10%.