Postgraduate research opportunities Novel time series machine learning methodology for high-dimensional data
ApplyKey facts
- Opens: Monday 6 April 2026
- Deadline: Friday 5 June 2026
- Number of places: 1
- Duration: 3 years
- Funding: Home fee, Stipend
Overview
This is a hot research topic of statistics and data science currently. We aim to propose new machine learning methodology to model high-dimensional time series data from health care, finance and environmental science. Statistical machine learning inference and probabilistic models for imputation of missing data and forecasting of high-dimensional data are explored. Applications to real data modelling are studied.Eligibility
Applicants with first-class or upper-second-class honours degree in Statistics or Applied Mathematics or Econometrics. Applicants with distinction MSc in Mathematics and Statistics or Econometrics. Outstanding oversea applicants with an equivalent degree.
Project Details
The development of AI-based methodology and architectures to improve financial, environmental and health care prediction accuracy, enhance data reliability, and support evidence-based policy has become the priorities of many countries including the UK.
This project aims to provide novel time series machine learning (TSML) methodology for imputation of missing values and forecasting high-dimensional data. The research is particularly innovative for the discrete-valued case, because there is a lack of such research in literature, to the best of our knowledge. We will work on the following two aspects of modelling high-dimensional time series.
Imputation of missing data in high-dimensional time series
High-dimensional data are common in fields such as finance, healthcare, and environmental science. They are normally recorded in time order to form high-dimensional time series datasets, for example, air pollution data across many locations. But they contain inevitably missing values, which hinder application of many analytical and statistical methods. Effective handling of these gaps is therefore essential before model development. In ultra-high-dimensional settings, filling in missing entries presents significant challenges for machine learning and statistical approaches. While existing techniques (e.g. Obata et al. (2024)) only suit the low-dimensional and continuous-valued case, the research for the high-dimensional or/and discrete-valued cases is in demand recently. In this project, relationships between components (so-called network structure) and temporal dependency are used jointly to obtain accurate imputation. We will introduce a novel framework that models evolve inter-correlations through Markov regime-switching network with large number of nodes, temporal dynamics by a state-space formulation (e.g. Fan et al (2020)), dimension reduction via factor models. We will also develop self-exciting spatio-temporal models for imputation, under assumption that the imputed data follows a nested family of continuous and discrete distributions, not only normal distributions.
Machine learning architectures for accurate and robust forecasting of high-dimensional time series
For imputed data, we develop new machine learning and statistical models for forecasting high-dimensional time series. We will develop deep learning models related to temporal convolutional networks and transformers, improve existing methods (e.g. Fan et al (2020), Obata et al. (2024)) and extend them to high-dimensional and discrete-valued cases by using transformer-based architectures, factor models (ref. Liu et al. (2025), Pan and Yao (2008)) and recent advances in probabilistic and statistical hybrid approaches. We also propose dynamic uncertainty quantification, combining Bayesian inference and quantile regression to enhance robustness and achieve probabilistic forecasting (ref. Dvijotham et al. (2023)).
Objectives
The objectives are:
- to develop machine learning architectures for high-dimensional time series modelling to improve accuracy and robustness in forecasting
- to detect anomalies and impute missing data in high-dimensional time series with minimal errors.
We will validate the proposed models through real-world case studies in forecasting, anomaly detection, and decision support.
Application areas include:
- financial forecasting and risk modelling across interconnected markets
- public health monitoring across multiple regions and large healthcare systems
- trend analysis of air and water pollution across districts
Relevant datasets will be drawn from public sources, industrial partners, and healthcare collaborators to ensure sufficient high-dimensional data for model evaluation
We will show that deployment of the proposed techniques makes imputing and forecasting possible and accurate in each of the applications.
The following outcomes are expected:
- publications in top-tier journals and conferences.
- open-source time series AI models for imputation and forecasting
- deployment-ready prototypes for selected applications
Funding details
The funding is worth £89,300. The successful applicant will receive money from their registered date 1 October 2026.
The difference between home fee and international fee for oversea applicants needs to be funded by other sources.
While there is no funding in place for opportunities marked "unfunded", there are lots of different options to help you fund postgraduate research. Visit funding your postgraduate research for links to government grants, research councils funding and more, that could be available.
Apply
Number of places: 1
There is a shortlist/interview process for this opportunity.
To read how we process personal data, applicants can review our 'Privacy Notice for Student Applicants and Potential Applicants' on our Privacy notices' web page.
Mathematics and Statistics - Statistics
Programme: Mathematics and Statistics - Statistics