Columbia University
Room 1005 SSW, MC 4690
1255 Amsterdam Avenue
New York, NY 10027
Phone: 212.851.2132
Fax: 212.851.2164
Researchers and statistics teachers are often tasked with writing an article or paper on a given stats project idea. One of the most crucial things in writing an outstanding and well-composed statistics research project, paper, or essay is to come up with a very interesting topic that will captivate your reader’s minds and provoke their thoughts.
Leading statistical research topics for college students that will interest you, project topics in statistics worth considering, the best idea for statistics project you can focus on, good experiments for statistics topics you should be writing on, what are the best ap statistics project ideas that will be of keen interest to you, good statistics project ideas suitable for our modern world, some of the most crucial survey topics for statistics project, statistical projects topics every researcher wants to write on, statistical research topics you can focus your research on.
Students often find it difficult to come up with well-composed statistical research project topics that take the format of argumentative essay topics to pass across their message. In this essay, we will look at some of the most interesting statistics research topics to focus your research on.
Here are some of the best statistical research topics worth writing on:
As a college student, here are the best statistical projects for high school students to focus your research on, especially if you need social media research topics .
Are you a student tasked with writing a project but can’t come up with befitting stats research topics? Here are the best ideas for statistical projects worth considering:
If you have been confused based on the availability of different statistics project topics to choose from, here are some of the best thesis statement about social media to choose from:
Are you a student tasked with writing an essay on social issues research topics but having challenges coming up with a topic? Here are some amazing statistical experiments ideas you can center your research on.
The best AP statistic project ideas every student especially those interested in research topics for STEM students will want to write in include:
If you need some of the best economics research paper topics , here are the best statistics experiment ideas you can write research on:
As a student who needs fresh ideas relating to the topic for a statistics project to write on, here are crucial survey topics for statistics that will interest you.
Do you want to write on unique statistical experiment ideas? Here are some topics you do not want to miss out on:
Here are some of the most carefully selected stat research topics you can focus on.
The above statistics final project examples will stimulate your curiosity and test your abilities, and they can even be linked to some biochemistry topics and anatomy research paper topics . Writing about these statistics project ideas helps provide a deeper grasp of the natural and social phenomena that affect our lives and the environment by studying these subjects.
Title | Author | Supervisor |
---|---|---|
Statistical Methods for the Analysis and Prediction of Hierarchical Time Series Data with Applications to Demography | ||
Exponential Family Models for Rich Preference Ranking Data | ||
Bayesian methods for variable selection | , | |
Statistical methods for genomic sequencing data | ||
Addressing double dipping through selective inference and data thinning | ||
Methods for the Statistical Analysis of Preferences, with Applications to Social Science Data | ||
Estimating subnational health and demographic indicators using complex survey data | ||
Inference and Estimation for Network Data | ||
Mixture models to fit heavy-tailed, heterogeneous or sparse data | , | |
Interpretation and Validation for unsupervised learning |
Title | Author | Supervisor |
---|---|---|
Likelihood-based haplotype frequency modeling using variable-order Markov chains | ||
Statistical Divergences for Learning and Inference: Limit Laws and Non-Asymptotic Bounds | , | |
Methods, Models, and Interpretations for Spatial-Temporal Public Health Applications | ||
Statistical Methods for Clustering and High Dimensional Time Series Analysis | ||
Causal Structure Learning in High Dimensions | , | |
Missing Data Methods for Observational Health Dataset | ||
Geometric algorithms for interpretable manifold learning |
Title | Author | Supervisor |
---|---|---|
Statistical modeling of long memory and uncontrolled effects in neural recordings | ||
Improving Uncertainty Quantification and Visualization for Spatiotemporal Earthquake Rate Models for the Pacific Northwest | , | |
Distribution-free consistent tests of independence via marginal and multivariate ranks | ||
Causality, Fairness, and Information in Peer Review | , | |
Subnational Estimation of Period Child Mortality in a Low and Middle Income Countries Context | ||
Progress in nonparametric minimax estimation and high dimensional hypothesis testing | , | |
Likelihood Analysis of Causal Models | ||
Bayesian Models in Population Projections and Climate Change Forecast |
Title | Author | Supervisor |
---|---|---|
Statistical Methods for Adaptive Immune Receptor Repertoire Analysis and Comparison | ||
Statistical Methods for Geospatial Modeling with Stratified Cluster Survey Data | ||
Representation Learning for Partitioning Problems | ||
Estimation and Inference in Changepoint Models | ||
Space-Time Contour Models for Sea Ice Forecasting | , | |
Non-Gaussian Graphical Models: Estimation with Score Matching and Causal Discovery under Zero-Inflation | , | |
Scalable Learning in Latent State Sequence Models |
Title | Author | Supervisor |
---|---|---|
Latent Variable Models for Prediction & Inference with Proxy Network Measures | ||
Bayesian Hierarchical Models and Moment Bounds for High-Dimensional Time Series | , | |
Inferring network structure from partially observed graphs | ||
Fitting Stochastics Epidemic Models to Multiple Data Types | ||
Realized genome sharing in random effects models for quantitative genetic traits | ||
Estimation and testing under shape constraints | , | |
Large-Scale B Cell Receptor Sequence Analysis Using Phylogenetics and Machine Learning | ||
Statistical Methods for Manifold Recovery and C^ (1, 1) Regression on Manifolds |
Title | Author | Supervisor |
---|---|---|
Topics in Statistics and Convex Geometry: Rounding, Sampling, and Interpolation | ||
Topics on Least Squares Estimation | ||
Discovering Interaction in Multivariate Time Series | ||
Nonparametric inference on monotone functions, with applications to observational studies | ||
Estimation and Testing Following Model Selection | ||
Model-Based Penalized Regression | ||
Bayesian Methods for Graphical Models with Limited Data | ||
Parameter Identification and Assessment of Independence in Multivariate Statistical Modeling | ||
Preferential sampling and model checking in phylodynamic inference | ||
Linear Structural Equation Models with Non-Gaussian Errors: Estimation and Discovery | ||
Coevolution Regression and Composite Likelihood Estimation for Social Networks |
Title | Author | Supervisor |
---|---|---|
"Scalable Manifold Learning and Related Topics" | ||
"Topics in Graph Clustering" | ||
"Methods for Estimation and Inference for High-Dimensional Models" | , | |
"Scalable Methods for the Inference of Identity by Descent" | ||
"Applications of Robust Statistical Methods in Quantitative Finance" |
Title | Author | Supervisor |
---|---|---|
"Testing Independence in High Dimensions & Identifiability of Graphical Models" | ||
"Likelihood-Based Inference for Partially Observed Multi-Type Markov Branching Processes" | ||
"Bayesian Methods for Inferring Gene Regulatory Networks" | , | |
"Finite Sampling Exponential Bounds" | ||
"Finite Population Inference for Causal Parameters" | ||
"Projection and Estimation of International Migration" | ||
"Statistical Hurdle Models for Single Cell Gene Expression: Differential Expression and Graphical Modeling" | , | |
"Space-Time Smoothing Models for Surveillance and Complex Survey Data" |
Title | Author | Supervisor |
---|---|---|
"Discrete-Time Threshold Regression for Survival Data with Time-Dependent Covariates" | ||
"Degeneracy, Duration, and Co-Evolution: Extending Exponential Random Graph Models (ERGM) for Social Network Analysis" | ||
"The Likelihood Pivot: Performing Inference with Confidence" | ||
"Lord's Paradox and Targeted Interventions: The Case of Special Education" | , | |
"Bayesian Modeling of a High Resolution Housing Price Index" | ||
"Phylogenetic Stochastic Mapping" | ||
"Theory and Methods for Tensor Data" |
Title | Author | Supervisor |
---|---|---|
"Monte Carlo Estimation of Identity by Descent in Populations" | ||
"Bayesian Spatial and Temporal Methods for Public Health Data" | , | |
"Functional Quantitative Genetics and the Missing Heritability Problem" | ||
"Predictive Modeling of Cholera Outbreaks in Bangladesh" | , | |
"Gravimetric Anomaly Detection Using Compressed Sensing" | ||
"R-Squared Inference Under Non-Normal Error" |
Title | Author | Supervisor |
---|---|---|
"An Algorithmic Framework for High Dimensional Regression with Dependent Variables" | ||
"Bayesian Population Reconstruction: A Method for Estimating Age- and Sex-Specific Vital Rates and Population Counts with Uncertainty from Fragmentary Data" | ||
"Bayesian Nonparametric Inference of Effective Population Size Trajectories from Genomic Data" | ||
"Modeling Heterogeneity Within and Between Matrices and Arrays" | ||
"Shape-Constrained Inference for Concave-Transformed Densities and their Modes" | ||
"Statistical Inference Using Kronecker Structured Covariance" | ||
"Learning and Manifolds: Leveraging the Intrinsic Geometry" |
Title | Author | Supervisor |
---|---|---|
"Tests for Differences between Least Squares and Robust Regression Parameter Estimates and Related To Pics" | ||
"Bayesian Modeling of Health Data in Space and Time" | ||
"Coordinate-Free Exponential Families on Contingency Tables" | , | |
"Bayesian Modeling For Multivariate Mixed Outcomes With Applications To Cognitive Testing Data" |
Title | Author | Supervisor |
---|---|---|
"Bayesian Inference of Exponential-family Random Graph Models for Social Networks" | ||
"Statistical Models for Estimating and Predicting HIV/AIDS Epidemics" | ||
"Modeling the Game of Soccer Using Potential Functions" | ||
"Parametrizations of Discrete Graphical Models" | ||
"A Bayesian Surveillance System for Detecting Clusters of Non-Infectious Diseases" | ||
"Statistical Approaches to Analyze Mass Spectrometry Data Graduating Year" | , | |
"Seeing the trees through the forest; a competition model for growth and mortality" |
Title | Author | Supervisor |
---|---|---|
"Covariance estimation in the Presence of Diverse Types of Data" | ||
"Portfolio Optimization with Tail Risk Measures and Non-Normal Returns" | ||
"Convex analysis methods in shape constrained estimation." | ||
"Estimating social contact networks to improve epidemic simulation models" | ||
"Multivariate Geostatistics and Geostatistical Model Averaging" |
Title | Author | Supervisor |
---|---|---|
"A comparison of alternative methodologies for estimation of HIV incidence" | ||
"Bayesian Model Averaging and Multivariate Conditional Independence Structures" | ||
"Conditional tests for localizing trait genes" | ||
"Combining and Evaluating Probabilistic Forecasts" | ||
"Probabilistic weather forecasting using Bayesian model averaging" | ||
"Statistical Analysis of Portfolio Risk and Performance Measures: the Influence Function Approach" | ||
"Factor Model Monte Carlo Methods for General Fund-of-Funds Portfolio Management" | ||
"Statistical Models for Social Network Data and Processes" | ||
"Models for Heterogeneity in Heterosexual Partnership Networks" |
Title | Author | Supervisor |
---|---|---|
"Models and Inference of Transmission of DNA Methylation Patterns in Mammalian Somatic Cells" | ||
"Estimates and projections of the total fertility rate" | ||
"Nonparametric estimation of multivariate monotone densities" | ||
"Learning transcriptional regulatory networks from the integration of heterogeneous high-throughout data" | ||
"Extensions of Latent Class Transition Models with Application to Chronic Disability Survey Data" | ||
"Statistical Solutions to Some Problems in Medical Imaging" | , | |
"Statistical methods for peptide and protein identification using mass spectrometry" | ||
"Inference from partially-observed network data" |
Title | Author | Supervisor |
---|---|---|
"Probabilistic weather forecasting with spatial dependence" | ||
"Wavelet variance analysis for time series and random fields" | , | |
"Bayesian hierarchical curve registration" | ||
""Up-and-Down" and the Percentile-Finding Problem" | ||
"Statistical Methodology for Longitudinal Social Network Data" |
Title | Author | Supervisor |
---|---|---|
"Learning in Spectral Clustering" | ||
"Variable selection and other extensions of the mixture model clustering framework" | ||
"Algorithms for Estimating the Cluster Tree of a Density" | ||
"Likelihood inference for population structure, using the coalescent" | ||
"Exploring rates and patterns of variability in gene conversion and crossover in the human genome" | ||
"Alleviating ecological bias in generalized linear models and optimal design with subsample data" | , | |
"Nonparametric estimation for current status data with competing risks" | , | |
"Goodness-of-fit statistics based on phi-divergences" | ||
"An efficient and flexible model for patterns of population genetic variation" |
Title | Author | Supervisor |
---|---|---|
"Alternative models for estimating genetic maps from pedigree data" | ||
"Allele-sharing methods for linkage detection using extended pedigrees" | ||
"Robust estimation of factor models in finance" | ||
"Using the structure of d-connecting paths as a qualitative measure of the strength of dependence" | , | |
"Alternative estimators of wavelet variance" | , , | |
"Bayesian robust analysis of gene expression microarray data" |
Title | Author | Supervisor |
---|---|---|
"Nonparametric estimation of a k-monotone density: A new asymptotic distribution theory" | ||
"Maximum likelihood estimation in Gaussian AMP chain graph models and Gaussian ancestral graph models" | , |
Title | Author | Supervisor |
---|---|---|
"The genetic structure of related recombinant lines" | ||
"Joint relationship inference from three or more individuals in the presence of genotyping error" | ||
"Personal characteristics and covariate measurement error in disease risk estimation" | , | |
"Model based and hybrid clustering of large datasets" | , |
Title | Author | Supervisor |
---|---|---|
"Applying graphical models to partially observed data-generating processes" | , | |
"Generalized linear mixed models: development and comparison of different estimation methods" | ||
"Practical importance sampling methods for finite mixture models and multiple imputation" |
Title | Author | Supervisor |
---|---|---|
"Bayesian inference for deterministic simulation models for environmental assessment" | ||
"Modeling recessive lethals: An explanation for excess sharing in siblings" | ||
"Estimation with bivariate interval censored data" | ||
"Latent models for cross-covariance" | , |
Title | Author | Supervisor |
---|---|---|
"Global covariance modeling: A deformation approach to anisotropy" | ||
"Likelihood inference for parameteric models of dispersal" | ||
"Bayesian inference in hidden stochastic population processes" | ||
"Logic regression and statistical issues related to the protein folding problem" | , | |
"Likelihood ratio inference in regular and non-regular problems" | ||
"Estimating the association between airborne particulate matter and elderly mortality in Seattle, Washington using Bayesian Model Averaging" | , | |
"Nonhomogeneous hidden Markov models for downscaling synoptic atmospheric patterns to precipitation amounts" | , | |
"Detecting and extracting complex patterns from images and realizations of spatial point processes" | ||
"A model selection approach to partially linear regression" | ||
"Wavelet-based estimation for trend contaminated long memory processes" | , |
Title | Author | Supervisor |
---|---|---|
"Bayesian inference for noninvertible deterministic simulation models, with application to bowhead whale assessment" | ||
"Monte Carlo likelihood calculation for identity by descent data" | ||
"Fast automatic unsupervised image segmentation and curve detection in spatial point processes" | ||
"Semiparametric inference based on estimating equations in regressions models for two phase outcome dependent sampling" | , | |
"Capture-recapture estimation of bowhead whale population size using photo-identification data" | , | |
"Lifetime and disease onset distributions from incomplete observations" | ||
"Statistical approaches to distinct value estimation" | , | |
"Generalization of boosting algorithms and applications of Bayesian inference for massive datasets" | , |
Title | Author | Supervisor |
---|---|---|
"Bayesian modeling of highly structured systems using Markov chain Monte Carlo" | ||
"Assessing nonstationary time series using wavelets" | , | |
"Lattice conditional independence models for incomplete multivariate data and for seemingly unrelated regressions" | , | |
"Estimation for counting processes with incomplete data" | ||
"Regularization techniques for linear regression with a large set of carriers" | ||
"Large sample theory for pseudo maximum likelihood estimates in semiparametric models" | ||
"Additive mixture models for multichannel image data" | ||
"Application of ridge regression for improved estimation of parameters in compartmental models" |
Title | Author | Supervisor |
---|---|---|
"Bayesian model averaging in censored survival models" | ||
"Bayesian information retrieval" | ||
"Statistical inference for partially observed markov population processes" | ||
"Tools for the advancement of undergraduate statistics education" | ||
"A new learning procedure in acyclic directed graphs" | ||
"Phylogenies via conditional independence modeling" |
Title | Author | Supervisor |
---|---|---|
"Variability estimation in linear inverse problems" | ||
"Inference in a discrete parameter space" | ||
"Bootstrapping functional m-estimators" |
Title | Author | Supervisor |
---|---|---|
"Semiparametric estimation of major gene and random environmental effects for age of onset" | ||
"Statistical analysis of biological monitoring data: State-space models for species compositions" | ||
"Estimation of heterogeneous space-time covariance" |
Title | Author | Supervisor |
---|---|---|
"Spatial applications of Markov chain Monte Carlo for bayesian inference" | ||
"Accounting for model uncertainty in linear regression" | ||
"Robust estimation in point processes" | ||
"Multilevel modeling of discrete event history data using Markov chain Monte Carlo methods" | ||
"Estimation in regression models with interval censoring" |
Title | Author | Supervisor |
---|---|---|
"State-space modeling of salmon migration and Monte Carlo Alternatives to the Kalman filter" | ||
"The Poisson clumping heuristic and the survival of genome in small pedigrees" | ||
"Markov chain Monte Carlo estimates of probabilities on complex structures" | ||
"A class of stochastic models for relating synoptic atmospheric patterns to local hydrologic phenomena" | ||
"A Bayesian framework and importance sampling methods for synthesizing multiple sources of evidence and uncertainty linked by a complex mechanistic model" |
Title | Author | Supervisor |
---|---|---|
"Auxiliary and missing covariate problems in failure time regression analysis" | ||
"A high order hidden markov model" | ||
"Bayesian methods for the analysis of misclassified or incomplete multivariate discrete data" |
Title | Author | Supervisor |
---|---|---|
"The weighted likelihood bootstrap and an algorithm for prepivoting" | ||
"General-weights bootstrap of the empirical process" |
Title | Author | Supervisor |
---|---|---|
"Modelling agricultural field trials in the presence of outliers and fertility jumps" | ||
"Modeling and bootstrapping for non-gaussian time series" | ||
"Genetic restoration on complex pedigrees" | ||
"Incorporating covariates into a beta-binomial model with applications to medicare policy: A Bayes/empirical Bayes approach" | ||
"Likelihood and exponential families" |
Title | Author | Supervisor |
---|---|---|
"Estimation of mixing and mixed distributions" | ||
"Classical inference in spatial statistics" |
Title | Author | Supervisor |
---|---|---|
"Exploratory methods for censored data" | ||
"Aspects of robust analysis in designed experiments" | ||
"Diagnostics for time series models" | ||
"Constrained cluster analysis and image understanding" |
Title | Author | Supervisor |
---|---|---|
"The data viewer: A program for graphical data analysis" | ||
"Additive principal components: A method for estimating additive constraints with small variance from multivariate data" | ||
"Kullback-Leibler estimation of probability measures with an application to clustering" | ||
"Time series models for continuous proportions" |
Title | Author | Supervisor |
---|---|---|
"Estimation for infinite variance autoregressive processes" | ||
"A computer system for Monte Carlo experimentation" |
Title | Author | Supervisor |
---|---|---|
"Robust estimation for the errors-in-variables model" | ||
"Robust statistics on compact metric spaces" | ||
"Weak convergence and a law of the iterated logarithm for processes indexed by points in a metric space" |
Title | Author | Supervisor |
---|---|---|
"The statistics of long memory processes" |
Dissertation Advisor: Andrew Wilson
Initial job placement: AI Scientist - AWS AI Labs
Dissertation Advisor: Sumanta Basu and Madeleine Udell
Initial job placement: Applied Scientist - Amazon
Dissertation Advisor: Giles Hooker
Initial job placement: Data & Applied Scientist - Microsoft
Dissertation Advisor: Giles Hooker and Martin Wells
Initial job placement: Microsoft
Dissertation Advisor: Martin Wells
Dissertation Advisor: Giles Hooker
Initial job placement: Data Scientist - Google
4c69b3a36a33a4c1c5b5cd3ef5360949.
This is a brief overview of thesis writing; for more information, please see our website here . Senior theses in Statistics cover a wide range of topics, across the spectrum from applied to theoretical. Typically, senior theses are expected to have one of the following three flavors:
1. Novel statistical theory or methodology, supported by extensive mathematical and/or simulation results, along with a clear account of how the research extends or relates to previous related work.
2. An analysis of a complex data set that advances understanding in a related field, such as public health, economics, government, or genetics. Such a thesis may rely entirely on existing methods, but should give useful results and insights into an interesting applied problem.
3. An analysis of a complex data set in which new methods or modifications of published methods are required. While the thesis does not necessarily contain an extensive mathematical study of the new methods, it should contain strong plausibility arguments or simulations supporting the use of the new methods.
A good thesis is clear, readable, and well-motivated, justifying the applicability of the methods used rather than, for example, mechanically running regressions without discussing the assumptions (and whether they are plausible), performing diagnostics, and checking whether the conclusions make sense.
Home > FACULTIES > Statistical and Actuarial Sciences > STATS-ETD
This collection contains theses and dissertations from the Department of Statistics and Actuarial Sciences, collected from the Scholarship@Western Electronic Thesis and Dissertation Repository
Studies of compound risk models with dependence and parameter uncertainty , Dechen Gao
Parameter Estimation for Normally Distributed Grouped Data and Clustering Single-Cell RNA Sequencing Data via the Expectation-Maximization Algorithm , Zahra Aghahosseinalishirazi
Statistical modelling and applications for sustainable-development goals , Yiyang Chen
Multivariate Regression Analysis for Data with Measurement Error, Missing Values, and/or Sparsity Structures , Jingyu Cui
Addressing the Impact of Time-Dependent Social Groupings on Animal Survival and Recapture Rates in Mark-Recapture Studies , Alexandru M. Draghici
Generalized Poisson random variables: Their distributional properties and actuarial applications , Pouya Faroughi
Optimizing Dynamic Treatment Regimes with Q-Learning: Complications due to Error-Prone Data and Applications to COVID-19 Data , Yasin Khadem Charvadeh
Estimating the spatial correlation structure of measurement error in functional magnetic resonance imaging (fMRI) to improve multivariate inference , Lingling Lin
Cyber risk valuation via a hidden Markov-modulated modelling approach , Yuying Li
Advances in Copula Estimation and Distribution Theory , Yishan Zang
Modelling long-term security returns , XINGHAN ZHU
Efficiency Improvements in the Least-Squares Monte Carlo Algorithm , François-Michel Boire
Portfolio Optimization Analysis in the Family of 4/2 Stochastic Volatility Models , Yuyang Cheng
Early-Warning Alert Systems for Financial-Instability Detection: An HMM-Driven Approach , Xing Gu
The Analysis of Mark-recapture Data with Individual Heterogeneity via the H-likelihood , Han-na Kim
Statistical Applications to the Management of Intensive Care and Step-down Units , Yawo Mamoua Kobara
Regression-based Methods for Dynamic Treatment Regimes with Mismeasured Covariates or Misclassified Response , Dan Liu
Statistical Roles of the G-expectation Framework in Model Uncertainty: the Semi-G-structure as a Stepping Stone , Yifan Li
Risk theory: data-driven models , Yang Miao
New Developments on the Estimability and the Estimation of Phase-Type Actuarial Models , Cong Nie
Copulas, maximal dependence, and anomaly detection in bi-variate time series , Ning Sun
Interdisciplinary Knowledge Exchange in Statistics with Applications in Fire Science and Statistical Education , Chelsea Uggenti
On the Geometry of Multi-Affine Polynomials , Junquan Xiao
Understanding Deep Learning with Noisy Labels , Li Yi
An Analysis of Weighted Least Squares Monte Carlo , Xiaotian Zhu
Application Of A Polynomial Affine Method In Dynamic Portfolio Choice , Yichen Zhu
A class of phase-type ageing models and their lifetime distributions , Boquan Cheng
Application of Stochastic Control to Portfolio Optimization and Energy Finance , Junhe Chen
Making Sense of Noisy Data: Theory and Applications , Lingzhi Chen
The Mean-Reverting 4/2 Stochastic Volatility Model: Properties And Financial Applications , Zhenxian Gong
Compound Sums, Their Distributions, and Actuarial Pricing , Ang Li
On the Estimation of Heston-Nandi GARCH Using Returns and/or Options: A Simulation-based Approach , Xize Ye
A Treatise of PD-LGD Correlation Modelling , Wisdom S. Avusuglo WSA
Visualization and Joint Analysis of Monitored Multivariate Spatio-Temporal Data with Applications to Forest Fire Modelling and Sports Analytics , Devan Becker
Generalized 4/2 Factor Model , Yuyang Cheng
Renewable-energy resources, economic growth and their causal link , Yiyang Chen
Some Insurance Options on Stochastic Drawdowns , Filip Dikic
Extensions of Classification Method Based on Quantiles , Yuanhao Lai
Point Process Modelling of Objects in the Star Formation Complexes of the M33 Galaxy , Dayi Li
Classification-based method for estimating dynamic treatment regimes , Junwei Shen
Statistical Methods with a Focus on Joint Outcome Modeling and on Methods for Fire Science , Da Zhong Xi
Ranking comments: An Entropy-based Method with Word Embedding Clustering , Yuyang Zhang
A computationally efficient methodology in pricing a guaranteed minimum accumulation benefit , Yiming Huang
Some Recent Developments on Pareto-optimal Reinsurance , Wenjun Jiang
Classification with Measurement Error in Covariates Or Response, with Application to Prostate Cancer Imaging Study , Kexin Luo
Exploring the Estimability of Mark-Recapture Models with Individual, Time-Varying Covariates using the Scaled Logit Link Function , Jiaqi Mu
Split credibility: A two-dimensional semi-linear credibility model , Jingbing Qiu
Advances in Moment-Based Distributional Methodologies , Yishan Zang
How to Rank Answers in Text Mining , Guandong Zhang
On the Sparre-Andersen Risk Models , Ruixi Zhang
Valuation and Risk Management of Some Longevity and P&C Insurance Products , Yixing Zhao
Modelling the Common Risk among Equities Using a New Time Series Model , Jingjia Chu
Stochastic modelling of implied correlation index and herd behavior index. Evidence, properties and pricing. , Lin Fang
Optimal Trading of a Storable Commodity via Forward Markets , Behzad Ghafouri
Statistical Modeling of CO2 Flux Data , Fang He
Advances in the Modeling of Heavy-tailed Distributions , Sang Jin Kang
The Statistical Exploration in the $G$-expectation Framework: The Pseudo Simulation and Estimation of Variance Uncertainty , Yifan Li
Statistical tools for assessment of spatial properties of mutations observed under the microarray platform , Bin Luo
Valuation of Multiple Exercise Option Using a Modified Longstaff and Schwartz Approach , Rahim Mohammadhasani Khorasany
Statistical Applications in Healthcare Systems , Maryam Mojalal
Exact Box-Cox Analysis , Samira Soleymani
Anisotropic kernel smoothing for change-point data with an analysis of fire spread rate variability , John Ronald James Thompson
Some applications of higher-order hidden Markov models in the exotic commodity markets , Heng Xiong
Advances in Semi-Nonparametric Density Estimation and Shrinkage Regression , Hossein Zareamoghaddam
Analysis Challenges for High Dimensional Data , Bangxin Zhao
Properties of k-isotropic functions , Tianpei Jiang
Data-Adaptive Kernel Support Vector Machine , Xin Liu
Annuity Product Valuation and Risk Measurement under Correlated Financial and Longevity Risks , Soohong Park
Statistical Modelling, Optimal Strategies and Decisions in Two-Period Economies , Jiang Wu
Joint Models for Spatial and Spatio-Temporal Point Processes , Alisha Albert-Green
Applications of Credit Scoring Models , Mimi Mei Ling Chong
Joint Analysis of Zero-heavy Longitudinal Outcomes: Models and Comparison of Study Designs , Erin R. Lundy
Data Smoothing Techniques: Historical and Modern , Lori L. Murray
Joint Modelling in Liver Transplantation , Elizabeth M. Renouf
Probability Models for Health Care Operations with Application to Emergency Medicine , Azaz Bin Sharif
Advances in Portmanteau Diagnostic Tests , Jinkun Xiao
Actuarial Modelling with Mixtures of Markov Chains , Yuzhou Zhang
Healthy And Unhealthy Statistics: Examining The Impact Of Erroneous Statistical Analyses In Health-Related Research , Britney Allen
Recent Advances in Accumulating Priority Queues , Na Li
Quantitative Techniques for Spread Trading in Commodity Markets , Mir Hashem Moosavi Avonleghi
A Novel Method for Assessing Co-monotonicity: an Interplay between Mathematics and Statistics with Applications , Danang T. Qoyyimi
Completely monotone and Bernstein functions with convexity properties on their measures , Shen Shan
Online Nonparametric Estimation of Stochastic Differential Equations , Xin Wang
On the Dual Risk Models , Chen Yang
Statistical methods for the analysis of RNA sequencing data , Man-Kee Maggie Chu
Valuation and Risk Measurement of Guaranteed Annuity Options under Stochastic Environment , Huan Gao
Statistical Applications in Wildfire Management and Prediction , Lengyi Han
Computing and Approximation Methods for the Distribution of Multivariate Aggregate Claims , Tao Jin
The Doubly Adaptive LASSO Methods for Time Series Analysis , Zi Zhen Liu
Risk models with dependence and perturbation , Zhong Li
Censored Time Series Analysis , Nagham Muslim Mohammad
A Spatial Analysis of Forest Fire Survival and a Marked Cluster Process for Simulating Fire Load , Amy A. Morin
Estimation of Hidden Markov Models and Their Applications in Finance , Anton Tenyakov
Perfect and Nearly Perfect Sampling of Work-conserving Queues , Yaofei Xiong
Decision Theory Based Models in Insurance and Beyond , Raymond Ye Zhang
Seasonal Decomposition for Geographical Time Series using Nonparametric Regression , Hyukjun Gweon
Stochastic simulation and spatial statistics of large datasets using parallel computing , Jonathan SW Lee
Flexible Partially Linear Single Index Regression Models for Multivariate Survival Data , Na Lei
Joint outcome modeling using shared frailties with application to temporal streamflow data , Lihua Li
Asymptotic Theory for GARCH-in-mean Models , Weiwei Liu
Advanced Search
Home | About | FAQ | My Account | Accessibility Statement | Privacy | Copyright
©1878 - 2016 Western University
50 Topic Ideas To Kickstart Your Research Project
If you’re just starting out exploring data science-related topics for your dissertation, thesis or research project, you’ve come to the right place. In this post, we’ll help kickstart your research by providing a hearty list of data science and analytics-related research ideas , including examples from recent studies.
PS – This is just the start…
We know it’s exciting to run through a list of research topics, but please keep in mind that this list is just a starting point . These topic ideas provided here are intentionally broad and generic , so keep in mind that you will need to develop them further. Nevertheless, they should inspire some ideas for your project.
To develop a suitable research topic, you’ll need to identify a clear and convincing research gap , and a viable plan to fill that gap. If this sounds foreign to you, check out our free research topic webinar that explores how to find and refine a high-quality research topic, from scratch. Alternatively, consider our 1-on-1 coaching service .
While the ideas we’ve presented above are a decent starting point for finding a research topic, they are fairly generic and non-specific. So, it helps to look at actual studies in the data science and analytics space to see how this all comes together in practice.
Below, we’ve included a selection of recent studies to help refine your thinking. These are actual studies, so they can provide some useful insight as to what a research topic looks like in practice.
As you can see, these research topics are a lot more focused than the generic topic ideas we presented earlier. So, for you to develop a high-quality research topic, you’ll need to get specific and laser-focused on a specific context with specific variables of interest. In the video below, we explore some other important things you’ll need to consider when crafting your research topic.
If you’re still unsure about how to find a quality research topic, check out our Research Topic Kickstarter service, which is the perfect starting point for developing a unique, well-justified research topic.
Your email address will not be published. Required fields are marked *
Save my name, email, and website in this browser for the next time I comment.
Home > Statistics > Dissertations, Theses, and Student Work
Department of statistics: dissertations, theses, and student work.
Examining the Effect of Word Embeddings and Preprocessing Methods on Fake News Detection , Jessica Hauschild
Exploring Experimental Design and Multivariate Analysis Techniques for Evaluating Community Structure of Bacteria in Microbiome Data , Kelsey Karnik
Human Perception of Exponentially Increasing Data Displayed on a Log Scale Evaluated Through Experimental Graphics Tasks , Emily Robinson
Factors Influencing Student Outcomes in a Large, Online Simulation-Based Introductory Statistics Course , Ella M. Burnham
Comparing Machine Learning Techniques with State-of-the-Art Parametric Prediction Models for Predicting Soybean Traits , Susweta Ray
Using Stability to Select a Shrinkage Method , Dean Dustin
Statistical Methodology to Establish a Benchmark for Evaluating Antimicrobial Resistance Genes through Real Time PCR assay , Enakshy Dutta
Group Testing Identification: Objective Functions, Implementation, and Multiplex Assays , Brianna D. Hitt
Community Impact on the Home Advantage within NCAA Men's Basketball , Erin O'Donnell
Optimal Design for a Causal Structure , Zaher Kmail
Role of Misclassification Estimates in Estimating Disease Prevalence and a Non-Linear Approach to Study Synchrony Using Heart Rate Variability in Chickens , Dola Pathak
A Characterization of a Value Added Model and a New Multi-Stage Model For Estimating Teacher Effects Within Small School Systems , Julie M. Garai
Methods to Account for Breed Composition in a Bayesian GWAS Method which Utilizes Haplotype Clusters , Danielle F. Wilson-Wells
Beta-Binomial Kriging: A New Approach to Modeling Spatially Correlated Proportions , Aimee Schwab
Simulations of a New Response-Adaptive Biased Coin Design , Aleksandra Stein
MODELING THE DYNAMIC PROCESSES OF CHALLENGE AND RECOVERY (STRESS AND STRAIN) OVER TIME , Fan Yang
A New Approach to Modeling Multivariate Time Series on Multiple Temporal Scales , Tucker Zeleny
A Reduced Bias Method of Estimating Variance Components in Generalized Linear Mixed Models , Elizabeth A. Claassen
NEW STATISTICAL METHODS FOR ANALYSIS OF HISTORICAL DATA FROM WILDLIFE POPULATIONS , Trevor Hefley
Informative Retesting for Hierarchical Group Testing , Michael S. Black
A Test for Detecting Changes in Closed Networks Based on the Number of Communications Between Nodes , Christopher S. Wichman
GROUP TESTING REGRESSION MODELS , Boan Zhang
A Comparison of Spatial Prediction Techniques Using Both Hard and Soft Data , Megan L. Liedtke Tesar
STUDYING THE HANDLING OF HEAT STRESSED CATTLE USING THE ADDITIVE BI-LOGISTIC MODEL TO FIT BODY TEMPERATURE , Fan Yang
Estimating Teacher Effects Using Value-Added Models , Jennifer L. Green
SEQUENCE COMPARISON AND STOCHASTIC MODEL BASED ON MULTI-ORDER MARKOV MODELS , Xiang Fang
DETECTING DIFFERENTIALLY EXPRESSED GENES WHILE CONTROLLING THE FALSE DISCOVERY RATE FOR MICROARRAY DATA , SHUO JIAO
Spatial Clustering Using the Likelihood Function , April Kerby
FULLY EXPONENTIAL LAPLACE APPROXIMATION EM ALGORITHM FOR NONLINEAR MIXED EFFECTS MODELS , Meijian Zhou
Advanced Search
Search Help
Home | About | FAQ | My Account | Accessibility Statement
Privacy Copyright
Run a free plagiarism check in 10 minutes, generate accurate citations for free.
Published on November 11, 2022 by Shona McCombes and Tegan George. Revised on November 20, 2023.
Choosing your dissertation topic is the first step in making sure your research goes as smoothly as possible. When choosing a topic, it’s important to consider:
You can follow these steps to begin narrowing down your ideas.
Step 1: check the requirements, step 2: choose a broad field of research, step 3: look for books and articles, step 4: find a niche, step 5: consider the type of research, step 6: determine the relevance, step 7: make sure it’s plausible, step 8: get your topic approved, other interesting articles, frequently asked questions about dissertation topics.
The very first step is to check your program’s requirements. This determines the scope of what it is possible for you to research.
Some programs have stricter requirements than others. You might be given nothing more than a word count and a deadline, or you might have a restricted list of topics and approaches to choose from. If in doubt about what is expected of you, always ask your supervisor or department coordinator.
Start by thinking about your areas of interest within the subject you’re studying. Examples of broad ideas include:
To get a more specific sense of the current state of research on your potential topic, skim through a few recent issues of the top journals in your field. Be sure to check out their most-cited articles in particular. For inspiration, you can also search Google Scholar , subject-specific databases , and your university library’s resources.
As you read, note down any specific ideas that interest you and make a shortlist of possible topics. If you’ve written other papers, such as a 3rd-year paper or a conference paper, consider how those topics can be broadened into a dissertation.
After doing some initial reading, it’s time to start narrowing down options for your potential topic. This can be a gradual process, and should get more and more specific as you go. For example, from the ideas above, you might narrow it down like this:
All of these topics are still broad enough that you’ll find a huge amount of books and articles about them. Try to find a specific niche where you can make your mark, such as: something not many people have researched yet, a question that’s still being debated, or a very current practical issue.
At this stage, make sure you have a few backup ideas — there’s still time to change your focus. If your topic doesn’t make it through the next few steps, you can try a different one. Later, you will narrow your focus down even more in your problem statement and research questions .
There are many different types of research , so at this stage, it’s a good idea to start thinking about what kind of approach you’ll take to your topic. Will you mainly focus on:
Many dissertations will combine more than one of these. Sometimes the type of research is obvious: if your topic is post-war Irish poetry, you will probably mainly be interpreting poems. But in other cases, there are several possible approaches. If your topic is reproductive rights in South America, you could analyze public policy documents and media coverage, or you could gather original data through interviews and surveys .
You don’t have to finalize your research design and methods yet, but the type of research will influence which aspects of the topic it’s possible to address, so it’s wise to consider this as you narrow down your ideas.
It’s important that your topic is interesting to you, but you’ll also have to make sure it’s academically, socially or practically relevant to your field.
The easiest way to make sure your research is relevant is to choose a topic that is clearly connected to current issues or debates, either in society at large or in your academic discipline. The relevance must be clearly stated when you define your research problem .
Before you make a final decision on your topic, consider again the length of your dissertation, the timeframe in which you have to complete it, and the practicalities of conducting the research.
Will you have enough time to read all the most important academic literature on this topic? If there’s too much information to tackle, consider narrowing your focus even more.
Will you be able to find enough sources or gather enough data to fulfil the requirements of the dissertation? If you think you might struggle to find information, consider broadening or shifting your focus.
Do you have to go to a specific location to gather data on the topic? Make sure that you have enough funding and practical access.
Last but not least, will the topic hold your interest for the length of the research process? To stay motivated, it’s important to choose something you’re enthusiastic about!
Most programmes will require you to submit a brief description of your topic, called a research prospectus or proposal .
Remember, if you discover that your topic is not as strong as you thought it was, it’s usually acceptable to change your mind and switch focus early in the dissertation process. Just make sure you have enough time to start on a new topic, and always check with your supervisor or department.
If you want to know more about the research process , methodology , research bias , or statistics , make sure to check out some of our other articles with explanations and examples.
Methodology
Statistics
Research bias
Formulating a main research question can be a difficult task. Overall, your question should contribute to solving the problem that you have defined in your problem statement .
However, it should also fulfill criteria in three main areas:
All research questions should be:
You can assess information and arguments critically by asking certain questions about the source. You can use the CRAAP test , focusing on the currency , relevance , authority , accuracy , and purpose of a source of information.
Ask questions such as:
A dissertation prospectus or proposal describes what or who you plan to research for your dissertation. It delves into why, when, where, and how you will do your research, as well as helps you choose a type of research to pursue. You should also determine whether you plan to pursue qualitative or quantitative methods and what your research design will look like.
It should outline all of the decisions you have taken about your project, from your dissertation topic to your hypotheses and research objectives , ready to be approved by your supervisor or committee.
Note that some departments require a defense component, where you present your prospectus to your committee orally.
The best way to remember the difference between a research plan and a research proposal is that they have fundamentally different audiences. A research plan helps you, the researcher, organize your thoughts. On the other hand, a dissertation proposal or research proposal aims to convince others (e.g., a supervisor, a funding body, or a dissertation committee) that your research topic is relevant and worthy of being conducted.
If you want to cite this source, you can copy and paste the citation or click the “Cite this Scribbr article” button to automatically add the citation to our free Citation Generator.
McCombes, S. & George, T. (2023, November 20). How to Choose a Dissertation Topic | 8 Steps to Follow. Scribbr. Retrieved June 28, 2024, from https://www.scribbr.com/research-process/dissertation-topic/
Other students also liked, how to define a research problem | ideas & examples, what is a research design | types, guide & examples, writing strong research questions | criteria & examples, "i thought ai proofreading was useless but..".
I've been using Scribbr for years now and I know it's a service that won't disappoint. It does a good job spotting mistakes”
Megamenu featured, megamenu social, math/stats thesis and colloquium topics.
Updated: April 2024
The degree with honors in Mathematics or Statistics is awarded to the student who has demonstrated outstanding intellectual achievement in a program of study which extends beyond the requirements of the major. The principal considerations for recommending a student for the degree with honors will be: Mastery of core material and skills, breadth and, particularly, depth of knowledge beyond the core material, ability to pursue independent study of mathematics or statistics, originality in methods of investigation, and, where appropriate, creativity in research.
An honors program normally consists of two semesters (MATH/STAT 493 and 494) and a winter study (WSP 031) of independent research, culminating in a thesis and a presentation. Under certain circumstances, the honors work can consist of coordinated study involving a one semester (MATH/STAT 493 or 494) and a winter study (WSP 030) of independent research, culminating in a “minithesis” and a presentation. At least one semester should be in addition to the major requirements, and thesis courses do not count as 400-level senior seminars.
An honors program in actuarial studies requires significant achievement on four appropriate examinations of the Society of Actuaries.
Highest honors will be reserved for the rare student who has displayed exceptional ability, achievement or originality. Such a student usually will have written a thesis, or pursued actuarial honors and written a mini-thesis. An outstanding student who writes a mini-thesis, or pursues actuarial honors and writes a paper, might also be considered. In all cases, the award of honors and highest honors is the decision of the Department.
Here is a list of possible colloquium topics that different faculty are willing and eager to advise. You can talk to several faculty about any colloquium topic, the sooner the better, at least a month or two before your talk. For various reasons faculty may or may not be willing or able to advise your colloquium, which is another reason to start early.
RESEARCH INTERESTS OF MATHEMATICS AND STATISTICS FACULTY
Here is a list of faculty interests and possible thesis topics. You may use this list to select a thesis topic or you can use the list below to get a general idea of the mathematical interests of our faculty.
Colin Adams (On Leave 2024 – 2025)
Research interests: Topology and tiling theory. I work in low-dimensional topology. Specifically, I work in the two fields of knot theory and hyperbolic 3-manifold theory and develop the connections between the two. Knot theory is the study of knotted circles in 3-space, and it has applications to chemistry, biology and physics. I am also interested in tiling theory and have been working with students in this area as well.
Hyperbolic 3-manifold theory utilizes hyperbolic geometry to understand 3-manifolds, which can be thought of as possible models of the spatial universe.
Possible thesis topics:
Possible colloquium topics : Particularly interested in topology, knot theory, graph theory, tiling theory and geometry but will consider other topics.
Christina Athanasouli
Research Interests: Differential equations, dynamical systems (both smooth and non-smooth), mathematical modeling with applications in biological and mechanical systems
My research focuses on analyzing mathematical models that describe various phenomena in Mathematical Neuroscience and Engineering. In particular, I work on understanding 1) the underlying mechanisms of human sleep (e.g. how sleep patterns change with development or due to perturbations), and 2) potential design or physical factors that may influence the dynamics in vibro-impact mechanical systems for the purpose of harvesting energy. Mathematically, I use various techniques from dynamical systems and incorporate both numerical and analytical tools in my work.
Possible colloquium topics: Topics in applied mathematics, such as:
Julie Blackwood
Research Interests: Mathematical modeling, theoretical ecology, population biology, differential equations, dynamical systems.
My research uses mathematical models to uncover the complex mechanisms generating ecological dynamics, and when applicable emphasis is placed on evaluating intervention programs. My research is in various ecological areas including ( I ) invasive species management by using mathematical and economic models to evaluate the costs and benefits of control strategies, and ( II ) disease ecology by evaluating competing mathematical models of the transmission dynamics for both human and wildlife diseases.
Each topic (1-3) can focus on a case study of a particular invasive species or disease, and/or can investigate the effects of ecological properties (spatial structure, resource availability, contact structure, etc.) of the system.
Possible colloquium topics: Any topics in applied mathematics, such as:
Research Interest : Statistical methodology and applications. One of my research topics is variable selection for high-dimensional data. I am interested in traditional and modern approaches for selecting variables from a large candidate set in different settings and studying the corresponding theoretical properties. The settings include linear model, partial linear model, survival analysis, dynamic networks, etc. Another part of my research studies the mediation model, which examines the underlying mechanism of how variables relate to each other. My research also involves applying existing methods and developing new procedures to model the correlated observations and capture the time-varying effect. I am also interested in applications of data mining and statistical learning methods, e.g., their applications in analyzing the rhetorical styles in English text data.
Possible colloquium topics: I am open to any problems in statistical methodology and applications, not limited to my research interests and the possible thesis topics above.
Richard De Veaux
Research interests: Statistics.
My research interests are in both statistical methodology and in statistical applications. For the first, I look at different methods and try to understand why some methods work well in particular settings, or more creatively, to try to come up with new methods. For the second, I work in collaboration with an investigator (e.g. scientist, doctor, marketing analyst) on a particular statistical application. I have been especially interested in problems dealing with large data sets and the associated modeling tools that work for these problems.
Possible colloquium topics:
Thomas Garrity (On Leave 2024 – 2025)
Research interest: Number Theory and Dynamics.
My area of research is officially called “multi-dimensional continued fraction algorithms,” an area that touches many different branches of mathematics (which is one reason it is both interesting and rich). In recent years, students writing theses with me have used serious tools from geometry, dynamics, ergodic theory, functional analysis, linear algebra, differentiability conditions, and combinatorics. (No single person has used all of these tools.) It is an area to see how mathematics is truly interrelated, forming one coherent whole.
While my original interest in this area stemmed from trying to find interesting methods for expressing real numbers as sequences of integers (the Hermite problem), over the years this has led to me interacting with many different mathematicians, and to me learning a whole lot of math. My theses students have had much the same experiences, including the emotional rush of discovery and the occasional despair of frustration. The whole experience of writing a thesis should be intense, and ultimately rewarding. Also, since this area of math has so many facets and has so many entrance points, I have had thesis students from wildly different mathematical backgrounds do wonderful work; hence all welcome.
Possible colloquium topics: Any interesting topic in mathematics.
Leo Goldmakher
Research interests: Number theory and arithmetic combinatorics.
I’m interested in quantifying structure and randomness within naturally occurring sets or sequences, such as the prime numbers, or the sequence of coefficients of a continued fraction, or a subset of a vector space. Doing so typically involves using ideas from analysis, probability, algebra, and combinatorics.
Possible thesis topics:
Anything in number theory or arithmetic combinatorics.
Possible colloquium topics: I’m happy to advise a colloquium in any area of math.
Susan Loepp
Research interests: Commutative Algebra. I study algebraic structures called commutative rings. Specifically, I have been investigating the relationship between local rings and their completion. One defines the completion of a ring by first defining a metric on the ring and then completing the ring with respect to that metric. I am interested in what kinds of algebraic properties a ring and its completion share. This relationship has proven to be intricate and quite surprising. I am also interested in the theory of tight closure, and Homological Algebra.
Topics in Commutative Algebra including:
Possible colloquium topics: Any topics in mathematics and especially commutative algebra/ring theory.
Steven Miller
For more information and references, see http://www.williams.edu/Mathematics/sjmiller/public_html/index.htm
Research interests : Analytic number theory, random matrix theory, probability and statistics, graph theory.
My main research interest is in the distribution of zeros of L-functions. The most studied of these is the Riemann zeta function, Sum_{n=1 to oo} 1/n^s. The importance of this function becomes apparent when we notice that it can also be written as Prod_{p prime} 1 / (1 – 1/p^s); this function relates properties of the primes to those of the integers (and we know where the integers are!). It turns out that the properties of zeros of L-functions are extremely useful in attacking questions in number theory. Interestingly, a terrific model for these zeros is given by random matrix theory: choose a large matrix at random and study its eigenvalues. This model also does a terrific job describing behavior ranging from heavy nuclei like Uranium to bus routes in Mexico! I’m studying several problems in random matrix theory, which also have applications to graph theory (building efficient networks). I am also working on several problems in probability and statistics, especially (but not limited to) sabermetrics (applying mathematical statistics to baseball) and Benford’s law of digit bias (which is often connected to fascinating questions about equidistribution). Many data sets have a preponderance of first digits equal to 1 (look at the first million Fibonacci numbers, and you’ll see a leading digit of 1 about 30% of the time). In addition to being of theoretical interest, applications range from the IRS (which uses it to detect tax fraud) to computer science (building more efficient computers). I’m exploring the subject with several colleagues in fields ranging from accounting to engineering to the social sciences.
Possible thesis topics:
Possible colloquium topics:
Plus anything you find interesting. I’m also interested in applications, and have worked on subjects ranging from accounting to computer science to geology to marketing….
Ralph Morrison
Research interests: I work in algebraic geometry, tropical geometry, graph theory (especially chip-firing games on graphs), and discrete geometry, as well as computer implementations that study these topics. Algebraic geometry is the study of solution sets to polynomial equations. Such a solution set is called a variety. Tropical geometry is a “skeletonized” version of algebraic geometry. We can take a classical variety and “tropicalize” it, giving us a tropical variety, which is a piecewise-linear subset of Euclidean space. Tropical geometry combines combinatorics, discrete geometry, and graph theory with classical algebraic geometry, and allows for developing theory and computations that tell us about the classical varieties. One flavor of this area of math is to study chip-firing games on graphs, which are motivated by (and applied to) questions about algebraic curves.
Possible thesis topics : Anything related to tropical geometry, algebraic geometry, chip-firing games (or other graph theory topics), and discrete geometry. Here are a few specific topics/questions:
Possible Colloquium topics: I’m happy to advise a talk in any area of math, but would be especially excited about talks related to algebra, geometry, graph theory, or discrete mathematics.
Shaoyang Ning (On Leave 2024 – 2025)
Research Interest : Statistical methodologies and applications. My research focuses on the study and design of statistical methods for integrative data analysis, in particular, to address the challenges of increasing complexity and connectivity arising from “Big Data”. I’m interested in innovating statistical methods that efficiently integrate multi-source, multi-resolution information to solve real-life problems. Instances include tracking localized influenza with Google search data and predicting cancer-targeting drugs with high-throughput genetic profiling data. Other interests include Bayesian methods, copula modeling, and nonparametric methods.
Possible colloquium topics: Any topics in statistical methodology and application, including but not limited to: topics in applied statistics, Bayesian methods, computational biology, statistical learning, “Big Data” mining, and other cross-disciplinary projects.
Anna Neufeld
Research interests: My research is motivated by the gap between classical statistical tools and practical data analysis. Classic statistical tools are designed for testing a single hypothesis about a single, pre-specified model. However, modern data analysis is an adaptive process that involves exploring the data, fitting several models, evaluating these models, and then testing a potentially large number of hypotheses about one or more selected models. With this in mind, I am interested in topics such as (1) methods for model validation and selection, (2) methods for testing data-driven hypotheses (post-selection inference), and (3) methods for testing a large number of hypotheses. I am also interested in any applied project where I can help a scientist rigorously answer an important question using data.
Allison Pacelli
Research interests: Math Education, Math & Politics, and Algebraic Number Theory.
Math Education. Math education is the study of the practice of teaching and learning mathematics, at all levels. For example, do high school calculus students learn best from lecture or inquiry-based learning? What mathematical content knowledge is critical for elementary school math teachers? Is a flipped classroom more effective than a traditional learning format? Many fascinating questions remain, at all levels of education. We can talk further to narrow down project ideas.
Math & Politics. The mathematics of voting and the mathematics of fair division are two fascinating topics in the field of mathematics and politics. Research questions look at types of voting systems, and the properties that we would want a voting system to satisfy, as well as the idea of fairness when splitting up a single object, like cake, or a collection of objects, such as after a divorce or a death.
Algebraic Number Theory. The Fundamental Theorem of Arithmetic states that the ring of integers is a unique factorization domain, that is, every integer can be uniquely factored into a product of primes. In other rings, there are analogues of prime numbers, but factorization into primes is not necessarily unique!
In order to determine whether factorization into primes is unique in the ring of integers of a number field or function field, it is useful to study the associated class group – the group of equivalence classes of ideals. The class group is trivial if and only if the ring is a unique factorization domain. Although the study of class groups dates back to Gauss and played a key role in the history of Fermat’s Last Theorem, many basic questions remain open.
Possible thesis topics:
Possible colloquium topics: Anything in number theory, algebra, or math & politics.
Anna Plantinga
Research interests: I am interested in both applied and methodological statistics. My research primarily involves problems related to statistical analysis within genetics, genomics, and in particular the human microbiome (the set of bacteria that live in and on a person). Current areas of interest include longitudinal data, distance-based analysis methods such as kernel machine regression, high-dimensional data, and structured data.
Any topics in statistical application, education, or methodology, including but not restricted to:
Cesar Silva
Research interests : Ergodic theory and measurable dynamics; in particular mixing properties and rank one examples, and infinite measure-preserving and nonsingular transformations and group actions. Measurable dynamics of transformations defined on the p-adic field. Measurable sensitivity. Fractals. Fractal Geometry.
Possible thesis topics: Ergodic Theory. Ergodic theory studies the probabilistic behavior of abstract dynamical systems. Dynamical systems are systems that change with time, such as the motion of the planets or of a pendulum. Abstract dynamical systems represent the state of a dynamical system by a point in a mathematical space (phase space). In many cases this space is assumed to be the unit interval [0,1) with Lebesgue measure. One usually assumes that time is measured at discrete intervals and so the law of motion of the system is represented by a single map (or transformation) of the phase space [0,1). In this case one studies various dynamical behaviors of these maps, such as ergodicity, weak mixing, and mixing. I am also interested in studying the measurable dynamics of systems defined on the p-adics numbers. The prerequisite is a first course in real analysis. Topological Dynamics. Dynamics on compact or locally compact spaces.
Topics in mathematics and in particular:
Mihai Stoiciu
Research interests: Mathematical Physics and Functional Analysis. I am interested in the study of the spectral properties of various operators arising from mathematical physics – especially the Schrodinger operator. In particular, I am investigating the distribution of the eigenvalues for special classes of self-adjoint and unitary random matrices.
Topics in mathematical physics, functional analysis and probability including:
Possible colloquium topics:
Any topics in mathematics, mathematical physics, functional analysis, or probability, such as:
Elizabeth Upton
Research Interests: My research interests center around network science, with a focus on regression methods for network-indexed data. Networks are used to capture the relationships between elements within a system. Examples include social networks, transportation networks, and biological networks. I also enjoy tackling problems with pragmatic applications and am therefore interested in applied interdisciplinary research.
Hire a Writer
Get an expert writer for your academic paper
Check Samples
Take a look at samples for quality assurance
Free customised dissertation topics for your assistance
Get a native to improve your language & writing
Enjoy quality dissertation help on any topic
Qualitative & Quantitative data analysis
Date published October 7 2021 by Jacob Miller
Statistics is a demanding subject that deals with the collection, analysis, interpretation, evaluation, and management of numeric data. The topic selection of the statistics dissertation can involve the subfields of statistics, i.e. Probability Theory, Mathematical Statistics, Design of Experiments, Sampling, Classification, and Time Series.
How “Dissertation Proposal” Can Help You!
Our top dissertation writing experts are waiting 24/7 to assist you with your university project, from critical literature reviews to a complete masters dissertation.
This subject is much complicated, further, the implication of the proportions in large quantities under complex theories contribute to the difficulties concerning the subject. That’s why it is hard to find considerable statistics dissertation topics. Moreover, the multiple dimensions of the subject make it more problematic to come up with a focused and comprehensive topic.
While selecting a topic for a statistics dissertation, you must consider the fundamental idea of statistics, i.e. variation and uncertainty. Certain statistical frameworks and methods are applied to get the results.
The topic of the statistics dissertation should be so close to the subject that you will be able the statistical method in the dissertation and presentation of findings.
There are several reasons which together make it a difficult task for the students to select a worthwhile topic for their statistics dissertation.
Students usually lack in generating potential ideas concerning different areas and aspects of the subject. That’s why they face difficulty in listing out the suitable statistics topics for the dissertation.
Statistics has a wide scope. It holds a relation with scientific, industrial, and social problems. So, a dissertation topic for this subject can never stand out alone. Due to this reason, students find it difficult to determine their direction and fail to select a potential topic.
Somehow, if students manage to come up with some understandable topics for their dissertation, the uncertainty of the context or the background leads them towards the confusion. They are unable to find a purpose and the background on which they can base their research.
While this all seems a pretty tough task, so then you may take inspiration from our free dissertation topics, and even better you can get the professional on those each topic.
We have skilled and professional subject experts, who bring the best ideas for your statistics dissertation selection. They are well aware of how to meet your subject requirements and professors’ expectations. Through their expertise, they help you select the most significant topics for your dissertation.
By selecting one of the strong statistics research topics we propose, you may contribute to the subject through your intellectual capabilities and unique ideas. While preparing a list of topic suggestions for you, we focus on the following points.
Our statistics dissertation experts are well-equipped with dense knowledge in the subject. They know which topic is worthy to be chosen for your dissertation. According to our experts, your topic must involve data collection, data analysis, and data synthesis.
You also must have to go through with several previous dissertations and research papers regarding the subject so that you can come up with a topic having fine scope, context, relevancy, and accuracy. Further, it should be concise and manageable so that you can complete a dissertation on it within the deadline.
You can avoid all these complexities by hiring our statistics dissertation topic selection services. Our experts have produced hundreds of successful works for the satisfaction of the customers. With vast experience in the world of academics and command of statistics dissertations, they have prepared the list of most suitable statistics dissertation topics.
Kernel regression using the four fourier transform, assessing and accounting for correlation in rna-seq data analysis., a guide to doing statistics in second language research using spss, prediction interval methods for reliability data, relevance of tests of significances uses and limitations., interaction forward selection in ultra-high-dimension functional linear models..
To know the details of the above-mentioned topics and have an idea about their aims and objectives, you can consult with our team. You are welcomed 24/7 to get our consultancy. Further, you can have more potential topics for your statistics dissertation topics by hiring our services.
Academic Level Undergraduate Masters PhD Others
Objectives:
To explore the methods of kernel regression
To demonstrate the method of speeding up the computation of kernel.
To analyse the FFT to improve the computation of kernel.
To explore the importance of statistics and probability.
To examine the different methods of statistics and probability used in education system.
To provide the need for collaborative and cross-disciplinary in researches.
To explore the concepts behind the usage of statistics in different domains.
To examine the concept of statistics in Second Language.
To study and implement the SPSS software in statistics.
To study the importance of Prediction in statistics.
To analyse the statistical Prediction methods in statistics theory.
To examine the different methods of Prediction interval under the parametric framework.
To study the importance of statistical tools and significance test both in parametric and nonparametric test.
To examine the statistical tools significance in decision making.
To evaluate the statistical significance test in information retrieval.
To study the statistical methods for the variable selection in ultra-high dimensional functional linear models.
To propose two forward selection procedures on the basis of coefficients approximation.
To demonstrate the application of the proposed methodologies.
To explore the different method of Bayes and its applications.
To examine the Bayes method for the purpose of biclustering and inference for mixture models.
To represent the performance of model through the simulation and applications to real datasets.
To study the concept behind the RNA- sequence data analysis and its procedure.
To examine the papers on the analysis of RNA- sequence data analysis.
To perform a simulation and validate the proposed methods on the basis of results.
To explore the techniques used in data analytics used for various purposes in order to produce visual charts.
To demonstrate the use of python language as a main feature in Data analytics.
View different varieties of dissertation topics and samples on multiple subjects for every educational level
Home » 500+ Statistics Research Topics
Statistics is a branch of mathematics that deals with the collection, analysis, interpretation, presentation, and organization of data . It is a fundamental tool used in various fields such as business, social sciences, engineering, healthcare, and many more. As a research topic , statistics can be a fascinating subject to explore, as it allows researchers to investigate patterns, trends, and relationships within data. With the help of statistical methods, researchers can make informed decisions and draw valid conclusions based on empirical evidence. In this post, we will explore some interesting statistics research topics that can be pursued by researchers to further expand our understanding of this field.
Statistics Research Topics are as follows:
Researcher, Academic Writer, Web developer
Necessary cookies.
Necessary cookies enable core functionality. The website cannot function properly without these cookies, and can only be disabled by changing your browser preferences.
Analytical cookies help us improve our website. We use Google Analytics. All data is anonymised.
Clarity helps us to understand our users’ behaviour by visually representing their clicks, taps and scrolling. All data is anonymised.
Privacy policy
Below are sample topics available for prospective postgraduate research students. These sample topics do not contain every possible project; they are aimed at giving an impression of the breadth of different topics available. Most prospective supervisors would be more than happy to discuss projects not listed below.
Funded projects are projects with project-specific funding. Funding for other projects is usally available on a competitive basis.
Information about postgraduate research opportunities and how to apply can be found on the Postgraduate Research Study page . Below is a selection of projects that could be undertaken with our group.
Supervisors: Jethro Browell Relevant research groups: Modelling in Space and Time , Computational Statistics , Applied Probability and Stochastic Processes
Many decisions are informed by forecasts, and almost all forecasts are uncertain to some degree. Probabilistic forecasts quantify uncertainty to help improve decision-making and are playing an important role in fields including weather forecasting, economics, energy, and public policy. Evaluating the quality of past forecasts is essential to give forecasters and forecast users confidence in their current predictions, and to compare the performance of forecasting systems.
While the principles of probabilistic forecast evaluation have been established over the past 15 years, most notably that of “ sharpness subject to calibration/reliability” , we lack a complete toolkit for applying these principles in many situations, especially those that arise in high-dimensional settings. Furthermore, forecast evaluation must be interpretable by forecast users as well as expert forecasts, and assigning value to marginal improvements in forecast quality remains a challenge in many sectors.
This PhD will develop new statistical methods for probabilistic forecast evaluation considering some of the following issues:
Good knowledge of multivariate statistics is essential, prior knowledge of probabilistic forecasting and forecast evaluation would be an advantage.
Supervisors: Jethro Browell Relevant research groups: Modelling in Space and Time , Computational Statistics , Applied Probability and Stochastic Processes
Data-driven predictive models depend on the representativeness of data used in model selection and estimation. However, many processes change over time meaning that recent data is more representative than old data. In this situation, predictive models should track these changes, which is the aim of “online” or “adaptive” algorithms. Furthermore, many users of forecasts require probabilistic forecasts, which quantify uncertainty, to inform their decision-making. Existing adaptive methods such as Recursive Least Squares, the Kalman Filter have been very successful for adaptive point forecasting, but adaptive probabilistic forecasting has received little attention. This PhD will develop methods for adaptive probabilistic forecasting from a theoretical perspective and with a view to apply these methods to problems in at least one application area to be determined.
In the context of adaptive probabilistic forecasting, this PhD may consider:
A good knowledge of methods for time series analysis and regression is essential, familiarity with flexible regression (GAMs) and distributional regression (GAMLSS/quantile regression) would be an advantage.
Supervisors: Vincent Macaulay Relevant research groups: Bayesian Modelling and Inference , Modelling in Space and Time , Statistical Modelling for Biology, Genetics and *omics
Shapes of objects change in time. Organisms evolve and in the process change form: humans and chimpanzees derive from some common ancestor presumably different from either in shape. Designed objects are no different: an Art Deco tea pot from the 1920s might share some features with one from Ikea in 2010, but they are different. Mathematical models of evolution for certain data types, like the strings of As, Gs , Cs and Ts in our evolving DNA, are quite mature and allow us to learn about the relationships of the objects (their phylogeny or family tree), about the changes that happen to them in time (the evolutionary process) and about the ways objects were configured in the past (the ancestral states), by statistical techniques like phylogenetic analysis. Such techniques for shape data are still in their infancy. This project will develop novel statistical inference approaches (in a Bayesian context) for complex data objects, like functions, surfaces and shapes, using Gaussian-process models, with potential application in fields as diverse as language evolution, morphometrics and industrial design.
Supervisors: Janine Illian Relevant research groups: Modelling in Space and Time , Bayesian Modelling and Inference , Computational Statistics , Environmental, Ecological Sciences and Sustainability
Joint project with Dr Urška Demšar (University of St Andrews)
Migratory birds travel annually across vast expanses of oceans and continents to reach their destination with incredible accuracy. How they are able to do this using only locally available cues is still not fully understood. Migratory navigation consists of two processes: birds either identify the direction in which to fly (compass orientation) or the location where they are at a specific moment in time (geographic positioning). One of the possible ways they do this is to use information from the Earth’s magnetic field in the so-called geomagnetic navigation (Mouritsen 2018). While there is substantial evidence (both physiological and behavioural) that they do sense magnetic field (Deutschlander and Beason 2014), we however still do not know exactly which of the components of the field they use for orientation or positioning. We also do not understand how rapid changes in the field affect movement behaviour.
There is a possibility that birds can sense these rapid large changes and that this may affect their navigational process. To study this, we need to link accurate data on Earth’s magnetic field with animal tracking data. This has only become possible very recently through new spatial data science advances: we developed the MagGeo tool, which links contemporaneous geomagnetic data from Swarm satellites of the European Space Agency with animal tracking data (Benitez Paez et al. 2021).
Linking geomagnetic data to animal tracking data however creates a highly-dimensional data set, which is difficult to explore. Typical analyses of contextual environmental information in ecology include representing contextual variables as co-variates in relatively simple statistical models (Brum Bastos et al. 2021), but this is not sufficient for studying detailed navigational behaviour. This project will analyse complex spatio-temporal data using computationally efficient statistical model fitting approches in a Bayesian context.
This project is fully based on open data to support reproducibility and open science. We will test our new methods by annotating publicly available bird tracking data (e.g. from repositories such as Movebank.org), using the open MagGeo tool and implementing our new methods as Free and Open Source Software (R/Python).
Benitez Paez F, Brum Bastos VdS, Beggan CD, Long JA and Demšar U, 2021. Fusion of wildlife tracking and satellite geomagnetic data for the study of animal migration. Movement Ecology , 9:31. https://doi.org/10.1186/s40462-021-00268-4
Brum Bastos VdS, Łos M, Long JA, Nelson T and Demšar U, 2021, Context-aware movement analysis in ecology: a systematic review. International Journal of Geographic Information Science , https://doi.org/10.1080/13658816.2021.1962528
Deutschlander ME and Beason RC, 2014. Avian navigation and geographic positioning. Journal of Field Ornithology , 85(2):111–133. https://doi.org/10.1111/jofo.12055
Supervisors: Janine Illian Relevant research groups: Modelling in Space and Time , Bayesian Modelling and Inference , Computational Statistics , Environmental, Ecological Sciences and Sustainability
(Jointly supervised by Peter Henrys, CEH)
The last decade has seen a proliferation of environmental data with vast quantities of information available from various sources. This has been due to a number of different factors including: the advent of sensor technologies; the provision of remotely sensed data from both drones and satellites; and the explosion in citizen science initiatives. These data represent a step change in the resolution of available data across space and time - sensors can be streaming data at a resolution of seconds whereas citizen science observations can be in the hundreds of thousands.
Over the same period, the resources available for traditional field surveys have decreased dramatically whilst logistical issues (such as access to sites, ) have increased. This has severely impacted the ability for field survey campaigns to collect data at high spatial and temporal resolutions. It is exactly this sort of information that is required to fit models that can quantify and predict the spread of invasive species, for example.
Whilst we have seen an explosion of data across various sources, there is no single source that provides both the spatial and temporal intensity that may be required when fitting complex spatio-temporal models (cf invasive species example) - each has its own advantages and benefits in terms of information content. There is therefore potentially huge benefit in beginning together data from these different sources within a consistent framework to exploit the benefits each offers and to understand processes at unprecedented resolutions/scales that would be impossible to monitor.
Current approaches to combining data in this way are typically very bespoke and involve complex model structures that are not reusable outside of the particular application area. What is needed is an overarching generic methodological framework and associated software solutions to implement such analyses. Not only would such a framework provide the methodological basis to enable researchers to benefit from this big data revolution, but also the capability to change such analyses from being stand alone research projects in their own right, to more operational, standard analytical routines.
FInally, such dynamic, integrated analyses could feedback into data collection initiatives to ensure optimal allocation of effort for traditional surveys or optimal power management for sensor networks. The major step change being that this optimal allocation of effort is conditional on other data that is available. So, for example, given the coverage and intensity of the citizen science data, where should we optimally send our paid surveyors? The idea is that information is collected at times and locations that provide the greatest benefit in understanding the underpinning stochastic processes. These two major issues - integrated analyses and adaptive sampling - ensure that environmental monitoring is fit for purpose and scientists, policy and industry can benefit from the big data revolution.
This project will develop an integrated statistical modelling strategy that provides a single modelling framework for enabling quantification of ecosystem goods and services while accounting for the fundamental differences in different data streams. Data collected at different spatial resolutions can be used within the same model through projecting it into continuous space and projecting it back into the landscape level of interest. As a result, decisions can be made at the relevant spatial scale and uncertainty is propagated through, facilitating appropriate decision making.
(jointly supervised by Esther Jones and Adam Butler, BIOSS)
Assessing the impacts of offshore renewable developments on marine wildlife is a critical component of the consenting process. A NERC-funded project, ECOWINGS, will provide a step-change in analysing predator-prey dynamics in the marine environment, collecting data across trophic levels against a backdrop of developing wind farms and climate change. Aerial survey and GPS data from multiple species of seabirds will be collected contemporaneously alongside prey data available over the whole water column from an automated surface vehicle and underwater drone.
These methods of data collection will generate 3D space and time profiles of predators and prey, creating a rich source of information and enormous potential for modelling and interrogation. The data present a unique opportunity for experimental design across a dynamic and changing marine ecosystem, which is heavily influenced by local and global anthropogenic activities. However, these data have complex intrinsic spatio-temporal properties, which are challenging to analyse. Significant statistical methods development could be achieved using this system as a case study, contributing to the scientific knowledge base not only in offshore renewables but more generally in the many circumstances where patchy ecological spatio-temporal data are available.
This PhD project will develop spatio-temporal modelling methodology that will allow user to anaylse these exciting - and complex - data sets and help inform our knowledge on the impact of off-shore renewable on wildlife.
Supervisors: Surajit Ray Relevant research groups: Modelling in Space and Time , Computational Statistics , Nonparametric and Semi-parametric Statistics , Imaging, Image Processing and Image Analysis
Historically, functional data analysis techniques have widely been used to analyze traditional time series data, albeit from a different perspective. Of late, FDA techniques are increasingly being used in domains such as environmental science, where the data are spatio-temporal in nature and hence is it typical to consider such data as functional data where the functions are correlated in time or space. An example where modeling the dependencies is crucial is in analyzing remotely sensed data observed over a number of years across the surface of the earth, where each year forms a single functional data object. One might be interested in decomposing the overall variation across space and time and attribute it to covariates of interest. Another interesting class of data with dependence structure consists of weather data on several variables collected from balloons where the domain of the functions is a vertical strip in the atmosphere, and the data are spatially correlated. One of the challenges in such type of data is the problem of missingness, to address which one needs develop appropriate spatial smoothing techniques for spatially dependent functional data. There are also interesting design of experiment issues, as well as questions of data calibration to account for the variability in sensing instruments. Inspite of the research initiative in analyzing dependent functional data there are several unresolved problems, which the student will work on:
Supervisors: Duncan Lee Relevant research groups: Modelling in Space and Time , Biostatistics, Epidemiology and Health Applications
The health impact of exposure to air pollution is thought to reduce average life expectancy by six months, with an estimated equivalent health cost of 19 billion each year (from DEFRA). These effects have been estimated using statistical models, which quantify the impact on human health of exposure in both the short and the long term. However, the estimation of such effects is challenging, because individual level measures of health and pollution exposure are not available. Therefore, the majority of studies are conducted at the population level, and the resulting inference can only be made about the effects of pollution on overall population health. However, the data used in such studies are spatially misaligned, as the health data relate to extended areas such as cities or electoral wards, while the pollution concentrations are measured at individual locations. Furthermore, pollution monitors are typically located where concentrations are thought to be highest, known as preferential sampling, which is likely to result in overly high measurements being recorded. This project aims to develop statistical methodology to address these problems, and thus provide a less biased estimate of the effects of pollution on health than are currently produced.
Disease risk varies over space and time, due to similar variation in environmental exposures such as air pollution and risk inducing behaviours such as smoking. Modelling the spatio-temporal pattern in disease risk is known as disease mapping, and the aims are to: quantify the spatial pattern in disease risk to determine the extent of health inequalities, determine whether there has been any increase or reduction in the risk over time, identify the locations of clusters of areas at elevated risk, and quantify the impact of exposures, such as air pollution, on disease risk. I am working on all these related problems at present, and I have PhD projects in all these areas.
Supervisors: Craig Anderson Relevant research groups: Modelling in Space and Time , Bayesian Modelling and Inference , Biostatistics, Epidemiology and Health Applications
The prevalence of disease is typically not constant across space – instead the risk tends to vary from one region to another. Some of this variability may be down to environmental conditions, but many of them are driven by socio-economic differences between regions, with poorer regions tending to have worse health than wealthier regions. For example, within the the Greater Glasgow and Clyde region, where the World Health Organisation noted that life expectancy ranges from 54 in Calton to 82 in Lenzie, despite these areas being less than 10 miles apart. There is substantial value to health professionals and policymakers in identifying some of the causes behind these localised health inequalities.
Disease mapping is a field of statistical epidemiology which focuses on estimating the patterns of disease risk across a geographical region. The main goal of such mapping is typically to identify regions of high disease risk so that relevant public health interventions can be made. This project involves the development of statistical models which will enhance our understanding regional differences in the risk of suffering from major diseases by focusing on these localised health inequalities.
Standard Bayesian hierarchical models with a conditional autoregressive prior are frequently used for risk estimation in this context, but these models assume a smooth risk surface which is often not appropriate in practice. In reality, it will often be the case that different regions have vastly different risk profiles and require different data generating functions as a result.
In this work we propose a mixture model based approach which allows different sub-populations to be represented by different underlying statistical distributions within a single modelling framework. By integrating CAR models into mixture models, researchers can simultaneously account for spatial dependencies and identify distinct disease patterns within subpopulations.
Modelling genetic variation (msc/phd).
Supervisors: Vincent Macaulay Relevant research groups: Bayesian Modelling and Inference , Statistical Modelling for Biology, Genetics and *omics
Variation in the distribution of different DNA sequences across individuals has been shaped by many processes which can be modelled probabilistically, processes such as demographic factors like prehistoric population movements, or natural selection. This project involves developing new techniques for teasing out information on those processes from the wealth of raw data that is now being generated by high-throughput genetic assays, and is likely to involve computationally-intensive sampling techniques to approximate the posterior distribution of parameters of interest. The characterization of the amount of population structure on different geographical scales will influence the design of experiments to identify the genetic variants that increase risk of complex diseases, such as diabetes or heart disease.
Supervisors: Vincent Macaulay Relevant research groups: Bayesian Modelling and Inference , Modelling in Space and Time , Statistical Modelling for Biology, Genetics and *omics
Integrated spatio-temporal modelling for environmental data (phd), statistical methodology for assessing the impacts of offshore renewable developments on marine wildlife (phd).
This PhD project will develop spatio-temporal modelling methodology that will allow user to anaylse these exciting - and complex - data sets and help inform our knowledge on the impact of off-shore renewable on wildlife.
Supervisors: Mayetri Gupta Relevant research groups: Bayesian Modelling and Inference , Computational Statistics , Statistical Modelling for Biology, Genetics and *omics
An important issue in high-dimensional regression problems is the accurate and efficient estimation of models when, compared to the number of data points, a substantially larger number of potential predictors are present. Further complications arise with correlated predictors, leading to the breakdown of standard statistical models for inference; and the uncertain definition of the outcome variable, which is often a varying composition of several different observable traits. Examples of such problems arise in many scenarios in genomics- in determining expression patterns of genes that may be responsible for a type of cancer; and in determining which genetic mutations lead to higher risks for occurrence of a disease. This project involves developing broad and improved Bayesian methodologies for efficient inference in high-dimensional regression-type problems with complex multivariate outcomes, with a focus on genetic data applications.
The successful candidate should have a strong background in methodological and applied Statistics, expert skills in relevant statistical software or programming languages (such as R, C/C++/Python), and also have a deep interest in developing knowledge in cross-disciplinary topics in genomics. The candidate will be expected to consolidate and master an extensive range of topics in modern Statistical theory and applications during their PhD, including advanced Bayesian modelling and computation, latent variable models, machine learning, and methods for Big Data. The successful candidate will be considered for funding to cover domestic tuition fees, as well as paying a stipend at the Research Council rate for four years.
Supervisors: Mayetri Gupta Relevant research groups: Bayesian Modelling and Inference , Computational Statistics , Statistical Modelling for Biology, Genetics and *omics , Biostatistics, Epidemiology and Health Applications
In recent years, many different computational methods to analyse biological data have been established: including DNA (Genomics), RNA (Transcriptomics), Proteins (proteomics) and Metabolomics, that captures more dynamic events. These methods were refined by the advent of single cell technology, where it is now possible to capture the transcriptomics profile of single cells, spatial arrangements of cells from flow methods or imaging methods like functional magnetic resonance imaging. At the same time, these OMICS data can be complemented with clinical data – measurement of patients, like age, smoking status, phenotype of disease or drug treatment. It is an interesting and important open statistical question how to combine data from different “modalities” (like transcriptome with clinical data or imaging data) in a statistically valid way, to compare different datasets and make justifiable statistical inferences. This PhD project will be jointly supervised with Dr. Thomas Otto and Prof. Stefan Siebert from the Institute of Infection, Immunity & Inflammation ), you will explore how to combine different datasets using Bayesian latent variable modelling, focusing on clinical datasets from Rheumatoid Arthritis.
Funding Notes
The successful candidate will be considered for funding to cover domestic tuition fees, as well as paying a stipend at the Research Council rate for four years.
Supervisors: Vinny Davies , Richard Reeve Relevant research groups: Bayesian Modelling and Inference , Computational Statistics , Environmental, Ecological Sciences and Sustainability , Statistical Modelling for Biology, Genetics and *omics
The functional traits and environmental preferences of plant species determine how they will react to changes resulting from global warming. The main global biodiversity repositories, such as the Global Biodiversity Information Facility ( GBIF ), contain hundreds of millions of records from hundreds of thousands of species in the plant kingdom alone, and the spatiotemporal data in these records can be associated with soil, climate or other environmental data from other databases. Combining these records allow us to identify environmental preferences, especially for common species where many records exist. Furthermore, in a previous PhD studentship we showed that these traits are highly evolutionarily conserved ( Harris et al., 2022 ), so it is possible to impute the preferences for rare species where little data exists using phylogenetic inference techniques.
The aim of this PhD project is to investigate the application of Bayesian variable selection methods to identify these evolutionarily conserved traits more effectively, and to quantify these traits and their associated uncertainty for all plant species for use in a plant ecosystem digital twin that we are developing separately to forecast the impact of climate change on biodiversity. In another PhD studentship, we previously developed similar methods for trait inference in viral evolution ( Davies et al., 2017 ; Davies et al., 2019 ), but due to the scale of the data here, these methods will need to be significantly enhanced. We therefore propose a project to investigate extensions to methods for phylogenetic trait inference to handle datasets involving hundreds of millions of records in phylogenies with hundreds of thousands of tips, potentially through either sub-sampling ( Quiroz et al, 2018 ) or modelling splitting and recombination ( Nemeth & Sherlock, 2018 ).
Supervisors: Jethro Browell Relevant research groups: Modelling in Space and Time , Computational Statistics , Applied Probability and Stochastic Processes
Supervisors: Jethro Browell Relevant research groups: Modelling in Space and Time , Computational Statistics , Applied Probability and Stochastic Processes
This project will develop an integrated statistical modelling strategy that provides a single modelling framework for enabling quantification of ecosystem goods and services while accounting for the fundamental differences in different data streams. Data collected at different spatial resolutions can be used within the same model through projecting it into continuous space and projecting it back into the landscape level of interest. As a result, decisions can be made at the relevant spatial scale and uncertainty is propagated through, facilitating appropriate decision making.
Bayesian variable selection for genetic and genomic studies (phd), bayesian statistical data integration of single-cell and bulk “omics” datasets with clinical parameters for accurate prediction of treatment outcomes in rheumatoid arthritis (phd), scalable bayesian models for inferring evolutionary traits of plants (phd).
The aim of this PhD project is to investigate the application of Bayesian variable selection methods to identify these evolutionarily conserved traits more effectively, and to quantify these traits and their associated uncertainty for all plant species for use in a plant ecosystem digital twin that we are developing separately to forecast the impact of climate change on biodiversity. In another PhD studentship, we previously developed similar methods for trait inference in viral evolution ( Davies et al., 2017 ; Davies et al., 2019 ), but due to the scale of the data here, these methods will need to be significantly enhanced. We therefore propose a project to investigate extensions to methods for phylogenetic trait inference to handle datasets involving hundreds of millions of records in phylogenies with hundreds of thousands of tips, potentially through either sub-sampling ( Quiroz et al, 2018 ) or modelling splitting and recombination ( Nemeth & Sherlock, 2018 ).
Supervisors: Vinny Davies , Craig Alexander Relevant research groups: Computational Statistics , Machine Learning and AI , Emulation and Uncertainty Quantification , Statistical Modelling for Biology, Genetics and *omics , Statistics in Chemistry/Physics
Untargeted metabolomics experiments aim to identify the small molecules that make up a particular sample (e.g. , blood), allowing us to identify biomarkers, discover new chemicals, or understand the metabolism ( Smith et al., 2014 ) . Data Dependent Acquisition (DDA) methods are used to collect the information needed to identify the metabolites , and various more advanced DDA methods have recently been designed to improve this process ( Davies et al. (2021) ; McBride et al. (2023) ) . Each of these methods , however, ha s parameters that must be chosen in order to maximise the amount of relevant data (metabolite spectra) that is collected . Our recent work led to the design of a Virtual Metabolomics Mass Spectrometer ( ViMMS ) in which we can run computer simulations of experiments and test different parameter settings ( Wandy et al., 2019 , 2022 ). Previously this has involve d running a pre-determined set of parameters as part of a grid search in ViMMS , and then choosing the best parameter settings based on a single measure of performance. The proposed M . Res . (or Ph . D . ) will extend this appro ach by using multi objective Bayesian Optimisation to adapt simulations and optimise over multiple different measurements of quality . By optimising parameters in this manner, we can help improve real experiments currently underway at the University of Glasgow and beyond.
Nonparametric and semi-parametric statistics - example research projects, modality of mixtures of distributions (phd).
Supervisors: Surajit Ray Relevant research groups: Nonparametric and Semi-parametric Statistics , Applied Probability and Stochastic Processes , Statistical Modelling for Biology, Genetics and *omics , Biostatistics, Epidemiology and Health Applications
Finite mixtures provide a flexible and powerful tool for fitting univariate and multivariate distributions that cannot be captured by standard statistical distributions. In particular, multivariate mixtures have been widely used to perform modeling and cluster analysis of high-dimensional data in a wide range of applications. Modes of mixture densities have been used with great success for organizing mixture components into homogenous groups. But the results are limited to normal mixtures. Beyond the clustering application existing research in this area has provided fundamental results regarding the upper bound of the number of modes, but they too are limited to normal mixtures. In this project, we wish to explore the modality of non-normal distributions and their application to real life problems.
Modality of mixtures of distributions (phd).
Finite mixtures provide a flexible and powerful tool for fitting univariate and multivariate distributions that cannot be captured by standard statistical distributions. In particular, multivariate mixtures have been widely used to perform modeling and cluster analysis of high-dimensional data in a wide range of applications. Modes of mixture densities have been used with great success for organizing mixture components into homogenous groups. But the results are limited to normal mixtures. Beyond the clustering application existing research in this area has provided fundamental results regarding the upper bound of the number of modes, but they too are limited to normal mixtures. In this project, we wish to explore the modality of non-normal distributions and their application to real life problems.
Estimating false discovery rates in metabolite identification using generative ai (phd).
Supervisors: Vinny Davies , Andrew Elliott , Justin J.J. van der Hooft (Wageningen University) Relevant research groups: Machine Learning and AI , Emulation and Uncertainty Quantification , Statistical Modelling for Biology, Genetics and *omics , Statistics in Chemistry/Physics
Metabolomics is the study field that aims to map all molecules that are part of an organism, which can help us understand its metabolism and how it can be affected by disease, stress, age, or other factors. During metabolomics experiments, mass spectra of the metabolites are collected and then annotated by comparison against spectral databases such as METLIN ( Smith et al., 2005 ) or GNPS ( Wang et al., 2016 ). Generally, however, these spectral databases do not contain the mass spectra of a large proportion of metabolites, so the best matching spectrum from the database is not always the correct identification. Matches can be scored using cosine similarity, or more advanced methods such as Spec2Vec ( Huber et al., 2021 ), but these scores do not provide any statement about the statistical accuracy of the match. Creating decoy spectral libraries, specifically a large database of fake spectra, is one potential way of estimating False Discovery Rates (FDRs), allowing us to quantify the probability of a spectrum match being correct ( Scheubert et al., 2017 ). However, these methods are not widely used, suggesting there is significant scope to improve their performance and ease of use. In this project, we will use the code framework from our recently developed Virtual Metabolomics Mass Spectrometer (ViMMS) ( Wandy et al., 2019 , 2022 ) to systematically evaluate existing methods and identify possible improvements. We will then explore how we can use generative AI, e.g., Generative Adversarial Networks or Variational Autoencoders, to train a deep neural network that can create more realistic decoy spectra, and thus improve our estimation of FDRs.
Supervisors: Surajit Ray Relevant research groups: Machine Learning and AI , Imaging, Image Processing and Image Analysis
This project focuses on the application of medical imaging and uncertainty quantification for the detection of tumours. The project aims to provide clinicians with accurate, non-invasive methods for detecting and classifying the presence of malignant and benign tumours. It seeks to combine advanced medical imaging technologies such as ultrasound, computed tomography (CT) and magnetic resonance imaging (MRI) with the latest artificial intelligence algorithms. These methods will automate the detection process and may be used for determining malignancy with a high degree of accuracy. Uncertainty quantification (UQ) techniques will help generate a more precise prediction for tumour malignancy by providing a characterisation of the degree of uncertainty associated with the diagnosis. The combination of medical imaging and UQ will significantly decrease the requirement for performing invasive medical procedures such as biopsies. This will improve the accuracy of the tumour detection process and reduce the duration of diagnosis. The project will also benefit from the development of novel image processing algorithms (e.g. deep learning) and machine learning models. These algorithms and models will help improve the accuracy of the tumour detection process and assist clinicians in making the best treatment decisions.
Supervisors: Andrew Elliott , Vinny Davies , Hao Gao Relevant research groups: Machine Learning and AI , Emulation and Uncertainty Quantification , Biostatistics, Epidemiology and Health Applications , Imaging, Image Processing and Image Analysis
Personalised medicine is an exciting avenue in the field of cardiac healthcare where an understanding of patient-specific mechanisms can lead to improved treatments ( Gao et al., 2017 ). The use of mathematical models to link the underlying properties of the heart with cardiac imaging offers the possibility of obtaining important parameters of heart function non-invasively ( Gao et al., 2015 ). Unfortunately, current estimation methods rely on complex mathematical forward simulations, resulting in a solution taking hours, a time frame not suitable for real-time treatment decisions. To increase the applicability of these methods, statistical emulation methods have been proposed as an efficient way of estimating the parameters ( Davies et al., 2019 ; Noè et al., 2019 ). In this approach, simulations of the mathematical model are run in advance and then machine learning based methods are used to estimate the relationship between the cardiac imaging and the parameters of interest. These methods are, however, limited by our ability to understand the how cardiac geometry varies across patients which is in term limited by the amount of data available ( Romaszko et al., 2019 ). In this project we will look at AI based methods for generating fake cardiac geometries which can be used to increase the amount of data ( Qiao et al., 2023 ). We will explore different types of AI generation, including Generative Adversarial Networks or Variational Autoencoders, to understand how we can generate better 3D and 4D models of the fake left ventricles and create an improved emulation strategy that can make use of them.
Metabolomics is the study field that aims to map all molecules that are part of an organism, which can help us understand its metabolism and how it can be affected by disease, stress, age, or other factors. During metabolomics experiments, mass spectra of the metabolites are collected and then annotated by comparison against spectral databases such as METLIN ( Smith et al., 2005 ) or GNPS ( Wang et al., 2016 ). Generally, however, these spectral databases do not contain the mass spectra of a large proportion of metabolites, so the best matching spectrum from the database is not always the correct identification. Matches can be scored using cosine similarity, or more advanced methods such as Spec2Vec ( Huber et al., 2021 ), but these scores do not provide any statement about the statistical accuracy of the match. Creating decoy spectral libraries, specifically a large database of fake spectra, is one potential way of estimating False Discovery Rates (FDRs), allowing us to quantify the probability of a spectrum match being correct ( Scheubert et al., 2017 ). However, these methods are not widely used, suggesting there is significant scope to improve their performance and ease of use. In this project, we will use the code framework from our recently developed Virtual Metabolomics Mass Spectrometer (ViMMS) ( Wandy et al., 2019 , 2022 ) to systematically evaluate existing methods and identify possible improvements. We will then explore how we can use generative AI, e.g., Generative Adversarial Networks or Variational Autoencoders, to train a deep neural network that can create more realistic decoy spectra, and thus improve our estimation of FDRs.
Supervisors: Andrew Elliott , Vinny Davies , Hao Gao Relevant research groups: Machine Learning and AI , Emulation and Uncertainty Quantification , Biostatistics, Epidemiology and Health Applications , Imaging, Image Processing and Image Analysis
Statistical methodology for assessing the impacts of offshore renewable developments on marine wildlife (phd), statistical modelling for biology, genetics and *omics - example research projects, modelling genetic variation (msc/phd).
Supervisors: Vincent Macaulay Relevant research groups: Bayesian Modelling and Inference , Modelling in Space and Time , Statistical Modelling for Biology, Genetics and *omics
Supervisors: Vinny Davies , Richard Reeve , Claire Harris (BIOSS) Relevant research groups: Bayesian Modelling and Inference , Computational Statistics , Environmental, Ecological Sciences and Sustainability , Statistical Modelling for Biology, Genetics and *omics
Supervisors: Vinny Davies , Andrew Elliott , Justin J.J. van der Hooft (Wageningen University) Relevant research groups: Machine Learning and AI , Emulation and Uncertainty Quantification , Statistical Modelling for Biology, Genetics and *omics , Statistics in Chemistry/Physics
Finite mixtures provide a flexible and powerful tool for fitting univariate and multivariate distributions that cannot be captured by standard statistical distributions. In particular, multivariate mixtures have been widely used to perform modeling and cluster analysis of high-dimensional data in a wide range of applications. Modes of mixture densities have been used with great success for organizing mixture components into homogenous groups. But the results are limited to normal mixtures. Beyond the clustering application existing research in this area has provided fundamental results regarding the upper bound of the number of modes, but they too are limited to normal mixtures. In this project, we wish to explore the modality of non-normal distributions and their application to real life problems
Supervisors: Vincent Macaulay , Luísa Pereira (Geneticist, i3s ) Relevant research groups: Statistical Modelling for Biology, Genetics and *omics , Biostatistics, Epidemiology and Health Applications
The traditional genome-wide association studies to detect candidate genetic risk factors for complex diseases/phenotypes (GWAS) recur largely to the microarray technology, genotyping at once thousands or millions of variants regularly spaced across the genome. These microarrays include mostly common variants (minor allele frequency, MAF>5%), missing candidate rare variants which are the more likely to be deleterious [ 1 ]. Currently, the best strategy to genotype low-frequency (1%<MAF<5%) and rare (MAF<1%) variants is through next generation sequencing, and the increasingly availability of whole genome sequences (WGS) places us in the brink of detecting rare variants associated with complex diseases [ 2 ]. Statistically, this detection constitutes a challenge, as the massive number of rare variants in genomes (for example, 64.7M in 150 Iberian WGSs) would imply genotyping millions/billions of individuals to attain statistical power. In the last couple years, several statistical methods have being tested in the context of association of rare variants with complex traits [ 2 , 3 , 4 ], largely testing strategies to aggregate the rare variants. These works have not yet tested the statistical empowerment that can be gained by incorporating reliable biological evidence on the aggregation of rare variants in the most probable functional regions, such as non-coding regulatory regions that control the expression of genes [ 4 ]. In fact, it has been demonstrated that even for common candidate variants, most of these variants (around 88%; [ 5 ]) are located in non-coding regions. If this is true for the common variants detected by the traditional GWAS, it is highly probable to be also true for rare variants.
In this work, we will implement a biology-empowered statistical framework to detect rare variant risk factors for complex diseases in WGS cohorts. We will recur to the 200,000 WGSs from UK Biobank database [ 6 ], that will be available to scientists before the end of 2023. Access to clinical information of these >40 years old UK residents is also provided. We will build our framework around type-2 diabetes (T2D), a common complex disease for which thousands of common variant candidates have been found [ 7 ]. Also, the mapping of regulatory elements is well known for the pancreatic beta cells that play a leading role in T2D [ 8 ]. We will use this mapping in guiding the rare variants’ aggregation and test it against a random aggregation across the genome. Of course, the framework rationale will be appliable to any other complex disease. We will browse literature for aggregation methods available at the beginning of this work, but we already selected the method SKAT (sequence kernel association test; [ 3 ]) to be tested. SKAT fits a random-effects model to the set of variants within a genomic interval or biologically-meaningful region (such as a coding or regulatory region) and computes variant-set level p-values, while permitting correction for covariates (such as the principal components mentioned above that can account for population stratification between cases and controls).
Bayesian statistical data integration of single-cell and bulk “omics” datasets with clinical parameters for accurate prediction of treatment outcomes in rheumatoid arthritis (phd).
Supervisors: Mayetri Gupta Relevant research groups: Bayesian Modelling and Inference , Computational Statistics , Vincent Macaulay , Biostatistics, Epidemiology and Health Applications
Supervisors: Andrew Elliott , Vinny Davies , Hao Gao Relevant research groups: Machine Learning and AI , Emulation and Uncertainty Quantification , Biostatistics, Epidemiology and Health Applications , Statistical Modelling for Biology, Genetics and *omics
Supervisors: Craig Anderson Relevant research groups: Modelling in Space and Time , Bayesian Modelling and Inference , Biostatistics, Epidemiology and Health Applications
Supervisors: Vincent Macaulay , Luísa Pereira (Geneticist, i3s ) Relevant research groups: Statistical Modelling for Biology, Genetics and *omics , Biostatistics, Epidemiology and Health Applications
The traditional genome-wide association studies to detect candidate genetic risk factors for complex diseases/phenotypes (GWAS) recur largely to the microarray technology, genotyping at once thousands or millions of variants regularly spaced across the genome. These microarrays include mostly common variants (minor allele frequency, MAF>5%), missing candidate rare variants which are the more likely to be deleterious [ 1 ]. Currently, the best strategy to genotype low-frequency (1%<MAF<5%) and rare (MAF<1%) variants is through next generation sequencing, and the increasingly availability of whole genome sequences (WGS) places us in the brink of detecting rare variants associated with complex diseases [ 2 ]. Statistically, this detection constitutes a challenge, as the massive number of rare variants in genomes (for example, 64.7M in 150 Iberian WGSs) would imply genotyping millions/billions of individuals to attain statistical power. In the last couple years, several statistical methods have being tested in the context of association of rare variants with complex traits [ 2 , 3 , 4 ], largely testing strategies to aggregate the rare variants. These works have not yet tested the statistical empowerment that can be gained by incorporating reliable biological evidence on the aggregation of rare variants in the most probable functional regions, such as non-coding regulatory regions that control the expression of genes [ 4 ]. In fact, it has been demonstrated that even for common candidate variants, most of these variants (around 88%; [ 5 ]) are located in non-coding regions. If this is true for the common variants detected by the traditional GWAS, it is highly probable to be also true for rare variants.
Our group has an active PhD student community, and every year we admit new PhD students. We welcome applications from across the world. Further information can be found here .
Supervisors: Andrew Elliott , Vinny Davies , Hao Gao Relevant research groups: Machine Learning and AI , Emulation and Uncertainty Quantification , Biostatistics, Epidemiology and Health Applications , Imaging, Image Processing and Image Analysis
Statistics and data analytics education - example research projects.
Our group has an active PhD student community, and every year we admit new PhD students. We welcome applications from across the world. Further information can be found here .
Program summary.
Students are required to
The PhD requires a minimum of 135 units. Students are required to take a minimum of nine units of advanced topics courses (for depth) offered by the department (not including literature, research, consulting or Year 1 coursework), and a minimum of nine units outside of the Statistics Department (for breadth). Courses for the depth and breadth requirements must equal a combined minimum of 24 units. In addition, students must enroll in STATS 390 Statistical Consulting, taking it at least twice.
All students who have passed the qualifying exams but have not yet passed the Thesis Proposal Meeting must take STATS 319 at least once each year. For example, a student taking the qualifying exams in the summer after Year 1 and having the dissertation proposal meeting in Year 3, would take 319 in Years 2 and 3. Students in their second year are strongly encouraged to take STATS 399 with at least one faculty member. All details of program requirements can be found in our PhD handbook (available to Stanford affiliates only, using Stanford authentication. Requests for access from non-affiliates will not be approved).
Statistics Department PhD Handbook
All students are expected to abide by the Honor Code and the Fundamental Standard .
During the first two years of the program, students' academic progress is monitored by the department's Graduate Director. Each student should meet at least once a quarter with the Graduate Director to discuss their academic plans and their progress towards choosing a thesis advisor (before the final study list deadline of spring of the second year). From the third year onward students are advised by their selected advisor.
Qualifying examinations are part of most PhD programs in the United States. At Stanford these exams are intended to test the student's level of knowledge when the first-year program, common to all students, has been completed. There are separate examinations in the three core subjects of statistical theory and methods, applied statistics, and probability theory, which are typically taken during the summer at the end of the student's first year. Students are expected to attempt all three examinations and show acceptable performance in at least two of them. Letter grades are not given. Qualifying exams may be taken only once. After passing the qualifying exams, students must file for Ph.D. Candidacy, a university milestone, by the end of spring quarter of their second year.
While nearly all students pass the qualifying examinations, those who do not can arrange to have their financial support continued for up to three quarters while alternative plans are made. Usually students are able to complete the requirements for the M.S. degree in Statistics in two years or less, whether or not they have passed the PhD qualifying exams.
The thesis proposal meeting is intended to demonstrate a student's depth in some areas of statistics, and to examine the general plan for their research. In the meeting the student gives a 60-minute presentation involving ideas developed to date and plans for completing a PhD dissertation, and for another 60 minutes answers questions posed by the committee. which consists of their advisor and two other members. The meeting must be successfully completed by the end of winter quarter of the third year. If a student does not pass, the exam must be repeated. Repeated failure can lead to a loss of financial support.
The Dissertation Reading Committee consists of the student’s advisor plus two faculty readers, all of whom are responsible for reading the full dissertation. Of these three, at least two must be members of the Statistics Department (faculty with a full or joint appointment in Statistics but excluding for this purpose those with only a courtesy or adjunct appointment). Normally, all committee members are members of the Stanford University Academic Council or are emeritus Academic Council members; the principal dissertation advisor must be an Academic Council member.
The Doctoral Dissertation Reading Committee form should be completed and signed at the Dissertation Proposal Meeting. The form must be submitted before approval of TGR status or before scheduling a University Oral Examination.
For further information on the Dissertation Reading Committee, please see the Graduate Academic Policies and Procedures (GAP) Handbook section 4.8.
The oral examination consists of a public, approximately 60-minute, presentation on the thesis topic, followed by a 60 minute question and answer period attended only by members of the examining committee. The questions relate to the student's presentation and also explore the student's familiarity with broader statistical topics related to the thesis research. The oral examination is normally completed during the last few months of the student's PhD period. The examining committee typically consists of four faculty members from the Statistics Department and a fifth faculty member from outside the department serving as the committee chair. Four out of five passing votes are required and no grades are given. Nearly all students can expect to pass this examination, although it is common for specific recommendations to be made regarding completion of the thesis.
The Dissertation Reading Committee must also read and approve the thesis.
For further information on university oral examinations and committees, please see the Graduate Academic Policies and Procedures (GAP) Handbook section 4.7 .
The dissertation is the capstone of the PhD degree. It is expected to be an original piece of work of publishable quality. The research advisor and two additional faculty members constitute the student's dissertation reading committee.
We use cookies on reading.ac.uk to improve your experience, monitor site performance and tailor content to you
Read our cookie policy to find out how to manage your cookie settings
This site may not work correctly on Internet Explorer. We recommend switching to a different browser for a better experience.
2015 onwards.
Abdulrafiu Babatunde Odunuga | |
Philip Maybank | |
Natalie Dimier | |
Chintu Desai | Statistical study designs for phase III pharmacogenetic clinical trials |
Frank Owusu-Ansah | Methodology for joint modelling of spatial variation and competition effects in the analysis of varietal selection trials |
Supada Charoensawat | A likelihood approach based upon the proportional hazards model for SROC modelling in meta-analysis of diagnostic studies |
Pianpool Kirdwichai | A nonparametric regression approach to the analysis of genomewide association studies |
Reynaldo Martina | DStat thesis: Challenges in modelling pharmacogenetic data: Investigating biomarker and clinical response simultaneously for optimal dose prediction |
Rungruttikarn Moungmai | Family-based genetic association studies in a likelihood framework |
Michael Dunbar | Multiple hydro-ecological stressor interactions assessed using statistical models |
Osama Abdulhey | Alcohol consumption and mortality from all and specific causes: the J-hypothesis. A systematic review and meta-analysis of current and historical evidence |
Rattana Lerdsuwansri | Generalisation of the Lincoln-Peterson approach to non-binary source variables |
Krisana Lanumteang | Estimation of the size of a target population using Capture-Recapture methods based upon multiple sources and continuous time experiments |
Rainer-Georg Göldner | Investigation of new single locus and multivariate methods for the analysis of genetic association studies |
Isak Neema | Survey and monitoring crimes in Namibia through the likelihood based cluster analysis |
Mercedes Andrade Bejarano | Monthly average temperature modelling for Valle del Cauca (Colombia) |
Robert Mastrodomenico | Statistical analysis of genetic association studies |
Ruth Butler | DStat thesis: An exploration of the statistical consequences of sub-sampling for species identification |
Carmen Ybarra Moncada | Multivariate methods with application to spectroscopy |
Alun Bedding | The Bayesian analysis of dose titration to effect in Phase II clinical trials in order to design Phase III |
Timothy Montague | Adaptive designs for bioequivalence trials |
Magnus Kjaer | Clinical trials of cytostatic agents with repeated measurements: using the regression coefficients as response |
Kamziah Abd Kudus | Survival analysis models for interval censored data with application to an plantation spacing trial |
Isobel Barnes | Point estimation after a sequential clinical trial |
Ben Carter | Statistical methodology for the analysis of microarray data |
Joanna Burke | Regularised regression in QTL mapping |
Alexandre M F G da Silva | Methods for the analysis of multivariate lifetime data with frailty |
Harsukhjit Deo | Analysis of a Quantitative Trait Locus for twin data using univariate and multivariate linear mixed effects models |
Kim Bolland | The design and analysis of neurological trials yielding repeated ordinal data |
Fazil Baksh | Sequential tests of association with applications in genetic epidemiology |
Martyn Byng | A statistical model for locating regulatory regions in novel DNA sequences |
Rob Deardon | Representation bias in field trials for airborne plant pathogens |
Marian Hamshere | Statistical aspects of objects generated by dynamic processes at sea, detected by remote sensing techniques |
Mike Branson | The analysis of survival data in which patients switch treatments |
Christoph Lang | Generalised estimating equation methods in statistical genetics |
V R P Putcha | Random effects in survival analysis |
Robin Fletcher | Statistical inversion of surface parameters from ATSR-2 satellite observations |
Seth Ohemeng-Dapaah | Methods for analysis and interpretation of genotype by environment interaction |
Emmanuelle Vincent | Sequential designs for clinical trials involving multiple treatments |
Pi Wen Tsai | Three-level designs robust to model uncertainty |
Jo Farebrother | Statistical design and analysis of factorial combination drug trials |
Mark Lennon | Design and analysis of multiple site large plot field experiments |
Norberto Lavorenti | Fitting models in a bivariate analysis of intercrops |
Bernard North | Contributions to survival analysis |
Karen Ayres | Measuring genetic correlations within and between loci, with implications for disequilibrium mapping and forensic identification |
Andrew Morris | Transmission tests of linkage and association using samples of nuclear families with at least one affected child |
Julian Higgins | Exploiting information in random effects meta-analysis |
Mohammed Inayat Khan | Improving precision of agricultural field experiments in Pakistan |
Luzia Trinca | Blocking response surface designs |
Phil Bowtell | Non-linear functional relationships |
Louise Burt | Statistical modelling of volcanic hazards |
Helen Millns | The application of statistical methods to the analysis of diet and coronary heart disease in Scotland |
Dominic Neary | Methods of analysis for ordinal repeated measures data |
Graham Pursey | Shape location and classification with reference to fungal spores |
Nigel Stallard | Increasing efficiency in the design and analysis of animal toxicology studies |
Katarzyna Stepniewska | Some variable selection problems in medical research |
Current/past master's theses.
Document is being generated. Please wait.
In progress..
© Universität Zürich | Jun 26, 2024
Research topics in probability and statistics, problem solving in mathematics and statistics is inspiring and enjoyable. but are achievements in mathematics and statistics any of use in the so-called real world , researchers in the department of statistics at warwick are developing and utilising modern statistics, mathematics, and computing to solve practical problems., examples of themes for undergraduate research projects:.
Probability of containment for multitype branching process models for emerging epidemics
Non-stationary statistical modeling and inference for circadian oscillations for research in cancer chronotherapy
Bayesian Models of Category-Specific Emotional Brain Responses
Decision focused inference on Networked Proabilistic Systems: with applications to food security
Rotationally invariant statistics for examining the evidence from the pores in fingerprints
Dynamic Uncertainty Handling for Coherent Decision Making in Nuclear Emergency Response
Study of Key Interventions into Terrorism using Bayesian Networks
Assessing the risk of subsequent tonic-clonic seizures in patients with a history of simple or complex partial seizures
Multidimensional Markov-functional Interest Rate Models
Prospect Theory, Liquidation and the Disposition Effect
Dynamic Bradley-Terry modelling of sports tournaments
Further information on the wide range of research opportunities open to you as an Undergraduate or Postgraduate Taught student in the Department of Statistics can be found on at our Student Research Opportunities webpage.
More information about research in the Department of Statistics, both applied and theoretical, can be found at the departmental research pages .
The work of mathematicians and statisticians often turns out useful and essential, but typically in a less concrete manner than say the work of a scientists or a physician. David Hilbert, in his now historical address to scientists and physicians, put it this way:
"The instrument that mediates between theory and practice, between thought and observation, is mathematics; it builds the connecting bridge and makes it stronger and stronger. Thus it happens that our entire present-day culture, insofar as it rests on intellectual insight into and harnessing of nature, is founded on mathematics"
Almost a century after Hilbert's words, the mathematical fundations of sciences and social sciences, and the evidence based approach in medicine are often being taken for granted. In the 21st century we are facing complex big data sets with unknown structures from manifold aspecs of the 'real world' as well as fascinating discourses about objective and subjective notions of risk and uncertainty.
Probability and statistics are mathematical disciplines for modelling and analysing theoretical and practical aspects of these burning questions.
IMAGES
VIDEO
COMMENTS
2015. 2014. 2013. 2012. 2011. 2010. 2009. 2008. This list of recent dissertation topics shows the range of research areas that our students are working on.
Theses/Dissertations from 2016 PDF. A Statistical Analysis of Hurricanes in the Atlantic Basin and Sinkholes in Florida, Joy Marie D'andrea. PDF. Statistical Analysis of a Risk Factor in Finance and Environmental Models for Belize, Sherlene Enriquez-Savery. PDF. Putnam's Inequality and Analytic Content in the Bergman Space, Matthew Fleeman. PDF
Dissertation Advisor: Jim Dai. Initial job placement: Applied Scientist - Amazon. Seth Strimas-Mackey - "Latent structure in linear prediction and corpora comparison" Dissertation Advisor: Marten Wegkamp and Florentina Bunea. Initial job placement: Data Scientist at Google. Tao Zhang - "Topics in modern regression modeling"
If we talk about the interesting research topics in statistics, it can vary from student to student. But here are the key topics that are quite interesting for almost every student:-. Literacy rate in a city. Abortion and pregnancy rate in the USA. Eating disorders in the citizens.
2022 Ph.D. Dissertations. Andrew Davison. Statistical Perspectives on Modern Network Embedding Methods. Sponsor: Tian Zheng. Nabarun Deb. Blessing of Dependence and Distribution-Freeness in Statistical Hypothesis Testing. Sponsor: Bodhisattva Sen / Co-Sponsor: Sumit Mukherjee. Elliot Gordon Rodriguez.
Here are some of the best statistical research topics worth writing on: Predictive Healthcare Modeling with Machine Learning. Analyzing Online Education During COVID-19 Epidemic. Modeling How Climate Change Affects Natural Disasters. Essential Elements Influencing Personnel Productivity. Social Media Influence on Customer Choices and Behavior.
PhD Theses. 2023. Title. Author. Supervisor. Statistical Methods for the Analysis and Prediction of Hierarchical Time Series Data with Applications to Demography. Daphne Liu. Adrian E Raftery. Statistical methods for genomic sequencing data.
This dissertation will introduce novel method- ology and review state-of-the-art existing methods in three different areas of applied statistics. Chapter 2 focuses on modelling subcommunity dynamics in gut micro- biome data. Existing methods ignore cross-sample heterogeneity in subcommunity composition; we propose a novel mixed-membership model ...
Praphruetpong (Ben) Athiwaratkun - "Density representations for words and hierarchical data" Dissertation Advisor: Andrew Wilson Initial job placement: AI Scientist - AWS AI Labs Yiming Sun - "High dimensional data analysis with dependency and under limited memory" Dissertation Advisor: Sumanta Basu and Madeleine Udell Initial job placement: Applied Scientist - Amazon Zi Ye - "Functional ...
Senior theses in Statistics cover a wide range of topics, across the spectrum from applied to theoretical. Typically, senior theses are expected to have one of the following three flavors: 1. Novel statistical theory or methodology, supported by extensive mathematical and/or simulation results, along with a clear account of how the research ...
Theses/Dissertations from 2015 PDF. Healthy And Unhealthy Statistics: Examining The Impact Of Erroneous Statistical Analyses In Health-Related Research, Britney Allen. PDF. Recent Advances in Accumulating Priority Queues, Na Li. PDF. Quantitative Techniques for Spread Trading in Commodity Markets, Mir Hashem Moosavi Avonleghi. PDF
If you're just starting out exploring data science-related topics for your dissertation, thesis or research project, you've come to the right place. ... Survey on Statistics and ML in Data Science and Effect in Businesses (Reddy et al., 2022) Visualization in Data Science VDS @ KDD 2022 (Plant et al., 2022) ...
AUTHOR: In each respective box, enter your names (and/or initials) as they appear on the title page of your dissertation or thesis. You are the sole author; your advisor is not considered a co-author. Institution is University of Nebraska-Lincoln (not "at Lincoln" or ", Lincoln"). Do not leave this field blank.
Step 1: Check the requirements. Step 2: Choose a broad field of research. Step 3: Look for books and articles. Step 4: Find a niche. Step 5: Consider the type of research. Step 6: Determine the relevance. Step 7: Make sure it's plausible. Step 8: Get your topic approved. Other interesting articles.
Updated: April 2024 Math/Stats Thesis and Colloquium Topics 2024- 2025 The degree with honors in Mathematics or Statistics is awarded to the student who has demonstrated outstanding intellectual achievement in a program of study which extends beyond the requirements of the major. The principal considerations for recommending a student for the degree with honors will be: Mastery of core ...
With vast experience in the world of academics and command of statistics dissertations, they have prepared the list of most suitable statistics dissertation topics. Bayesian Methods for Functional and Time Series. Kernel Regression Using the Four Fourier Transform. Assessing and Accounting for Correlation in RNA-Seq Data Analysis.
500+ Statistics Research Topics. March 25, 2024. by Muhammad Hassan. Statistics is a branch of mathematics that deals with the collection, analysis, interpretation, presentation, and organization of data. It is a fundamental tool used in various fields such as business, social sciences, engineering, healthcare, and many more.
Statistics thesis topics. Below are sample topics available for prospective postgraduate research students. These sample topics do not contain every possible project; they are aim
The thesis proposal meeting is intended to demonstrate a student's depth in some areas of statistics, and to examine the general plan for their research. In the meeting the student gives a 60-minute presentation involving ideas developed to date and plans for completing a PhD dissertation, and for another 60 minutes answers questions posed by ...
DStat thesis: Challenges in modelling pharmacogenetic data: Investigating biomarker and clinical response simultaneously for optimal dose prediction. Rungruttikarn Moungmai. Family-based genetic association studies in a likelihood framework. Michael Dunbar. Multiple hydro-ecological stressor interactions assessed using statistical models.
MSc thesis (Biostatistics, University of Zurich, 2013): Disease mapping with the Besag-York-Mollié model applied to a cancer and a worm infections dataset. 2013, Master's thesis in Biostatistics. Stefan Purtschert. Construction of bathymetric charts using spatial statistics. 2012, Master's thesis in Mathematics.
Examples of themes for undergraduate research projects: Discovering which genes can discriminate between diseased and healthy patients. Modelling and detecting asset price bubbles while they are happening and before they burst. Modelling infectious diseases and identifying localized outbreaks. Developing a fast algorithm through probabilistic ...
Consult the top 50 dissertations / theses for your research on the topic 'Statistics and Operations Research.'. Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard ...