Probability and Statistics Research Seminars
2nd Semester 2013/2014

29 JanuaryPearson diffusions
2014
Nikolai Leonenko (Cardiff University)
2.15pm, Frank Adams 2AbstractPearson diffusions have stationary distributions of Pearson type. They includes OrnsteinUhlenbeck, CoxIngersollRoss, and several others wellkown processes. Their stationary distributions solve the Pearson equation, developed by Pearson in 1914 to unify some important classes of distributions (e.g., normal, gamma, beta, inverse gamma, Student and FisherSnedecor). Their eigenfunction expansions of of transition densities involve the classical orthogonal polynomials (e.g., Hermite, Laguerre, Jacobi).
The selfadjointness of the semigroup generator of one dimensional diffusions implies a spectral representation which has found many useful applications, for example for the prediction of second order stationary sequences and in mathematical finance. However, on noncompact state spaces the spectrum of the generator will typically include both a discrete and a continuous part, with the latter starting at a spectral cutoff point related to the nonexistence of stationary moments. The significance of this fact for statistical estimation is not yet fully understood.
We consider here the problem of spectral representation of transition density for an interesting class of examples: the hypergeometric diffusions with heavytailed Pearson type invariant distribution of a) Reciprocal (inverse) gamma, b) FisherSnedecor, or c) skewStudent type. As opposed to the "classic" hypergeometric diffusions (OrnsteinUhlebeck, Gamma/CIR, Beta/Jacobi), these diffusions have a continuum spectrum, whose spectral cutoff and transition density we present in this lecture. 
5 FebruaryDesigns for generalised linear models with random block effects
2014
Dave Woods (University of Southampton)
2.15pm, Frank Adams 2AbstractFor an experiment measuring independent discrete responses, a generalised linear model, such as the logistic or loglinear, is typically used to analyse the data. In blocked experiments, where observations from the same block are potentially dependent, it may be appropriate to include random effects in the predictor, thus producing a generalized linear mixed model. Selecting optimal designs for such models is complicated by the fact that the Fisher information matrix, on which most optimality criteria are based, is computationally expensive to evaluate. In addition, the dependence of the information matrix on the unknown values of the parameters must be overcome by, for example, use of a pseudoBayesian approach. We use a variety of closedform approximations, some derived from marginal quasilikelihood, to develop computationally inexpensive surrogates for the information matrix to obtain Doptimal designs. This approach reduces the computational burden substantially, enabling straightforward selection of multifactor designs. The accuracy of the closedform approximations is explored for the first time using a novel computational approximation. It is found that correcting for the marginal attenuation of parameters in binaryresponse models yields much improved designs.

19 FebruaryMultivariate Extremes Value Methods for Univariate and Spatial Flood Risk Assessment.
2014
Jonathan Tawn (University of Lancaster)
2.15pm, Frank Adams 2AbstractThe talk will cover two distinct problems in flood risk assessment: the estimation of the distribution of flood peaks at a site and the estimation of the distribution of \u201cfinancial loss\u201d over a region from flooding. Approaches based on univariate extreme value theory exist for each of these, with the one for flood peaks being very widely used. Both of these problems are essentially multivariate problems. In this talk I will present a multivariate extreme value approach to each of the two problems that offers substantial improvements over the existing methods.

5 MarchFiltering, Drift Homotopy and Target Tracking
2014
Vasileios Maroulas (University of Tennessee, USA)
2.15pm, Frank Adams 2AbstractTarget tracking is a problem of paramount importance arising in Biology, Defense, Ecology and other scientific fields. We attack to this problem by employing particle filtering. Particle filtering is an importance sampling method which may fail in several occasions, e.g. in high dimensional data. In this talk, we present a novel approach for improving particle filters suited to target tracking with a nonlinear observation model. The suggested approach is based on what I will call drift homotopy for stochastic differential equations which describe the dynamics of the moving target. Based on drift homotopy, we design a Markov Chain Monte Carlo step which is appended to the particle filter and aims to bring the particles closer to the observations while at the same time respecting the dynamics. The talk is based on joint works with Kai Kang and Panos Stinis.

12 MarchOn the stochastic CahnHilliard equation
2014
Yiming Jiang (Nankai University)
2.15pm, Frank Adams 2 
19 MarchAsymptotics of forward implied volatility
2014
Antoine Jacquier (Imperial College London)
2.15pm, G.114AbstractWe study the asymptotic behaviour of the forward implied volatility (namely the implied volatility corresponding to forwardstart European options). Our tools rely on (finitedimensional) large deviations and saddlepoint analysis, albeit not necessarily relying on standard convexity arguments. We shall also relate this to the FreidlinWentzell approach for sample paths. As a corollary, we obtain nontrivial weak convergence results for diffusion processes with random initial data.

2 AprilPopulation and individual level trends in BMI using cross sectional and longitudinal data
2014
Matthew Sperrin (University of Manchester)
2.15pm, Frank Adams 2AbstractBMI is a commonly used measure of body fat, and is used to characterise whether an individual is underweight, normal weight, overweight or obese. In 2007 a UK government report predicted that over half of the UK population would be obese by 2050. While this was based on a very crude extrapolation, it has generated substantial interest in the area. In this talk I will discuss: 1) work in which we use Health Surveys for England to describe trends in BMI, using mixture models to characterise changing shape of the BMI distribution over time; 2) longitudinal analysis looking at BMI trajectories at the individual level, using cohort studies and routinely collected data, in which we use linear latent and mixed models to capture subgroups of people with different trajectories; and 3) a brief consideration of the merits of BMI as opposed to other measures of body fat.

9 AprilTimedependency in multistate models: specification and estimation
2014
Ardo van den Hout (University College London)
2.15pm, Frank Adams 2AbstractContinuoustime multistate survival models can be used to describe healthrelated processes over time. In the presence of intervalcensored times for transitions between the living states, the likelihood is constructed using transition probabilities. Model specification and maximum likelihood estimation using the scoring algorithm are extended to allow for flexible parametric modelling of the timedependency of the process. Within one multistate model, transitionspecific hazards can be defined using Gompertz and Weibull formulations. Data on cognitive function in the older population are analysed to illustrate the methods, where the data are taken from the Cognitive Function and Ageing Studies (CFAS) and from the English Longitudinal Study of Ageing (ELSA).

7 MayRegularity properties of SDE's with discontinuous drift
2014
Torstein Kastberg Nilssen (University of Oslo)
2.15pm, Frank Adams 2AbstractIn this talk I will present a new method for constructing strong solutions to SDE's with discontinuous drift coefficients. This method reveals regularity properties of the solution that are not easy to see using the standard method, i.e. the YamadaWatanabe theorem.

21 MayJoint modelling of longitudinal and survival data in biomedical research
2014
Michael Crowther (University of Leicester)
2.00pm, Frank Adams 2AbstractThe joint modelling of longitudinal and survival data has been a rapidly expanding area of methodological research in the past 20 years, but it is only now making its way into applied research, particularly in the fields of cancer and AIDS. Data linkage and personalised medicine are moving to the forefront of healthcare strategy, leading to many opportunities to utilise the joint model framework in the development of dynamic, patient tailored prognostic models. I will introduce the concepts and motivation for joint modelling through clinical datasets in liver cirrhosis and cardiovascular disease, describe available software, and finally, I will describe the extension to joint modelling of survival and multiple, possibly correlated, longitudinal outcomes within the generalised multivariate mixed effects framework.

21 MayProjections of random covering sets
2014
Henna Koivusalo (University of York)
3.00pm, Frank Adams 2 
4 JuneStochastic systems with nonnormal drift
2014
Conall Kelly (The University of the West Indies, Jamaica)
2.15pm, Frank Adams 2AbstractThe equilibrium of a linearised ODE system with nonnormal coefficient matrix may display large transient responses to initial value perturbations, even when the equilibrium is asymptotically stable. Systems of this this kind arise in models of, for example, population dynamics (Neubert et al 1997, 2004) and magnetohydrodynamics (Fedotov et al 2004, 2006).
Such transients render the stability of the equilibrium vulnerable to stochastic perturbation, and even perturbations that are of insufficient intensity to destabilise such an equilibrium can magnify the transient response. This has implications for stochastic modelling as well as for the analysis of stochastic numerical methods. In this talk, we consider the effects of various kinds of stochastic perturbation on the almost sure and meansquare asymptotic stability of such systems, discuss methods of analysis, and provide some relevant applied examples.
1st Semester 2013/2014

25 SeptemberExit problems for spectrally negative Markov additive processes.
2013
Zbigniew Palmowski (University of Wroclaw)
2.15pm, Frank Adams 2 
9 OctoberA Joint Modelling Approach for Longitudinal Studies.
2013
Chenlei Leng (University of Warwick)
2.15pm, Frank Adams 2AbstractIn longitudinal studies, it is of fundamental importance to understand the dynamics in the mean function, variance function, and correlations of the repeated or clustered measurements. For modelling the covariance structure, Cholesky type decomposition based approaches are demonstrated effective. However, parsimonious approaches for directly revealing the correlation structure among longitudinal measurements remain less explored, and existing joint modelling approaches may encounter difficulty in interpreting the covariation structure. In this paper, we propose a novel joint meanvariancecorrelation modelling approach for longitudinal studies. By applying hyperspherical coordinates, we obtain an unconstrained parametrization for the correlation matrix that automatically guarantees its positive definiteness, and develop a regression approach to model the correlation matrix of the longitudinal measurements by exploiting the parametrization. The proposed modelling framework is! parsimonious, interpretable, and flexible for analysing longitudinal data. Extensive data examples and simulations support the effectiveness of the proposed approach. This is a joint work with Weiping Zhang and Cheng Yong Tang.

23 OctoberRandom Fluid Limit in an Overloaded Polling Model and Related Questions
2013
Sergey Foss (HeriotWatt University)
2.15pm, Frank Adams 2AbstractFluid Approximation Approach (RybkoStolyar 1992, Dai 1995, Stolyar 1995, etc.) has become a powerful tool to analyse (in)stability and related problems in complex stochastic models. To obtain a fluid limit, one scales a random process under consideration linearly in time and in space (like in the SLLN) and let the scaling parameter tend to infinity. Any weak limit (if exists) is called a fluid limit. In queueing and communications models, fluid limits are typically deterministic, but not always. In the talk, I discuss reasons for fluid limits to stay random and consider examples of an overloaded polling model and of another queueing system. My talk is based on joint papers with A Kovalevskii and V Topchii (2005), with M Frolkova and B Zwart (2014) and, possibly, with A Kovalevskii (1999), see http://www.math.nsc.ru/LBRT/v1/foss/#bib for possible downloading.

4 NovemberSpacial martingales/Occupation problems in highdimensional spaces, problems of diversity and connection with theory of infinite divisibility/Duration of reign of Chinese emperors
2013
Estate Khmaladze (Victoria University of Wellington)
1:153pm, FA2 
6 NovemberModelling Covariance Structure for Incomplete Multivariate Longitudinal Data.
2013
Jing Xu (Birkbeck College, University of London)
2.15pm, Frank Adams 2AbstractMissing response data in multivariate longitudinal studies has the potential to create major challenges for statistical analysis. The covariance modelling method proposed in Xu and MacKenzie (2012) handles balanced multivariate longitudinal data, but it may not be applicable to unbalanced data subject to missingness. This paper overcomes such problems by using an EM algorithm. It also focuses on the gain attained when modelling covariance structures resulting from studying correlated multiple response variables simultaneously rather than by modelling the responses separately. Simulations conducted show two points: firstly, the EM algorithm proposed in this paper can yield valid mean and covariance estimates for unbalanced data due to missingness; secondly, when correlation and crosscorrelation between multiple variables over time are present, simultaneous modelling noticeably improves the efficiency of mean estimates.

13 NovemberUndiscounted infinite horizon optimal stopping problems
2013
Jan Palczewski (University of Leeds)
2.15pm, Frank Adams 2 
27 NovemberLargescale regression with sparse data.
2013
Rajen Shah (University of Cambridge)
2.15pm, G1.10AbstractThe information age in which we are living has brought with it a combination of statistical and computational challenges that often must be met with approaches that draw on developments from both the fields of statistics and computer science. Here I will present a method for performing regression where the $n$ by $p$ design matrix may have both $n$ and $p$ in the millions, but where the design matrix is sparse, i.e. most of its entries are zero; such sparsity is common in many largescale applications such as text analysis. In this setting, performing regression using the original data can be computationally infeasible. Instead, we first map the design matrix to an $n \times L$ matrix with $L \ll p$, using a scheme based on a technique from computer science known as minwise hashing. From a statistical perspective, we study the performance of regression using this compressed data, and give finite sample bounds on the prediction error. Interestingly, despite the loss of information through the compression scheme, we will see that least squares or ridge regression applied to the reduced data can actually allow us to fit a model containing interactions in the original data.

11 DecemberSome nonlinear stochastic partial differential equations of second order in time: existence of solutions and convergence of a full discretization.
2013
David Siska (University of Liverpool)
2.15pm, Frank Adams 2
Further information
For further information contactDr. Christiana Charalambous (Statistics) or
Dr Ronnie Loeffen (Probability)