Tim Sullivan

Junior Professor in Applied Mathematics:
Risk and Uncertainty Quantification

Outlines of Some Research Topics

My general area of research is uncertainty quantification, at the intersection of applied mathematics, probability, statistics, and computational science. The brief descriptions below outline some of my research interests, past work, and ideas for the future. If you are interested in any of these areas, then please get in touch!

A common theme underlying many of my current research interests is to add ‘the right amount of uncertainty’ to an otherwise certain or near-certain prediction, in order to correctly compensate for over-confidence in the modelling process. In the words of Julian Baggini:

The mark of a mature, psychologically healthy mind is indeed the ability to live with uncertainty and ambiguity, but only as much as there really is.

Probabilistic Numerics

Beginning with seminal papers of Diaconis (1988) and O'Hagan (1992), there has been interest in giving probabilistic solutions to deterministic problems such as quadrature, linear algebra, optimisation, etc. For example, a Bayesian approach to computing the integral of some function \(f \colon [a, b] \to \mathbb{R}\) would be to specify a prior probability measure on a suitable space of integrands, treat the evaluations \(y_{i} = f(x_{i})\) of \(f\) at \(n\) quadrature nodes \(x_{i} \in [a, b]\) as observed data, and then form the posterior distribution

\( \mathbb{P} \left( \int_{a}^{b} f(x) \, \mathrm{d} x \,\middle|\, f(x_{i}) = y_{i} \text{ for } i = 1, \dots, n \right) . \)

The expected value of this distribution gives an estimate of the value of the integral, and the variance can be used as a measure of accuracy.

For forward ODE and elliptic PDE solves arising in Bayesian inverse problems, this area was recently stimulated by Conrad et al. (2016), who pointed out the problems associated with over-confident inferences arising from under-resolved models, and established an optimal scaling for Gaussian process perturbations of standard numerical schemes (e.g. implicit Euler or RK4) that introduces maximal uncertainty without changing the deterministic convergence rate for the mean. Thus, for an ODE

\( u'(t) = f(u(t)) \)

and a time step \(\tau > 0\), one replaces the deterministic numerical integrator (discrete time flow map \(\Psi^{\tau}\))

\( U_{k + 1} = \Psi^{\tau}(U_{k}), \)

which generates deterministic approximations \(U_{k} \approx u(k \tau)\) by a randomised integrator

\( U_{k + 1} = \Psi^{\tau}(U_{k}) + \xi_{k}(\tau), \)

where the \(\xi_{k}(t)\) are suitably-chosen stochastic processes: one desirable property is that if \(\Psi^{\tau}\) has truncation error of order \(q\), then so too should \(\mathbb{E}[U]\) — though, of course, the randomised trajectories will exhibit some ‘spread’ about the mean. This approach provides a statistical characterisation of the impact of numerical errors in the integration scheme, which is highly desirable in Bayesian inverse problems, in which the numerical solution \(U\) may be used to help infer important physical parameters.

This video (Video mp4, Video ogg, Video webm) shows numerical solutions of the chaotic Lorenz-63 system using deterministic fourth-order Runge–Kutta (in red) with time step \(\tau = 10^{-3}\), and 2048 independent realisations of a probabilistic RK4 integrator (in blue) with the same time step which, at each time step, adds a random perturbation of the same order as the known local truncation error (i.e. of order \(\tau^{5}\)). For times up to approximately \(t=38\), the deterministic and randomised solutions remain very close; after this time, the probabilistic ensemble quickly becomes profoundly non-Gaussian, and the classical deterministic solution is effectively just one of many “reasonable” draws from this distribution. Such randomised solvers can be used to incorporate solver error and uncertainty into inference problems, e.g. trying to infer the initial conditions or other parameters of the system.

I am interested in furthering this line of work, both in developing the theory of probabilistic integrators for more general ordinary and partial differential equations, and seeking applications, e.g. to molecular dynamics, atomistics, and other multiscale modelling problems.

Optimal Distributionally-Robust Uncertainty Quantification

In its most general form, uncertainty quantification can be cast as a constrained optimisation problem over a large space of parameters, functions, and probability measures, with information acting as constraints on the feasible set (Owhadi et al., 2013; Sullivan et al., 2013). Typically, one is interested in the expected value (and other moments) of some quantity of interest \(q\), as the probability distribution \(\mu\) of the inputs \(X\) and any associated response function \(f\) range over a general admissible set \(\mathcal{A}\); that is, we need to calculate and minimise/maximise

\( \mathbb{E}_{X \sim \mu}[q(X, f(X))], \quad (\mu, f) \in \mathcal{A}. \)

The admissible set \(\mathcal{A}\) corresponds to the available information about the system of interest: for example, one might know that the mean of \(X\) must lie between \(a\) and \(b\), and so the constraint \(a \leq \mathbb{E}_{X \sim \mu}[X] \leq b\) would form part of the definition of \(\mathcal{A}\). Similarly, one one could consider higher-order statistics related to \(\mu\), and information about the response function \(f\), e.g. numerical models and approximations, archives of legacy data, and so on (Sullivan et al., 2013; Kamga et al., 2014).

In principle, the optimisation problem over \((\mu, f) \in \mathcal{A}\) is infinite-dimensional, and therefore computationally intractable. However, extensions of classical Choquet theory (Owhadi et al., 2013) can be used to show that, under generalised independence and moment constraints on the probability measures \(\mu\), this optimisation problem reduces to a search over a finite-dimensional effective admissible set \(\mathcal{A}_{\Delta} \subset \mathcal{A}\), and this reduced problem can be solved numerically — in some cases, it is even a convex problem (Han et al., 2015). In the reduced setting \(\mathcal{A}_{\Delta}\), the measures \(\mu\) are discrete measures with finite support, and the values of \(f\) are only needed on the finitely many points where \(\mu\) assigns positive probability mass:

\( \mu \to \displaystyle \sum_{i = 0}^{N} w_{i} \delta_{x_{i}}, \quad f \to (f(x_{i}))_{i = 1}^{N} . \)

For the forward uncertainty propagation problem, this approach allows for the optimal propagation of distributional uncertainty (e.g. histogram envelopes) through general forward models. In practice, this opens up new possibilities for probabilistic reliability forecasting and quality control. For inverse problems, this approach allows one to study the robustness of Bayesian inversion to perturbations of the prior and likelihood model.

Bayesian Inverse Problems

Bayesian procedures are popular methods for the quantification of uncertainty, especially in inverse problems. In many modern applications, the underlying models are infinite-dimensional, and discretisation only enters as a computational consideration. The general form of an inverse problem is, given spaces \(U\) and \(Y\) and a known operator \(G \colon U \to Y\), to recover \(u \in U\) from \(y \in Y\), where \(y\) is a (perhaps noisily corrupted version of) \(G(u)\). A simple example is the additive noise model

\( y = G(u) + \eta , \)

where \(Y\) is a vector space and \(\eta\) is a \(Y\)-valued random variable. In modern applications, the forward model \(G\) may correspond to solving an ordinary or partial differential equation, and the parameter \(u\) may be a scalar, vector, or tensor field — and so \(U\) is an infinite-dimensional space. Inverse problems are typically ill-posed in the strict sense (i.e. no such \(u \in U\) exists, or there are multiple solutions, or they depend sensitively upon the problem setup. In the Bayesian paradigm, closely related to the tradition of regularisation, the inverse problem is rendered well-posed by seeking a conditional posterior probability distribution for \(u\) given \(y\), not just a point estimate.

From both frequentist and numerical analytic points of view, it is natural to ask whether Bayesian methods are accurate and stable: if exposed to enough high-quality data, will they converge to the ‘truth’, and in a way that is robust to perturbations of the problem setup (the Bayesian prior, likelihood, and observed data)? What kinds of priors and likelihoods make for good modelling choices, and also lead to well-posed inverse problems? This is a subtle topic with many positive and negative results. I am interested in both theoretical and numerical approaches (Owhadi, Scovel & Sullivan, 2015a, 2015b; Sullivan, 2016).

Accelerated Exploration of Intractable Bayesian Posterior Distributions

In principle, methods such as Markov chain Monte Carlo (MCMC) can be used to explore and sample an arbitrary probability distribution \(\mu\) on a space \(U\), such as a Bayesian posterior distribution. However, these methods often converge poorly — if at all — in practice when \(U\) has high dimension, or when \(\mu\) is concentrated on a low-dimensional subset of \(U\). In such situations, it is desirable to approximate \(\mu\) by any reasonable means in order to obtain an informative approximate solution in finite time. Methods of interest include

Concentration of Measure

A powerful heuristic first observed by Lévy (1951) and Milman (1971) is that a ‘nice’ function of many independent or weakly correlated random variables has overwhelming probability to take values near its mean, median, or a low-dimensional subset of its range. This has implications for uncertainty quantification of complex systems: in situations where concentration of measure results can be proved, a large stochastic system becomes effectively deterministic. In the past, I have worked on applications of concentration-of-measure methods to the certification of terminal ballistics (Sullivan et al., 2011; Adams et al., 2012; Kidane et al., 2012).