Tim Sullivan | #klebanov

#klebanov

A periodic table of modes and MAP estimators

Ilja Klebanov and I have just uploaded a preprint of our paper “A ‘periodic table’ of modes and maximum a posteriori estimators” to the arXiv.

This paper forms part of the growing body of work on the ‘small balls’ theory of modes for probability measures on metric spaces, which is needed e.g. for the treatment of MAP estimation for Bayesian inverse problems with functional unknowns. There are already several versions in the literature: the strong mode, the weak mode, and the generalised strong mode. We take an axiomatic approach to the problem and identify a system of 17 essentially distinct notions of mode, proving implications between them and providing explicit counterexamples to distinguish them. From an axiomatic point of view, all these 17 seem to be ‘equally good’, suggesting that further research is needed in this area.

Abstract. The last decade has seen many attempts to generalise the definition of modes, or MAP estimators, of a probability distribution \(\mu\) on a space \(X\) to the case that \(\mu\) has no continuous Lebesgue density, and in particular to infinite-dimensional Banach and Hilbert spaces \(X\). This paper examines the properties of and connections among these definitions. We construct a systematic taxonomy – or ‘periodic table’ – of modes that includes the established notions as well as large hitherto-unexplored classes. We establish implications between these definitions and provide counterexamples to distinguish them. We also distinguish those definitions that are merely ‘grammatically correct’ from those that are ‘meaningful’ in the sense of satisfying certain ‘common-sense’ axioms for a mode, among them the correct handling of discrete measures and those with continuous Lebesgue densities. However, despite there being 17 such ‘meaningful’ definitions of mode, we show that none of them satisfy the ‘merging property’, under which the modes of \(\mu|_{A}\), \(\mu|_{B}\), and \(\mu|_{A \cup B}\) enjoy a straightforward relationship for well-separated positive-mass events \( A, B \subseteq X\).

Published on Monday 17 July 2023 at 09:00 UTC #preprint #modes #map-estimators #klebanov

Γ-convergence of Onsager-Machlup functionals in Inverse Problems

The articles “Γ-convergence of Onsager–Machlup functionals” (“I. With applications to maximum a posteriori estimation in Bayesian inverse problems” and “II. Infinite product measures on Banach spaces”) by Birzhan Ayanbayev, Ilja Klebanov, Han Cheng Lie, and myself have just appeared in their final form in the journal Inverse Problems.

The purpose of this work is to address a long-standing issue in the Bayesian approach to inverse problems, namely the joint stability of a Bayesian posterior and its modes (MAP estimators) when the prior, likelihood, and data are perturbed or approximated. We show that the correct way to approach this problem is to interpret MAP estimators as global weak modes in the sense of Helin and Burger (2015), which can be identified as the global minimisers of the Onsager–Machlup functional of the posterior distribution, and hence to provide a convergence theory for MAP estimators in terms of Γ-convergence of these Onsager–Machlup functionals. It turns out that posterior Γ-convergence can be assessed in a relatively straightforward manner in terms of prior Γ-convergence and continuous convergence of potentials (negative log-likelihoods). Over the two parts of the paper, we carry out this programme both in generality and for specific priors that are commonly used in Bayesian inverse problems, namely Gaussian and Besov priors (Lassas et al., 2009; Dashti et al., 2012).

B. Ayanbayev, I. Klebanov, H. C. Lie, and T. J. Sullivan. “Γ-convergence of Onsager–Machlup functionals: I. With applications to maximum a posteriori estimation in Bayesian inverse problems.” Inverse Problems 38(2):025005, 32pp., 2022. doi:10.1088/1361-6420/ac3f81

Abstract (Part I). The Bayesian solution to a statistical inverse problem can be summarised by a mode of the posterior distribution, i.e. a MAP estimator. The MAP estimator essentially coincides with the (regularised) variational solution to the inverse problem, seen as minimisation of the Onsager–Machlup functional of the posterior measure. An open problem in the stability analysis of inverse problems is to establish a relationship between the convergence properties of solutions obtained by the variational approach and by the Bayesian approach. To address this problem, we propose a general convergence theory for modes that is based on the Γ-convergence of Onsager–Machlup functionals, and apply this theory to Bayesian inverse problems with Gaussian and edge-preserving Besov priors. Part II of this paper considers more general prior distributions.

B. Ayanbayev, I. Klebanov, H. C. Lie, and T. J. Sullivan. “Γ-convergence of Onsager–Machlup functionals: II. Infinite product measures on Banach spaces.” Inverse Problems 38(2):025006, 35pp., 2022. doi:10.1088/1361-6420/ac3f82

Abstract (Part II). We derive Onsager–Machlup functionals for countable product measures on weighted \(\ell^{p}\) subspaces of the sequence space \(\mathbb{R}^\mathbb{N}\). Each measure in the product is a shifted and scaled copy of a reference probability measure on \(\mathbb{R}\) that admits a sufficiently regular Lebesgue density. We study the equicoercivity and Γ-convergence of sequences of Onsager–Machlup functionals associated to convergent sequences of measures within this class. We use these results to establish analogous results for probability measures on separable Banach or Hilbert spaces, including Gaussian, Cauchy, and Besov measures with summability parameter \( 1 \leq p \leq 2 \). Together with Part I of this paper, this provides a basis for analysis of the convergence of maximum a posteriori estimators in Bayesian inverse problems and most likely paths in transition path theory.

Published on Wednesday 5 January 2022 at 12:00 UTC #publication #inverse-problems #modes #map-estimators #ayanbayev #klebanov #lie

Linear conditional expectation in Hilbert space in Bernoulli

The article “The linear conditional expectation in Hilbert space” by Ilja Klebanov, Björn Sprungk, and myself has just appeared in its final form in the journal Bernoulli. In this paper, we study the best approximation \(\mathbb{E}^{\mathrm{A}}[U|V]\) of the conditional expectation \(\mathbb{E}[U|V]\) of an \(\mathcal{G}\)-valued random variable \(U\) conditional upon a \(\mathcal{H}\)-valued random variable \(V\), where “best” means \(L^{2}\)-optimality within the class \(\mathrm{A}(\mathcal{H}; \mathcal{G})\) of affine functions of the conditioning variable \(V\). This approximation is a powerful one and lies at the heart of the Bayes linear approach to statistical inference, but its analytical properties, especially for \(U\) and \(V\) taking values in infinite-dimensional spaces \(\mathcal{G}\) and \(\mathcal{H}\), are only partially understood — which this article aims to rectify.

I. Klebanov, B. Sprungk, and T. J. Sullivan. “The linear conditional expectation in Hilbert space.” Bernoulli 27(4):2267–2299, 2021. doi:10.3150/20-BEJ1308

Abstract. The linear conditional expectation (LCE) provides a best linear (or rather, affine) estimate of the conditional expectation and hence plays an important rôle in approximate Bayesian inference, especially the Bayes linear approach. This article establishes the analytical properties of the LCE in an infinite-dimensional Hilbert space context. In addition, working in the space of affine Hilbert–Schmidt operators, we establish a regularisation procedure for this LCE. As an important application, we obtain a simple alternative derivation and intuitive justification of the conditional mean embedding formula, a concept widely used in machine learning to perform the conditioning of random variables by embedding them into reproducing kernel Hilbert spaces.

Published on Wednesday 25 August 2021 at 08:00 UTC #publication #tru2 #bayesian #rkhs #mean-embedding #klebanov #sprungk

Γ-convergence of Onsager-Machlup functionals

Birzhan Ayanbayev, Ilja Klebanov, Han Cheng Lie, and I have just uploaded preprints of our work “Γ-convergence of Onsager–Machlup functionals” to the arXiv; this work consists of two parts, “Part I: With applications to maximum a posteriori estimation in Bayesian inverse problems” and “Part II: Infinite product measures on Banach spaces”.

The purpose of this work is to address a long-standing issue in the Bayesian approach to inverse problems, namely the joint stability of a Bayesian posterior and its modes (MAP estimators) when the prior, likelihood, and data are perturbed or approximated. We show that the correct way to approach this problem is to interpret MAP estimators as global weak modes in the sense of Helin and Burger (2015), which can be identified as the global minimisers of the Onsager–Machlup functional of the posterior distribution, and hence to provide a convergence theory for MAP estimators in terms of Γ-convergence of these Onsager–Machlup functionals. It turns out that posterior Γ-convergence can be assessed in a relatively straightforward manner in terms of prior Γ-convergence and continuous convergence of potentials (negative log-likelihoods). Over the two parts of the paper, we carry out this programme both in generality and for specific priors that are commonly used in Bayesian inverse problems, namely Gaussian and Besov priors (Lassas et al., 2009; Dashti et al., 2012).

Abstract (Part I). The Bayesian solution to a statistical inverse problem can be summarised by a mode of the posterior distribution, i.e. a MAP estimator. The MAP estimator essentially coincides with the (regularised) variational solution to the inverse problem, seen as minimisation of the Onsager–Machlup functional of the posterior measure. An open problem in the stability analysis of inverse problems is to establish a relationship between the convergence properties of solutions obtained by the variational approach and by the Bayesian approach. To address this problem, we propose a general convergence theory for modes that is based on the Γ-convergence of Onsager–Machlup functionals, and apply this theory to Bayesian inverse problems with Gaussian and edge-preserving Besov priors. Part II of this paper considers more general prior distributions.

Abstract (Part II). We derive Onsager–Machlup functionals for countable product measures on weighted \(\ell^{p}\) subspaces of the sequence space \(\mathbb{R}^\mathbb{N}\). Each measure in the product is a shifted and scaled copy of a reference probability measure on \(\mathbb{R}\) that admits a sufficiently regular Lebesgue density. We study the equicoercivity and Γ-convergence of sequences of Onsager–Machlup functionals associated to convergent sequences of measures within this class. We use these results to establish analogous results for probability measures on separable Banach or Hilbert spaces, including Gaussian, Cauchy, and Besov measures with summability parameter \( 1 \leq p \leq 2 \). Together with Part I of this paper, this provides a basis for analysis of the convergence of maximum a posteriori estimators in Bayesian inverse problems and most likely paths in transition path theory.

Published on Wednesday 11 August 2021 at 12:00 UTC #preprint #inverse-problems #modes #map-estimators #ayanbayev #klebanov #lie

Linear conditional expectation in Hilbert space

Ilja Klebanov, Björn Sprungk, and I have just uploaded a preprint of our recent work “The linear conditional expectation in Hilbert space” to the arXiv. In this paper, we study the best approximation \(\mathbb{E}^{\mathrm{A}}[U|V]\) of the conditional expectation \(\mathbb{E}[U|V]\) of an \(\mathcal{G}\)-valued random variable \(U\) conditional upon a \(\mathcal{H}\)-valued random variable \(V\), where “best” means \(L^{2}\)-optimality within the class \(\mathrm{A}(\mathcal{H}; \mathcal{G})\) of affine functions of the conditioning variable \(V\). This approximation is a powerful one and lies at the heart of the Bayes linear approach to statistical inference, but its analytical properties, especially for \(U\) and \(V\) taking values in infinite-dimensional spaces \(\mathcal{G}\) and \(\mathcal{H}\), are only partially understood — which this article aims to rectify.

Abstract. The linear conditional expectation (LCE) provides a best linear (or rather, affine) estimate of the conditional expectation and hence plays an important rôle in approximate Bayesian inference, especially the Bayes linear approach. This article establishes the analytical properties of the LCE in an infinite-dimensional Hilbert space context. In addition, working in the space of affine Hilbert–Schmidt operators, we establish a regularisation procedure for this LCE. As an important application, we obtain a simple alternative derivation and intuitive justification of the conditional mean embedding formula, a concept widely used in machine learning to perform the conditioning of random variables by embedding them into reproducing kernel Hilbert spaces.

Published on Friday 28 August 2020 at 09:00 UTC #preprint #tru2 #bayesian #rkhs #mean-embedding #klebanov #sprungk