### #schuster

### A rigorous theory of conditional mean embeddings in SIMODS

The article “A rigorous theory of conditional mean embeddings” by Ilja Klebanov, Ingmar Schuster, and myself has just appeared online in the *SIAM Journal on Mathematics of Data Science*.
In this work we take a close mathematical look at the method of conditional mean embedding.
In this approach to non-parametric inference, a random variable \(Y \sim \mathbb{P}_{Y}\) in a set \(\mathcal{Y}\) is represented by its *kernel mean embedding*, the reproducing kernel Hilbert space element

\( \displaystyle \mu_{Y} = \int_{\mathcal{Y}} \psi(y) \, \mathrm{d} \mathbb{P}_{Y} (y) \in \mathcal{G}, \)

and conditioning with respect to an observation \(x\) of a related random variable \(X \sim \mathbb{P}_{X}\) in a set \(\mathcal{X}\) with RKHS \(\mathcal{H}\) is performed using the Woodbury formula\( \displaystyle \mu_{Y|X = x} = \mu_Y + (C_{XX}^{\dagger} C_{XY})^\ast \, (\varphi(x) - \mu_X) . \)

Here \(\psi \colon \mathcal{Y} \to \mathcal{G}\) and \(\varphi \colon \mathcal{X} \to \mathcal{H}\) are the canonical feature maps and the \(C\)'s denote the appropriate centred (cross-)covariance operators of the embedded random variables \(\psi(Y)\) in \(\mathcal{G}\) and \(\varphi(X)\) in \(\mathcal{H}\).

Our article aims to provide rigorous mathematical foundations for this attractive but apparently naïve approach to conditional probability, and hence to Bayesian inference.

I. Klebanov, I. Schuster, and T. J. Sullivan. “A rigorous theory of conditional mean embeddings.” *SIAM Journal on Mathematics of Data Science* 2(3):583–606, 2020.

**Abstract.**
Conditional mean embeddings (CMEs) have proven themselves to be a powerful tool in many machine learning applications. They allow the efficient conditioning of probability distributions within the corresponding reproducing kernel Hilbert spaces by providing a linear-algebraic relation for the kernel mean embeddings of the respective joint and conditional probability distributions. Both centered and uncentered covariance operators have been used to define CMEs in the existing literature. In this paper, we develop a mathematically rigorous theory for both variants, discuss the merits and problems of each, and significantly weaken the conditions for applicability of CMEs. In the course of this, we demonstrate a beautiful connection to Gaussian conditioning in Hilbert spaces.

Published on Wednesday 15 July 2020 at 08:00 UTC #publication #simods #mathplus #tru2 #rkhs #mean-embedding #klebanov #schuster

### A rigorous theory of conditional mean embeddings

Ilja Klebanov, Ingmar Schuster, and I have just uploaded a preprint of our recent work “A rigorous theory of conditional mean embeddings” to the arXiv.
In this work we take a close mathematical look at the method of conditional mean embedding.
In this approach to non-parametric inference, a random variable \(Y \sim \mathbb{P}_{Y}\) in a set \(\mathcal{Y}\) is represented by its *kernel mean embedding*, the reproducing kernel Hilbert space element

\( \displaystyle \mu_{Y} = \int_{\mathcal{Y}} \psi(y) \, \mathrm{d} \mathbb{P}_{Y} (y) \in \mathcal{G}, \)

and conditioning with respect to an observation \(x\) of a related random variable \(X \sim \mathbb{P}_{X}\) in a set \(\mathcal{X}\) with RKHS \(\mathcal{H}\) is performed using the Woodbury formula\( \displaystyle \mu_{Y|X = x} = \mu_Y + (C_{XX}^{\dagger} C_{XY})^\ast \, (\varphi(x) - \mu_X) . \)

Here \(\psi \colon \mathcal{Y} \to \mathcal{G}\) and \(\varphi \colon \mathcal{X} \to \mathcal{H}\) are the canonical feature maps and the \(C\)'s denote the appropriate centred (cross-)covariance operators of the embedded random variables \(\psi(Y)\) in \(\mathcal{G}\) and \(\varphi(X)\) in \(\mathcal{H}\).

Our article aims to provide rigorous mathematical foundations for this attractive but apparently naïve approach to conditional probability, and hence to Bayesian inference.

**Abstract.**
Conditional mean embeddings (CME) have proven themselves to be a powerful tool in many machine learning applications. They allow the efficient conditioning of probability distributions within the corresponding reproducing kernel Hilbert spaces (RKHSs) by providing a linear-algebraic relation for the kernel mean embeddings of the respective probability distributions. Both centered and uncentered covariance operators have been used to define CMEs in the existing literature. In this paper, we develop a mathematically rigorous theory for both variants, discuss the merits and problems of either, and significantly weaken the conditions for applicability of CMEs. In the course of this, we demonstrate a beautiful connection to Gaussian conditioning in Hilbert spaces.

Published on Tuesday 3 December 2019 at 07:00 UTC #preprint #mathplus #tru2 #rkhs #mean-embedding #klebanov #schuster

### Active subspace Metropolis-Hastings

Ingmar Schuster, Paul Constantine, and I have just uploaded a preprint of our latest article, “Exact active subspace Metropolis–Hastings, with applications to the Lorenz-96 system”, to the arXiv. This paper reports on our first investigations into the acceleration of Markov chain Monte Carlo methods using active subspaces as compared to other adaptivity techniques, and is supported by the DFG through SFB 1114 *Scaling Cascades in Complex Systems*.

**Abstract.** We consider the application of active subspaces to inform a Metropolis–Hastings algorithm, thereby aggressively reducing the computational dimension of the sampling problem. We show that the original formulation, as proposed by Constantine, Kent, and Bui-Thanh (*SIAM J. Sci. Comput.*, 38(5):A2779–A2805, 2016), possesses asymptotic bias. Using pseudo-marginal arguments, we develop an asymptotically unbiased variant. Our algorithm is applied to a synthetic multimodal target distribution as well as a Bayesian formulation of a parameter inference problem for a Lorenz-96 system.

Published on Friday 8 December 2017 at 08:00 UTC #preprint #mcmc #sfb1114 #schuster #constantine

### Ingmar Schuster Joins the UQ Group

It is a pleasure to announce that Ingmar Schuster will join the UQ research group as a postdoctoral researcher with effect from 15 June 2017. He will be working on project A06 “Enabling Bayesian uncertainty quantification for multiscale systems and network models via mutual likelihood-informed dimension reduction” as part of SFB 1114 Scaling Cascades in Complex Systems.

Published on Thursday 15 June 2017 at 08:00 UTC #group #sfb1114 #schuster

### UQ Talks: Ingmar Schuster

Ingmar Schuster (Université Paris-Dauphine) “Gradient Importance Sampling”

**Time and Place.** Friday 11 March 2016, 11:15–12:45, Room 126 of Arnimallee 6 (Pi-Gebäude), 14195 Berlin

**Abstract.** Adaptive Monte Carlo schemes developed over the last years usually seek to ensure ergodicity of the sampling process in line with MCMC tradition. This poses constraints on what is possible in terms of adaptation. In the general case ergodicity can only be guaranteed if adaptation is diminished at a certain rate. Importance Sampling approaches offer a way to circumvent this limitation and design sampling algorithms that keep adapting. Here I present an adaptive variant of the discretized Langevin algorithm for estimating integrals with respect to some target density that uses an Importance Sampling instead of the usual Metropolis–Hastings correction.

Published on Monday 7 March 2016 at 12:00 UTC #event #uq-talk #schuster