## Virtual Kick-off Meeting (ZOOM)

### 14-15 December, 16pm CEST

**Organizers**: Ezio Bartocci, Efstathia Bura, Laura Kovacs

**Speakers** (in alphabetical order):

- Nimar Arora, Independent AI Researcher, USA
- Gilles Barthe, MPI, Germany
- Mathias Beiglboeck, UniWien, Austria
- Jürgen Cito, TU Wien, Austria
- Joost Pieter Katoen, RWTH Aachen University, Germany
- Manuel Kauers, JKU, Austria
- Marcel Moosbrugger, TU Wien, Austria
- Sriram Sankaranarayanan, University of Colorado Boulder, USA
- Miroslav Stankovic, TU Wien, Austria

Tentative Agenda:

### Talks:

**Automated Termination Analysis of Polynomial Probabilistic Programs**

Marcel Moosbrugger (Speaker), Ezio Bartocci, Joost-Pieter Katoen, Laura Kovács

The termination behavior of probabilistic programs depends on the
outcomes of random assignments. Almost sure termination (AST) is
concerned with the question whether a program terminates with
probability one on all possible inputs. Positive almost sure termination
(PAST) focuses on termination in a finite expected number of steps. This
paper presents a fully automated approach to the termination analysis of
probabilistic while-programs whose guards and expressions are polynomial
expressions. As proving (positive) AST is undecidable in general,
existing proof rules typically provide sufficient conditions. These
conditions mostly involve constraints on supermartingales. We consider
four proof rules from the literature and extend these with
generalizations of existing proof rules for (P)AST. We automate the
resulting set of proof rules by effectively computing asymptotic bounds
on polynomials over the program variables. These bounds are used to
decide the sufficient conditions - including the constraints on
supermartingales - of a proof rule. Our software tool Amber can thus
check AST, PAST, as well as their negations for a large class of
polynomial probabilistic programs, while carrying out the termination
reasoning fully with polynomial witnesses. Experimental results show the
merits of our generalized proof rules and demonstrate that Amber can
handle probabilistic programs that are out of reach for other
state-of-the-art tools.

**Analysis of Bayesian Networks via Prob-Solvable Loops**

Ezio Bartocci, Laura Kovács, Miroslav Stankovic (Speaker)

Prob-solvable loops are probabilistic programs with polynomial assignments over random
variables and parameterized distributions, for which the full automation of moment-based
invariant generation is decidable. In this talk addresses how Prob-solvable loops can be used
for encoding Bayesian networks (BNs). We show that various BNs, such as discrete, Gaussian,
conditional linear Gaussian and dynamic BNs, can naturally be encoded as Prob-solvable loops.
With these encodings, we automatically solve several BN related problems, including exact
inference, sensitivity analysis, prediction and computing the expected number of rejecting
samples in sampling-based procedures.

**An Introduction to Symbolic Summation**

Manuel Kauers

Symbolic summation concerns algorithms that find closed form
evaluations of expressions involving a summation sign. Clearly, there can be no
algorithm which is able to handle any given sum expression, so in order
to get a meaningful problem, some restrictions must be imposed on the
input. There is also no universally accepted
definition of what a ``closed form'' is, so the output specification
must be clarified as well. In the talk, we
will discuss some classical algorithms for symbolic summation and
discuss the mathematical and computational
ideas behind them.

**The role of Probabilistic Programming Languages in Artificial Intelligence**

Nimar Arora

AI has gone back and forth from explicit knowledge-based
approaches to purely data-driven approaches. From the glory days of Prolog in 1980s to
Deep Learning in 2010s. The pendulum keeps shifting as inadequacies in
each approach is highlighted. Most recently, a lot of attention is being
paid to issues around fairness as well as robustness to adversarial
attacks. The hypothesis of this talk is that these recent issues are
primarily because popular discriminative approaches such as Deep
Learning or decision trees do not explicitly encode any human knowledge
and are thus not constrained by the bounds of rationality. In this
regard Probabilistic Programming Languages (PPLs) provide an alternate
approach where a human can specify a generative model of the world. Such
models were conjectured by Helmholtz (1925) as the basis of rational
thinking in humans. We motivate this claim with real examples taken from
a generative model that was used to very effectively solve a hundred
year old problem in seismology. Our central thesis is that the role of
PPLs is to bring rationality back to AI. Finally, we discuss issues of
scalability in PPLs and in particular we describe recent trends to
combine Deep Learning with PPLs.

**Software Engineering Research Opportunities for Probabilistic Programming**

Jürgen Cito

Modern probabilistic programming is often facilitated through embedding
domain-specific constructs (for sampling, conditioning, and inference) in general-purpose languages.
While this provides some level of familiarity of using existing control structures and
building blocks for abstractions, understanding and debugging of probabilistic programs
brings its own set of unique challenges that are currently understudied in SE/PL research.
In this talk, I want to discuss early ideas and opportunities on how probabilistic
programming can benefit from methods in software engineering research that bring together
empirical studies, theories of program comprehension, and interactive visualization.

**The Adaptive Topology**

Mathias Beiglboeck

A recurring question in different scientific fields is what it means
for two stochastic processes to be close. The usual weak topology / Wasserstein distance
is not well-suited: it fails to provide stability for sequential decision problems, standard
operations such as the Doob-decomposition are not continuous, a process that is arbitrarily
close to a martingale can be heavily biased, etc.
In the last decades, researchers coming from rather unrelated fields (economics, logics,
sequential optimization, mass transport, machine learning, among others) have independently
suggested distances/topologies to deal with this challenge.
Our main new result is that all of these seemingly different definitions determine the
same topology on the space of stochastic processes.

**Towards the Automated Synthesis of Probabilistic Programs**

Joost Pieter Katoen (Speaker), Milan Ceska, Nils Jansen and Sebastian Junges

Probabilistic programs are typically rather small, but intricate to get
correct. Whereas the main control flow is often in place, completing the
bits and pieces to a full-fledged program is non-trivial. This talk
addresses program sketching: given a partial probabilistic program,
i.e., a program with "holes", fill these "holes" such that the resulting
program satisfies the specification.
I will outline two base approaches: counterexample-guided abstraction
refinement (CEGAR) and counterexample-guided inductive synthesis
(CEGIS). The CEGAR approach iteratively partitions the design space
starting from an abstraction of this space and refines this by a
light-weight analysis of verification results. The CEGIS technique
exploits critical subsystems as counterexamples to prune all programs
behaving incorrectly on that input.
We show how these techniques can be used to sketch probabilistic
programs, to synthesise randomised distributed algorithms and
controllers for POMDPs, and to select ``good'' products in software
product lines.

**Certified Convergence and Stability Bounds for Machine Learning Algorithms**

Gilles Barthe (Speaker), Toby Murray, Jiajia Song, Pierre-Yves Strub

Convergence and stability are quantitative properties which guarantee that supervised
machine learning algorithms output well-behaved classifiers that do not under-fit or
over-fit. We develop criteria for convergence and stability bounds; our criteria provide
easy-to-prove conditions for both the convex and non-convex settings. We instantiate our
criteria to prominent algorithms such as Stochastic Gradient Descent, FedAvg, and
Asynchronous distributed Stochastic Gradient Descent. The criteria have been formally
verified in the EasyCrypt proof assistant and interfaced with its program verification
mechanisms. Moreover we have used EasyCrypt for proving selected examples.

**Polynomial Forms for Representing and Reasoning about Uncertainties**

Sriram Sankaranarayanan (Speaker), Chou Yi, Eric Goubault and Sylvie Putot.

We will describe polynomial forms that consist of polynomials plus intervals
involving noise symbols with known distributions in order to represent complex probability
distributions of state variables of dynamical systems over time.
Using a calculus of polynomial forms, we describe how standard operations can be implemented
with rigorous tracking of uncertainties. Furthermore, we show how various concentration
inequalities can be applied over these forms to reason about probabilities of events and
moments of key functions. Finally, we describe conditioning on polynomial
forms and show how they can be used to perform approximate Bayesian inference for dynamical
systems along with numerous applications.