All topics listed on this page are still relevant. However, three very recent ones are part of ongoing research projects and would allow the direct embedding into a team. Please contact me for details!

The famous game 2048 exhibits probabilistic behavior. The task is to use machine learning and verification methods to compute an optimal AI for the game, or to prevent humans from making disastrous moves. See here for instance. *Topic can be rather applied or very theoretical.*

POMDPs are extremely important in many robotics and AI applications. As in the standard AI textbook by Stuart Russell and Peter Norvig: Partially observable MDPs ... are usually viewed as much more difficult than ordinary MDPs. We cannot avoid POMDPs, however, because the real world is one.

The task is to work on methods to compute strategies with finite or no memory, which is (contrary to infinite memory) still feasible but not investigated very much. Approaches can range from dedicated branch and bound methods (algorithmic), mixed-integer linear programming (optimization), or SMT solving (satisfiability). All approaches that lead to convincing results are expected to be able to lead to research publications. *Topic is rather theoretical with some practical aspects.*

We seek methods that help a machine learning algorithm to solve PAC-MAN. See for instance the videos uploaded here for a new result we recently obtained. The paper is available here. Approaches range from model checking, machine learning to SMT solving. *Topic can be rather applied or very theoretical.*

If you are interested in any of the topics listed below or in the general area, please contact me (Nils Jansen). My webpage can be found here.

With rapidly growing application of artificial intelligence and machine learning in social life, considering for instance autonomous systems such as self-driving cars, the need for **verified guarantees** against potentially fatal accidents is self-evident. An important question is then, how confidence in system behaviors obtained from machine learning can be transferred to **formal verification**.

Vice versa, industrial usage of formal verification so far suffers from the fact that scalability is still an issue for large real-world applications. Leveraging the capabilities of machine learning to assess large data sets will help to enable verification for more realistic systems.

A very nice article discussing challenges and open problems can be found here.
If one of the open problems is of interest, let me know.
On top of that, please consider the following concrete thesis proposals.
We focus on two fundamental formalisms/methods, namely **partially observable Markov decision processes** (POMDPs) and **reinforcement learning** (RL).
If interest in **deep neural networks** (DNNs) is there, consider reading this article. I'm happy to discuss potential projects.

In machine learning algorithms, a system model is learned based on observation of the real world.
Upon further observation, this model may be subject to change.
The problem of applying formal verification to such a model is that it is not a well-defined and fixed model of the system at hand.
This project proposes to **robustify** a current learned model against further changes based on future observations.
If it can be verified, that given system specifications are satisfied for this robust model, all future observations that are **consistent** with the intervals will not change this fact.

The project will involve the development of underlying formalisms and a small prototype implementation using Python libraries for reinforcement learning and model checking using the probabilistic model checker Storm.

Take a problem that is sufficiently modeled by a POMDP, for instance an agent navigating in a partially unknown environment.
Intuitively, the agent does not really know, in which state the system is, in fact, there is a probability distribution over possible states, called the **belief state**.
This yields an infinite number of possibilities.

The task is to make use of human's capabilities in inferring from partial observation to resolve or reduce the belief space of the POMDP.
We propose to immerse a human either **actively** or as an **observer** into a verification scenario.
Possibilities range from a sophisticated virtual reality immersion to a plain 2-D representation of the scenario. Check this article on gathering human feedback from the OpenAI blog.

Reinforcement learning largely suffers from the problem, that it is basically uninformed exploration of the state space of a system. We want to study the exploitation of **counterexamples** to identify corner cases and critical parts of state spaces to guide learning. For applications in Deep Neural Networks, check this article.

Verification for POMDPs is not feasible in general.
We propose to use so-called **permissive schedulers** computed on the observable part of a POMDP to allow for as much freedom in decision making as possible.

In a POMDP exploration scenario such as typical motion planning for robots, this will reduce the number of situations, where the exact system state needs to be computed: Consider a situation where the belief state is a distribution over a certain number of states. In other words, there are many possibilities of states, the system can be in. Usually, in order to make a decision, this needs to be based on all possibilities. If now each state has a precomputed permissive choice that agrees with all states, no further computation is necessary.

This very broad and practical project aims at finding and modeling typical robotics case studies that can be formally verified. Here, we are not fixed on probabilistic systems but also care for hybrid systems or even plain transitions systems. Many case studies may for instance be found in this book about probabilistic robotics by Sebastian Thrun.

Many applications are structured in the sense that the underlying systems exploit a certain **structured** in the sense that there exist multiple but finitely many pre-specified options for the system configuration.

Imagine a system incorporating surveillance of the environment using a number of sensors. Each of these sensors has a fixed probability for false alarms or misclassification, depending on the quality of the sensor. Increased quality may come at an increased acquisition or maintenance cost. We propose to model such a given structured scenario as a parametric Markov decision process, where the several fixed options influence both the cost and the system dynamics.

The goal is to synthesize a system configuration that adheres to given cost and safety specifications. The thesis topic involves work on an existing and practical case study and exploitation of current theoretical achievements.

Traditional schedulers (also referred to as strategies or adversaries) resolve all nondeterminism in a system such as Markov decision process (MDP). When probabilities are not fixed but given by an interval range of probabilities, synthesis methods require **permissiveness** for a scheduler in the sense, that only critical behavior (that is, nondeterministic choices) is excluded.

The goal is to work on a case study focused on autonomous systems such as self-driving cars, where schedulers need to be synthesized that guarantee safe behavior while allowing for as much freedom as possible. Prior work can be found here.

Many verification problems for parametric probabilistic systems can be casted as a nonlinear program. Solving such programs is in general infeasible. However, for a certain subclass, so-called **convex programs**, efficient methods exist. In many applications such as networks, planning problems and control theory in general, it is even stated that only convex problems are well stated, consider this great book by Stephen Boyd.

Building on prior work, this project concerns the application of several concepts from **convex optimization** to well-known verification problems for parametric probabilistic systems that cannot be solved efficiently.