The Healthy AI Lab at Chalmers University of Technology conducts academic research into machine learning and artificial intelligence motivated by challenges in healthcare: causality, decision-making and clinical applications. The lab is led by Fredrik Johansson.

We are hiring a PhD student! Read more here

» Read more about our research and our group

  • Fast Treatment Personalization with Latent Bandits in Fixed-Confidence Pure Exploration

    Newton Mwai, Emil Carlsson, Fredrik Johansson
    TMLR (To appear)
    Personalizing treatments for patients often involves a period of trial-and-error search until an optimal choice is found. To minimize suffering and other costs, it is critical to make this process as short as possible. When treatments have primarily short-term effects, search can be performed with multi-armed bandits (MAB), but these typically require long exploration periods to guarantee optimality. In this work, we design MAB algorithms which provably identify optimal treatments quickly by leveraging prior knowledge of the types of decision processes (patients) we can encounter, in the form of a latent variable model. We present two algorithms, the Latent LP-based Track and Stop (LLPT) explorer and the Divergence Explorer for this setting: fixed-confidence pure-exploration latent bandits. We give a lower bound on the stopping time of any algorithm which is correct at a given certainty level, and prove that the expected stopping time of the LLPT Explorer matches the lower bound in the high-certainty limit. Finally, we present results from an experimental study based on realistic simulation data for Alzheimer's disease, demonstrating that our formulation and algorithms lead to a significantly reduced stopping time.

    May 3, 2023

  • Time series of satellite imagery improve deep learning estimates of neighborhood-level poverty in Africa

    Markus Pettersson, Fredrik Johansson, Adel Daoud
    IJCAI 2023 - AI for Social Good (To appear)
    To combat poor health and living conditions, policymakers in Africa require temporally and geographically granular data measuring economic well-being. Machine learning (ML) offers a promising alternative to expensive and time-consuming survey measurements by training models to predict economic conditions from freely available satellite imagery. However, previous efforts have failed to utilize the temporal information available in earth observation (EO) data, which may capture developments important to standards of living. In this work, we develop an EO-ML method for inferring neighborhood-level material-asset wealth using multi-temporal imagery and recurrent convolutional neural networks. Our model outperforms state-of-the-art models in several aspects of generalization, explaining 72% of the variance in wealth across held-out countries and 75% held-out time spans. Using our geographically and temporally aware models, we created spatio-temporal material-asset data maps covering the entire continent of Africa from 1990 to 2019, making our data product the largest dataset of its kind. We showcase these results by analyzing which neighborhoods are likely to escape poverty by the year 2030, which is the deadline for when the Sustainable Development Goals (SDG) are evaluated.

    Apr 25, 2023

  • Sharing pattern submodels for prediction with missing values

    Lena Stempfle, Ashkan Panahi, Fredrik Johansson
    AAAI 2023 (To appear) [Paper URL]
    Missing values are unavoidable in many applications of machine learning and present a challenge both during training and at test time. When variables are missing in recurring patterns, fitting separate pattern submodels have been proposed as a solution. However, independent models do not make efficient use of all available data. Conversely, fitting a shared model to the full data set typically relies on imputation which may be suboptimal when missingness depends on unobserved factors. We propose an alternative approach, called sharing pattern submodels, which make predictions that are a) robust to missing values at test time, b) maintains or improves the predictive power of pattern submodels, and c) has a short description enabling improved interpretability. We identify cases where sharing is provably optimal, even when missingness itself is predictive and when the prediction target depends on unobserved variables. Classification and regression experiments on synthetic data and two healthcare data sets demonstrate that our models achieve a favorable trade-off between pattern specialization and information sharing.

    Oct 20, 2022

  • NeurIPS 2022: Efficient learning of nonlinear prediction models with time-series privileged information

    Bastian Jung, Fredrik Johansson
    NeurIPS 2022 [Paper URL]
    In domains where sample sizes are limited, efficient learning algorithms are critical. Learning using privileged information (LuPI) offers increased sample efficiency by allowing prediction models access to types of information at training time which is unavailable when the models are used. In recent work, it was shown that for prediction in linear-Gaussian dynamical systems, a LuPI learner with access to intermediate time series data is never worse and often better in expectation than any unbiased classical learner. We provide new insights into this analysis and generalize it to nonlinear prediction tasks in latent dynamical systems, extending theoretical guarantees to the case where the map connecting latent variables and observations is known up to a linear transform. In addition, we propose algorithms based on random features and representation learning for the case when this map is unknown. A suite of empirical results confirm theoretical findings and show the potential of using privileged time-series information in nonlinear prediction.

    Sep 15, 2022

  • Seminar: Efficient learning of nonlinear prediction models with time-series privileged information

    Fredrik Johansson
    Chalmers machine learning seminars
    In domains where sample sizes are limited, efficient learning is critical. Yet, there are machine learning problems where standard practice routinely leaves substantial information unused. One example is prediction of an outcome at the end of a time series based on variables collected at a baseline time point, for example, the 30-day risk of mortality for a patient upon admission to a hospital. In applications, it is common that intermediate samples, collected between baseline and end points, are discarded, as they are not available as input for prediction when the learned model is used. We say that this information is privileged, as it is available only at training time. In this talk, we show that making use of privileged information from intermediate time series can lead to much more efficient learning. We give conditions under which it is provably preferable to classical learning, and a suite of empirical results to support these findings.

    Sep 12, 2022

  • Practicality of generalization guarantees for unsupervised domain adaptation with neural networks

    Adam Breitholtz, Fredrik Johansson
    TMLR [Paper URL]
    Understanding generalization is crucial to confidently engineer and deploy machine learning models, especially when deployment implies a shift in the data domain. For such domain adaptation problems, we seek generalization bounds which are tractably computable and tight. If these desiderata can be reached, the bounds can serve as guarantees for adequate performance in deployment. However, in applications where deep neural networks are the models of choice, deriving results which fulfill these remains an unresolved challenge; most existing bounds are either vacuous or has non-estimable terms, even in favorable conditions. In this work, we evaluate existing bounds from the literature with potential to satisfy our desiderata on domain adaptation image classification tasks, where deep neural networks are preferred. We find that all bounds are vacuous and that sample generalization terms account for much of the observed looseness, especially when these terms interact with measures of domain shift. To overcome this and arrive at the tightest possible results, we combine each bound with recent data-dependent PAC-Bayes analysis, greatly improving the guarantees. We find that, when domain overlap can be assumed, a simple importance weighting extension of previous work provides the tightest estimable bound. Finally, we study which terms dominate the bounds and identify possible directions for further improvement.

    Sep 2, 2022

  • Case-Based Off-Policy Evaluation Using Prototype Learning

    Anton Matsson, Fredrik Johansson
    UAI 2022 [Paper URL]
    Importance sampling (IS) is often used to perform off-policy evaluation but it is prone to several issues---especially when the behavior policy is unknown and must be estimated from data. Significant differences between target and behavior policies can result in uncertain value estimates due to, for example, high variance. Standard practices such as inspecting IS weights may be insufficient to diagnose such problems and determine for which type of inputs the policies differ in suggested actions and resulting values. To address this, we propose estimating the behavior policy for IS using prototype learning. The learned prototypes provide a condensed summary of the input-action space, which allows for describing differences between policies and assessing the support for evaluating a certain target policy. In addition, we can describe a value estimate in terms of prototypes to understand which parts of the target policy have the most impact on the estimate. We find that this provides new insights in the examination of a learned policy for sepsis management. Moreover, we study the bias resulting from restricting models to use prototypes, how bias propagates to IS weights and estimated values and how this varies with history length.

    Jul 27, 2022

  • ADCB: An Alzheimer’s disease simulator for benchmarking observational estimators of causal effects

    Newton Mwai Kinyanjui, Fredrik D. Johansson
    CHIL 2022 [Paper URL]
    Simulators make unique benchmarks for causal effect estimation as they do not rely on unverifiable assumptions or the ability to intervene on real-world systems. This is especially important for estimators targeting healthcare applications as possibilities for experimentation are limited with good reason. We develop a simulator of clinical variables associated with Alzheimer’s disease, aimed to serve as a benchmark for causal effect estimation while modeling intricacies of healthcare data. We fit the system to the Alzheimer’s Disease Neuroimaging Initiative (ADNI) dataset and ground hand-crafted components in results from comparative treatment trials and observational treatment patterns. The simulator includes parameters which alter the nature and difficulty of the causal inference tasks, such as latent variables, effect heterogeneity, length of observed subject history, behavior policy and sample size. We use the simulator to compare standard estimators of average and conditional treatment effects.

    Apr 7, 2022

  • Learning using privileged time series information

    Rickard Karlsson, Martin Willbo, Zeshan Hussain, Rahul G. Krishnan, David Sontag, Fredrik Johansson
    AISTATS 2022 [Paper URL]
    We study prediction of future outcomes with supervised models that use privileged information during learning. The privileged information comprises samples of time series observed between the baseline time of prediction and the future outcome; this information is only available at training time which differs from the traditional supervised learning. Our question is when using this privileged data leads to more sample-efficient learning of models that use only baseline data for predictions at test time. We give an algorithm for this setting and prove that when the time series are drawn from a non-stationary Gaussian-linear dynamical system of fixed horizon, learning with privileged information is more efficient than learning without it. On synthetic data, we test the limits of our algorithm and theory, both when our assumptions hold and when they are violated. On three diverse real-world datasets, we show that our approach is generally preferable to classical learning, particularly when data is scarce. Finally, we relate our estimator to a distillation approach both theoretically and empirically.

    Feb 1, 2022

  • Predicting progression & cognitive decline in amyloid-positive patients with Alzheimer's disease

    Hákon Valur Dansson, Lena Stempfle, Hildur Egilsdóttir, Alexander Schliep, Erik Portelius, Kaj Blennow, Henrik Zetterberg, Fredrik D Johansson
    Alzheimer's Research & Therapy [Paper URL]
    In Alzheimer’s disease, amyloid-β (Aβ) peptides aggregate in the brain forming CSF amyloid levels, which are a key pathological hallmark of the disease. However, CSF amyloid levels may also be present in cognitively unimpaired elderly individuals. Therefore, it is of great value to explain the variance in disease progression among patients with Aβ pathology. We studied the problem of predicting disease progression and cognitive decline of potential AD patients with established Aβ pathology using the Alzheimer’s Disease Neuroimaging Initiative (ADNI) database.

    Mar 15, 2021

  • Tocilizumab treatment for rheumatoid arthritis

    Fredrik Johansson
    As of September 2020, the division of Data Science & AI have four brand new PhD students with Fredrik Johansson as main advisor. This marks the start of the Healthy AI lab at Chalmers University of Technology. Adam Breitholtz, Newton Mwai, Lena Stempfle and Anton Matsson begin their doctoral studies, funded by the Wallenberg AI, Autonomous Systems and Software programme.

    Feb 1, 2021

  • Learning to search efficiently for causally near-optimal treatments

    Fredrik Johansson
    NeurIPS 2020 [Paper URL]
    Finding an effective medical treatment often requires a search by trial and error. Making this search more efficient by minimizing the number of unnecessary trials could lower both costs and patient suffering. We formalize this problem as learning a policy for finding a near-optimal treatment in a minimum number of trials using a causal inference framework. We give a model-based dynamic programming algorithm which learns from observational data while being robust to unmeasured confounding. To reduce time complexity, we suggest a greedy algorithm which bounds the near-optimality constraint. The methods are evaluated on synthetic and real-world healthcare data and compared to model-free reinforcement learning. We find that our methods compare favorably to the model-free baseline while offering a more transparent trade-off between search time and treatment efficacy.

    Dec 31, 2020

  • A new group: The Healthy AI lab!

    Fredrik Johansson
    As of September 2020, the division of Data Science & AI have four brand new PhD students with Fredrik Johansson as main advisor. This marks the start of the Healthy AI lab at Chalmers University of Technology. Adam Breitholtz, Newton Mwai, Lena Stempfle and Anton Matsson begin their doctoral studies, funded by the Wallenberg AI, Autonomous Systems and Software programme.

    Sep 1, 2020