Always keep a human in the loop

BLOG: Heidelberg Laureate Forum

Laureates of mathematics and computer science meet the next generation
Heidelberg Laureate Forum

This year’s Hot Topic discussed the opportunities and challenges around technological solutions for global health. As a by-product, a veritable treasure trove of tips has been created for innovators to ensure that future technologies are sustainable, do not discriminate against anyone and lead to a healthy future.

When I was asked by Andreas Reuter if I could organize and moderate the Hot Topic of this year’s virtual HLF, I was enthusiastic. Not only because it revolved around a pressing topic in digital health, but also because I already had so many great discussions at the Forum. Moreover, it is always a pleasure to work with the people behind the HLF. As a science and tech journalist, I have frequently written about technical solutions and innovations in the area of public health, and I have often found that there is not enough communication between the fields of computer science, medicine and global health experts. I think an interdisciplinary approach would make innovations in that area much better and much more sustainable.

The idea for the topic came from Shwetak Patel, recipient of the 2018 ACM Prize in Computing for contributions to creative and practical sensing systems for sustainability and health and someone who already had a lot of experience in this field. In our first preparatory meeting as organizers of the Hot Topic, we immediately agreed that we wanted to focus one part of the session on experts in the field of global health and that it is crucial that their knowledge be better incorporated into the worlds of computer science, tech and innovation.

What came out of the Hot Topic session exceeded my expectations. I already knew that bringing the perspectives of different disciplines together and freely talking about challenges would have a great effect. That said, we had an impressively open debate in which many solutions emerged and a lot of guidance was offered for all those working on technical innovations in the health sector or planning to do so.

Details make the difference

Even though the topic for 2020’s Hot Topic was chosen before anyone could have known about the pandemic, recent developments around Covid-19 made it very clear that we are on the right path. Future health care supported by technology will become increasingly important, and we need to discuss the details in order to have an effective outcome for society. On the one hand, sensors, data and artificial intelligence/machine learning seem hopeful when it comes to our health and creating a healthcare system of the future. But on the other hand, despite the promise many innovations and new methods exhibit, the details make the difference. Those details are often an underestimated challenge.

The Hot Topic session highlighted this challenge and asked: What would be possible if we could utilize all conceivable data and react accordingly to the findings – simultaneously protecting people’s privacy and civil liberties? How can we find the best way to take care of our health if we could predict everything one can imagine? What are the most important steps toward a promising future?

The first half of the Hot Topic was devoted to the question of what are the actual conditions in different parts of the world and what poses the largest challenges – from the major questions around the pandemic to seemingly insignificant aspects that make a huge difference. What have we learned so far from the Corona crisis? What would the world look like if all collectible data had been available at the earliest possible time? Which aspects of this knowledge can we use in the future, for future pandemics or viruses? Beyond Covid-19, what are the biggest challenges in global health?

On the first panel, Stefanie Friedhoff from the Harvard Gobal Health Institute made clear that access to clean water and nutrition is still a big issue in global health. Other fundamental concerns are diseases and a deficit of health care workers, which is where technology could play a special role.  “AI is a seven billion dollar market, there is a lot of interest”, she stated. Although it is fairly easy to create an algorithm, from her experience it is really hard to create data infrastructures and the cultural awareness to use them. 

Augmenting community healthcare workers with technology could have a great impact, which was echoed by Arunan Skandarajah, former program manager on the Innovative Technology Solutions team of the Bill and Melinda Gates Foundation and is now a Presidential Innovation Fellow at the FDA. However, Skandarajah warned that already here lies the first challenge in the details: “How exactly can we make the interaction with them richer?” He has seen some promising ideas around measuring malnutrition with mobile devices, but is this something that could replace doctors? “No,” said Ziad Obermayer from the Berkeley Public Health Institute. Although it might be very tempting to view tech and healthcare through the classical lens of automation, this would lead to the wrong association: “Automation is the wrong frame because it is way too ambitious.” Obermayer explained that one reason for this is, “We don’t have the ground truth in medicine.”

Built in bias

And that led directly to an essential topic in AI: bias in the data. Health data in particular reflect huge inequality in the health system: some people receive more resources, others less. But just because some subpopulations are diagnosed less frequently does not mean that they are less ill. This is precisely the problem that Obermayer has uncovered using a system of machine learning that should make the health system more efficient and sustainable. The idea seemed obvious: first, the system calculated which patients currently incur the most costs, and then examined which patients should be given more resources today so that they incur fewer costs in the future. “But poorer people generate fewer costs because they have less access to health care; non-white people as well”, Obermayer pointed out. The system logically suggested that less resources should be invested in these populations, as they seem to be less ill. Obermayer continued, “Some are over-diagnosed – others don’t show up in our data. We are building in all these biases.”

Skandarajah observed similar biases in training data in dermatology where, again, non-white patients are underrepresented. “What you train on makes a huge difference. That seems obvious, but there is a lot of risk.” He has some hope that AI can help with that by simulating missing data in training data.

Friedhoff raised another central issue in that we need to make sure that innovations are sustainable. That starts with knowing the cultural context. “There is a large graveyard of well-meaning projects because people did not understand the people and cultural background”. In Ruanda, for example, a contact-tracing app asked for the ethnicity ignoring that the genocide had traumatized people. How can inventors, computer scientists and developers avoid the typical mistakes in technological solutions for the health care system and global health? On the panel, we started a list of things innovators should take care of: The first one came from Friedhoff:

If you develop apps or technology you should make sure that you have somebody on the team from the context or culture.

The second pitfall is connected to funding circles. Friedhoff reported from projects in Sierra Leone during the Ebola outbreak, where even back then they had really good contact-tracing apps. She then said, “But now in times of Covid all the work is done again on paper,” Why? Because after Ebola all the NGOs left – with their apps. This led to the second point:

Developers must thoroughly cooperate with local institutions to ensure that their work is sustainable and does not disappear once they are no longer on site.

They have to prepare so that technical developments, documentation and necessary equipment can remain on-site and that someone is dedicated to maintain and update apps and programs. Our list continued to grow and concluded with the following questions, mainly asked by Friedhoff:

Ask yourself before you start:

  • Is the technical solution the best solution for the problem?
  • Do you understand the context you are working in?
  • Do you actually have the data you need for your solution and do you understand it?
  • Did you make sure that your idea is shared with local authorities?

Bias in training data and machine learning systems can also be included on this list, as Obermayer stated next: “Algorithms are very mechanical, we set out the goals for them. But they can’t tell us whether the goal is the right one.” He pointed out that the central challenge is not accuracy, it is rather asking the right questions. Asking the wrong questions without realizing is called “label-choice-bias”. According to Obermayer, we will never get rid of these biases totally, “but I think it is not hopeless – we have to look at long-term follow-up data.” Because these data can tell us if we have asked the right questions and, ideally, revealing if the intervention works. So the advice is:

Make sure to ask the right questions. Discuss your goals with an interdisciplinary team.

Finally, maybe one of the biggest challenges when it comes to healthcare and technology is privacy and data protection. While there are cultural differences to consider, the computer science community is very committed to privacy, which itself could be a bias. Of course, there is always a privacy risk connected to using health data. “But we tend to be less attuned to the other side of risks which is not using the data,” Obermayer stressed. Those risks are harder to see because those are things that never happen. We can imagine what a data breach looks like, but how can we imagine an algorithm that was never developed because nobody had access to those data? “But what if this algorithm could have saved 1000s of lives?” Obermayer challenged. “We are not doing a very good job in trading off those two types of risks.”

Our investigation continued during the second panel, dedicated to recent and possible solutions. How can machine learning help us to fight the pandemic for example? Which apps and tools currently help to improve health? Where are the challenges and what are the limitations of statistical power?

Aisha Walcott from IBM Research Africa demonstrated their project on non-pharmaceutical interventions for Covid-19: a database using natural language processing (NLP) to extract all kinds of interventions in the pandemic like social distancing, school closures, travel restrictions and many more out of Wikipedia articles. The idea behind this project is to help researchers find correlations and ideally causalities between measures and their effects, thus helping society to fight the pandemic by putting the right measures into place.

Promising solutions for the prediction of outbreaks are agent-based simulations.

Katherine Chou, director of Healthcare & Life Sciences at Google, is working with machine learning as well. She presented Google search trends as a means to predict the next outbreaks. The idea is similar to the idea behind Google Flu trends: if people in one area are searching intensively for keywords connected to the flu or in this case Covid-19, like fever or cough, there might be an outbreak underway. But in 2008, Google Flu Trends failed to predict major outbreaks and predicted outbreaks that never happened – which can be a problem if the healthcare system is using that tool to designate resources because they might not be available where they needed most. Chou admitted that Search Trends for Covid-19 “was interesting but not that useful for epidemiology”. It was much more accurate than flu trends and included 400 different search terms. “It is maybe more useful for hypothesis testing,” she said.

Chou believes that a promising solution for the prediction of outbreaks is agent-based simulations. In these simulations, agents act like humans, and researchers are trying to map real-world behaviors to these agents on a scale where they are also present in the real world.

Mobile devices are great vessels for objective measures

Shwetak Patel’s talk – “Learning from Global Health Research to Address the Current Pandemic” at the Virtual HLF.

Google also offered its mobility reports at the very beginning of the pandemic. Thanks to the data from mobile devices, it was possible to accurately track the extent to which people adhered to the guidelines of staying at home and traveling as little as possible.

Shwetak Patel finds mobile devices promising beyond being tools for data collection. “They are a way to get objective measures out there” after all mobile phones are the most ubiquitous computing platform out there. Combined with low-cost sensors, they could help a lot when it comes to global health.

Patel used the current example of cough monitoring. Could it be possible to detect super spreaders just by the way they are coughing? Is cough an indicator for spread? Systems using machine learning can “hear” more than humans or doctors – they can detect patterns we are not aware of.

Centralized or decentralized?

Privacy was also central to this panel’s discussion, but this time from a very practical point of view. Health data are sensitive data after all: How can we make sure that those data do not end up in the wrong hands? Walcott described an interesting technical solution deployed in Kenya. IBM is experimenting with a system that moved consent management into a blockchain system, and patients can also share their data via the blockchain. “Trust was a big thing from policymakers. Blockchain is great for trust because it is completely transparent and traceable.”

Chou feels that an informed user should have the choice of which data they wanted to share: “Some data is only existing on my device; some data I am totally fine if they are aggregated.” In the community, mobility reports did not offer data for some regions because not enough users hadn’t opted-in. So, there were not enough data to make sure that differential privacy works and that users are kept anonymous.

Accountability is another big challenge when it comes to practical technical solutions for problems in health care. All people involved must be confident that the data behind a machine learning system is representative and suitable to answer the questions. Which is something that made our list of advice: Find out if your data really can answer your questions. Chou detailed a project in which they predicted breast cancer but the models had higher false-positive rates than doctors. “So the combination between the system and a doctor was much more accurate,” she explained, “a human should always be in the loop.”

Walcott echoed the sentiment and said that in her project, the researchers make sure that there is always a human in the loop, controlling that there are no misunderstandings between the algorithm and the data (text written by humans).

We concluded with our final entry for the list, which can hopefully help future developers and innovators to find timely answers to important questions. In this case:

Always keep a human in the loop.

Posted by

Eva Wolfangel is an award winning science and reportage journalist as well as a speaker and moderator, focusing on future technologies such as artificial intelligence and virtual reality, computer science, data journalism, interaction between digital and real worlds, and space travel. She writes for major magazines and newspapers in Germany and Switzerland — including ZEIT, Geo, Spiegel, and NZZ — and produces radio features. As a VR reporter, she reports from virtual worlds as part of the journalistic cooperative RiffReporter. Twitter: @evawolfangel. Photo by Helena Ebel


  1. Eva Wolfangel wrote (09. Dec 2020):
    > […] In the community, mobility reports did not offer data for some regions because not enough users hadn’t opted-in. […]

    May I suggest: “… because not enough users had opted-in.”

    > [… Katherine Chou, director of Healthcare & Life Sciences at Google, ] feels that an informed user should have the choice of which data they wanted to share: “Some data is only existing on my device; some data I am totally fine if they are aggregated.”

    Echoing this feeling, I’d hereby like to share (in translation) data which I had submitted already on 31. Nov 2020 for SciLogs comment aggregation (

    Lars Fischer wrote (29. Nov 2020):
    > […] The space of the game of Chess […] consists of 64 fields interrelated to each other by certain rules [… I’d like to know more specificly] which type of space it presents [in a mathematical sense.]

    As far as the rules of Chess determine foremost how the different Chess pieces may move in case each is on the Chess board alone by itself (i.e. before these “principal rules of motion” are further modified in certain ways by other figures being present), the space of the game of Chess can be considered the aggregate of several metric spaces (or certain generalizations of metric spaces), which are all defined “on” the same set of 64 fields;
    where for each individual Chess piece, by itself, following its applicable “principal rules of motion”, there is a distance defined between (distinct) fields by how many moves the piece requires at a minimum in order to reach the destination field from the field where it starts.

    Accordingly we may distinguish (and consider jointly making up “The Space of the Game of Chess”):

    – “the king’s distance between any two distinct fields” (values 1 “step” to 7 “steps”),

    – “the queen’s distance between any two distinct fields” (values 1 “move” to 2 “moves”),

    – “the rook’s distance between any two distinct fields” (values 1 “move” to 2 “moves”),

    – “the knight’s distance between any two distinct fields” (values 1 “move” to 5 “moves” (?)),

    – “the bishop’s distance between any two distinct fields” (values 1 “move” to 2 “moves”; separately either “only on the white fields” or “only on the black fields”), and

    – “the pawn’s distance between any two distinct fields” (values 1 “move” to 5 “moves”; by the “principal rules” only “straight forward”, and separately for each individual pawn. Therefore, to be more correct, we may distinguish sixteen “pawn quasi-distances”, each defined on a certain subset of the 64 fields).

    Why, then, would the (usual) Chess board be considered 2-dimensional nevertheless? —
    Most obviously perhaps, because there are two armies of chess pieces (“the white” and “the black”) facing each other (and they are both “advancing towards each other”; in a strict sense at least as far as the pawns may move), and their initial positions both have a certain “breadth” (of 8 fields, or 7 king steps).

    In order to represent and to distinguish whole Chess matches, 3 dimensions are sufficient; schematically:

    { 64 fields } ⊗ { 32 pieces } ⊗ { Number of (half) moves } .

      • Herr Senf wrote (09.12.2020, 19:02 o’clock):
        > … und was hat das mit dem Thema zu tun? [ Now — what’s that have to do with the topic? ]

        As far as “the topic” can be discerned and as far as SciLogs comments cannot be without having to do with “the topic”, surely “the topic” is (inevitably) “CoViD-19”, and my commenting on Lars Fischer’s apparent pause from administering his SciLog cannot stand without wishing that he’s going to get through this pandemic alright, along with all of “us old farts”.

        • Frank Wappler wrote (10.12.2020, 21:19 o’clock):
          > […] my commenting on Lars Fischer’s apparent pause from administering his SciLog […]

          It bears pointing out that the comment I had submitted to Lars Fischer’s SciLog has meanwhile emerged from “moderation” (along with two comments by other authors, which had been submitted on 01.12.2020 as well; and no others).

          »Attributing to malice those failures for which incompetence need not necessarily be blamed has a wonderful way of sharpening attention.«

Leave a Reply

E-Mail-Benachrichtigung bei weiteren Kommentaren.
-- Auch möglich: Abo ohne Kommentar. +