Hot Topic: Deep Learning – Applications and Implications
BLOG: Heidelberg Laureate Forum
As a communicator working in the world of artificial intelligence (AI), I am often frustrated by how much hype the topic of tools that utilize a deep learning approach receive. I think this is a major barrier to a better public understanding of the reality of the research, which is going to be crucial if we want to be able to gain the most benefit from these tools. The hype is not without consequences for the field itself. The discrepancy between what these tools can achieve and what they can easily do has greatly impacted the amount of interest for the field in the past. Among practitioners, I think we also see these kinds of trends surrounding new tools and approaches. There is no arguing that deep learning is experiencing this kind of attention. Over the past several years, the number of papers with a focus on deep learning that have been submitted to the Conference on Neural Information Processing Systems (NeurIPS) – one of the largest academic conferences in the AI space – has grown at a tremendous rate. I think there is only one way to avoid an “attention overload”: talk about the reality of the research.
Therefore, when the 9th Heidelberg Laureate Forum announced that its Hot Topic would focus on deep learning, I was very excited to be asked to moderate the panel. Deep learning is a quintessentially ‘hot topic’. It is surrounded by hype, often referenced and not discussed as thoroughly as it should be. While its successes have been impressive, deploying a deep learning approach is no guarantee of a solution of the problem or question at hand. Through a panel featuring some of the world’s foremost experts in deep learning and machine learning, we explored the applications and implications of this technology both within the field of artificial intelligence and for society in general.
Joining us on the panel were some of the biggest names in artificial intelligence: Yann LeCun and Yoshua Bengio (co-recipients of the 2018 ACM A.M. Turing Award), Shakir Mohamed (DeepMind), Sanjeev Arora (2011 ACM Prize in Computing), Raj Reddy (1994 ACM A.M. Turing Award), Been Kim (Google Brain), Dina Machuve (DevData Analytics) and Shannon Valor (Centre for Technomoral Futures, Edinburgh Futures Institute).
Foundations
We began with a foundational definition of deep learning provided by Yoshua Bengio. Very simply, these tools are defined by the use of a number of neural networks layered on top of each other. Bengio said the approach that he and his fellow Turing laureates utilized was unique at that time: “What gave them their ‘deep’ label was the idea of composing many or at least several layers of non-linearities in these neural networks.”
Yann LeCun elaborated on Bengio’s answer:
“A very general definition is one in which is you assemble a machine by constructing assembling blocks whose function is not completely defined. And then you adjust those functions by minimizing some sort of objective function using gradient descent … but basically you write a program where the function calls are not completely defined and they’re adjusted by training … deep learning is really sort of a general framework. It is what you bring to it, really.”
Bengio also pointed to the tradeoff between data needs and the number of layers being used:
“We – in theory – we don’t need so many layers. We could just have a single hidden layer, but it might need much more data in order to learn the same things.”
While these definitions helped us build the foundational layer of our conversation, we needed to explore more deeply why deep learning tools have had such success in the field and garnered so much attention. Our panel also included a number of researchers working directly with these tools in the field. Shakir Mohamed, Director for research at DeepMind in London, works on technical and sociotechnical questions in machine learning research, exploring problems in machine learning principles, applied problems in healthcare and environment as well as ethics and diversity. Mohamed added some important context and insight around not only deep learning but also the larger field of machine learning in general:
“The magic of deep learning is a particular choice of specific kinds of models that are based on composition, recursiveness, depth of architecture, particular kind of loss functions and then the algorithm based on gradient descent. So I think part of the analytical tools, especially for people who are interested in probability and thinking through reasoning, which is the kind of tool that I use, is to do this decomposition of what is the model that you are building and why is it that you are building this model? What is the inferential process that turns data into insight, and what is the algorithm that you use to actually implement it? And for each of those you can look at it as an object, you can pick it up, you can do analysis, you can do some theory, you can do some empirical testing, you can put it down, and then you can also study them together. And I think that’s a very useful model to understand machine learning as a broad field generally. It’s not just take data, run, scikit-learn, do some predictions, don’t care about what it is that we’re actually doing.”
Applications
These tools have shown incredible results in the last several years. Dina Machuve, Co-Founder and CTO of the data science consulting start-up DevData Analytics, provided some real-world examples. Her previous research focused on developing data-driven solutions in agriculture and health, such as utilizing image data and deployment on cell phones in poultry disease prediction. For her, using image data allows her to cross language barriers, a crucial capability, particularly in countries with a great number of languages and often low literacy levels:
“We really find out that image data is the most suitable source of data in such settings, and that’s where deep learning comes in (…) In Africa, we have multiple languages being spoken in the continent – an estimate of about 2,000 languages. So really images form the universal data format, and that is the motivation [for why] we’ve used deep learning approaches in developing these tools for early detection of poultry diseases.”
While these tools have many immediate on the ground applications, it is important to keep (realistic) sight of their potential to create great change. Raj Reddy brought that into focus for us. He is a Professor of Computer Science and Robotics at Carnegie Mellon University and was awarded the 1994 ACM A.M. Turing Award “for pioneering the design and construction of large scale artificial intelligence systems, demonstrating the practical importance and potential commercial impact of artificial intelligence technology. Reddy’s vision for the global impact of these tools is a grand one, surpassing language barriers at their root:
“I think one important application of deep learning is helping the people at the bottom of the pyramid. There are about 2 billion people in the world who cannot read or write, or even if they can read, they cannot understand. … Everyone on the planet can speak. Therefore, if they can speak to the computer, and then have it interact with them, it has a major potential benefit to them. The slogan I use is: “in ten years from now, even an illiterate person can read any book, watch any movie, and have a conversation with anyone, anywhere in the world in their native language.”
Implications
Even with the tremendous impact that these tools are having already and the immense promise they hold, there are major concerns about them. One of the most prevalent concerns about deep learning tools that is often raised in open conversations like this one is about the “black box” nature of these tools. The fact that deep learning uses many layers of neural networks means that sometimes the conclusions reached cannot be traced all the way through the model. There is no way for the designers to track the information all the way from raw data to final output.
Sanjeev Arora brought this point to the fore. He is Professor of Computer Science at Princeton University and received the 2011 ACM Prize in Computing “for contributions to computational complexity, algorithms, and optimization that have helped reshape our understanding of computation.” Arora feels that the connection between this black box nature and the cost function is one that most practitioners do not take into account:
“I think one thing that maybe many practitioners don’t realize is … that the paradigm … is you have a cost function and you’re adjusting it and that’s the learning. Now, the issue is that the cost function doesn’t really determine what the network is doing. There are lots of behaviors you can get out of the same cost function. This is not always appreciated but now in theory it’s been shown. That’s called the implicit bias of the algorithm. It’s not really clear what’s going on underneath. So when people reason about deep nets using just the cost function, I think maybe that’s off and we need more understanding of what’s going on under the covers inside the black box.”
Been Kim specializes in issues like these. A staff research scientist at Google Brain, her work focuses on helping humans communicate with complex machine learning models – not only by building tools, but also by studying their nature compared to humans. During our conversation, she said she likes to think of this black box problem as one of translation:
“I actually look at it as a conversation. It has to be a two-way thing. If we humans only get to ask questions, we’re getting just half of [the] goodness that’s been learned and solved by the machine. So … the ultimate goal is to create language so that humans and machine can have a conversation. So what does that look like? Well, I speak two languages. I speak Korean and English. And when I translate, what I’m implicitly doing is aligning Korean concepts … to English. Now, this alignment isn’t perfect. It will never be. But it’s useful. You can get the gist across. And I think what we need to do is really to create that language so that we can align these different concepts.”
But it is not enough to consider the technological difficulties of these tools. We also have to examine what our use of them, or reliance on them, means for society. Tools that use artificial intelligence approaches require data to draw their conclusions. Whom that data represents is often shaped by who organized and collected the data set. Oftentimes, these data sets reflect the assumptions and biases of the curators. This is a very natural occurrence, but one with big consequences. Often, we think of our own experience as normative – as it is the only one we truly have complete understanding of – but if these tools are asked to examine data where one experience or outlook is represented as normative and others are not, the conclusions that tool draws from the data will reflect that. These tools are an excellent magnifying glass: they bring into sharper focus patterns that might be too detailed to see from a distance. But when we cannot truly understand or interrogate how the lenses in our magnifying glass are arranged – as is the case with deep learning tools – the dangers of bias are magnified. These are some of the ethical implications of the complexity of these tools. The wider societal implications of the speed of adoption, however, may be far greater.
Shannon Vallor is the Baillie Gifford Chair in the Ethics of Data and Artificial Intelligence at the University of Edinburgh. Professor Vallor’s research explores how emerging technologies reshape human moral and intellectual character as well as maps the ethical challenges and opportunities posed by new uses of data and artificial intelligence. She brought into focus the importance of keeping in mind that society is not a static thing that will wait while these tools are refined:
“I think that’s misleading when we talk about the impact on society. That imagines that society stays still and these systems just kind of bounce off of it and maybe do a little damage here and there. That’s not what’s happening, and that’s not what will happen as these systems become more and more integrated in our institutions. These systems are not impacting society. They’re reshaping it. And in democratic cultures, the sort of central core value and principle is the idea that people – all people – should have a say in how their lives go, how their lives are governed and the kind of society that’s around them. And I think we really need to ask ourselves: Do computer scientists, large technology companies, software developers have the social license to decide those things for us? Because that’s what’s happening. Not because there’s an intentional desire to take over these political functions of governance. But we have to recognize that that’s what’s happening because of the incentives in the system now.”
Conclusions
One of the key points about current deep learning approaches is the enormous amount of data that they require, to say nothing of the amount of computing power that this data requires to process. This has all sorts of implications, from the ecological to the societal. Several of our panelists brought up work that would allow these models to function with less data as a place where they are excited to see advancement in the field. These “sparse” models might be the next directions for the field.
It is interesting to note that the nature of the topic meant our conversation necessarily stretched across several issues. While deep learning may have been the central focus, this necessitated a conversation on the implications of artificial intelligence in general. I think this speaks to a need for broader conversation on the topic, not only in the public sphere but – beyond merely the development and deployment phase – in research institutions and industrial settings as well.
You can find the full video of the panel on the HLF YouTube Channel.
Current deep learning apps do not yet have much impact on society. But language models, for example, already show that such a broad impact, even a transformation of society, could be possible in the near future. This is because language models such as GPT-3 and multimodal models such as DALL·E 2 have a much larger scope and a much larger audience than smarter, but also more special programs such as Alpha Go/Alpha Zero.
The development step that will change everything will be the introduction of autonomous systems. Autonomy means the ability to cope with a small world alone without outside help: For example, an autonomous car must know the world of roads as well as a human driver and should be able to drive wherever a person can drive.
An autonomous digital communicator (the same area of application as e.g. Katherine Gorman) for a company should recognize which events should be communicated in which style and which people should be addressed.
These two examples show what autonomy requires: a fairly deep understanding of a particular area. A combination of gut feeling, concluding thinking, experience and the ability to set priorities is part of this ability. Today’s deep learning systems lack these skills because they operate too close to the data used for their training. They lack deeper understanding and the ability to deal with unforeseen events.
Truly autonomous systems will also be able to explain their decisions. They will even understand why it is important to communicate and what advantages a particular communication style has.
We are not there yet. And it seems that deep learning alone is not enough to get there.
Supplement to the role of language models in the broad impact of artificial intelligence: The article ChatGPT proves AI is finally mainstream — and things are only going to get weirder reports, that ChatGPT is now used by ordinary people for their leisure and also for professional purposes like Excel sheet programming.
Until now, AI has mainly been used by large companies such as Google, Apple, Amazon and Facebook to get a better grip on their customers, but now customers are starting to use AI for their own purposes.