The deep learning revolution with John Hopcroft

BLOG: Heidelberg Laureate Forum

Laureates of mathematics and computer science meet the next generation
Heidelberg Laureate Forum

I was first introduced to John Hopcroft at the opening dinner of HLF by Jennifer Chayes from Microsoft, who is an ardent supporter of young researchers and eager to help them make useful contacts. John won the Turing award in 1986 for fundamental achievements in the design and analysis of algorithms and data structures. With a career spanning almost five decades, John is both an inspirational researcher and teacher. I spent some time chatting with John about his current research interest in deep learning, tried to gather some advice on how to be a better researcher and attended his popular talk. Here are snippets of what I’ve learnt.

© Heidelberg Laureate Forum Foundation / Mueck – 2017

“We are undergoing an information revolution and it’s going to change our entire society. Machine learning is a major part of this and one of the drivers is deep learning,’’ so begins John.

This is a bold statement, but one with solid backing. Machine learning is already transforming almost every aspect of our lives, from image and voice recognition, natural language processing, medical diagnoses, self-driving cars to having an effective spam filter. Deep learning is an important family of machine learning algorithms, used in DeepMind’s AlphaGo program this year to beat the world’s highest-ranked Go player. Theses algorithms also dominate in computer vision applications, in 2015 having beaten human ability in some visual recognition tasks using the ImageNet database. It’s even been used to train computers to paint like van Gogh, Turner or Munch.

While most researchers in deep learning concentrate on applications of deep learning, which have been enormously successful, John is interested more in finding out why deep learning is so successful. The theory for deep learning were already developed in the 1980s and 1990s by Geoffrey Hinton and others, but the machine capabilities at the time were insufficient to handle the large amount of data processing required for the algorithms to work well. However, even with well-developed architectures that are known to be extremely effective, no one actually understands why this is so. Most machine learning researchers I ask simply answer with a “it just works’’. As a theoretical physicist, or any kind of theory-oriented scientist, this is very unsettling. So you are not going to find out why, I ask? Many answer that it’s too complicated (“all those layers that you can’t track!’’) or are simply more interested in applications, which to be fair are really very interesting.

John takes a different route and wants to find out the why. One way to approach a complicated problem is to dissect it into manageable parts. So I ask John if any non-trivial insight can be gained at the level of small deep learning circuits? John was clearly very excited. Yes! John has some very bright students in China and one of them recently discovered something unexpected. The student took a very simple network (2 layers in a deep learning circuit) and simple images (10 by 10 pixels, black and white) and set it to task on recognising letters of the alphabet. As he was training the network, he found that three gates started to learn the same thing. What is really amazing is that after a few iterations, two of the other gates somehow found discovered they were redundant and began to learn something different. Now this begs a compelling question: how did the two gates realise it didn’t make sense for them to learn what the other gate was learning and how did they decide to choose something new to learn? There are also other open research questions, like, what is it exactly that individual gates learn? How does the learning happening on the second-level gates differ from that in the first level? And more generally, does what a gate learn evolve over time and if so, how and why?

While deep learning has been very fruitful and has been hailed as one of the drivers of artificial intelligence, it’s important to understand that there is one major difference between what deep learning is currently doing and what human brains do. As John says, “deep learning is pattern recognition…it deals with the shape of an image and does not abstract the function or another property of an object in the image.’’ So even though deep learning can recognise bicycles in photographs, it cannot infer from images of bikes that the seat is for a person to sit on. Surely John, I ask, more data will help? Don’t we humans also learn functionality by more exposure to bikes? What about those from remote tribes who have never seen bikes? Surely they will have similar problems? John was fairly adamant: “No, even if you feed videos of people on bikes to these algorithms, they can’t figure out the function of the bike seat. But humans will be able to, even if they have never seen a bike before. There is something extra the human brain can do that these algorithms cannot do yet.’’

But John is very optimistic. He says that, “at the current state artificial intelligence is pattern recognition in high dimensional space and for artificial intelligence programs to extract the essence of an object and understanding its function or other important aspects…another revolution in 40 years may accomplish that.’’ Great! Looking forward to it!

John advises that at the same time as performing research on deep learning, it’s important to consider its impact on society. These fundamental changes to our society are believed to be equal in scope compared to the agricultural and industrial revolutions and will change our notion of human work itself. As John says, “Some countries are already asking what is the percentage of the population needed to produce the goods and services that we need. They figure it’s very small. They are thinking about universal guaranteed income, and how to engage the population in engaging and meaningful ways so that they can occupy themselves.’’

It is not only the expected qualitative structural changes to society, but the speed of development that is challenging the limits of our adaptability. Even in academic research itself, the speed of development is extraordinarily fast, where breakthroughs in machine learning can occur every few months. So I ask John for some final words of advice for a young scientist trying to navigate herself in this rapidly changing world. He kindly replied,

No one has experience with living in this new kind of world before. While it’s important to respect the advice of those who came before you, that’s different from taking their advice. Their advice may simply not be valid because they lived in a very different world. Do what excites you the most. You only have one life and you should enjoy it.’’

 

Avatar photo

Posted by

Nana Liu is currently a Postdoctoral Research Fellow at the Centre for Quantum Technologies in the National University of Singapore and also at the Singapore University for Technology and Design. She has recently completed her PhD in Atomic and Laser Physics at the University of Oxford and specialized in finding quantum resources responsible for quantum advantages in both quantum computation and quantum sensing. At the moment, her research focus lies at the interface between quantum computation and security, which relies on the cross-pollination between physics and computer science.

4 comments

  1. Yes, only human are curious about “how it works”. And to be honest: Many things just work and we do not know why.
    Because John Hopcroft ist not the only curious being, other human beings also asked “How does deep learning work”:
    Naftali Tishby’s answer is the following: Tishby argues that deep neural networks learn according to a procedure called the “information bottleneck” (network rids noisy input data of extraneous details as if by squeezing the information through a bottleneck, retaining only the features most relevant to general concepts).
    – Theoretical phycicists found another answer: An exact mapping between the Variational Renormalization Group and Deep Learning

Leave a Reply


E-Mail-Benachrichtigung bei weiteren Kommentaren.
-- Auch möglich: Abo ohne Kommentar. +