The Imitation Game: Turing and Beyond

BLOG: Heidelberg Laureate Forum

Laureates of mathematics and computer science meet the next generation
Heidelberg Laureate Forum

Thanks to the eponymous film, “the imitation game” is a phrase known by many people. I must confess though, before the talk by Avi Widgerson (Nevanlinna and Abel Prize recipient), on day two of the Heidelberg Laureate Forum, I wouldn’t have been able to tell you what an imitation game actually was. I might have mumbled something about computers, or even blurted out the words “Turing test”, but beyond that I would have been clueless. I learned over the course of that hour how imitation games are a tool with uses far beyond computing.

I shall begin as Widgerson did, and indeed as Turing himself did. With the Turing test.

I’m not a robot

The Turing test was posed in Alan Turing’s 1950 paper “Computing Machinery and Intelligence”. Perhaps unsurprisingly, the name “Turing test” came about later. The name Turing gave his test was “the imitation game”, providing a small indication of what the test involves.

The purpose of the test is to determine whether or not a machine is able to exhibit intelligence. Put another way: can a robot think?

In the Turing test, a human referee interacts with an entity. The referee does not know whether the entity is itself human or not. At the end of the interaction, the referee must make a judgement as to the entity’s humanity. The robot entity is defined to be intelligent if a similar guess is made by every referee as it is in the case of a human entity.

Put more formally, consider a robot 🤖. Let \( p_0 \) be the probability that a referee judges a human entity to be human, and let \( p_1 \) be the probability that a referee judges the robot entity 🤖 to be human. The robot 🤖 is intelligent if for every human referee, \( p_0 \approx p_1 \).

The image is split down the middle by a dotted line. On the left-hand side is a graphic of a woman, and on the right is a robot graphic. Arrows got from both of these to a graphic of a man with code on it. There is a question mark about the graphic, and a speech bubble below reading "it's random". To the left of the speech bubble, p_0 is written. p_1 is written to the right

At this point, I should say that if “intelligence” meant the same thing as it does in everyday parlance, then we would have to have a serious debate over whether or not humans are the right benchmark to be using. Intelligence here is defined as the ability to pass this test. That is precisely of the beauty of the test. It circumvents the philosophical debate over what it means to think, and focusses instead on the behavioural viewpoint.

The use of imitation games doesn’t stop at the Turing test though, and neither did Avi Widgerson’s talk. The imitation game can be treated as a framework and applied in many diverse scenarios.

In this paradigm, two things are considered to be the same if they cannot be distinguished by a reasonable test. It’s clear to see that this can have many different applications, and Widgerson introduced four of these in his lecture, from fields across mathematics and computer science.

Lock and Key

It is not uncommon to hear conversations about encryption at the Heidelberg Laureate Forum. Indeed, we were treated to a panel on post-quantum cryptography on the Friday of the forum.

In discussing how we protect our messages, one facet that is often overlooked, though, is what it actually means for an encryption algorithm to be “secure”, and how we can message that.

The initial instinct is often to say that an algorithm is secure if it is impossible for anyone but the intended recipient to decrypt a message that was encrypted using that algorithm. But this begs the question, how do we know that it’s impossible? What if a decryption method exists but just hasn’t been found yet?

A good measure for how secure an encryption algorithm (let’s call it Enc) is, is whether or not two messages encrypted with it are distinguishable. If an encrypted message can’t be distinguished from random noise without the decryption key, it means that the encryption leaves no information about the initial message readable to the outside observer.

Now it’s time for another imitation game.

Instead of comparing a human and a robot, this imitation game compares two messages \( m_0 \) and \( m_1 \). We encrypt both of our messages using the encryption algorithm, Enc. Instead of a human judge, we have an “efficient algorithm” referee, who is tasked to determine whether the message it receives was originally \( m_0 \) or \( m_1\).

Let \(p_0\) be the probability that the referee judges Enc(\( m_0 \)) to be \( m_0 \), and let \( p_1 \) be the probability that the referee judges Enc(\( m_1 \)) to be \(m_0\). The encryption is secure if for every efficient-algorithm referee \(p_0 \approx p_1\).

The image is split down the middle by a dotted line. On the left-hand side is a blue padlock with Enc(M_0) written below. On the right is a purple padlock with Enc(M_1) written below. Arrows got from both of these to a graphic of a computer screen with code on it. There is a question mark about the graphic, and a speech bubble below reading "it's random". To the left of the speech bubble, p_0 is written. p_1 is written to the right

Does this sound familiar? It should! This is analogous to the Turing test. We haven’t even scratched the surface of what imitation games can be used for though.

Security and privacy go hand in hand, so it’s perhaps unsurprising that Widgerson also discussed how imitation games can be used to assess privacy.

The specific situation Widgerson posed was the dilemma posed by databases. From data we’ve voluntarily given in surveys, to data submitted in censuses, to data that has been collected without us knowing, our data is stored in databases worldwide. In order for the data to be useful it needs to be detailed, but the more detailed it is, the more identifiable. A natural question arises: How can we verify that individuals can’t be identified from the data?

In their 2006 paper, Dwork, McSherry, Nissim and Smith proposed using an imitation game. We compare two databases \(D_0\) and \(D_1\), which are “adjacent”. This means that there is precisely one entry different, in this case \(D_0\) includes Jane Doe, but \(D_1\) does not . The referee in this case is any database query \(Q\). A filter is applied to the databases and the filter is said to be differentially private if for every adjacent \(D_0\) and \(D_1\), and every legal query \(Q\), \(p_0 \approx p_1\).

Once again, the imitation game gives us a definition, based very much in the real world. In this instance, we get a definition of the word “private” and it’s a sensible one. If no query can detect that Jane Doe is in the database, then Jane has no need to worry. In a practical sense at least, the filter is private.

“I’m so random”

You may have started to see the pattern by now, and I have one final example to share. This concerns randomness.

Randomness is much like artificial intelligence in that there are many philosophical debates as to whether it truly exists. Some argue that humans cannot simulate randomness (in fact, this was a discussion I had with some other members of the press at the Heidelberg Laureate Forum!). Putting human ability aside, many debate whether, say, a coin flip truly is random because the result is entirely determined by the initial conditions and the toss.

But beyond this debate, it’s important to consider how randomness can be simulated. Regardless of whether or not a coin toss is genuinely random, it’s just impractical to flip a coin every time you need a generate a random number. Suppose, then, that we want a pseudo-random source. What would this look like? How do we define “pseudo-random”? Is the stock market, or rainfall sufficiently random?

Once again, imitation games come to the rescue in answering these questions, in a method first suggested by Blum and Micali in 1981, and Yao in 1982.

We set up this imitation game with a random input, and our potential pseudo-random input. We give one of these inputs to an algorithm \(A\) (our referee), which is tasked in determining which of the two inputs it is. As before, \(p_0\) is the probability that the algorithm \(A\) judges the random input correctly to be the random, and \(p_1\) is the probability that the algorithm \(A\) judges the potentially pseudo-random input to be the random input. The source is pseudo-random if for every algorithm referee \(p_0 \approx p_1\).

The image is split down the middle by a dotted line. On the left-hand side is some dice, and of the right is a rain cloud. Arrows got from both of these to a graphic on a computer screen with code on it. There is a question mark about the graphic, and a speech bubble below reading "it's random". To the left of the speech bubble, p_0 is written. p_1 is written to the right

The examples are endless…

These STILL aren’t all the examples of imitation games that Avi Widgerson gave, but I shall stop here.

Throughout the course of his talk, Widgerson gave a whistle-stop tour of mathematics and computer science through imitation games. Turing was famously a polymath and it’s fitting that his idea is proving important across so much of science.

Avi Wigderson’s talk, as well as the entire scientific program of the 9th HLF, can be viewed on the HLFF YouTube channel.

A blonde, white woman smiling

Posted by

Sophie Maclean is a mathematician and maths communicator, currently studying for a PhD in Analytic Number Theory at Kings College London. She has previously worked as a Quantitative Trader and a Software Engineer, and now gives mathematics talks all over the UK (and Europe!). She is also a member of the team behind Chalkdust Magazine. You can follow her on Twitter at @sophietmacmaths

4 comments

  1. Mathematically, the imitation game could be seen as a test of whether two unknown functions are the same, which means that the function value for any input is the same for both functions.

    There is no explicit function description of the two functions and not even a table with function values. As a tester, you must take samples yourself and conclude that the functions are the same or not the same.

    Problem: You can only perform a finite number of tests in a certain time, and if you do not find a difference between the two functions, it only means that you have not found a difference and not necessarily that there is no difference at all.

    But in the end, this situation is very similar to our real life. For example, you can live with a partner and always think that the partner is a nice, lovable person – until the opposite is proven. And this realization that your partner is not a nice, lovable person can even come about after years of living together. Admittedly, maybe your partner has changed in the meantime or he has always been like that and you only found out very late.

    • In etwa so, Herr “Holzherr” (die doppelten Anführungszeichen nur deswegen, weil ein als solches unerkennbares Pseudonym genutzt wird, Sie heißen nicht so!), insofern sollte sich aus diesseitiger Sicht im Sozialen nie um Gleichsetzung, dieser sog. Turing-Test versucht sich an ihr, versucht werden.

      Die bekannte Online-Enzyklopädie kommt so :

      Mit dem später sogenannten Turing-Test formulierte Alan Turing im Jahr 1950 eine Idee, wie man feststellen könnte, ob ein Computer, also eine Maschine, ein dem Menschen gleichwertiges Denkvermögen hätte. [Quelle]

      Der Ansatz ist falsch, um einmal ein Füllwort zu nutzen, völlig falsch.

      Korrekt wäre an Hand eines Kriterienkatalogs, der gesondert zu entwickeln wäre, einer unbekannten Entität i.p. Verständigkeit entgegen zu kommen.
      Dabei insbesondere auch Interessen, Bedürfnissen und antizipierten Anforderungslagen dieser Entität vor- oder nachzuspüren.
      Gerne auch, wenn es um “AI” geht, Peter Lustigs Rat zu folgen und abzuschalten.

      Mit freundlichen Grüßen
      Dr. Webbaer (der sich durchaus, also vollständig sozusagen, das Auftauchen einer künstlich geschaffenen “Intelligenz” auf diesem Planeten vorstellen kann, die ähnlich wie der sog. Gray Goo primär Vertilgungs-Tendenzen hätte – vielleicht verlautbarend, vielleicht einem sog. Turing-Test zur Genüge, aber doch mit einem unpässlichen Veranstaltungs-Charakter kommen könnte)

  2. Sophie Maclean wrote (21. Oct 2022):
    > The Turing test was posed in Alan Turing’s 1950 paper “Computing Machinery and Intelligence”. Perhaps unsurprisingly, the name “Turing test” came about later. The name Turing gave his test was “the imitation game” […]

    Indeed.
    (Incidentally, Turing’s paper doesn’t seem to mention “probability” related to those tests.. …)

    > In the Turing test, a human referee interacts with an entity. The referee does not know whether the entity is itself human or not. At the end of the interaction, the referee must make a judgement as to the entity’s humanity.

    Apparently this requires the referee to make a discrete (binary, Boolean) judgement:
    Either the entity under test is judged “human”, or judged “not human”.
    (Accordingly the referee is asked to “render a judgement”, rather than to “determine a score”.)

    > Put more formally, consider a [one particular, fixed] robot […]
    > Let \( p_0 \) be the probability that a referee judges a human entity to be human

    Does this probability \( p_0 \) derive (as some average) from all (many) relevant referees having played out a suitably large number of tests with some (more or less representative) sample of distinct humans ?, or:

    Does this probability derive (as some average) from one particular (fixed) referee (say referee \(R^k\)) having played out a suitably large number of tests with some (more or less representative) sample of distinct humans ?
    (If so, this probability might be denoted more discriptively as \( p_0^k \).)

    > and let \( p_1 \) be the probability that a referee judges the robot entity to be human.

    Does this probability \( p_1 \) derive (as some average) from all (many) relevant referees having played out one test each with the robot ?, or:

    Does this probability \( p_1 \) derive (as some average) from all (many) relevant referees having played out a suitably large number of tests this same one robot,
    allowing for the possibility that the same one referee came to different judgements in distinct iterations of the game ?, or:

    Does this probability \( p_1 \) derive (as some average) from one particular (fixed) referee \(R^k\) having played out a suitably large number of tests with the robot,
    allowing for the possibility that referee \(R^k\) came to different judgements in distinct iterations of the game ?
    (If so, this probability might be denoted more discriptively as \( p_1^k \).) Or:

    Does the value \( p_1 \) represent the judgement of one particular (fixed) referee \(R^k\) from one single tests with the robot ?
    (If so, this value might be either exactly \(1\), or else exactly \(0\), for instance.)

    Now, depending on the answers to the above questions, it should be possible to consider the subsequent conclusion …

    > […] if for every […] referee, \( p_0 \approx p_1\) [then …]

    … in more detail. Perhaps especially relevant is the following case:

    “If each referee \(R^k\) finds \( p_0^k \approx p_1^k \) then …”

    It may then be asked and sorted out further whether some individual referee \(R^k\) “is even qualified” (if, say, \(p_0^k \lt 0.5\)),
    or whether some individual referee \(R^k\) “can be trusted” (if, say, \(p_0^k \lt p_1^k\) such that \(p_0^k \not\approx p_1^k \)) …

  3. Dr. Webbaer fand den sog. Turing-Test, der Sacharbeit auf soziale Akzeptanz reduziert, kein anderer Versuch lag vor, nicht passend.
    Vielleicht ist so nur vorgeschlagen worden, damit vorgeschlagen werden konnte.
    Es gibt bei großen Denkern einige Beispiele dafür, Immanuel Kant hat bspw. den sog. Kategorischen Imperativ entwickelt, er passt ebenfalls nicht.
    Und wer bei zeitgenössichen bundesdeutschen Denkern mal reinschaut, bspw. bei Peter Sloterdijk, wird auch nicht immer happy und sagt “So muss ich es haben, hier und jetzt, und zwar sofort!”, Precht und Jordan Peterson bleiben aus diesseitiger Sicht, um nur mal einige Namen zu nennen, “Namedropping” liegt vor, lol, auch nicht frei von dem Verdacht sozial angefordert sozusagen liefern zu müssen.

    Mit freundlichen Grüßen
    Dr. Webbaer (der der Ansicht ist, dass auf derartige “Tests” verzichtet werden darf bis muss)

Leave a Reply


E-Mail-Benachrichtigung bei weiteren Kommentaren.
-- Auch möglich: Abo ohne Kommentar. +