Proposing an Alternative to the Turing Test for A.I.

By Wesley Fenlon

Passing the Turing Test, says one computer scientist, doesn't actually make a computer intelligent.

"The science of AI is concerned with the study of intelligent forms of behaviour in computational terms," writes computer scientist Hector Levesque. "But what does it tell us when a good semblance of a behaviour can be achieved using cheap tricks that seem to have little to do with what we intuitively imagine intelligence to be?"

The introduction to Levesque's paper about artifical intelligence, quoted above, offers a different take on A.I. than we usually see from the computer world's smartest minds. Namely, that A.I. sucks. And not just that it sucks--that A.I., as we're developing it, is basically chasing the wrong kind of intelligence.

Photo credit: Paramount

Levesque recently presented his paper at an international conference on artificial intelligence. He starts by breaking down what he considers important in artificial intelligence: answering questions. Levesque takes issue with how modern A.I. systems, like Apple's Siri, answers questions--if you're not asking something that can be discovered with a Google search, it's probably not going to give you an answer.

While it's still impressive how far natural language recognition has come in the last few years, Levesque argues that applications like Siri aren't where we should be taking artificial intelligence, because they're not very intelligent.

Levesque posts a sample question. " 'Could a crocodile run a steeplechase?' Even if you know what crocodiles and steeplechases are, you have never really thought about this question before...And yet, an answer does occur to you almost immediately."

This is the kind of question a computer is bad at, because it can only draw on a knowledge base like the Internet. And since this is an unusual question, it's likely not going to bring up any results, even though it's a question most humans could answer. Levesque writes that the Turing Test, commonly used to measure computer intelligence, "has a serious problem." The Turing Test "relies too much on deception. A computer program passes the test if it can fool an interrogator into thinking she is dealing with a person not a computer."


Levesque goes into more detail about the Turing Test's flaws, but what's important is his proposed alternative, the Winograd schema test, which uses questions people could answer but are impervious to a Google search. They essentially rely on common sense and understanding English, which can't be gamed in the same way the Turing Test can. Computers that pass the Turing Test often fake intelligence to appear as though they can hold up a conversation.

It's worth checking out the paper to read more about how the Winograd schema test could be a better judge of artificial intelligence; at the end of the paper, Levesque offers an interesting, broader criticism of the computer science community. "As a field, I believe that we tend to suffer from what might be called serial silver bulletism, defined as follows: the tendency to believe in a silver bullet for AI, coupled with the belief that previous beliefs about silver bullets were hopelessly naıve. We see this in the fads and fashions of AI research over the years: first, automated theorem proving is going to solve it all; then, the methods appear too weak, and we favour expert systems; then the programs are not situated enough, and we move to behaviour-based robotics; then we come to believe that learning from big data is the answer; and on it goes."

We like the ability to search the entire history of human knowledge, but Levesque might be onto something--it's not, exactly, intelligence. "We should not treat English text as a monolithic source of information," he writes. "Instead, we should carefully study how simple knowledge bases might be used to make sense of the simple language needed to build slightly more complex knowledge bases, and so on."