Monday 4 June 2012

Artificial Intelligence - Tac-Tac-Toe

I have discussed two AI programs in recent posts. Eliza seems to "respond" intelligently to conversation. Animals can "guess" an animal by asking a series of questions and can "learn" about new animals.

Aside: It is difficult to discuss AI without personifying either the computer program or the computer itself. I have and will continue to do this in this posting!

The program I am discussing in this posting can also "learn". But it does not need the user to "teach" it. It learns by doing.

The program appeared in the September 1979 issue of "Practical Computing". The issue has a particular focus on "The Intelligent Computer" as you can see from the cover below.

The article in question was "Noughts and Crosses" (Tac-Tac-Toe) and was written by Trevor L. Lusty.

It featured a program written in BASIC which could "learn" how to play the simple game.

The clever thing about the game was that the code told it how to recognise a win, lose, or draw but not how to attack or defend. It learned this by playing.

Here is an example of a game which I (the human player was always given X, although each player took turns to start) could win in 5 turns:

As you can see, the program had no idea how to defend against my attack. In fact it just plays in the first available square each time. It would immediately spot that it had lost however and would display the following message:
"I concede --- You win --- I'll try harder next time"

What it would do behind the scenes was to recognise that the move it made in turn 4 was a mistake and to record the fact that it should not make that mistake again.

So starting a new game and repeating the same moves would result in it making the following move in turn 4:
Obviously this move is no more effective that its previous attempt so turn 5 sees me winning again.

But again the program records its mistake.

Repeating the game over and over sees the program modifying its strategy until it finally blocks the attack and the game moves on:


In this way the program would get better and better, and harder and harder to beat!

But the really clever part of the program was how it would learn not to make moves that would lead it into a no-win situation. My next move shows such a situation (with my next winning moves highlighted):
The program would try blocking those two moves (and then try all other blank spaces) before "realising" that this was a no-win situation. It would then deprecate its previous move (the block it made in turn 4). At this point it would recognise the situation illustrated in turn 3 as a no-win situation. This means that the next time this arose, it would flag the move it took in turn 2 as being a mistake and it would start to try alternative responses to my move in turn 1.

Genius!

Aside: when I first read the program back in 1979 (age 17) I could not understand the piece I have just tried to explain. So the happy message I take from this is that it shows that we DO become smarter with age and experience!

The program was interesting to play as you could see how it adapted to your strategy (how it became "smarter").

The author himself observed in the article that you could modify the program so that it would play itself a number of times before engaging with the user. In this way it could do the "learning" on its own and be ready to give the player a good game from the outset.