With AI systems largely receiving feedback in a binary yes/no format, Monash University professor Tom Drummond says rich feedback is needed to allow AI systems to know why answers are incorrect.
In much the same way children have to be told not only what they are saying is wrong, but why it is wrong, artificial intelligence (AI) systems need to be able to receive and act on similar feedback.
“Rich feedback is important in human education, I think probably we’re going to see the rise of machine teaching as an important field — how do we design systems so that they can take rich feedback and we can have a dialogue about what the system has learnt?”
“We need to be able to give it rich feedback and say ‘No, that’s unacceptable as an answer because … ‘ we don’t want to simply say ‘No’ because that’s the same as saying it is grammatically incorrect and its a very, very blunt hammer,” Drummond said.
The flaw of objective function
According to Drummond, one problematic feature of AI systems is the objective function that sits at the heart of a system’s design.
The professor pointed to the match between Google DeepMind’s AlphaGo and South Korean Go champion Lee Se-dol in March, which saw the artificial intelligence beat human intelligence by 4 games to 1.
In the fourth match, the only one where Se-dol picked up a victory, after clearly falling behind, the machine played a number of moves that Drummond described as insulting if played by a human due to the position AlphaGo found itself in. “Here’s the thing, the objective function was the highest probability of victory, it didn’t really understand the social niceties of the game.
“At that point AlphaGo knew it had lost but it still tried to maximise its probability of victory, so it played all these moves … a move that threatens a large group of stones, but has a really obvious counter and if somehow the human misses the counter move, then it’s won — but of course you would never play this, it’s not appropriate.”