Discover how the Ivan Pavlov-inspired algorithm works

The psychology teacher couldn’t quite figure out why she didn’t like teaching the class on Ivan Pavlov.

Maybe it was because the students were not very interested in the findings of a Russian theology student who lost his faith when he learned about Darwin’s evolutionary theory.

Or perhaps it was her personal aversion to dogs, which irritated her even as a topic of conversation.

And to talk about Ivan Pavlov she had to talk about dogs. There was no way out.

But this Introduction to Psychology class was particularly heavy because of the section she had been assigned.

They were not very numerous, some twenty souls lost in the labyrinth of their telephones with no sparkle in their eyes to give away the spirit of youth.

The Ivan Pavlov of dogs

No way, despite the strong presence of the matrix in the classroom, the work had to be done.

I knew from experience that there is no worse sin for a teacher than to be bored by his own words.

But bored and all, the teacher resorted to the old trick of getting others to talk.

“What do you know about Ivan Pavlov?” she asked, trying to look interested.

The question floated for a few moments like a cloud over the heads of the students and then dissolved into a small shower of nothing.

Dogs and bells

After a few seconds that are not seconds but compact segments of eternity, a girl from the top dared to say as if expressing a trivial comment: “Teach, isn’t that the lord of the dog and the little bell?”

Sure, that’s all right. That’s the way it used to be in these cases. From the classic image of the – for her – disgusting salivating dog, she had to develop her class with a script that she knew by heart:

The salivation of the dogs associated with the sound of the bell and the presence of food would occur even if the food was not present.
This result led Ivan Pavlov to establish the theoretical basis of classical conditioning.
That is, an organism responds to an environmental stimulus, originally neutral, with an automatic or reflex response.

Now it only remained to mention how this derived into a learning theory and to leave behaviorism for the next class.

There you go. No more Ivan Pavlov of the dogs. There was only the usual question to close the class and be a free woman again.

One arm up

“Any questions?” she said this time with more encouragement because almost always everyone is silent and then leaves.

Then, there in the last positions, someone raised an arm.

It was one of those students with such a common face that if she stopped seeing it for a moment, he would be erased from his consciousness almost immediately.

By allowing him to intervene, she realized that his eyes were already saying a lot of things in advance.

“May I add a comment to your class,” he asked politely but firmly.

“Of course!” He wanted to ask her name, but decided to leave it anonymous and away.

“First of all I would like to say that beyond the classical conditioning always associated with psychology, Ivan Pavlov is the great precursor of artificial intelligence and an important architect of the conformation of our contemporaneity.”

Those words hit her teacher’s heart and bounced to an unknown area of her other heart that she was not teaching anyone.

The machine that learns like animals

“In 1951,” continued the student with the common face, “Marvin Minsky, a Harvard student, built an intelligent machine based on the ideas of Ivan Pavlov.

The machine in question learned with animal-like reinforcements.

At that time, neuroscientists had not yet discovered the brain mechanisms that make animals learn this way.

But Minsky was still able to freely imitate behavior, thus advancing artificial intelligence.

At a high level, reinforcement learning follows the intuition derived from Ivan Pavlov’s dogs.

Algorithms don’t salivate but they learn

That is, it is possible to teach an agent to master complex tasks through positive and negative feedback.

An algorithm begins to learn an assigned task by randomly predicting what action might earn it a reward.

It then takes the action, observes the actual reward and adjusts its prediction based on the margin of error.

Over millions of tests, the algorithm’s prediction errors converge to zero.

That’s when you know precisely what actions to take to maximize your reward and complete your task. Fascinating, isn’t it?”

The teacher nodded and suddenly wished she was a different person.

She would have liked for example to be lighter and more spontaneous.

And to love dogs and have one at home, and that by petting it she would make her think of algorithms and not of saliva from hunger and bells.

Translated with www.DeepL.com/Translator (free version)

This post is also available in: Español (Spanish)