Post

The Controversy of Operant Conditioning

Motivation

My interest in operant conditioning was sparked when I delved into the realm of Artificial Intelligence (AI). In contemporary AI, reinforcement learning techniques are frequently employed to train models in acquiring new skills. The realization that reinforcement learning is rooted in psychological principles ignited my desire to delve deeper into this field.

Reinforcement Learning

Reinforcement learning essentially involves guiding an agent to perform desired actions by selecting actions that maximize rewards based on a predefined reward function.

The reward function is mathematically represented as

\[R(s, a, s')\]
  • $ s $: Denotes the current state of the system.
  • $ a $: Represents the action selected by the agent.
  • $ s’ $: Signifies the subsequent state following the execution of action $ a $.

Notice that it is problem specific.

Does this explanation feel a little too much? Maybe. Let me explain with an example. Think about when you want your little sibling to learn good habits, like cleaning up toys after playing. To encourage this, you might give them a reward each time they tidy up. Over time, they start wanting to clean up on their own, even without you telling them. This is kind of like what we want to do in machine learning and AI to make things better by encouraging good actions, just like you encourage your sibling to clean up toys!

For a more rigorous explanation with heavy mathematics notation, feel free to visit the wikipedia page.

Operant Conditioning

Reinforcement learning is basically the subset of operant conditioning. Doesn’t know what does subset means? See this image.

While you can see that reinforcement learning is a method to make kids (or agents in the context of machine learning or AI) to encourage them to do something, operant conditioning make use of 2 big ideas, which is encouragement (reinforcement learning) and punishment. When you want to encourage kids or machine to do something, you give them reward. Conversly, if you don’t want kids to do something, you punish them right? That is so that you discourage them to do something that you don’t want them to do. That is the big idea of operant conditioning. By using encouraging or discoraging techniques to train your kid to behave as you want them to.

For further reading read this blog.

The Controversy

Great! Now that we’ve wrapped our heads around what Operant Conditioning and Reinforcement Learning entail, let’s dive into the juicy stuff—the controversy! Picture this: it all hit me one lazy afternoon as I pondered the curious ways we educate our kiddos.

Here’s the twist—I find it a tad sketchy that parents often use operant conditioning on their children. Why, you ask? Well, turns out, zookeepers use the exact same technique to train animals! Crazy, right? And here’s the kicker: we’re applying reinforcement learning techniques to teach machines to do what we want them to do. Now, that’s a hot topic up for discussion!

Now, my take on this is a bit of a rollercoaster. On one hand, I reckon we might need a sprinkle of operant conditioning to educate our little humans. But—and it’s a big BUT—we’ve got to make sure they don’t end up dodging punishment or doing things just for the reward. After all, what sets us apart as humans? It’s our knack for thinking, our conscious minds, and the free will to decide what we do or don’t do. Let the pondering begin!

This post is licensed under CC BY 4.0 by the author.