In 1898, a man named Edward Thorndike put a hungry meowing cat into a box with a mechanical latch and then deviously placed some food just outside the box – out of the cat’s reach. The cat meowed, paced back and forth and rubbed against the walls of the box, just as cat’s do. In doing so, it happened to trip the latch. Immediately, the door to the box opened and the cat gained access to the food. Thorndike stroked his moustache with pride and repeated the experiment, and with continued repetitions, the cat became more adept at tripping the latch. Eventually, it was able to escape from it’s boxy confines as soon as food appeared outside of the box.
Thorndike proposed a law (these people enjoyed proposing laws) of learning to account for this phenomenon, which he called the law of effect: an animal’s tendency to reproduce a behaviour depending on that behaviour’s effect on the environment and the consequent effect on the animal (humans are animals too you know). More simply put, the law of effect states that behaviour is controlled by it’s consequences… does this remind you of your school days? It should.
Thorndike’s cat began a second form of conditioning, known as instrumental or operant conditioning. Thorndike used the term instrumental conditioning because the behaviour is instrumental to achieving a more satisfying state of affairs. B.F. Skinner, who spent years experimenting with the ways in which behaviour is controlled by the environment, called it operant conditioning, which means learning to operate on the environment to produce a consequence.
In classical conditioning, an environmental stimulus produces a response, while in operant conditioning, a behavior (or operant) produces an environmental response. Operants are behaviours that are spontaneously produced rather than elicited by the environment. Thorndike’s cat spontaneously emitted the behaviour of brushing up against the latch, which resulted in an effect that conditioned future behaviour. Just like his cat, had Thorndike failed, he probably wouldn’t have persisted in these experiments, which others no doubt looked upon as silly ‘so you’re putting cats in boxes now?’ But because he had succeeded and had gained respect from his colleagues, he had continued his experiments and made a name for himself and placed it permanently in the psychology books.
In this post we will explore two types of environmental consequence that produce operant conditioning: reinforcement, which increases the probability that a response will occur, and punishment, which decreases it’s likelihood of occurring.
Reinforcement – Reinforcement means exactly as the name implies: something in the environment that fortifies, or reinforces a behaviour. A reinforcer is an environmental consequence that occurs after an organism has produced a response and makes the response more likely to occur. There are two types of reinforcement, they are positive reinforcement and negative reinforcement. Are you still reading? Good boy!
Positive reinforcement is the process where presentation of a stimulus (usually a reward) after a behaviour makes the behaviour more likely to occur again. The psychological term for reward in this case is a positive reinforcer. A couple of examples of positive reinforcement would be students showing more effort in class when their behaviour is praised by teachers, and adults going to work everyday even though they don’t want to, because they get a paycheck at the end of every week.
Negative reinforcement is when the removal of an aversive stimulus makes a behaviour more likely to occur. Don’t confuse this with punishment because of the word negative, the meaning of negative in this context refers to the taking away of something, or subtraction of. An example of negative reinforcement would be using an umbrella in the rain, because the umbrella removes the unwanted stimulus of rain making you wet, the behaviour of using an umbrella in the rain is strengthened. Another example would be using an air conditioner more often in the summer to get rid of the heat, wearing shorts in the summer to avoid being too hot, or putting on sunscreen to avoid getting sunburnt in the hot summer sun. Yeah I know I just pulled a whole lot of weather examples out of my hat, but they all reinforce the point I am making…
Negative reinforcers are unpleasant stimuli that strengthen a behaviour by their removal, in the above examples case they are: rain, heat and sunburn. Another example would be: hitting the snooze button on an alarm clock is negatively reinforced by the dissapearance of the alarm sound, sometimes I even manage to hit the snooze button in my sleep that is how conditioned I am to hate my alarm clock.
Negative reinforcement occurs in both escape learning and avoidance learning. In escape learning, a behaviour is reinforced by the elimination of an aversive state of affairs that already exists, that is the organism escapes an unpleasant situation. Like a cat escaping the rain, or a person escaping a really bad party. Avoidance learning occurs as an organism learns to prevent an expected unpleasant from happening. In this case avoidance of an unpleasant situation reinforces the behaviour of avoidance. Both escaping and avoiding an unpleasant situation negatively reinforces the behaviour of escaping and avoiding. It’s a cycle.