Operant Behavior

As we saw, Pavlovian conditioning is the process by which inborn reflexes, including those reactions that we usually consider emotions, spread to new situations and settings.

Pavlovian conditioning is largely responsible for our motivation to respond in any situation. Operant conditioning, on the other hand, is what we learn to do to satisfy these motivational states.

Pavlovian conditioning is often considered to be involuntary. The organism has little control over the response, in that the environment elicits the behavior from the organism. Operant behavior involves voluntary behavior that is emitted by the organism is controlled by its consequences (hence the term consequated). Another way of looking at this difference is to look at the relationship between behavior and the environment. In Pavlovian conditioning, the relationship is stimulus-response (SproducesR), but in operant it is response-stimulus (RproducesS). Whether responses occur in the future depend upon the nature of the contingency. If it makes life better for the organism, it will likely occur again in the future (reinforcement), and if it makes like worse, it will likely not occur again in the future (punishment).

In the early stages of twentieth century behaviorism, no differentiation was made between what we now refer to as Pavlovian (or, in Skinner's terminology, respondent) conditioning, and instrumental (or, operant) conditioning. While the differences at first blush seem profound, we will eventually see that Pavlovian and operant are two halves of an integrated view of learned behavior--Pavlovian by itself isn't of much value--it sets us up (i.e.--provides the motivation) to act, and the action is the operant part of our behavior. By the same token, operant doesn't operate in a vacuum--we need to be motivated to act.

The 2x2 contingency table below summarizes, in a very simplistic way, the basic relationships that our behavior can have with our environment. When it's all boiled down to its simplest elements, there are two types of events (stimuli) in the world--those that we like (appetitive) and those we don't (aversive). Our actions (at least in terms of those that have an effect) can either lead to one of these events either being added to our environment (positive) or withdrawn (negative). The terms positive and negative do not connote value judgments on whether the behavior is good or bad--simply whether or not a stimulus is being added to the environment (positive), or being removed from the environment (negative).

As we'll see throughout the rest of the semester, life is really much more complex than this, but this is still a good starting place for understanding operant behavior.

Contingencies of Reinforcement

Stimulus

Appetitive

Aversive

Presentation

Positive
Reinforcement
Positive
Punishment

Response

Withdrawal

Negative
Punishment
Negative
Reinforcement

Reinforcement and Punishment. The terms reinforcement and punishment refer to a relationship between behavior and a resulting environmental change, very much like the term reflex referred to a relationship between a stimulus and a resulting change in behavior. In this case, the term reinforcement refers to a relationship between behavior and environment that results in an improvement of conditions for the individual, and punishment refers to a relationship that results in a worsening of conditions. If animals are hedonistic, then they should increase the frequency of acts that are reinforced, and decrease the frequency of acts that are punished.

Blinky

OK, that's quite a bit of terminology/concepts in a short period of time. Click on Blinky and take a short quiz to try to attach these concepts to real-life experiences.

Positive Reinforcement. The individual's behavior adds something desirable to the environment, thereby increasing the probability, under similar circumstances, of that behavior occurring again in the future. Positive reinforcement is what most people think of first when they think of operant conditioning, and it is what most applications of operant conditioning attempt to promote. In everyday terms, positive reinforcement means that behavior will be rewarded. For example, a student studies hard because she wants to get a good grade, or, more precisely, in scientific terms, in the past, studying has produced high grades.

Fido Some examples of positive reinforcement submitted by students in previous semesters.

Negative Reinforcement. The individual's behavior removes something undesirable from the environment, thereby increasing the probability, under similar circumstances, of that behavior occurring again in the future. This contingency differs from the other three in that it is usually further divided into two components, escape and avoidance.

The difference between the two is which of these two states is in effect-- did the behavior lessen the existing aversiveness (escape), or did it prevent the impending onset of aversiveness (avoidance)? Since escape involves an observable change in the environment, it is usually easier to see by casual observation. Avoidance, however, since it typically involves no change in the environment (an impending increase can not be seen), is more problematic. Avoidance, however, is the proverbial exception that proves the rule-since it involves behavior but no resulting change in the environment, yet is highly resistant to extinction, the only way to understand it is to look at the past history of the organism, where we will no doubt find a history of lack of responding being associated with increased aversiveness and responding associated with no corresponding increase in aversiveness.

Blinky See some examples of negative reinforcement, and decide which of these is escape, and which avoidance!

FidoA couple more examples of negative reinforcement submitted by students in previous semesters.

Positive Punishment. The individual's behavior adds something undesirable to the environment, thereby decreasing the probability, under similar circumstances, of that behavior occurring again in the future. Sometimes we do things that cause us to experience pain, reducing the probability of us doing that again. You put your hand on a hot stove, and you get burned. You are less likely to put your hand near the stove again. As I related in class, when I was in sixth grade, we were learning about electricity, and Mr. Carlson, my sixth grade teacher, had set up a phonograph such that the phonograph would play when a circuit was completed by touching the bare ends of an insulated wire to two exposed poles. I picked up two bolts, one in each hand, and touched the two poles of the phonograph, completing the electrical circuit. The phonograph started to play (I remember quite vividly that it was a 78 rpm record of Ricky Nelson singing Be Bop Baby). Unfortunately, my body was acting as the "wire." (Duh!) The violent electric shock I received was very painful, and I never did that again (One trial learning!)

FidoSome examples of positive punishment submitted by students in previous semesters.

Negative Punishment. The individual's behavior removes something desirable from the environment, thereby decreasing the probability, under similar circumstances, of that behavior occurring again in the future. Not all of what we do that is punished is physically painful. Often, the consequences of our behavior are that we loose something of value to us. This type of punishment is especially common when we look at the sanctions that society places on our behavior that it deems inappropriate. For example, the wise and all-knowing city fathers of St. Joseph, MN, have decreed that, in their infinite wisdom, it is a Sin to make a U-turn in their fair city, a Sin that is punishable by the Sinner being fined $49.00 (or at least that's what it was when I was caught Sinning). Notice that nothing physically painful happened to me, but this bright boy will probably not make a U-turn in St. Joseph again. Also note that one of the points that B.F. Skinner makes about the effects of punishment is that not only is the punished behavior suppressed, but the organism tends to keep away from anything associated with the punishment. I tend to stay away from (not avoid!) St. Joe as much as possible. To think about: Why didn't I want to say that I avoid St. Joe?

Another point to keep in mind about punished behavior is that it occurs because, presumably, at some time in the past it was reinforced. I made a U-turn because I wanted to go the other way. Making U-turns, at least in cities with less anal retentive city councilpersons, had been reinforced in the past by my getting to where I wanted to go more quickly. I still make U-turns if I can get away with it--I'm just be more vigilant where and when. So, sometimes the outcome of punishment is a more discriminating organism, at least in terms of when the punished behavior will be emitted.

FidoSome examples of negative punishment submitted by students in previous semesters.

FidoIf you would like to have a little more practice figuring out what's what with operant conditioning, try this exercise.

Premack Principle. We usually think about operant conditioning in stimulus-response terms--an organism does some act (behavior, or response) which leads to some change in the environment (a reinforcing or punishing stimulus). There are a couple of problems with this--the term response seems out or order--a response and then the stimulus? This is another one of those unfortunate products of history--we've tried to stretch a set of terms beyond their usefulness. There is another problem as well, though--talking about reinforcers and punishers bring up the philosophical question of "What is a reinforcer?" Many productive years have been wasted trying to answer that question. David Premack supplied a way out of the dilemma--just look at what an organism does as what is important, and not concern yourself with determining what is a reinforcer. An example may help. We normally think of a rat pressing a lever to get food, and diagram the operation as:
Premack--bar press
However, Premack would first look at what a hungry rat might do if given the opportunity to engage in all possible behaviors (or at least those available at the time), and construct a hierarchy from most probable to least, as diagrammed below:
Premack hierarchy without contingencies
As shown, a hungry rat would spend most of its time eating, a fair amount of time drinking, some grooming, exploring, etc., but very little lever pressing. However, once eating, a high probability behavior, becomes contingent upon lever pressing, then lever pressing becomes much more likely (its rate of response increases), as diagrammed below:
Premack hierarchy with contingencies
In other words, in Premack's view, the rat is engaging in a less desired behavior, bar pressing, to get the opportunity to engage in a more desired activity, eating. By looking at it this way, a reinforcing event is simply one that is more probable than the one upon which it is contingent. Much of our behavior can be viewed in this way--we work (less desired activity) in order to do the things that we enjoy.

If you keep the following suggestions in mind, you'll do fine:

  1. Keep in mind that, to understand the behavior you are observing now, you must look to the past history of the individual--looking for an immediate reinforcer in any given situation is the sure route to confusion, mentalism, and eternal damnation.
  2. Look for observable behavior and environmental consequences--what you are not doing and what is not happening to you is difficult for others to see! At any given time, there is an infinite number of things that you are not doing, and an infinite number of things that are not happening to you. For an example of this point, see Muffy walks the dog.
  3. In many examples of behavior, there are several component parts, each controlled by different contingencies, so you need to be as precise as possible in describing behavior, consequences, and how they relate to each other. Also, if the example concerns an interaction between two or more people, each is acting under a different contingency.
  4. A great deal of what we do, we do because we have learned to manipulate our environment to produce certain consequences (this is pretty everyday--I could make it more scientific if I wanted to!). You probably also discovered that, as was the case with Pavlovian conditioning, the real world is much more complex than the laboratory! You may also now have the unsettling feeling that a lot of the terminology is arbitrary!

BlinkyBefore going on, be sure to Make Up Your Own Question!

Go to:
UGLI 3 Principles of Learning and Behavior
Spawn UGLI 3 in WordTom CreedCSB/SJU page

mailboxEmail Tom

Last modified on September 25, 1998.