As we saw, Pavlovian conditioning is the process by which inborn reflexes, including those reactions that we usually consider emotions, spread to new situations and settings.
Pavlovian conditioning is largely responsible for our motivation to respond in any situation. Operant conditioning, on the other hand, is what we learn to do to satisfy these motivational states.
Pavlovian conditioning is often considered to be involuntary. The organism has little control over the response, in that the environment elicits the behavior from the organism. Operant behavior involves voluntary behavior that is emitted by the organism is controlled by its consequences (hence the term consequated). Another way of looking at this difference is to look at the relationship between behavior and the environment. In Pavlovian conditioning, the relationship is stimulus-response (S
R), but in operant it is response-stimulus (R
S). Whether responses occur in the future depend upon the nature of the contingency. If it makes life better for the organism, it will likely occur again in the future (reinforcement), and if it makes like worse, it will likely not occur again in the future (punishment).
In the early stages of twentieth century behaviorism, no differentiation was made between what we now refer to as Pavlovian (or, in Skinner's terminology, respondent) conditioning, and instrumental (or, operant) conditioning. While the differences at first blush seem profound, we will eventually see that Pavlovian and operant are two halves of an integrated view of learned behavior--Pavlovian by itself isn't of much value--it sets us up (i.e.--provides the motivation) to act, and the action is the operant part of our behavior. By the same token, operant doesn't operate in a vacuum--we need to be motivated to act.
The 2x2 contingency table below summarizes, in a very simplistic way, the basic relationships that our behavior can have with our environment. When it's all boiled down to its simplest elements, there are two types of events (stimuli) in the world--those that we like (appetitive) and those we don't (aversive). Our actions (at least in terms of those that have an effect) can either lead to one of these events either being added to our environment (positive) or withdrawn (negative). The terms positive and negative do not connote value judgments on whether the behavior is good or bad--simply whether or not a stimulus is being added to the environment (positive), or being removed from the environment (negative).
As we'll see throughout the rest of the semester, life is really much more complex than this, but this is still a good starting place for understanding operant behavior.
Contingencies of Reinforcement | |||||||
Stimulus | |||||||
Appetitive | Aversive | ||||||
Presentation | Positive Reinforcement | Positive Punishment | |||||
Response | |||||||
Withdrawal | Negative Punishment | Negative Reinforcement | |||||
Reinforcement and Punishment. The terms reinforcement and punishment refer to a relationship between behavior and a resulting environmental change, very much like the term reflex referred to a relationship between a stimulus and a resulting change in behavior. In this case, the term reinforcement refers to a relationship between behavior and environment that results in an improvement of conditions for the individual, and punishment refers to a relationship that results in a worsening of conditions. If animals are hedonistic, then they should increase the frequency of acts that are reinforced, and decrease the frequency of acts that are punished.
|
|
OK, that's quite a bit of terminology/concepts in a short period of time. Click on Blinky and take a short quiz to try to attach these concepts to real-life experiences. |
Positive Reinforcement. The individual's behavior adds something desirable to the environment, thereby increasing the probability, under similar circumstances, of that behavior occurring again in the future. Positive reinforcement is what most people think of first when they think of operant conditioning, and it is what most applications of operant conditioning attempt to promote. In everyday terms, positive reinforcement means that behavior will be rewarded. For example, a student studies hard because she wants to get a good grade, or, more precisely, in scientific terms, in the past, studying has produced high grades.
| Some examples of positive reinforcement submitted by students in previous semesters. |
Negative Reinforcement. The individual's behavior removes something undesirable from the environment, thereby increasing the probability, under similar circumstances, of that behavior occurring again in the future. This contingency differs from the other three in that it is usually further divided into two components, escape and avoidance.
The difference between the two is which of these two states is in effect-- did the behavior lessen the existing aversiveness (escape), or did it prevent the impending onset of aversiveness (avoidance)? Since escape involves an observable change in the environment, it is usually easier to see by casual observation. Avoidance, however, since it typically involves no change in the environment (an impending increase can not be seen), is more problematic. Avoidance, however, is the proverbial exception that proves the rule-since it involves behavior but no resulting change in the environment, yet is highly resistant to extinction, the only way to understand it is to look at the past history of the organism, where we will no doubt find a history of lack of responding being associated with increased aversiveness and responding associated with no corresponding increase in aversiveness.
| See some examples of negative reinforcement, and decide which of these is escape, and which avoidance! |
A couple more examples of negative reinforcement submitted by students in previous semesters.
Positive Punishment. The individual's behavior adds something undesirable to the environment, thereby decreasing the probability, under similar circumstances, of that behavior occurring again in the future. Sometimes we do things that cause us to experience pain, reducing the probability of us doing that again. You put your hand on a hot stove, and you get burned. You are less likely to put your hand near the stove again. As I related in class, when I was in sixth grade, we were learning about electricity, and Mr. Carlson, my sixth grade teacher, had set up a phonograph such that the phonograph would play when a circuit was completed by touching the bare ends of an insulated wire to two exposed poles. I picked up two bolts, one in each hand, and touched the two poles of the phonograph, completing the electrical circuit. The phonograph started to play (I remember quite vividly that it was a 78 rpm record of Ricky Nelson singing Be Bop Baby). Unfortunately, my body was acting as the "wire." (Duh!) The violent electric shock I received was very painful, and I never did that again (One trial learning!)
Some examples of positive punishment submitted by students in previous semesters.
Negative Punishment. The individual's behavior removes something desirable from the environment, thereby decreasing the probability, under similar circumstances, of that behavior occurring again in the future. Not all of what we do that is punished is physically painful. Often, the consequences of our behavior are that we loose something of value to us. This type of punishment is especially common when we look at the sanctions that society places on our behavior that it deems inappropriate. For example, the wise and all-knowing city fathers of St. Joseph, MN, have decreed that, in their infinite wisdom, it is a Sin to make a U-turn in their fair city, a Sin that is punishable by the Sinner being fined $49.00 (or at least that's what it was when I was caught Sinning). Notice that nothing physically painful happened to me, but this bright boy will probably not make a U-turn in St. Joseph again. Also note that one of the points that B.F. Skinner makes about the effects of punishment is that not only is the punished behavior suppressed, but the organism tends to keep away from anything associated with the punishment. I tend to stay away from (not avoid!) St. Joe as much as possible. To think about: Why didn't I want to say that I avoid St. Joe?
Another point to keep in mind about punished behavior is that it occurs because, presumably, at some time in the past it was reinforced. I made a U-turn because I wanted to go the other way. Making U-turns, at least in cities with less anal retentive city councilpersons, had been reinforced in the past by my getting to where I wanted to go more quickly. I still make U-turns if I can get away with it--I'm just be more vigilant where and when. So, sometimes the outcome of punishment is a more discriminating organism, at least in terms of when the punished behavior will be emitted.
Some examples of negative punishment submitted by students in previous semesters.
If you would like to have a little more practice figuring out what's what with operant conditioning, try this exercise.
Premack Principle. We usually think about operant conditioning in stimulus-response terms--an organism does some act (behavior, or response) which leads to some change in the environment (a reinforcing or punishing stimulus). There are a couple of problems with this--the term response seems out or order--a response and then the stimulus? This is another one of those unfortunate products of history--we've tried to stretch a set of terms beyond their usefulness. There is another problem as well, though--talking about reinforcers and punishers bring up the philosophical question of "What is a reinforcer?" Many productive years have been wasted trying to answer that question. David Premack supplied a way out of the dilemma--just look at what an organism does as what is important, and not concern yourself with determining what is a reinforcer. An example may help. We normally think of a rat pressing a lever to get food, and diagram the operation as:

However, Premack would first look at what a hungry rat might do if given the opportunity to engage in all possible behaviors (or at least those available at the time), and construct a hierarchy from most probable to least, as diagrammed below:

As shown, a hungry rat would spend most of its time eating, a fair amount of time drinking, some grooming, exploring, etc., but very little lever pressing. However, once eating, a high probability behavior, becomes contingent upon lever pressing, then lever pressing becomes much more likely (its rate of response increases), as diagrammed below:

In other words, in Premack's view, the rat is engaging in a less desired behavior, bar pressing, to get the opportunity to engage in a more desired activity, eating. By looking at it this way, a reinforcing event is simply one that is more probable than the one upon which it is contingent. Much of our behavior can be viewed in this way--we work (less desired activity) in order to do the things that we enjoy.
If you keep the following suggestions in mind, you'll do fine:
Before going on, be sure to Make Up Your Own Question!
| Go to: | ||
| UGLI 3 | Principles of Learning and Behavior | |
| Spawn UGLI 3 in Word | Tom Creed | CSB/SJU page |