Schedules of Reinforcement

So far, we've been looking at the development of new behavior. However, much of our life consists of well established behavior. The study of schedules of reinforcement is, in effect, the study of the characteristics of well established behavior.

There are three overridingly important principles to keep in mind when looking at schedules of reinforcement:

  1. Maintenance of behavior requires little reinforcement. Once a response is well-established, it can be maintained with fairly infrequent reinforcement, even though it may have taken fairly frequent reinforcement to develop it. The natural environment is frequently unpredictable. Sometimes, our behavior pays off handsomely, and at other times, the same behavior will go unreinforced for long periods of time. Those behaviors that have a fairly strong history of reinforcement will better survive periods of relative non-reinforcement. That well-established behavior persists even during periods of low reinforcement is clearly adaptive-- environments tends to be cyclic, in that even though conditions change for periods of time, they do tend to revert to previous conditions. And what once worked is liable to work again in the future when conditions revert.

    Therefore, organisms who inherited a tendency towards persistence would be favored by natural selection. Of course, even behaviors which are generally adaptive can be maladaptive on occasion. We can all think of times when we persisted in a behavior that we should have abandoned much earlier. But it is important to recognize that the general propensity to persist with well-established behavior is adaptive, and so has been selected for. Spontaneous recovery is adaptive for similar reasons.

  2. How reinforcement becomes available (the patterning of reinforcement) is a crucial determinate of our behavior. What the study of schedules of reinforcement is really all about is looking at how the various ways that the environment makes reinforcement available determines how we behave. Furthermore, the study of schedules is primarily a study of the development of the history of the individual in a particular environment.

  3. The context ("economy") of reinforcement availability is crucial in determining the control that it exerts over us. Contingencies do not occur in a vacuum-- other things, more or less, may be going on besides the one we happen to be attending to at the time. Two principles from microeconomics-- The availability of alternate means of obtaining reinforcement (the openness of the economy), and the extent to which we require a specific reinforcer (the elasticity of the reinforcer), are important determiners of the extent to which a contingency will control our behavior.

Let's look at the basic schedules of reinforcement to get a better understanding of how the patterning of reinforcement affects behavior. While many schedules of reinforcement have been studied in the laboratory, the most commonly studied are the simpler Ratio and Interval schedules. These two basic types are further broken down into Fixed and Variable. A simple 2x2 table shows the four basic schedules:

Consistency of Contingency
Fixed Variable
Ratio Fixed Ratio (FR) Variable Ratio (VR)
Contingency based on
Interval Fixed Interval (FI)Variable Interval (VI)

Ratio schedules. As the name implies, there is a ratio, or direct relationship, between the amount of behavior put out (usually measured by number of responses) and the amount of reinforcement obtained. In other words, the harder (faster) you go, the more strongly (frequently) you get reinforced. Because of this direct relationship between behavior and reinforcement, ratio schedules tend to generate hard work (high rates of response) in organisms. If the get too high, though, the organism will eventually stop responding (ratio strain).

Interval schedules. The opportunity for reinforcement is independent of the organism's effort-- forces beyond the control of the individual determine when reinforcement is available, and only after it becomes available will behavior be reinforced. The term interval is in some ways misleading, since it implies that waiting is what is important. The concept of the interval comes from the fact that laboratory experiments did indeed use times of various sorts to schedule the availability of reinforcement. While interval schedules in a sense do imply the passage of time, the important characteristic of these schedules, especially as contrasted with ratio schedules, is that the behavior of the individual does not influence the availability of reinforcement. Because of this lack of a relationship between the individual's behavior and the availability of reinforcement, interval schedules produce much lower amounts of behavior per reinforcement when compared to ratio schedules.

Fixed. The availability of reinforcement is predictable. Whether ratio or interval, the parameters of availability are, relatively speaking. unchanging, or fixed. Once the organism has had considerable experience with the contingency, there will be a pause following reinforcement, followed by a period of responding. For example, in an FI 1' schedule, reinforcement becomes available 1' after the last reinforcement. An organism which has come under the control of the contingencies will not respond until reinforcement is about due, then respond at an increasing rate of response. This makes sense, since short durations have never been reinforced, yet waiting longer than 1' reduces the density of reinforcement. Organisms tend to optimize their performance so that they neither waste opportunities for reinforcement not exert more effort than is necessary.

Variable. The availability of reinforcement is unpredictable, relatively speaking. From one reinforcement opportunity to the next, the organism has no way of knowing if reinforcement will be forthcoming on the next response. For example, in a VI 1' schedule, reinforcement may become available immediately after the last reinforcement, or it could be a few minutes before it again becomes available, but over a period of time, it will average out to once a minute. An organism which has come under the control of the contingencies will begin responding soon after the reinforcement, since, on occasion, since short durations have occasionally been reinforced. The rate of response will be steady, without post-reinforcement pauses. It turns out that even relatively infrequent short reinforcement intervals have a profound effect on the organism's rate of response. One of the primary findings of conditioning studies is that organisms are inordinately influenced by occasional immediate reinforcement (Question-- can you see any parallels in your life?)

Cumulative records of typical steady-state behavior under the four basic schedules of reinforcement:

INLINEIMAGE

Fixed Ratio (FR). Every xth response produces reinforcement. This schedule is expressed as FRx, where x is a specific value (FR10 means every tenth response is reinforced). Since the density of reinforcement is determined by rate of response, FRs produce the high rate of response typical of ratio schedules, with the characteristic post-reinforcement pause (PRP) following reinforcement resulting from the fact that responses right after reinforcement are never reinforced. Studies have shown that the period immediately following reinforcement in FRs are aversive to organisms, and they will escape from ratio schedules at this point of the schedule. The size of the PRP is directly proportional to the size of the ratio-- bigger ratios produce longer PRPs.

Some ratio schedules, especially those in the real world, include a reset function. What happens with a reset function is that, when the individual makes an error, any accumulated responses towards the completion of the ratio are taken away, and they must start over. In other words, errors are negatively punished. Reset functions are typical when some skill is involved in making a correct response. For example, when I was a kid I liked to build card houses. If I made an "error," the house would fall down and I'd have to start over. Reset functions have at least two effects on behavior--the individual becomes more careful as the ratio builds, and the farther you are into the ratio, the more cranked off you are when an error occurs. For example, the higher I would stack the playing cards, the more careful I would become, because I would have more to lose (my previous work), and the more disappointed I'd be that I had lost my wonderful creation.

You can try a little quiz that has a reset function. (nb- The quiz, which will appear in the top frame, is for a logic course at St. Thomas taught by Mike Winter. The reference to Our Course is to the logic course, not Principles of Learning and Behavior.) See if you become more careful as you get along on the quiz, and your reaction to making "errors" the farther you are in the quiz. When done, post your reactions to our electronic conference, then click Return to Schedules to come back to this place in your reading.

Variable Ratio (VR). On average, every xth response produces reinforcement, but from one reinforcement to the next, the number required is unpredictable. This schedule is expressed as VRx, where x is a specific value. VR10 means, on average, every tenth response is reinforced, but on any given occasion, reinforcement may be given on the next response after the last reinforcement, or it could be a dozens of responses before reinforcement is again given. But over a period of time, it will average out to once every ten responses. VRs produce very high rates of response with little pausing after reinforcement.

Fixed Interval (FI). The first response after a specified, unchanging period of time has elapsed since the last reinforcement, produces reinforcement. This schedule is expressed as FIx, where x is a specific period of time. For example, an FI 1' means the first response after one minute has passed since the last reinforcement, is reinforced. Fixed interval responding is characterized by the FI scallop, which is a result of the gradually accelerating rate of response as the interval comes due.

Variable Interval (VI). On average, the first response after the passage of a specified period of time has elapsed since the last reinforcement, produces reinforcement, but from one reinforcement to the next, the required amount of time to have passed is unpredictable. This schedule is expressed as VIx, where x is a specific period of time. In a VI 1', for example, on average, the first response after one minute has passed since the last reinforcement, is reinforced. But on any given occasion, reinforcement may become available immediately after the last reinforcement, or it could be a few minutes before it again becomes available, but over a period of time, it will average out to once a minute. VIs produce consistent, moderate rates of response, which make them ideal for sustained responding over long periods of time.

Before going on, be sure to Make Up Your Own Question!

Go to:
UGLI 6 Principles of Learning and Behavior
Spawn UGLI 6 in WordTom CreedCSB/SJU page

Email Tom

Last modified on October 11, 1997.