Reinforcement

Reinforcement Theory
For Me?

The one theory of influence almost everyone is taught is this one.  And if you are only taught one approach, this can be good candidate.  It works in a variety of situations, it can be simply applied, and it has just a few basic ideas.  In fact, reinforcement theory boils down to a Main Point:  Consequences influence behavior.

Think about that for a moment.  Consequences influence behavior.  It means that people do things because they know other things will follow.  Thus, depending upon the type of consequence that follows, people will produce some behaviors and avoid others.  Pretty simple.  Pretty realistic, too.  Reinforcement theory (consequences influence behavior) makes sense.

To understand this theory, I want you to go back in time to the 4th grade.  Think about school and teachers and rows of desks.  Remember the smells.  How about art class?  Think about your teachers.  Yeah, that one, too, the mean one.  While Reinforcement Theory can work anywhere with anyone, grade school is the perfect environment.

Principles Of Reinforcement

There are three basic principles of this theory.  These are the Rules of Consequences.  The three Rules describe the logical outcomes which typically occur after consequences.

1.  Consequences which give Rewards increase a behavior.

2.  Consequences which give Punishments decrease a behavior.

3.  Consequences which give neither Rewards nor Punishments extinguish a behavior.

These Rules provide an excellent blueprint for influence.  If you want to increase a behavior (make it more frequent, more intense, more likely), then when the behavior is shown, provide a Consequence of Reward.  If you want to decrease a behavior (make it less frequent, less intense, less likely), then when the behavior is shown, provide a Consequence of Punishment.  Finally, if you want a behavior to extinguish (disappear, fall out of the behavioral repertoire), then when the behavior is shown, then provide no Consequence (ignore the behavior).

Now, the Big Question becomes, “What is a reward?” or “What is a punishment?”

The answer is easy.

What is a reward?  Anything that increases the behavior.

What is a punisher?  Anything that decreases the behavior.

Yipes, is this circular reasoning or what?  Rewards increase a behavior and anything that increases a behavior is a reward.  What is going on here?

What’s going on is this:  Reinforcement theory is a functional theory.  That means all of its components are defined by their function (how they work) rather than by their structure (how they look).  Thus, there is no Consequences Cookbook where a teacher can look in the chapter, “Rewards for Fifth Grade Boys,” and find a long list of things to use as rewarding consequences.  Think about this a minute.

Many kids find candy to be rewarding.  If they sit quietly in their chairs for five minutes and you give them each a sweet, those kids will learn to sit quietly.  The candy (Consequence of Reward) is used to increase the behavior of sitting quietly.  So, we have discovered a Reward and can put it in the Consequences Cookbook, right?

And then the next time your spouse spends the afternoon cleaning up some grubby corner of the basement all you have to do is give them a candy bar and next week you’ll find ‘em in the bathroom scrubbing out the tub, right?  Of course not.

Candy functions as a reward in some circumstances, but candy has no effect in others.  (If there was a Consequences Cookbook, don’t you think the Board of Education would pay teachers with Smiley Face stickers instead of money?)

ShirleyTempleGoodThe functional nature of reinforcement theory is important to understand.  It explains why the theory sometimes appears to be incorrect.  An example:  when Sally Goodchild interrupts the class, Mrs. Reinforcer stops the class, tells Sally she’s a naughty girl who broke Rule 24 and now must leave the classroom and go to the principal’s office.  Ouch!  That really hurt Sally Goodchild.  And Mrs. Reinforcer knows that when Sally returns, she will not interrupt.  Mrs. Reinforcer then goes to the teacher’s lounge and sings the praises of this really great theory.

Well, don’t you know that the other kids in the class watched this event with great interest.  And when Bad Bill interrupts the class, Mrs. Reinforcer stops the class, tells Bad Bill he’s a naughty boy who broke Rule 24 and now must MarlonBrandoHogleave the classroom and go to the principle’s office.  Ouch!  That really hurt Bad Bill.  And Mrs. Reinforcer knows when Bad Bill comes back to class, he will not interrupt, because he will want to avoid that wicked punisher.

We all know what happens next.  Bad Bill comes back to class, immediately interrupts the lesson, Mrs. Reinforcer whacks him with the Consequence of Punishment, and Bad Bill keeps on interrupting, so he gets out of class.  Mrs. Reinforcer is totally confused at this point and she goes back to the teacher’s lounge complaining about this stupid reinforcement theory.

To understand if you have a Reward, you must observe its effect.  If the Consequence increases the behavior you want to increase, voila, you have a Reward.  If the Consequence decreases the behavior you want to decrease, then you have a Punishment.  Most teachers (and everyone else) have had the unfortunate experience of Mrs. Reinforcer.  They have persisted in giving a Consequence of Punishment and lo and behold, the receiver keeps doing the bad thing.  If the behavior does not increase or decrease the way you want it to, then you need to rethink your rewards and punishments.

In summary, the main point of this theory is that consequences influence behavior.  Rewarding consequences increase behavior.  Punishing consequences decrease behavior.  No consequences extinguish a behavior.  Finally, a consequence is known by its function (how it operates).

In the next section we consider how to put the Rules into effect.  Here we learn how to apply the Rules.

The Process Of Reinforcement

The Rules of Consequence are used in a three step sequence that defines the process of reinforcement.  We can call these steps, When-Do-Get.

Step 1:  When in some situation,

Step 2:  Do some behavior,

Step 3:  Get some consequence.

According to Reinforcement Theory, people learn several things during the process of reinforcement.  First, they learn that certain behaviors (Step 2: Do) lead to consequences (Step 3: Get).  This is the most obvious application of the Rules of Consequence.  A student realizes that if she does well on an assignment (Do), then she will get a Rewarding Consequence of a pretty sticker (Get).  Another student discovers that if he speaks out inappropriately (Do), then he will receive the Punishing Consequence of reduced recess time (Get).

But second, and as important, people learn that the Do-Get only works in certain situations (Step 1: When).  For example, a child may discover that when she is with her parents (When) and she throws a temper tantrum (Do), she embarrasses them and they give her Rewards such as attention, toys, candy, or whatever (Get).  Now when this child hits school and tries this trick, she is cruelly disappointed when the teacher provides a Punishing Consequence rather than a Rewarding Consequence.  She soon learns that Tantrum —> Reward only works When she is with Mom and Dad.

This is simple.  When in some situation-Do some behavior-Get a consequence.  And there are only three consequences, Rewarding, Punishing, and Ignoring.  With these basics firmly in hand, let’s now appreciate nuance!

Good Taste in Terminology

Many people are idiots about Reinforcement Theory and say more than they know:  FauxItAlls.  The mark of the beast occurs with two words, “positive” and “negative.”  If you wish to remain a person with class, sophistication, and darn it, just plain good taste, let me show you to the Cool Table.  This information will probably have no impact on how Those People misuse Reinforcement lingo, but at least you and I will have the smug pleasure of knowing the difference between those heathens and us sophisticates.  Let me hold your cape.  On to the opera!

You can easily find confident writers scribbling the terms “positive reinforcement” and “negative reinforcement” as synonyms for “reward” and “punishment,” respectively.  Thus, a “positive reinforcer” is a rewarding consequence while a “negative reinforcer” is a punishing consequence.  These heathen FauxItAlls have confounded two different elements of the theory into a hopeless mashup of mismeaning.  Heathens are free to say and do what they please – that’s what makes them heathens after all – but they cannot call themselves knowledgeable, competent, learned, I-passed-the-true-false-test purveyors of Reinforcement Theory.  Anyone who uses this lingo is post hoc, ergo ipso facto, dipsy-doodle dumb, but unfortunately neither speechless nor dysgraphic.

People properly punished, oops, I mean educated, in Reinforcement Theory lingo know that the modifiers “positive” and “negative” are closer in meaning to the street parlance meaning of “on” and “off.”  Thus, when a Proper Persuader says or writes “positive reinforcer” it can mean either a rewarding or a punishing consequence was “turned on” or “made available” or “activated” or just simply, “there.”  By contrast, with a “negative reinforcer” it means either a rewarding or a punishing consequence was “turned off” or “made unavailable” or “deactivated” or just simply, “not there.”  The key point to discipline here is the on-off usage of the positive-negative terms and not the reward-punishment connection wired by discombobulated heathens.

The Proper usage of positive-negative as on-off makes certain reinforcement situations more understandable than the uncouth reward-punishment misprison.  Consider this situation.  In a known situation (When), you perform a behavior (Do), and receive a reward (Get).  Later, I change the contingency by taking away the rewarding Get.  This is now punishment.  And I created a punishment by taking away the previously rewarding consequence.  I didn’t add anything new, I just took away something rewarding from the old.  This is negative reinforcement.

Kids growing up learn this by the street name of Grounded rather than the hoity-toity Negative Reinforcement.  This rose is all thorn and arises from what you lose rather than what you gain.  The FauxItAll nomenclature clearly cannot cope with an example even juvenile delinquents understand.

Thus Spake The Maestro:  Positive means On; Negative means Off.  Let’s us now scoff at the FauxItAlls who drop their unLearned Drawers in public with their positive-reward and negative-punishment miswiring.  Tah!

Tosca Stabs ScarpiaNow, we’re off to Tosca.  Scarpia knows the difference between the positive and the negative and how to apply them with rewards and punishments . . . until the end, of course, when Tosca proves that love is greater than operant conditioning!

More Good Taste:  Taking a Beating for a Rosy Glow

While we sit entr’acte, consider now another conceptual and linguistic faux pas from the FauxItAlls:  Killing a behavior.  We’re talking about, my good man, ending it, it never happens again, it falls out of the behavioral repertoire.  She stops nagging.  He stops drinking.  They live happily ever after.

To end, stop, finito, quito, void, nullify, endeth any behavior the FauxItAll goes to the whip early and often, believing like Mrs. Reinforcer in a crazed conception that never has, was, or will be found in Righteous Reinforcement.  Mrs. Reinforcer believes in a Constant Consequence that conditions for all faces and places.  FauxItAlls believe Punishment terminates a behavior.  Both are wrong.  There are no Constant Consequences and Punishment does not terminate a behavior.

What, then, sir, you say, what then terminates a behavior if not punishment?

Nothing, my good man.  Why, nothing at all extinguishes a behavior.  Now, pass me the mustache wax, if you please.  As you splutter, let me wax on . . .

. . . recall that Consequences of Reward INcrease a behavior, that Consequences of Punishment DEcrease a behavior, and that the Consequence of Nothing stops a behavior.  Nagging does not stop because the Nag fears Punishment, but because Nothing desirable follows the Nagging.  Silence while the Nag waits for the Punisher to leave the Scene is not the same thing as the Silence that arises from a new point of view.  FauxItAlls and Mrs. Reinforcer (in the Lecture Hall with the Whip) fail to distinguish these different states of inaction.  Sometimes, inaction is a ploy.  Other times, it is the sign of an extinguished action.

Of course, FauxItAlls know that the Consequence of Nothing is an idiocy because everyone knows that nothing comes from nothing, so Nothing cannot possibly work!  And the Fauxs point to numerous examples wherein someone like Mrs. Reinforcer clearly did Nothing with Bad Bill, yet Bad Bill’s behavior persisted.

The missing trick is that there are a google of Consequences and it is a conceit of the foolish to believe that only their Consequences count.  When Mrs. Reinforcer does Nothing with Bad Bill, she believes that the only source of Consequence for Bill is her.  Sigh.  It is difficult to extinguish a behavior because it is impossible or at least illegal for any one source to control all the Consequences for another human being.  Thus, even while you are properly doing Nothing, Bad Bill finds Consequences from other sources in the room, not infrequently from Sally Goodchild who finds herself blushing when Bill gets Bad with Mrs. Reinforcer.  Bill fancies a girl with a pinkish hue, a rosy glow, and will gladly pull Mrs. Reinforcer’s chain to put a blush on Sally’s cheeks.

Do not confuse the petty details of reality with the eternal truth of theory:  The Consequence of Nothing ends it all.

Common Examples Of Reinforcement

One of the best examples of reinforcement I’ve ever heard came from an assistant football coach at a college.  A little background:  Some football players have trouble getting to team meetings.  When this happens the coaches want to Punish the players so they will be on time.  What to do?

The standard answer is extra exercise.  When the team is in a workout, at the end of the session the coaches identify the tardy players and make them run extra laps or do more pushups, right?  (When on this team, Do miss a team meeting, Get extra laps).

Well, this coach had a better idea.  At the end of the workout he called everyone together, identified the tardy players who missed the team meeting.  Then he made the rest of the team run extra laps while the tardy ones sat and watched.  The coach claimed that this application had to be given only once a year.

And, of course, many “group” attitudes and values are created through Reinforcement.  Children performed a desired say or do and the family provides rewarding consequences.  You join a new work organization and perform an undesired say or do and you get punishing consequences.

A Great Story

One teacher developed an excellent and memorable system of reinforcement.  During tests in her mathematics class, she would quietly patrol the room, carefully observing the children.  If she saw that one was in trouble, she would ease over to the child and scan the test, looking for mistakes.  When she found an error, she would quietly take her pencil, tap it beside the mistake, so that the child knew there was an error on the test and where the error was.  Then the teacher would take the pencil and whack it on the kid’s nose.  (When you are taking a test, Do make a mistake, Get a rap on the nose).

Certainly an excellent application of the reinforcement paradigm and I would have to give it an “A” for correctness and an “F” for effectiveness.

For Me That’s Proven

Hmm, Good. Let’s get a bit more systematic here and look at proven For Me? plays.  Consider this interesting little study by Professor Insko.  He contacted university students by telephone and solicited their opinion about an upcoming local social event.  All participants were given a self report survey with a series of questions that essentially asked whether they thought the upcoming event was positive or negative.  Half of the students were randomly reinforced by the caller with the word, “Good,” when the student expressed a positive opinion on a question while the other half were reinforced when they expressed a negative opinion on a question.  Thus, all participants are getting the same reinforcer, “Good,” but for different positions on the same topic.  Later Insko had all of these students complete a multi-topic attitude survey that included their opinion of the social event.  We’re interested to see if this little verbal reinforcer, “Good,” could shift opinions and that’s exactly what Insko found.  People who got “Good” for expressing positive opinions on the telephone survey had more favorable attitudes (28.2) compared to those who received “Good,” for expressing negative opinions (20.7).  Again these means are meaningless, but the effect size is.  And it was large, a Windowpane of 30/70.

Praise. Perhaps the simplest form of persuasive reinforcement is praise.  Praise is communication that provides a positive evaluation of another person’s actions.  Praise is only communication – it is not a concrete reward like money or coupons or points that has a material benefit.  It is must be only a favorable opinion of your performance.

While the definition is simple, the effectiveness of praise, however, is not.  Merely offering exhortations of “Wow, you did great!” may not produce desired outcomes.  If the receiver has reason to doubt the authenticity or accuracy of the comment (“She always says that no matter what I do” or “He’s just trying to sweet talk me.”), the message will sound like praise, but it will not function like a reward.  Further, as we observed with Attribution Theory, praise could function like one of those awards that destroys intrinsic motivation for a task and brings it under external control.  Finally, if the praise lacks feedback on standards or expectations (the goal you’re trying to hit with the performance), it may also fail.

Let me quote a great summary line from a Henderlong and Lepper 2002 review of the research literature on praise.  If you want praise to function as a reward then:

“Provided that praise is perceived as sincere, it is particularly beneficial to motivation when it encourages performance attributions to controllable causes, promotes autonomy, enhances competence without an overreliance on social comparisons, and conveys attainable standards and expectations.”

Here, Have a Potato Chip While You Think It Over. Consider this simple little experiment with food as a reward.  Just two conditions . . .

“One was a condition in which a substantial quantity of food was offered to the subjects during the time they were engaged in reading a series of four persuasive communications.  Upon entering the experimental room, the subjects found the experimenter imbibing some refreshments (peanuts and Pepsi-Cola) and they were offered the same refreshments with the simple explanation that there was plenty on hand because “I brought some along for you too.” The contrasting ‘no-food’ condition was identical in every respect except that no refreshments were in the room at any time during the session.”

chipscolaPeople read a series of editorials on four different topics and either got to eat and drink while doing it or not (each randomly assigned to only one condition).  Of course, nowadays you probably wouldn’t use peanuts because of allergy concerns and you’d probably offer Coke Zero or maybe bottled water, but you get the point, don’t you?

The research team of Janis, Kaye, and Kirschner ran this study twice and combined the results.  On average, across the four topics and the two experiments, eating produced more favorable attitudes compared to the no-food condition.  It was a small plus windowpane (about a 40/60 effect).

The interpretation of this study is not simple as a For Me? play.  Rewards follow the behavior in the typical reinforcement setting, but here the reward of tasty food arrived before anyone could even read the persuasive messages.  Thus, the timing here is off and makes the results open to a Ding-Dong play, too.  The food could elicit favorable feelings which then are associated with the persuasive message.  Thus, we don’t have a simple, clear, and unambiguous interpretation of whether it is For Me? or Ding Dong, but we do know this:  Food influences people.

The Limitations Of Reinforcement

While Reinforcement Theory is a powerful influence tool, it does have several serious limitations.  To use it effectively, you must be aware of these difficulties in application.

1.  It is difficult to identify rewards and punishments.  As noted earlier in this chapter, reinforcers are identified by their function.  Thus, there is no cookbook list of Rewards and Punishments.  Candy increases student cooperation, but has no value as payment to a factory worker.  Thus, you have to observe your receivers very carefully to discover the things they find most rewarding or punishing.  (See the coach example above.)

And once you do find things that function effectively, you can be seriously disappointed to discover that they lose their value over time.  As people become accustomed to receiving some Reward, they may grow bored over time.  This is perhaps the greatest challenge for any persuader.  Finding good Rewards and Punishments requires a great deal of experience and insight.

2.  You must control all sources of reinforcement.  Often the receiver you want to reinforce also works and lives within a peer group that also provides reinforcers.  Peers provide an extremely important source of reinforcement, sometimes greater than any Reward or Punishment you can give.  Persuaders sometimes think their reinforcement applications are failing because they are not using the “right” Reward or Punishment.  Instead the problem may be that the receiver wants or needs the reinforcers the peer group offers more than the ones the persuader gives.

3.  Internal changes can be difficult to create.  One side effect of reinforcement theory is that people learn to perform behaviors we want them to show only when the Get is available.  If the Reward is not present, then the person will not show cooperation or good effort or attention or friendliness.  The target becomes little more than a well-trained monkey who does a trick, then holds out a hand waiting for the banana.  The person has not internalized the behavior but instead requires the full process (When-Do-Get).  This means that you must always be running around providing the correct consequences for the desired behaviors at the right time.  In such an instance one wonders who is being trained, you or the receiver.

You should also realize that reinforcement works best with the Low WATT thinker (“If I get a Reward, then the thing is good.  If I get a Punishment, then the thing is bad.”).  It does not require High WATT.  As we discovered in the Dual Process chapter, influence with Low WATT thinkers is often short lived and usually situation dependent.  The influence lasts only as long as the Cue (in this case the Reward or the Punishment) is available.  This simply means you need to maintain a steady diet of reinforcement cues to maintain the actions you desire.

4.  Punishing is difficult to do well.  Punishment is an extremely powerful consequence for all living things.  Whether it is a monkey, a pigeon, a child, or an adult, punishing consequences can produce extremely rapid, strong, and memorable changes.  The problem is that effective punishment demands certain requirements.  The research clearly shows that effective punishment must be:  1) immediate (right now!), 2) intense (the biggest possible stick), 3) unavoidable (there is no escape), and 4) consistent (every time).  If you cannot deliver punishment under these conditions, then the punishment is likely to fail.

Thus, the best punishment would be something like this.  A receiver does the Bad Thing, then:  the receiver is instantly placed in a dark room filled with snakes and bugs and jungly vines while weird and frightening voices shriek, “Don’t do the Bad Thing, Don’t do the Bad Thing.”  And as soon as the receiver stops doing the Bad Thing, bang, the receiver is back in your reinforcement economy, safe and sound.

While this example is an exaggeration, you get the point.  It would be extremely difficult to get anyone to accept this.  And, yet, people will criticize For Me? as a weak theory that doesn’t work in the real world.  Hey, Brave New World is not just a novel.  Let a merciless source run a Real World Reinforcement Economy that uses punishment effectively and you’d quickly understand the problem with Big Brother.

5.  Receivers may come to hate sources who use punishment.  Punishment is, by definition, an aversive, painful consequence.  People experience very negative emotional states when they get punished.  And, as we learned in the Classical Conditioning chapter, it is very easy to condition emotions.  Thus, when a sources uses punishment, the receivers will probably feel angry or fearful or hopeless and they will then connect or associate these negative feelings with the source of the punishment, you.

6.  It is easy to reinforce one pigeon, but a whole flock?  Reinforcement theory has been most strongly tested with animals, particularly pigeons and rats.  And that research with pigeons and rats has yielded outstanding results.  The problem for us is this:  The research used reinforcement principles on one pigeon at a time.  We often work with a whole flock.  The sheer number of receivers brings a very difficult dimension into the proper application of reinforcement theory.

Using Reinforcement To Best Effect

This model is simple and widely applicable.  It is also probably the one influence tool that almost every everyone knows.  Given the discussion of the limitations of reinforcement theory, you should realize that it is not the Swiss Army Knife of persuasion that can be ingeniously applied anytime anywhere with anyone.  In fact, I believe that it is used too often by people and typically under the wrong conditions.  Please understand that reinforcement theory will work marvelously when it is properly employed.  Under the correct conditions, monkeys and pigeons, boys and girls, and men and women will be strongly influenced through the skillful use of reinforcement principles.

What are those correct conditions?  Here’s the list:

1.  The source is well-trained in the theory and practice of reinforcement.

2.  The source has control of all significant reinforcers for all receivers.

3.  The source has control of each receiver (i.e. what the receiver does, when the receiver does it, what other receivers are in the situation).

4.  The source has a detailed and consistent plan of reinforcement.

5.  The reinforcers are always delivered under the same conditions to each different receiver.

To the extent that you deviate from these general rules, the application of reinforcement will be ineffective.  It is also important to realize that these inefficiencies do not make the theory a failure, but rather these inefficiencies simply show it is difficult to implement the theory in the real world.  Or why the real world won’t permit a Reinforcement Economy.

References And Recommended Readings

Henderlong, J., & Lepper, M. R. (2002). The effects of praise on children’s intrinsic motivation: A review and synthesis. Psychological Bulletin, 128, 774-795.

Hill, W. (1985). Learning: A survey of psychological interpretations.  (4th. Ed.).  New York: Harper and Row.

Janis, I., Kay, D., & Kirschner, P. (1965).  Facilitating effects of “eating-while-reading” on responsiveness to persuasive communications.  Journal of Personality and Social Psychology, 1, 181-186 .

Skinner, B. (1953). Science and human behavior.  New York: MacMillan.

Skinner, B. (1968). The technology of teaching. New York: Appleton-Crofts.

As a bonus, here’s a great website on Educational Psychology and an excellent chapter on Reinforcement.  I also highly recommend the other chapters.