There’s more to Operant Conditioning than simple reward and punishment. B. F. Skinner devoted much of his life to figuring out what influenced the rates of learning, and from his research, he discovered that different schedules of reinforcement can actually influence how quickly can you teach an animal to reliably perform a specific behaviour. At the same time, he discovered how quickly they’d stop doing it when reinforcement ceased.
Table of Contents
What are schedules of reinforcement?
A schedule of reinforcement is simply a set of rules (chosen by you, the teacher) used when teaching your student – in this instance your dog! These rules dictate how often you reward instances of the correct behaviour. You can reinforce continuously or intermittently.
Continuous schedules of reinforcement
With continuous schedules of reinforcement, as the name suggests, you reward behaviour continuously. To put it simply, you reward your dog every time she performs the behaviour you’re wanting to reinforce. Continuous schedules of reinforcement can require a lot of resources, depending on what you’re using as a reward. In addition, you need to watch your dog carefully, to ensure you’re rewarding her each time she performs the behaviour – i.e consistency! As you can imagine, continuous schedules of reinforcement can be dull and hard to sustain.
When should you use a continuous schedule of reinforcement?
It’s best to use a continuous schedule of reinforcement when teaching your dog something new. The reason for this is that it’s easier for her to make the connection between her behaviour and its consequence when she’s consistently rewarded for it. This only needs to happen initially, however. Once she’s reliably performing the behaviour when she should, you can put it on an intermittent schedule of reinforcement to maintain it.
Continuous reinforcement schedule examples
- Every time you call your dog to her bowl, she gets a meal.
- Every time you flick the switch, the light goes on.
Intermittent, or partial, schedules of reinforcement
Intermittent schedules of reinforcement (aka partial schedules of reinforcement) reward the dog sometimes, in contrast to continuous schedules which reward the dog all the time. Intermittent schedules are great for maintaining learned behaviours. Behaviours on an intermittent schedule of reinforcement are more resistant to extinction.
Fixed vs variable
Reinforcement is said to be on a fixed schedule when there is a set number of times a behaviour has to occur, or amount of time that passes before the dog is rewarded. For example, every second sit gets a treat.
Reinforcement is said to be on a variable schedule when there is a random number of times a behaviour has to occur, or amount of time that passes before the dog is rewarded. You should have an idea of how often you want to reward your dog on average, however. For example, if you’re wanting to reward your dog on a variable schedule for every second sit on average, your reinforcement might go: reward, reward, no reward, no reward, reward, no reward. In this instance, 3 out of the 6 sits got a reward – which equates to every second sit on average (6 sits ÷ 3 reinforcements = 2 reinforcements on average)
Interval vs ratio
Intervals refer to the amount of time that passes, after which your dog must perform the behaviour at least once to get her reward.
Ratios refer to the occurrence of the behaviour itself, after which your dog gets her reward.
There are many different kinds of intermittent schedules, the four main ones are listed below:
Schedule 1: Fixed interval (FI)
A dog’s behaviour is said to be on a fixed interval schedule when she is reinforced for performing the behaviour after a set amount of time has passed. It doesn’t matter if she performs the behaviour one time or thirty times on a fixed interval schedule. As you can imagine, rates of response can be poor on fixed interval schedules, as there’s no incentive to do extra work – she always earns the same reward regardless. Once your dog figures out the schedule, her rate of response should go up nearer to reward time.
Fixed interval schedule examples
- Let’s say you’re teaching down-stay. She’s still learning so you decide to reward her every five seconds that she stay in position. If five seconds have passed, and she’s still lying down, you reward her with a treat. Every time you reinforce her behaviour, her time starts again.
- For many of us, earning a salary is an example of a fixed interval reinforcement schedule. Regardless of how much you’ve accomplished in that month (as long as you’ve done enough to avoid being fired), your earnings stay the same and the money (your reinforcement) routinely goes into your account.
When should you use a fixed interval schedule of reinforcement?
Fixed interval schedules can be useful when you’re trying to build duration in behaviours that you’re teaching, such as in our down-stay example. Variable-interval schedules, however, can be even better for this purpose – see below.
Schedule 2: Variable interval (VI)
A dog’s behaviour is said to be on a variable interval schedule when she is reinforced for performing the behaviour after a random (to her) amount of time has passed. As seen in fixed interval schedules, it doesn’t matter if she performs the behaviour once or fifteen times on a variable interval schedule. Rates of response tend to be moderate and steady on variable interval schedules, as there’s no telling when the period of reinforcement has started.
Variable interval schedule examples
- Using our down-stay example from above, if you wanted to use a variable interval schedule with reinforcement every of five seconds on average, it could look something like this:
wait for 1s → reward, wait for 3s → reward, wait for 7s→reward, wait for 4s→reward, wait for 5s→reward. Twenty-five seconds have passed, with five incidences of reinforcement. Therefore, reinforcement has occurred every five seconds on average.
- Waiting for an elevator is an example of a variable interval schedule. You can press the button once, or ten times in succession, but the lift still appears randomly.
When should you use a variable interval schedule of reinforcement?
Variable-interval schedules can be instrumental in dealing with separation anxiety, or dogs who are generally noisy. You can reward them with your presence if they’re barking for attention, or a treat after they’ve been calm and quiet for long enough (might just be one second initially!). Gradually build up the amount of time you’re away, but return (their reinforcement for being quiet) early every now and then, so that they don’t think you’ll be gone for longer and longer periods each time you leave. This schedule is also great when you’re wanting to reinforce behaviours that are time dependent, or need duration, like ‘stay’, ‘heel’ and ‘wait’.
Schedule 3: Fixed ratio (FR)
A dog’s behaviour is said to be on a fixed ratio schedule when she is reinforced after performing the behaviour itself a set number of times, regardless of how much time has passed. It doesn’t matter if behaviours occur in a ten-second window or ten minute one. Rates of response tend to be higher on fixed ratio schedules, as the more she performs, the more she can earn. Once your dog figures out the schedule, her rate of response may drop immediately after reinforcement, however – this is known as the post-reinforcement pause. She knows that the next reward is a while away, so behaviours performed at the start of a new block tend to be lower effort.
Fixed ratio schedule examples
- Rewarding your dog for every three ‘sits’, or every seven ‘roll-overs’
- Some occupations pay for tasks completed rather than time worked, for example, fruit picking – being paid per full basket.
- Some parents pay their children for each chore done, as opposed to time spent doing it.
When should you use a fixed ratio schedule of reinforcement?
Fixed ratio schedules are great when your dog is learning something new, but you don’t want to reinforce every single performance. Depending on your rate, she should still be able to make the connection between performing the behaviour and getting her reward, and you don’t have to spend too much on time or treats.
Schedule 4: Variable ratio (VR)
A dog’s behaviour is said to be on a variable ratio schedule when she is reinforced after performing the behaviour itself a random number of times, regardless of how much time has passed. As there is no predictability, there’s minimal post-reinforcement pause and your dog maintains a high, steady rate of response.
Variable ratio schedule examples
Some ‘sits’ may get a no reward, or a mere “good girl”, while others get a piece of steak.
Gambling is one of the most addictive behaviours in the human world, and slot machines perfectly illustrate how powerful a variable schedule of reinforcement can be. There’s no telling which pull will be the winner, and it always feels so close, so you keep going and going.
When should you use a variable ratio schedule of reinforcement?
Once a behaviour is already learned, putting it on a variable ratio schedule is one of the best ways to maintain it. Variable schedules also help refine behaviours, allowing you to reinforce particularly good attempts, like faster sits or deeper bows, with better rewards. As a result, behaviours are not only strengthened, they’re perfected.
As we touched on in Dog Training Basics – Operant Conditioning, behaviours can become ‘extinct’ when there stops being any reinforcement for it (both internal and external). Extinction schedules work pretty much inversely to their reinforcement schedule counterparts. This means that if a behaviour is on a predictable, or continuous schedule of reinforcement, it generally reaches extinction quickly when all reinforcement is taken away. Using the light switch example above, if you flick the switch and the light doesn’t go on you don’t keep hitting the switch in the hopes that the light will turn on eventually. In the past it’s always worked within one flick, so you know not to waste your time on the switch again until the bulb is replaced.
In contrast, if a behaviour is on a very unpredictable schedule of reinforcement, as seen in variable ratio, the behaviour can be very resistant to extinction. This is due to that fact that reinforcement for that behaviour was random anyway, and sometimes far away – the dog may just think she needs to keep trying to get that reward.
References and recommended reading:
- Reid, P. (1996). Excel-Erated Learning: Explaining in Plain English How Dogs Learn and How Best to Teach Them. James & Kenneth.
- Yin, S. (2010). How To Behave So Your Dog Behaves. Neptune City, NJ: TFH
- Pryor, K. (1999). Don’t Shoot The Dog. Bantam.
- Skinner, B.F. (1951). How to teach animals. Scientific American.