Schedules for training with reinforcers
What are the available reinforcement schedules for positive reinforcement training?
The reinforcement schedules are Continuous Reinforcement Schedule (CRF) and intermittent reinforcement schedules. Which subdivide down to Fixed Ratio schedules (FR), Variable Ratio Schedules (VR), Fixed Interval Schedules (FI), Variable Interval Schedules (VI), Fixed Duration Schedules (FD) and Variable Duration Schedules (VD).
You do I Pay! Continuous reinforcement schedule
The CRF schedule is perfect for initial training of a new behaviour or skill. It allows the trainer to keep the dog engaged in the exercise. Therefore, creating a chance for fast progression. While at the same time allowing our dogs to understanding the required behaviour response.
For example, when teaching a puppy the lay-down behaviour. If every time the puppies belly touches the floor a reward arrives, the puppy will (mostly) very quickly learn that belly on the floor get the results. It is, however, essential once the behaviour is linked to the cue and is known to the dog. Changing the reinforcement schedule to an intermittent reinforcement schedule is vital for progression. Otherwise, you will end up with a dog that is only interested in performing when a reinforcer is in use.
You do five then I pay! Fixed Ratio reinforcement schedule
The FR schedule works on the principle that the same numbers of behaviours performed between each reinforcer. For example, if the rate was going to be five and the dog is performing a ‘Spin’. Each time the dog has completed the ‘Spin’ behaviour five times in a row, it will receive the reinforcer.
The number of repetitions between the reinforcer delivery can be any number but remains a constant factor during the training session. So using the modal above, it will be five behaviours between each reinforcer for the whole session. The benefits of this type of schedule are for the trainer it is easy to remember and deliver the reinforcer at the fixed program.
It can produce a high work drive in the dog except for just after the delivery of the reinforcer. This delay is one of the negatives of this type schedule and is sometimes known as Post-Reinforcement Pause.
Let’s get flexible! Variable ratio reinforcement schedule
The VR schedule works on the principle that the number of behaviours required between each reinforcer is random. Therefore, for the previously given example of the ‘Spin’. The reinforcer could come after 3 completed ‘Spins’, then after 5 ‘Spins’, then 2 ‘Spins’, then 4,6,3,5 and so on. Due to the variable rate of delivery of the reinforcer, the dog never knows when the next one will arrive.
This type of reinforcement schedule is known to keep the dog wanting to keep working for long periods at a consistently high rate. As a result, when the reinforcer stops appearing, the behaviour continues to be given for longer when compared with other schedule types. It is however more demanding on the trainer, and some less experienced trainers can fall into the trap of reducing the reinforcement schedule so much that the dog switches off. Practice makes perfect!
You continue, and then I will pay! Fixed interval reinforcement schedule
The FI schedule works on the principle that a specified amount of time must pass before the next offered behaviour is reinforced. If the schedule rate were 2 minutes, the dog would be reinforced for the required behaviour, then again, for the first repeat of the desired behaviour after the 2 minutes had elapsed. This schedule type is not a schedule that can be of much use in a regular dog training classroom or session.
Often it is used in a laboratory setting rather than real life situations. However ‘life rewards’ often come on an FI schedule, things like regular food times or regular walks if done at the same time every day are running on the principle of FI. The FI schedule may be of use for duration behaviours such as staying calm in a crate but similar to the FR schedule is prone to Post-Reinforcement Pauses.
Mixing up the time! Variable interval reinforcement schedule
The VI schedule works on the same principle as the FI schedule but allows a variable amount of time to pass before the behaviour receives reinforcement again. As with the FI schedule, there are only a limited amount of responses for which this type of program is useful.
Crate training will be one of these if the dog does not like more extended periods inside the crate. The principle would be that the dog receives a reinforcer for calm and relaxed behaviour inside the crate with variable durations of time between each given.
So the dog could be given the first one after two minutes of being calm, if it continues to remain calm for a further five minutes it gets another, another three minutes and it gets another and so on. It is important to remember to do both longer and shorter periods between reinforcers during the session but the overall length of time can increase between each delivery. As a result, the dog's self-control grows.
This schedule type produces more consistent performance results when compared to the FI schedule. As with the variable ratio reinforcement schedule, it is more demanding on the trainer, and some less experienced trainers can find it harder to use.
Hold that position! Fixed duration reinforcement schedule
The FD schedule is often confused with the FI schedule. The principle of this schedule is that the reinforcer arrives after the fixed period has elapsed. It could be of use for teaching timed behaviour for obedience competitions.
Often these competitions will have a set period down or sit stay. The difference between the FI and FD schedule is that in the FI schedule the period must pass after the first reinforcer has arrived before the next one occurs. Whereas with the FD reinforcement schedule the actual behaviour has to be continuously performed for the period, for example, a 3 minute ‘Sit’.
This reinforcement schedule means if the dog breaks the sit in the example above after 2 minutes it will not receive the reinforcer. The desired duration of the behaviour was not achieved. The downside to this fixed period is that dogs are great anticipators so you may end up with a dog that consistently breaks the ‘Sit’ after 3 minutes if this is the duration it first learnt. As with other fixed rate schedules, Post-Reinforcer pause is evident.
Keep their focus on staying! Variable duration reinforcement schedule
The variable duration reinforcement schedule works on the same principle as the FD schedule but the period that the behaviour is performed for before the reinforcer being delivered changes randomly within the training session.
With the above example of the held ‘Sit’, the reinforcer could be delivered at three minutes, before releasing the dog from the position, and then on the next repetition, the reinforcer arrives after one minute, then after four minutes. Due to the varying periods between the reinforcers the dog’s tendencies to anticipate and break the behaviour is reduced. As a result, long periods of continuous behaviour performance without any Post-Reinforcer pauses. As with other variable rate schedules, it is more demand on the trainer.
Conclusion, so which is 'Best' reinforcement schedule for positive reinforcement dog training?
There is not one perfect reinforcement schedule. The trainer needs to decide which program suits the specific behaviour being trained to achieve the best results. It also depends as with all training on the dog itself and the person teaching.
The best schedule to start
For starting any basic position behaviour like sits, downs, stands and hand touches. The Reinforcement schedule to start on would be continuous reinforcement. A constant and predictable reinforcer program allows the trainer and dog to get into a fluid rhythm of Cue-Behaviour–Reinforcer-Cue-Behaviour-Reinforcer and quickly connects the behaviour to the cue.
However, move on when you're both ready
Once the behaviour repetition rate is stable. Transfer onto a variable ratio reinforcement schedule as this will increase the dogs focus on continuing to perform the behaviour reducing the risk of only giving the behaviour when the reinforcer is present. It also reduces any problems with anticipation or Post-Reinforcer pauses.
When training self-control behaviours, such as relax or being calm. A variable reinforcement schedule would be a good choice, enabling the increase in the average interval between the reinforcers. While at the same time, the problems of anticipation and Post-Reinforcer pauses reduced without compromising performance.
Teaching higher demand behaviours, such as duration for sits and down stays. The variable duration reinforcement schedule is an excellent choice. The schedule can be tailored like the other variable rate schedules to the dog’s current ability level and increased as the training progresses.
Whichever the schedule, the main criteria is that it should be set up for the dog to succeed and expand difficulty from that point. Managing training this way builds the dogs confidence both in themselves and their ability to continue the training.
While on the subject of delivering reinforcer let’s not forget about trainer’s greed! Read more: The Rule of Five