How to Calculate Scored-Interval IOA for Low-Rate Behaviors
When evaluating the reliability of interval-based data, choosing the wrong Interobserver Agreement (IOA) metric can artificially inflate your scores and mask poor observer calibration. This psychometric danger is highest when tracking low-rate behaviors—behaviors that occur in a very small percentage of the observed intervals (e.g., occasional self-injurious bursts or brief elopement episodes).
On the 6th Edition BCBA exam, you must demonstrate how to bypass artifactually inflated agreement metrics by calculating Scored-Interval IOA step-by-step.
The Flaw of Interval-by-Interval IOA for Low-Rate Behaviors
A common clinical mistake is defaulting to Interval-by-Interval IOA, which scores agreement on every single interval where observers match—whether they both recorded a behavior happening (+) or both recorded a behavior not happening (-).
⚠️ The Low-Rate Trap: If a behavior only occurs twice in a 20-interval session, observers will automatically match on the 18 intervals where the behavior was completely absent. If you use Interval-by-Interval IOA, those matching “empty” intervals completely overwhelm the calculation, yielding an artificially inflated agreement score (often $>90\%$) even if the observers completely missed the actual behavioral bursts.
Unmasking the Math: Scored-Interval IOA Step-by-Step
Scored-Interval (or Occurrence) IOA strips away the distortion of empty intervals by forcing you to completely isolate and exclude intervals where neither observer recorded an occurrence. You only calculate agreement based on intervals where at least one observer marked a behavior happening.
Let’s break down a 6th Edition applied mock scenario using an interval grid layout:
The Raw Interval Log
Two behavior technicians record aggression across 10 intervals.
| Interval | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 |
| Observer 1 | - |
- |
+ |
- |
- |
- |
- |
- |
+ |
- |
| Observer 2 | - |
- |
+ |
- |
- |
+ |
- |
- |
- |
- |
Step 1: Identify the Critical Evaluation Set
Scan the grid and isolate every single interval where at least one observer marked a +.
-
Interval 3: Both marked
+(Include) -
Interval 6: Observer 2 marked
+(Include) -
Interval 9: Observer 1 marked
+(Include)
All other intervals (1, 2, 4, 5, 7, 8, 10) are completely thrown out of the math because neither observer recorded a behavior. Our total number of evaluation intervals is 3.
Step 2: Count the Absolute Agreement Intervals
Out of those 3 isolated intervals, how many show perfect agreement where both recorded a +?
-
Only Interval 3 qualifies. Our total agreement count is 1.
Step 3: Run the Final Calculation
📊 The Clinical Revelation: While a standard Interval-by-Interval calculation would yield a deceptive 80% agreement score, our Scored-Interval IOA reveals a catastrophic true agreement of 33.33%. This lower score tells the supervisor that the observers are completely uncalibrated and require immediate retraining.