Core Metric
ELO Rating System
The ELO rating system is a method for calculating the relative skill levels of teams in zero-sum games.
Originally developed for chess, we have adapted it for FTC alliances.
The Concept
Every team starts with a baseline rating (typically 800). After every match, the winning alliance takes
points from the losing alliance.
The amount of points exchanged depends on the expected outcome vs. the actual
outcome.
If a high-rated alliance beats a low-rated alliance, their rating increases only slightly (as expected).
However, if a low-rated alliance pulls off an upset, their rating increases significantly.
Mathematical Foundation
The expected score \( E_A \) for an alliance \( A \) against an opposing alliance \( B \) is calculated
using a logistic curve:
$$ E_A = \frac{1}{1 + 10^{(R_B - R_A) / 400}} $$
Where \( R_A \) and \( R_B \) are the combined ratings of the alliances. The new rating \( R'_A \) is
updated as:
$$ R'_A = R_A + K \cdot (S_A - E_A) $$
Where \( K \) is the K-factor (volatility) and \( S_A \) is the actual score (1 for win, 0 for loss).
Core Metric
Normalized cELO (Normalized Cumulative ELO)
Normalized cELO adjusts a team's cumulative ELO rating to account for regional strength differences,
enabling fair comparisons between teams from different competitive circuits.
Understanding the Metrics
Our system tracks three levels of ELO ratings:
- Event ELO: Team's rating calculated from matches at a specific event
- cELO (Cumulative ELO): Running total across all matches, with newer matches
weighted more heavily using exponential decay
- Normalized cELO: cELO adjusted for regional strength variations
Recency Weighting
cELO uses an exponential decay function to weight recent performance more heavily than older matches.
This ensures ratings reflect a team's current skill level rather than historical performance.
$$ w(t) = e^{-\lambda \cdot \Delta t} $$
Where Δt is the number of days since the match and λ is the decay parameter.
Matches from last week contribute significantly more to your rating than matches from months ago.
The Regional Normalization Problem
Teams in highly competitive regions face tougher opponents on average, which can suppress their
cELO compared to teams in weaker regions with similar skill levels. Without normalization,
comparing teams across regions would be inherently unfair.
Normalization Process
The normalization adjusts each team's rating based on the strength of competition in their region:
$$ \text{Normalized cELO} = f(\text{cELO}, \text{Regional Metrics}, \text{Normalization Factor}) $$
The normalization factor is derived from comparing average event scores in the team's region
to top-performing regions worldwide, accounting for game difficulty variations.
Example
A team with cELO of 1200 competing in a high-strength region may normalize to 1250,
while a team with the same raw 1200 cELO in a lower-strength region might normalize to 1150.
This adjustment enables meaningful cross-region comparisons.
Use Cases
Normalized cELO is essential for:
- Cross-regional team comparisons and global rankings
- World Championship seeding and advancement predictions
- Identifying underrated teams from highly competitive regions
- Match outcome predictions between teams from different circuits
Performance
Cumulative Offensive Power Rating (cOPR)
While ELO measures the ability to win, cOPR measures the ability to score points.
It attempts to isolate an individual team's contribution to the alliance's total score, with
more weight given to recent events.
The Concept
In FTC, matches are played 2v2. We only see the total alliance score, not individual contributions.
cOPR solves a system of linear equations to estimate the average points a team contributes,
while giving exponentially more importance to recent events.
Derivation
We model the match scores as a linear system where the sum of the cOPRs of the two teams in an alliance
equals the alliance's score:
$$ \text{cOPR}_{Team1} + \text{cOPR}_{Team2} \approx \text{AllianceScore} $$
Over many matches, this creates an overdetermined system \( Ax = b \), which we solve using
Weighted Least Squares Regression. Recent events receive weight 1.0, while each
older event's weight decays by a factor of 0.75, ensuring current performance is emphasized.
Why Time-Weighted?
Teams improve throughout the season by optimizing autonomous routines, refining strategies, and
repairing hardware. cOPR's exponential decay ensures:
- Latest event: Full weight (1.0)
- Previous event: 75% weight (0.75)
- Two events ago: 56% weight (0.75²)
- And so on...
This approach makes cOPR more predictive of current performance than a simple average.
Trend
Momentum
Momentum measures the rate of improvement of a team over time.
It answers the question: "Is this team getting better, getting worse, or staying the same?"
Calculation Methodology
We perform a Weighted Least Squares (WLS) regression on the team's match scores over
time.
Unlike a standard average, this method assigns higher weights to more recent matches.
The slope of this regression line represents the points-per-match improvement. This slope is then
normalized onto a 0-100 scale:
- > 50: Improving performance (positive slope)
- 50: Stable performance (flat slope)
-
< 50: Declining performance (negative slope)
Reliability
Consistency Index
The Consistency Index quantifies how reliably a team performs near their average.
A high consistency score means the team rarely has "bad matches," while a low score indicates
volatility.
Derivation
This metric is derived from the Coefficient of Variation (CV), which is the ratio of
the standard deviation \( \sigma \) to the mean \( \mu \):
$$ CV = \frac{\sigma}{\mu} $$
We invert and scale this value to a 0-100 range. A CV of 0 (perfect consistency) maps to 100.
Penalties
Foul cOPR
Foul cOPR estimates the average number of penalty points a team gives to the opposing
alliance per match.
Why cOPR?
Just like scoring, penalties are reported for the whole alliance, not individuals.
By applying the same cOPR logic to penalty points, we can isolate which team is likely committing the
fouls.
Note: A lower Foul cOPR is better.
Ranking Points
RP Reliability
Ranking Points (RP) determine a team's rank in the tournament.
RP Reliability estimates the probability of a team earning specific RPs (Movement, Goal, Pattern) in
their next match.
Bayesian Inference with Recency
We use a probabilistic approach that blends:
- Historical Performance: The team's long-term success rate.
- Recency Bias: Recent matches are weighted significantly higher using exponential
decay.
- Bayesian Smoothing: We add "dummy trials" to prevent 0% or 100% probabilities from
small sample sizes (e.g., 1 match played).
This results in a robust probability score that adapts quickly to new strategies (like a new auto path)
without overreacting to a single failure.
Event Analysis
Event Difficulty
Event Difficulty is a composite metric that quantifies how challenging an event is based on the
strength of the competing teams. This helps contextualize team performance and predict advancement
outcomes.
Calculation Methodology
The Event Difficulty score is calculated using a weighted combination of several factors:
- Average Team ELO: The mean cELO of all teams competing at the event.
Higher average ELO indicates stronger overall competition.
- Top-End Strength: The ELO ratings of the top-performing teams (typically top 25%).
Events with multiple elite teams are rated as more difficult.
- Competitive Depth: The distribution and variance of team ratings. Events with
consistent high-level competition throughout the field score higher than those with a few
standout teams and many weak teams.
Difficulty Scale
The difficulty score is normalized to a 0-10 scale and labeled accordingly:
- 9.0-10.0: High (Championship-level competition)
- 7.0-8.9: Medium (Strong regional events)
- Below 7.0: Low (Standard qualifier events)
Applications
Event Difficulty is used to:
- Normalize team performance across different events
- Predict advancement probabilities for Championship events
- Adjust ELO K-factors for more accurate rating updates
- Help teams strategize for event selection and preparation