Trying to beat 538's NFL predictions

I’ve always loved looking through 538’s sports predictions, so I was thrilled to see they’ve set up a challenge for this year’s NFL season:

The competition is super easy to join - just log in with Google or Facebook and use the sliders to set your predictions for each game.

Starting simple with Week 1, I decided to go solely off of last year’s Elo ratings. I used nflscrapR to pull data and the elo package to calculate Elo ratings. In short, Elo ratings are a way to rank teams based on schedule, outcome, and point margin. To learn more about the Elo methodology, check out this post on 538’s own NFL predictions.

# devtools::install_github(repo = 'maksimhorowitz/nflscrapR')
library(elo)
library(knitr)
library(nflscrapR)
library(tidyverse)
reg_games <- scrape_game_ids(2018, type = 'reg')
postseason_games <- scrape_game_ids(2018, type = 'post')
week_1 <- scrape_game_ids(2019, type = 'reg', weeks = 1)

season_data <- rbind(reg_games, postseason_games) %>%
    filter(home_team != 'NPR') %>% # remove pro bowl
    select(home_team, away_team, week, home_score, away_score)

head(season_data)
##   home_team away_team week home_score away_score
## 1       PHI       ATL    1         18         12
## 2       BAL       BUF    1         47          3
## 3       NYG       JAX    1         15         20
## 4        NO        TB    1         40         48
## 5        NE       HOU    1         27         20
## 6       MIN        SF    1         24         16

Armed with data on each 2018 regular season and postseason game, I calculated each team’s final Elo rating.

elos <- elo.run(score(home_score, away_score) ~ home_team + away_team, data = season_data, k = 20) %>%
    final.elos() %>%
    tibble(team = names(.),
           elo = .) %>%
    arrange(-elo)

head(elos)
## # A tibble: 6 x 2
##   team    elo
##   <chr> <dbl>
## 1 LA    1584.
## 2 NO    1579.
## 3 NE    1578.
## 4 LAC   1564.
## 5 KC    1560.
## 6 CHI   1557.

Oddly enough, even though New England beat LA in the Superbowl, the Rams still had a higher end of season Elo rating. This highlights a limitation of the Elo methodology and is due to both the Rams having having a tougher schedule and one more win than the Patriots.

Once I had the Elo ratings, using them to make predictions was really straightforward. I created a wrapper around the elo.prob() function and ran it in against the 2019 week 1 schedule.

nfl_elo <- function(team1, team2) {
    elo.prob(elos$elo[elos$team == team1], elos$elo[elos$team == team2])
}

predictions_538 <- c(
    'CHI' = 64, 
    'LA' = 52,
    'PHI' = 77, 
    'NYJ' = 55, 
    'MIN' = 59, 
    'BAL' = 61, 
    'KC' = 58, 
    'CLE' = 60, 
    'LAC' = 72, 
    'SEA' = 75,
    'TB' = 55,
    'DAL' = 74,
    'DET' = 51,
    'NE' = 68,
    'NO' = 68,
    'OAK' = 51
)

predictions <- week_1 %>%
    mutate_at(c('home_team', 'away_team'), as.character) %>%
    rowwise() %>%
    mutate(home_prob = nfl_elo(home_team, away_team),
           away_prob = 1 - home_prob,
           prediction_me = ifelse(home_prob > .5, home_team, away_team),
           probability_me = scales::percent(max(home_prob, away_prob), accuracy = 1)) %>%
    ungroup() %>%
    mutate(matchup = row_number()) %>% 
    mutate(prediction_538 = names(predictions_538),
           probability_538 = scales::percent(predictions_538 / 100, accuracy = 1)) %>% 
    select(home_team, away_team, prediction_me, prediction_538, probability_me, probability_538) %>% 
    mutate(outcome = 'TBD')

kable(predictions)
home_team away_team prediction_me prediction_538 probability_me probability_538 outcome
CHI GB CHI CHI 62% 64% TBD
CAR LA LA LA 65% 52% TBD
PHI WAS PHI PHI 57% 77% TBD
NYJ BUF BUF NYJ 55% 55% TBD
MIN ATL MIN MIN 54% 59% TBD
MIA BAL BAL BAL 58% 61% TBD
JAX KC KC KC 65% 58% TBD
CLE TEN TEN CLE 53% 60% TBD
LAC IND LAC LAC 53% 72% TBD
SEA CIN SEA SEA 59% 75% TBD
TB SF TB TB 51% 55% TBD
DAL NYG DAL DAL 62% 74% TBD
ARI DET DET DET 58% 51% TBD
NE PIT NE NE 58% 68% TBD
NO HOU NO NO 56% 68% TBD
OAK DEN DEN OAK 54% 51% TBD

Using these probabilities, I filled in my predictions on the 538 website. Since 538 also uses Elo ratings (though a more complicated version) our predictions weren’t drastically different in terms of outcome, though theirs were typically more confident. Since their competition penalizes overconfidence, this could go either way…check back next week to see how I fared!

Avatar
RCharlie
Data Scientist

Related

comments powered by Disqus