FPI ratings courtesy of ESPN.
All other statistics are courtesy of Football Outsiders, home of the F/+ Combined Ratings for college football.
The Fremeau Efficiency Index (FEI) was created by Brian Fremeau; check out his website BCFToys for other goodies.
The S&P+ rating was created by Bill Connelly; check out his college football analytics blog, Football Study Hall.
We are just under a week out now, which means all your favorite football-related columns at RBR will be returning over the next couple of weeks. In my case, that means Processing the Numbers is back this Thursday for the Wisconsin preview, and I know you are all as excited about that as I am. I will also note here that, per your feedback, there will be a version of Advanced Stats Rundown this season for football, although I think that will consist more of previews of the other SEC games and national games of interest with a lesser emphasis on the inter/intraconference standings that tend to dominate the basketball version of that series.
But for now, there have been developments in the college football advanced metrics arena that we need to discuss. I alluded to this at the start of the scheduling series over the summer, but my “mentor” Bill C. has been busy this offseason, due in part to a full redesign/overhaul of the S&P+ metric. This is important as the majority of metrics discussed in this space are either built on or are components of S&P+, and the underlying methodology in the new version is significantly different from before. Bill has written about this numerous times over at Football Study Hall, and those articles are all linked below. What I’m doing today is boiling it down into something a bit more manageable than the multi-month sprawl at FSH and elsewhere, and looking at how the redesign recast last year’s results. Also, based on my experience during basketball season, there’s another metric I’ll be discussing in this space going forward that will be introduced at the end of the article.
How we got here — The Five Factors
Just after the 2013 season, Bill developed a concept he’s called The Five Factors. Inspired by work Dean Oliver did for basketball a decade ago, the Five Factors seek to identify the essence of winning football games — what do you have to do to win? Turns out those factors are explosiveness, efficiency, winning field position, finishing drives, and limiting turnovers. Put another way:
You make more big plays than your opponent, you stay on schedule, you tilt the field, you finish drives, and you fall on the ball.
Through a tremendous amount of effort I won’t get into here (but is available below if you’re interested), Bill boiled down each factor into a set of measurable statistics, with the eventual intent to overhaul the ratings metric. The breakdown is as follows:
- Efficiency — Success Rate
- Explosiveness — IsoPPP
- Field Position — Turnover Margin, Success Rate, Kick Margin, Punt Margin
- Finishing Drives — IsoPPP, Red Zone Success Rate, Success Rate, FG Efficiency
- Turnovers — Turnover Margin, Sack Rate
Success Rate should be familiar to everyone reading this, as should Turnover Margin. Kick Margin, Punt Margin, and FG Efficiency all derive from components of Brian Fremeau’s FEI Special Teams Ratings that made an appearance during last year’s bowl previews; Sack Rate was discussed at the same time. Finally, IsoPPP is a related concept to the Equivalent Net Points per Play (or PPP) that was a primary component of the old S&P+. PPP is based on value accrued over all plays, whereas IsoPPP is calculated only on successful plays. IsoPPP better isolates explosiveness from efficiency, which are muddled within PPP — this is highly beneficial for this type of analysis.
Another major thrust of this redesign was to place a heavier emphasis on standard deviations and the normal distribution, which occurs in many college football statistics, including points scored and percentage of points scored over a season. The old S&P+ used standard deviations in places, but it wasn’t directly tied into the framework — now it is.
1 | i.e., if a team wins by an average of 28-14, they scored 67% of the points over the course of the season.
There are numerous other little adjustments — including the all-important opponent ones — present now, but I’ve hit the highlights. The real questions, of course, are is it accurate and what can be done with the numbers?
When Bill simulated the 2013 and 2014 seasons and made picks based on the new S&P+, the results were pretty encouraging — 54.5% against the spread and 76% straight up. Going farther back through 2011 drops the percentage against the spread to 53.2%, but that’s just about as good as anything else you’re going to find out there.
The New S&P+ Ratings
Below is the end-of-season top ten S&P+ ratings from 2014, incorporating the redesign we just discussed:
|Team||2nd Ord. Wins||S&P+ Percentile||S&P+ Margin||SOS||Weighted S&P+|
|OHIO STATE||12.8 (-1.2)||99.5%||30.2 (1)||1.46 (5)||26.1 (1)|
|ALABAMA||12.3 (0.3)||99.2%||28.3 (2)||1.15 (14)||21.9 (6)|
|OREGON||12.9 (-0.1)||97.9%||23.7 (3)||0.49 (43)||23.3 (3)|
|API||8.8 (-0.2)||97.8%||23.6 (4)||2.22 (1)||5.2 (45)|
|ARKANSAS||8.1 (1.1)||97.6%||23.1 (5)||1.71 (4)||22.1 (5)|
|OLE MISS||9.1 (0.1)||97.6%||23 (6)||1.75 (3)||-0.2 (77)|
|GEORGIA||10.4 (0.4)||97.3%||22.6 (7)||0.49 (42)||14.4 (14)|
|TCU||11.4 (-0.6)||95.2%||19.4 (8)||-0.48 (88)||14.4 (13)|
|UCLA||8.2 (-1.8)||94.8%||19 (9)||1.9 (2)||9.9 (27)|
|GEORGIA TECH||10.1 (-0.9)||94.5%||18.7 (10)||1.36 (9)||22.6 (4)|
(Bold) numbers indicate national ranking.
So all of this is new and exciting! Let’s discuss the columns in order:
Second Order Wins
Second order wins are related to Pythagorean expectation, a concept those of you who read the basketball version of this series are already familiar with. Pythagorean expectation shows up frequently in advanced sports statistics, as it is a quick method to estimate what a team’s winning percentage should be based on offensive and defensive performance. In baseball, this is done with runs scored and allowed. In basketball, it’s points scored and allowed; for PTN Basketball we use Ken Pomeroy’s adjusted scoring efficiencies. You can apply it to any sport, really, you just have to change the exponents involved as appropriate. Compare your Pythagorean record to your actual record, and taking the difference between the two gives you a handy measure of luck.
2 | If you aren’t, here’s a refresher.
Second order wins takes this a step further — instead of looking at expected winning percentage based on actual scoring, second order wins looks at expected winning percentage according to what you should have scored based on the qualities of your team. This teases out more of the impact of luck, so you get a more accurate picture of how fortunate a particular team has been to that point in the season.
3 | I should note that this isn’t new to football analytics, it’s just new to S&P+. Something like this based on FEI has been available for some time.
4 | Undoubtedly the blogosphere will utilize and interpret this information in a sane and civilized manner.
Looking at last year’s top ten, we see a couple of interesting results. The number in parentheses in that column is the win differential from the team’s final record. For example, Alabama ended up 12-2 with a second order win total of 12.3, so their differential was 0.3. A positive differential indicates underachievement or “bad” luck, whereas a negative differential indicates overachievement or “good” luck. Note that Ohio State’s 14-1 record was over a win better than expected, which probably doesn’t surprise anyone here. Conversely, Arkansas was over a full win worse than expected, which isn’t surprising either. In case you were curious, the fourth playoff participant who is conspicuously absent in the chart above? 9.5 second order wins, a full 3.5 wins short of the 13 games won by FSU — the largest negative differential in the country last year.
Shocking, I know.
This is the fun part of that bit we discussed earlier about fitting everything to the normal distribution. This should also be a familiar concept — this is the exact same stuff you saw on your standardized test results back in the day. Note that both Ohio State and Alabama were in the 99th percentile last season, which is literally as high as you can go in a metric like this. No disrespect to Oregon and the nearly-as-outstanding season they had, but last year’s playoff semifinal between the Tide and Buckeyes, much like a handful of SEC title games in years past, appears to have been the actual national championship. Also, yes, that is four SEC west members in the top six. SEC West was totally overrated last year, you guys.
This column is the “actual” S&P+ rating, as one intent of all this effort was to move away from the previous basis of the metric — normalizing everything to an average of 100 — to presentation as an adjusted scoring margin based on a particular season’s scoring curve. Interpreted another way, the above chart indicates Alabama would have been expected to outscore the average team by 28.3 points on a neutral field last season. This rating is directly tied to the offensive and defensive S&P+ components; it simply references the differential of those two numbers. Those columns were omitted from the above chart for ease of presentation, but again for example Alabama rated at 43.1 and 14.8 in the offensive and defensive metric respectively, the difference of which is 28.3. Simplicity is wonderful.
5 | Texas Tech, cooooome on down!
SOS (Strength of Schedule)
Nothing crazy here. From the FootballOutsiders page:
Strength of Schedule rating (SOS): A simple schedule measure based on average S&P+ ratings and normal distributions.
Basically, take the average S&P+ rating of everyone’s schedule, take the resulting distribution and find the z-score of a given team’s schedule, and you have the number of standard deviations above or below the average schedule for that schedule. API was tops as you can see, with Appalachian State bringing up the rear. There are some issues with that approach as we’ve discussed at various times in the past, but it’s good enough, and anything better requires a little too much touch time to automate for all the FBS teams on a week-to-week basis.
6 | Northwestern, cooooome on down!
This is a metric that weights games at the end of the season more heavily than those at the beginning, with the intent of providing a measure of how “hot” a team was to close a season. This is handy, as it allows you to put the regular S&P+ rating in the proper perspective. Ohio State was far and away the hottest team in the country to close the season, which is how they almost flew under the radar heading into the playoff only to end up winning the whole thing. Conversely, after Ole Miss was atop these sorts of lists prior to their game against API, the Laquon Treadwell injury unraveled the most promising Rebel season in over 50 years, one that culminated in an embarrassing drubbing at the hands of TCU in the Peach Bowl — note how Ole Miss is the only team in the top ten with a negative Weighted S&P+. I’m not sure if this one will be tracked throughout the season or will only appear at the end of the season, but either way this is extremely useful information.
7 | Based on their weak pre-playoff schedule and uneven early season results — the advanced metrics foretold their eventual success.
Another New Metric — ESPN’s Football Power Index (FPI)
Well, new to this column, anyway — FPI has been around for a couple of years now, and even shows up in Bama Bobblehead’s work from time to time. The college basketball equivalent to this metric is BPI, and that featured rather prominently in the basketball version of this series. I liked the perspective it gave there, so I decided to fold in the football version here.
Unfortunately, unlike Bill and (to a lesser extent) Brian Fremeau, ESPN is not as forthcoming with respect to the technical details of their proprietary metrics, which is understandable. What I know about it is essentially what can be read here, which is not much. Like the new S&P+, it’s presented as a scoring margin, and weights factors such as offensive, defensive, and special teams efficiencies, as well as turnovers and big plays, and also includes opponent adjustments. The preseason ratings factor in information on returning starters, recruiting, and coaching tenure, components which are presumably phased out over the course of the season as with the other advanced metrics.
There’s certainly no shortage of metrics out there which seek to evaluate and quantify team strength in college football, but in the interests of keeping this column reasonable FPI will probably be the last addition for some time. We could discuss Sagarin, Massey, etc. every week, but it wouldn’t provide a better viewpoint than just FEI, S&P+, and FPI will. As noted, look for the Wisconsin preview on Thursday, and also a short piece on some changes coming to Charting the Tide between then and now.
8 | Hopefully. This was supposed to be short too.