Should I use this projection system or that one? Why mess around with the second best system if you can easily determine the best, right?
If you search the web, you can locate previous studies that review the accuracy of baseball’s many projection models.
- Monster 2013 Projection Review, Rotovalue.com
- Evaluating 2013 Projections, Fangraphs Community, Will Larson
- Fantasy Baseball Accuracy, FantasyPros.com
- Fantasy Baseball Projections Review – 2012, Razzball.com
I Don’t Have Time To Read All That. Just Tell Me what They Say.
Understood. Here’s my summary:
- There area lot of different approaches to projecting stats (Marcel, Steamer, Zips, Oliver,PECOTA, etc.)
- Basic three year weighted average with regression to league average
- More than three year weighted averages incorporating more advanced component metrics
- Crowd sourcing
- Aging curves
- Similar player modelling
- No single projection system is consistently better than the others in all the stat categories we care about for fantasy baseball
- The most accurate projection model changes from year-to-year
- But there are some that consistently perform well
- Some systems do well in projecting offensive statistics
- Some are better at pitching
What Is Also True
A lot of research has been done on the effectiveness of combining or “aggregating” different projections or forecasts into one. This research was not done with only fantasy baseball in mind, but we can take advantage of it. Here’s one very interesting article on the topic (it’s from a website named “forecastingprinciples.com” and is a PDF of a study from the Wharton School of Business at Penn, it has to be legit, right?).
The thinking behind aggregating projections is that the wisdom of many intelligent people looking over a lot of information can lead to better results than just one isolated model for projecting future results. When you combine all of this together you’ll naturally be removing the outliers from the individual models, but hopefully you’re also improving the accuracy as a whole.
The Actual Results
It may not be appropriate to boil a 15 page research paper into a couple of sentences. But I’m going to do it anyway! Here’s what the PDF linked above concludes on the evidence on the value of combining forecasts:
Combined forecasts are more accurate than the typical component forecast in almost all situations studied to date. Sometimes the combined forecast will surpass the best method.
So there you have it.
Suggestions on How To Combine
The article suggests the following:
- Combine forecasts using different methods
- Combine forecasts using different input data
- Use at least five methods, when possible
- Use formal procedures for combining (a mechanical, structured method)
- Use equal weights in the combination unless you have strong evidence to support unequal weighting
And these are the situations when combining is the most beneficial
- Uncertainty about which forecasting method is the most accurate
- Uncertainty in the forecasting environment
- There is a high cost for large forecast errors
Seems to me that baseball statistics are ripe for this.
We have a lot of different forecasting methods available to us. They use a variety of different inputs. We can easily combine them using a simple mechanical approach. We are uncertain which method is the most accurate. There is great uncertainty in projecting baseball stats. And bragging rights (and maybe a few dollars) in a fantasy baseball league is surely a “high cost”.
Coming soon, I’ll give you a sneak peak at an Excel tool I’m developing that will simplify the process of averaging multiple projections.