Excel - Smart Fantasy Baseball

How to Use Excel to Determine Replacement Level

Who is the replacement level shortstop in a 12-team league that starts one shortstop and one middle infield position?

We know there will be at least 12 shortstops drafted in this scenario. But will there be 15, 16, 17, 18, or more drafted? And where does that put the replacement level shortstop at?

This concept of replacement level has always been a little bit of a problem for me. In my original series about ranking players, I mentioned that in this 12-team scenario, that we would have 36 combined 2B and SS drafted, and to simplify things we could assume that would be 18 second basemen and 18 shorststops.

But that’s not a precise enough answer.

If we’re trying to squeeze every drop of value from our drafts, we should determine precisely who the replacement level player is at each position. After all, replacement level is a huge driver in the calculation of a player’s value.

So we need to get it right.

What You Can Expect

I’m going to show you a system I’ve started using that will help you identify:

The starters at each position (e.g. top 12 1B, top 60 OF, etc.)
The corner and middle infielders (the next 12 best 1B/3B and 2B/SS)
The 12 utility players (the next 12 best players at any position)
The replacement level player at each position

The system is very easy to do. I was forced to come up with it out of necessity when I was working on my recent analysis of the past five years of draft results. For that post I had to calculate projected and actual dollar values for each of the last five seasons. So I needed a fool-proof method for determining replacement level 10 times in a short period of time and I also wanted to be able to come back to each set of data and easily be able to tell what group each player fell into. Thus the color coding.

Replacement_Level — Here’s a little taste of what we’re going to do. I’ll show you a process that will take you only a few minutes to color code and clearly document players into groups of Starters, CI/MI, UTIL, and Replacement Level.

Excel Features You Should Know

There are three pretty neat features of Excel that I used during this process that you may not be familiar with, and they might be able to save you a lot of time: Continue reading “How to Use Excel to Determine Replacement Level”

How Does a Player’s Age Affect Draft Return?

A few weeks back I took a closer look and analyzed the last five years of preseason Steamer projections (what I’m using as my best approximation of the “draft value” of each player heading into the season) and compared them to the actual end of season dollar values earned by those same players.

One of the glaring omissions in that article was some kind of analysis by age. Are there certain age groups that might be undervalued? Better yet, are there certain age groups of hitters we can take advantage of and a separate age group of pitchers we can jump on?

If we are trying to decide between a $20 pitcher who’s 23 years old or a $20 pitcher who’s 33 years old, who should we choose?

Quick Reminders

I’d highly recommend reading the first article that started me down this road. There’s a greater explanation of the approach used. But for a quick reminder… the dollar values are based on a standard 12-team league using traditional rosters (2 catchers, 14 hitters, 9 pitchers) and the standings gain points approach.

I also calculate return “including losses” and “without losses”. The best way to think about this is with a pitcher suffering a terrible injury in the first month of the season. Being injured that early, regardless of how good the pitcher is, will result in negative earnings. But the “benefit” of an injured pitcher is that you can immediately drop them and not suffer any of those negative earnings.

The flip side of that coin is with a struggling pitcher. You may decide to stick with a struggling pitcher for weeks or months, hoping for them to turn it around. In this scenario you are saddled with many of the negative earnings for that player. So the actual “return” on players lies somewhere between the “including losses” and “without losses” results.

Draft Results By Player Age

Take a look at the “Including Losses” and “Without Losses” charts below. Does anything jump out at you?

Continue reading “How Does a Player’s Age Affect Draft Return?”

Analyzing the Last Five Years of Rotisserie Baseball Drafts

How many of the top hitters and pitchers at the end of the year were actually drafted? How many of the top hitters and pitchers were not drafted and were picked up during the season? Were hitters or pitchers drafted more accurately? What is the dollar value earned by the players that were picked up during the season? Is there a position of hitter that’s more reliable than other positions?

Have you ever asked yourself draft analysis questions like these?

What follows is a five year analysis (with colorful graphs and an enormous Excel file!) of how accurately our projections in the preseason depict what has actually happened at the end of the season. How well we drafted. What positions yield the best returns. What positions offer the most free loot. And more.

Assumptions You Should Know

A number of the graphs depend on dollar value earnings for the “top 168” projected hitters or “top 108” projected pitchers. The dollar values are calculated using the approach documented in “Using Standings Gain Points to Rank and Value Fantasy Baseball Players” assuming a 12-team league, $260 team budget, 14 hitters (C, C, 1B, 2B, SS, 3B, CI, MI, OF, OF, OF, OF, OF, UTIL), 9 pitchers, and a 70%-30% hitter-to-pitcher allocation. That’s a total of 168 hitters and 108 pitchers.

These top projected players in the preseason were determined using Steamer’s preseason projections for that season (I downloaded the historical projections here).

I suppose using ADP results or expert rankings from the given year might give a better picture of the players that were actually drafted, but then you get into the question of what’s good ADP data, where to get it, what experts to use, league differences, lineup differences, etc.

To Be Clear… The Goal of this Study

The goal of this is not to measure the accuracy of particular experts. It is to determine which positions can we draft and get the most return on our investment. To some extent this is a review of Steamer’s accuracy, but that’s also not my intent. It’s my understanding (tell me if I’m wrong) that there are not significant differences between the top projections systems. So whether we were looking at PECOTA, Steamer, or Marcel projections, we would see similar results.

How Much of a Return Do We Get For Drafting HItters vs. Pitchers?

People have long been telling us to, “Load up on hitters early in the draft”.

“Don’t overspend on pitching.”

“Wait on pitching until most teams already have one.”

I’ve always heard these things. They sounded right. But I can’t say I’ve ever seen the data to support it.

In looking at the chart below it is very clear that we are much better at identifying the top hitters than the top pitchers. The top 168 hitters in the preseason provide about 70% of the dollars earned at the end of the season. For pitchers, it’s more in the neighborhood of 40%.

With results like that it’s very easy to see why the hitter-pitcher split is not 50-50.

Hitters are safer investments than pitchers. We’ve always been told this, but now you can see it. And things have not changed in the new era of pitching that we’ve been seeing the last few years. If anything, the gap seems to have widened.

Hitter-Pitcher-Draft-Returns-With-Losses — In a draft and hold environment, the return on investment for drafting hitters fluctuates between 65% and 80%. The return on pitchers is much lower, falling roughly between 30% and 50%.

Continue reading “Analyzing the Last Five Years of Rotisserie Baseball Drafts”

Now in Amazon – Using Standings Gain Points To Rank and Value Players

The Kindle edition of “Using Standings Gain Points to Rank and Value Fantasy Baseball Players: A Step-by-Step Guide Using Microsoft Excel” is now available at Amazon.com.

Want an edge in your league? Click here to get started on creating customized rankings and dollar values tailored specifically to your league.

The book contains 150+ pages of detailed instructions, over 200 screenshots of how to build this powerful Excel spreadsheet, and links to download 10 example files.

How Do You Account For and Value Players with Multiple Position Eligibility?

Zobrist_Prado_Santana_Carpenter How do we handle the multi-position players like Ben Zobrist and Martin Prado? When a player is eligible to be slotted at 2B, SS, and OF, how do we value that player?

This question came up in the comments of my last post on “How to Add Positional Ranking to Your Spreadsheet“, from a reader named Michael (I welcome your questions too). In this post I’ll take a look at how I handle this and look at a more inticate approach you could take to get the information.

I must warn you that it takes a lot of new formulas and manipulation of your existing rankings spreadsheet to accommodate multiple positions. To make sure you have something to reference, at the end of this post I’ll provide a download link to an example Excel file you can download.

Assign the Player to the Weakest Position They’re Eligible For

This is what I currently do. For example, let’s take a player like Ben Zobrist, who in the 2014 season currently qualifies at SS, 2B, and OF.

In the Player ID Map, I attempt to classify each player at the weakest position they’re eligible for. I do that by determining what replacement level is for a standard 12-team mixed rotisserie league.

Is the “weakest position” going to be the same for every league? No it’s not. It’s probably close in most leagues. Catcher will almost surely be the weakest in any format. Then Shortstop will generally be the next weakest, followed by 2B, 3B, 1B, and OF. But positions might change a little in an 8-team league or a 15-team league, in an AL-only league versus a standard league.

That’s a big reason why I started this site. It’s not always safe to give blanket advice, and I think the best approach is to calculate all of these things for your own league. You’ll be better off for it.

How Do I Determine the Weakest Positions?

Assuming you’ve done some kind of work to create your own rankings (if not, start here), the weakest position can easily be determined by looking at the replacement level information you’ve calculated (if you want to refresh your memory on replacement level, read this).

Looking back to one of my preseason files for the 2014 season, this is the replacement level information for one of my leagues.
REPL_LEVEL_TABLE

The weaker positions are those with the lowest replacement level. So in this league it’s C, SS, 2B, and then 1B, 3B, and OF are essentially the same.

Back to Zobrist

Going back to our example of Ben Zobrist who is eligible at SS, 2B, and OF, if we’re trying to assign him to the weakest position he’s eligible for, he would be assigned to SS.

Thinking of Martin Prado who is eligible at 2B, 3B, and OF, he would be assigned to 2B.

Carlos Santana who is eligible at 3B and C would be assigned to C.

Why Do I Only Assign a Player to One Position?

I have two reasons for this.

The main reason is because assigning players to the weakest position they are eligible for gives the player his greatest value. I’ll demonstrate more on this in a minute. But if you’re calculating Zobrist’s dollar value, it comes out highest when he’s classified as a SS.

Going along with this, fantasy baseball leagues are becoming more efficient market places. As we all get better and smarter about playing fake baseball, people generally realize they’re best off putting Buster Posey at catcher and not at 1B.

This won’t always be the case, but for the most part the obvious situations like your Buster Poseys and Carlos Santanas are going to be assigned where they belong. Because of this, it’s somewhat of a wasted effort to try calculating values for them at 1B or 3B.

The second reason is a technological one. You run into a lot of trouble having the same player appear multiple times in one spreadsheet (on multiple rows). Not only does it become confusing to have to remember that Zobrist’s name appears three times in your draft list, but it also greatly complicates (or eliminates) your ability to calculate dollar values.

How Much Does a Player’s Position Affect Their Value?

Is there really a big difference between a SS and an OF? Let’s take a look.

I am running the exercise below using Steamer’s 2014 preseason projections. The dollar values assume a 12-team standard mixed rotisserie league with 14 hitters (C, C, 1B, 2B, 3B, SS, CI, MI, OF, OF, OF, OF, OF, UTIL) and 9 pitchers. The dollar values are calculated using standings gain points and my approach to calculating player values.

Here’s the replacement level information again: REPL_LEVEL_TABLE

Let’s start out with Zobrist. You can see below that he was projected for 8.73 SGP before adjusting for replacement level. When you then account for replacement level and figure out his “SGP Over Replacement Level” you see that he becomes much more valuable as a shortstop. Over $4 more valuable than when he’s classified as an OF!

BEN_ZOBRIST_REPL_LEVEL Continue reading “How Do You Account For and Value Players with Multiple Position Eligibility?”

An Even Better Positional Ranking Formula

In my last post about how to calculate positional rankings in your player spreadsheet (link), I realized a weakness in the approach I outlined. I’m here to fix that with a slightly more complex formula.

The Weakness In The First Approach

The problem with the first approach is that it is entirely dependent upon how the file is sorted. If the file is sorted by dollar value or standings gain points in descending order, then the rankings work.

But if you sort the file by home runs, stolen bases, or player name, then the rankings fall apart. The player with the most home runs (or the most stolen bases or the first name alphabetically) becomes the top ranked player.

The COUNTIFS Excel Function

Instead of the COUNTIF function, we’ll use the COUNTIFS function this time.

You might remember from the last post that the COUNTIF function will count the number of cells in a specific range that meet a specified condition (e.g. “Hey Excel, count all the players in this column that have a ‘POS’ of ‘OF’.”).

Well the COUNTIFS function counts the number of cells in a specific range that meet all of the conditions you provide (you can give multiple conditions).

In plain English, our goal is to have Excel count all the players in our list of hitters that have a “POS” of “OF” AND that have a higher projected SGP than our player being evaluated.

We’ll get into the specifics of our formula in a bit, but here’s a screenshot of the Excel formula wizard for our COUNTIFS function.

This formula allows for an open-ended number of arguments, but you do need to pieces of information for each set of criteria you want to specify. Each criteria requires:

Range – This is the block, column, or area of cells we want to count from. “Excel, look in Column E and count the cells that meet this condition I’m about to tell you about in bullet #2.”
Criteria – This is what we are evaluating the cells for. “Count the cells in Column E that show ’1B’ as the position.”

If we want to specify three conditions that must be met, then we would need six arguments (three ranges and three criteria: range1, criteria1, range2, criteria2, range3, criteria3).

Step By Step Instructions

Continue reading “An Even Better Positional Ranking Formula”

How To Add Positional Ranking To Your Spreadsheet

I just came across an interesting use of an Excel formula that will let you easily determine what a player’s ranking is at their given position (1B-1, 1B-2, 1B-3, etc.).

Here’s an explanation that might show what I’m talking about. You’re scrolling through your huge list of ranked hitters (see image below). You have them sorted by Total Standings Gain Points (column V) in descending order.

You see Edwin Encarnacion’s name pop up in row #38. You know he’s a first basemen, and you can pretty easily determine that he must be the #37th ranked player (by showing up on row 38). But now you want to know where he ranks amongst only other first basemen.

In this post I’ll show you a formula we can use to get our spreadsheet to look like this (look at column W):

I’m a Moron

I got nearly to the end of this post when I started to think it was weird that Encarnacion was ranked #37, Starlin Castro #32, and Paul Goldschmidt #21. Turns out I used a rankings file from the 2013 preseason for all the screenshots…

I decided against starting over because it’s not the player names that are important, we’re mostly looking at a new formula. And I found it pretty interesting and thought provoking to look at these old lists and see names like Nori Aoki and B.J. Upton so high.

Excel Formulas Used In This Post

Using the “&” to Build Text

As you can see from the image above, we’re trying to take each player’s position (e.g. “1B”) and then add a dash and then the player’s positional ranking (e.g. “1B-8” for Encarnacion).

You can use the ampersand (would you know what that was called without “Wheel of Fortune”?), in an Excel formula to add text from different columns.

A real practical application of this is to build a player’s full name (e.g Billy Hamilton) from their first name and last names being in separate columns. Here’s an example:

Continue reading “How To Add Positional Ranking To Your Spreadsheet”

Designing Reusable Fantasy Baseball Spreadsheets

Why do your spreadsheets have so many tabs?

When you’re building a rankings spreadsheet, why don’t you just build the calculations right on the projection tab?

Why do you even use this ridiculous Player ID map? There’s so much useless information on that thing.

Have You Ever Wondered These Things?

I have to say that I haven’t had anyone ask these questions, but I feel the need to address the questions nonetheless. I don’t mean for them to be overly complicated, but there’s a good chance that the spreadsheets I design can be overwhelming. So let’s take a closer look at the reasoning behind things.

There Is More Than One Way To Skin a Cat

I don’t think it’s much of a stretch to say that spreadsheet design is a form or art. There are many different ways to get to the same end goal. I’m always looking for new ideas and methods to add to my Excel work, and I’m certain there are more efficient and better ways to design things I have done.

With that said, I believe the principles I’m about to talk about are universal. Whether you like the way I’ve designed things or if you prefer to do things a different way, using these concepts should help you out in the long run.

Why Do Your Spreadsheets Have So Many Tabs?

There are two main reasons for this:

I want all of my spreadsheets to be reusable.
I try to set things up so you only have to enter a piece of information one time.

Reusable Spreadsheets

We’ll take a close look at reusable design in a moment, but think about the different tabs/worksheets I use. Raw projections, player information, replacement level information, and calculations all on separate worksheets.

By setting these up on distinct areas of a spreadsheet it allows you to easily update one of those components without screwing up the whole model. Have new projection info? Just replace the data on the projection tab. Have a new Player ID Map? Just copy it over top of the existing one.

This setup makes it much easier for your work to be updated during the season and into future seasons.

Entering Information Only Once

If you’ve read my book on calculating dollar values and in-draft inflation, you’ll recall that we added a “Settings” tab to the rankings spreadsheet. On that tab we color-coded blue cells to indicate input cells.

These are just basic settings about the league, but by separating them out onto the “Settings” tab you only have to type the information one time. If we didn’t do this, we would have to embed this information into our calculations for both the hitting and pitching rankings.

Everything is also formula-driven, so if a 13th team is added next year, or if team salary caps increase, or if you want to try a 65% hitting allocation, you can easily change those inputs and all calculations for hitters and pitchers update automatically.

If we didn’t have things set up this way you would have to search deep inside complex formulas on the hitter tab and edit these inputs. And then you’d have to go do the same thing on the pitcher rankings.

And if you’re in more than one league you can set your rankings up for one league, just change the few settings different between the leagues, and you’ll easily have rankings tailored to each league you play in.

Why Don’t You Just Build Your Calculations Right on the Projections Tab?

Whether you’ve downloaded projections from another website or you’ve created your own, it might seem like I’m over-complicating things by having the projections on one tab and then using formulas to pull the projections to a whole other worksheet where I then calculate rankings, SGP, and dollar values.

If I were to use this spreadsheet one time, I would agree with this line of thinking. But take a look at this image.

This is a very primitive flow-charting model of a rankings spreadsheet. You have a few sets of raw data as inputs (the blue rhombuses, or is it rhombi?) – the projections, the Player ID Map, and replacement level data. Continue reading “Designing Reusable Fantasy Baseball Spreadsheets”

How To Pull a Player ID From a Hyperlink in Excel

Maybe you’re trying to build a the next great daily fantasy baseball spreadsheet. Or maybe you’re attempting to figure out who to start next week.

You’ve got plans to use a powerful projection tool like Razzball’s Hitter-Tron… but then you see it. There’s not a player ID to be found! How can you take data from a site like Razzball and drop it into your existing rankings, dollar value, or trade evaluator spreadsheets?

You Know Me. I’m Obsessed With PLayer IDs.

They’re the best way to build a reusable spreadsheet that you can just drop projection data in to. You can have all sorts of valuation formulas, trade evaluators, or daily roster lineup tools ready to go. You just drop in some projections with player IDs and everything seamlessly links up and calculates.

Look Closely At the Links

Even if you don’t see a player ID at first glance, on many sites you can still get what you need if you look carefully.

Hover over a player’s hyperlink. Depending on your browser you should somewhere see the web address for the hyperlink (in Google Chrome the destination of the link shows in the bottom left on the status bar).

There’s a player ID in there! 9368 for Evan Longoria. A quick look at the PlayerID Map tells us the 9368 happens to be Evan Longoria’s Fangraphs ID.

Unfortunately, if you paste the Hitter-Tron information into Excel, you don’t get text about the web address. You’re left with just the hyperlink itself and no reference to player ID #9368.

There’s not even an existing Excel formula we can use to just pull out the web address. We have to get a little advanced and create our own.

Step-By-Step Instructions

Continue reading “How To Pull a Player ID From a Hyperlink in Excel”

The Difficulty In Aggregating Projections

Now that we’ve established that we can benefit from combining multiple projection models into one, let’s take a look at the challenges this presents.

I’ll also give brief explanations of how you can work around these challenges in Excel. At the end I’ll discuss an Excel template I’m working on that will do these calculations for you automatically and how you can get your hands on it.

I love Your Feedback

If you’re a SFBB Insider you might recall that after you sign up, the very first e-mail I send you asks you to reply with any fantasy baseball topics you’d like to know more about or difficulties you’re having (if you’re not, you can register here. I like to think it’s worth your while).

Insider

I Read All Of Those Responses

I’ve been fortunate enough to have nearly 500 people register, and I read every single response that comes in from that question. One of the most frequent areas of interest is how to average, or aggregate, multiple sets of projections into one usable set of information.

More Difficult Than I Originally Thought

These requests started to roll in during the off-season, and I even replied to several people saying that I thought this was going to be easy and that I’d have guidance coming out soon on how to do this.

… And here I sit months later having never written on the topic yet.

In theory, averaging a set of three numbers in Excel is easy. If one system says 25 HR, one says 30 HR, and another says 35 HR, Excel’s AVERAGE formula can easily respond with the average of 30.

But I quickly ran into some big problems that greatly complicated things.

Problem 1 – Lining Projections Up To Do The Averages

In order to aggregate multiple projection systems into one, we need a method of “lining up” the projections from one system with those of another system. Perhaps Giancarlo Stanton is projected to hit 20 HR the rest of the season by Steamer and 22 HR by PECOTA.

Giancarlo_Stanton_ROS — I made this information up just to illustrate the concept of “lining up” different projections.

We can use formulas in Excel (e.g. VLOOKUP) to pull Stanton’s Steamer projection and place it next to his PECOTA projection. But you can run into some complications in doing this. What if one projection system lists him as “Stanton, Giancarlo” and the other as “Giancarlo Stanton”.

Using names to pull data also opens you up to inconsistencies in the name being used. Is it Jonathan Singleton or Jon Singleton? AJ Burnett or A.J. Burnett?

If you have taken on the challenge of creating your own rankings, you know that we’ve dealt with this problem before, but on a smaller scale. In my rankings spreadsheets I use a consistent playerID to pull information between the different tabs. I prefer to use the Baseball-Reference playerIDs because you can tell who a player is (Stanton is “stantmi03” because there were two other Mike Stanton’s before him).

But seemingly every major baseball site has their own player ID system. Fangraphs says Stanton is “4949”, Baseball Prospectus uses “57556”, ESPN says “30583”, etc.

This is why I maintain the SFBB player ID map Excel file. The map allows for this translation or “lining up” to happen. It’s the bridge that can easily help you take Stanton’s projection from one system and place it next to his projection from another. Giancarlo_Stanton_PlayerID

Problem 2 – Players Not Projected In All Systems

Continue reading “The Difficulty In Aggregating Projections”

What You Can Expect

Excel Features You Should Know

Share this article:

Quick Reminders

Draft Results By Player Age

Share this article:

Assumptions You Should Know

To Be Clear… The Goal of this Study

How Much of a Return Do We Get For Drafting HItters vs. Pitchers?

Share this article:

Assign the Player to the Weakest Position They’re Eligible For

How Do I Determine the Weakest Positions?

Back to Zobrist

Why Do I Only Assign a Player to One Position?

How Much Does a Player’s Position Affect Their Value?

Share this article:

The Weakness In The First Approach

The COUNTIFS Excel Function

Step By Step Instructions

Share this article:

I’m a Moron

Excel Formulas Used In This Post

Using the “&” to Build Text

Share this article:

Have You Ever Wondered These Things?

There Is More Than One Way To Skin a Cat

Why Do Your Spreadsheets Have So Many Tabs?

Reusable Spreadsheets

Entering Information Only Once

Why Don’t You Just Build Your Calculations Right on the Projections Tab?

Share this article:

You Know Me. I’m Obsessed With PLayer IDs.

Look Closely At the Links

Step-By-Step Instructions

Share this article:

I love Your Feedback

I Read All Of Those Responses

More Difficult Than I Originally Thought

Problem 1 – Lining Projections Up To Do The Averages

Problem 2 – Players Not Projected In All Systems

Share this article: