How to Use Excel Text Manipulation Formulas with Fantasy Baseball Data

In our last post we took a close look at a handful of different web query options available in Microsoft Excel.

By the end of that post we had an Excel file that was able to automatically go out and pull in a raw listing of player names and salaries. All you had to do was figure out the ID for the slate of games you’re entering on FanDuel and type in that five digit number.

FanDuelContestID

Then voila… Current salary information (or you can even run it for tomorrow to start planning the night before)!

FanDuel_Salaries

The Next Step

So now we have player names and salaries. And we know from doing research at how to succeed playing DFS that we also need to bring in other information like batting and pitching splits, Vegas over/unders, and weather data.

We also know that using Player IDs is a more reliable way of matching players up with all of this data.

New Column On The Player ID Map

To bring all this data together, I recognized that we need a way to match a player’s name according to FanDuel to the player ID systems shown on the Player ID Map. So I made an update to the map over the weekend and added “FANDUELNAME” in column AF (I also added over 30 new players that have been called up during this season and/or are starting to have a “fantasy impact”. Guys like Carson Smith, Nate Karns, Carlos Frias, Lance McCullers).

FanDuelName

If you’re using the Player ID Map and you want instructions on how to drop in an updated version, check this out.

But We Had a Problem

Look closely at the image below.ClaytonKershawFanDuel

Who the heck is “Clayton KershawP”?

If you look through the list a little more you will see “Jose FernandezDL”, “Yordano VenturaDTD”, and “Ervin SantanaNA”.

You can probably see that some player names are reflecting health or availability information. This is great, but it poses a challenge for our ability to match to the new column in the Player ID Map. Kershaw is going to be listed as “Clayton Kershaw” on the list. Not “KershawP”.

Enter Excel Formulas “RIGHT”, “LEFT”, “FIND”, “LEN”, and “IF”

We are going to use a series of formulas to do the following things:

  1. Identify if a player name has “DL”, “DTD”, “NA”, or “P”. I’ve scanned the list of player names and those are the only injury/availability classifications I see.
  2. If a player has one of those health indicators on their name we will strip it off
  3. If a player does not have health indicators, we’ll just use their name as it is shown

RIGHT

The RIGHT function will give you the rightmost characters in a text string. For example, we could use the RIGHT function to look at the string “Yu DarvishDL” to determine if the two rightmost characters are “DL”.

This formula uses two inputs:

  1. Text – The main string of text you want to pull the rightmost characters from (this can be a cell reference)
  2. Num_chars – The number of characters to pull out

RIGHT_FORMULA

We will use the RIGHT function to look at only the last three characters of each player’s name. We only need the last three because our longest health indicator is “DTD” (three characters in length). And we don’t want to look at full names, because one of our health indicators is just “P”. We can’t just tell Excel to look for a capital P in an entire player name because player’s like Michael Pineda or David Price might cause us trouble. So by limiting our search to just the last three letters in a player name, we should be able to located “DL”, “DTD”, “NA”, and “P” without issue.

FIND

The FIND function searches for a specific string of text within another string of text. If the string you are searching for is located, it returns a number that indicates the location where the string starts. FIND is also case-sensitive (thank goodness, otherwise looking for “NA” at the end of player names could be a problem with guys named “Santana”).

This formula requires two inputs and has one optional input:

  1. Find_text – The string of text you are searching for. You can enter your text in double quotes or use a cell reference.
  2. Within_text – The string of text you want to search within. This can be a cell reference.
  3. Start_num (optional) – What character within the string you want to start searching at. For example, if you wanted to start by searching only after the fifth character in a name, you could enter a five for this argument.

FIND_FORMULA

We will use the FIND function in conjunction with the RIGHT formula mentioned above. We will use RIGHT to first pull out only the last three characters from a player’s name. We will then run a FIND on those three characters to look for “P”, “DTD”, “DL”, or “N/A”.

LEFT

Similar to the RIGHT function, LEFT will give you the leftmost characters in a text string.

This formula uses two inputs:

  1. Text – The main string of text you want to pull the leftmost characters from (this can be a cell reference)
  2. Num_chars – The number of characters to pull out

LEFT_FORMULA

We will use LEFT on players that have health information in their name. For example, for “Clayton KershawP”, we will want to pull out the first 15 characters (out of the total 16 characters in the full string of text).

LEN

The LEN function will tell you the number of characters in a string of text.

This formula has just one input, Text, which represents the text to want to count the characters from. This can be a cell reference.

LEN_FORMULA

We will use LEN with the LEFT function mentioned above. Remember the “Clayton KershawP” example where I mentioned that string of text has 16 characters and we want to take the leftmost 15 characters? We will use LEN to easily get that 16. We’ll use LEN to evaluate how many characters are in every player’s name.

IF

The IF function allows you to evaluate a cell to see if a condition is true or false. If the condition is true, we can give one result or a specific formula. If the condition is false, we can give a second result or an alternative formula.

This formula uses three inputs:

  1. Logical_test – The condition you want to evaluate for being true or false.
  2. Value_if_true – The value or formula you want to run if the Logical_test is true.
  3. Value_if_false – The value or formula you want to run if the Logical_test is false.

IF_FORMULA

Nesting Formulas

By default, the IF function only allows you two options. One if true. One if false.

This might be a problem for us. Here are some of the tests we need to run…

If we find “P” in the last three characters of the player name, cut off the last character from the name (or if the player’s name is 16 characters in length, give me the leftmost 15). If we find a “DL” in the last three characters of the player name, cut off the last two characters from the name (or give me the leftmost 14 from a 16 character name). If we find a “DTD” in the last three characters, cut off those three letters from the name (or give me the leftmost 13 from a 16 character name). If we find an “NA”, cut off the last two characters (14 from 16). And if none of those are true, then just give me the full length of the player’s name and remove nothing.

Thankfully you can also “nest” IF functions to give you more options. In a weird combination of Excel language and plain English, our formula will look something like this:

=IF(player is status "P", give me LEFT(player_name, LEN(player_name)-1), if their status is not "P" then IF(player status is "DL", give me LEFT(player_name, LEN(player_name)-2),if their status is not "DL" then IF(player status is "DTD", give me LEFT(player_name, LEN(player_name)-3, if their status is not "DL" then IF(player is "NA", give me LEFT(player_name, LEN(player_name)-2), if their status is not "NA" just give me player_name))))

This is basically saying, if a player is marked as “P”, cut off one character from their name (that’s what the “-1” is doing in the LEFT(player_name, LEN(player_name)-1). If the player is not marked as “P”, go to the next IF statement. If the player is marked as “DL”, cut off two characters from their name LEFT(player_name, LEN(player_name)-2). If the player is not marked as “DL, go to the next IF statement.

And so on. Until we get to the very end of all the IF statements. The final player_name just indicates that if all the previous IF conditions are false, this is what will show.

One More Web Query Option I Didn’t Mention

We looked at quite a few web query options in the last post, but I neglected to mention one that will now be helpful to use. We are going to be placing the formulas from above next to the FanDuel web query results (you can see my vision of how this will look below, they’re shaded a yellowish color).

FormulasNextToData

The potential problem with doing this is that the web query results and FanDuel’s list of player names will be changing each day and for each contest. The length of the player list is going to be a lot longer for a full slate of games on a Wednesday night than it will be for the afternoon slate on a Thursday.

Fortunately, Excel offers a setting that will attempt to fill the formulas down as far as the data in the web query reaches. To activate this setting, first select a cell in the middle of the web query results (for example, select the cell with Clayton Kerhsaw’s name).

Clayton_Kershaw_FanDuel

Then click on the “Data” tab. And then click the “Properties” button in the “Connection” section of the Data tab.

Data_Connection_Properties

This will bring up the “External Data Range Properties” dialog. Check the bottom check box for “Fill down formulas in columns adjacent to data”. This will attempt to fill our formulas to the exact size of the web query results. I also like to uncheck the “Adjust column width” box because each time the web query runs, it resizes my columns and I find it annoying.

External_Data_Range_Properties

Step-By-Step Instructions

If you want to make sure you’re at the same starting point as me, here’s an Excel file Continue reading “How to Use Excel Text Manipulation Formulas with Fantasy Baseball Data”

How To Calculate Custom Rankings for a Points League: Part 5 – Calculating Projected Points

Welcome to the first part of a series in which we’ll go step-by-step through the process of using Microsoft Excel to calculate your own rankings for a fantasy baseball points league (as opposed to rotisserie or head-to-head rotisserie).

Whether you’re in a standard points league at a major site like ESPN or a more advanced Ottoneu league at Fangraphs, this process will help you develop customized rankings for your league.  These instructions can be used for a season-long points league or a weekly head-to-head points league.

If you’re looking for info on how to rank players for a roto league, look here.

I recommend going through all the parts of the series in order. If you missed an earlier part of this series, you can find it here:

ABOUT THESE INSTRUCTIONS

  • The projections used in this series are the Steamer 2015 preseason projections from Fangraphs.  If you see projections that you disagree with or that appear unusual, it’s likely because I began writing this series in December 2014, still early in the off-season.
  • For optimal results, you will want to be on Excel 2007 or higher.  Some of the features used were not in existence in older versions.
  • I use Excel 2013 for the screenshots included in the instructions.  There may be some subtle differences between Excel 2007, 2010, and 2013.
  • I can’t guarantee that all of formulas used in this series will work in Excel for Mac computers.  I apologize for this.  I don’t understand why Excel operates differently and has different features on different platforms.

IN PART 5

In this part of the series we will use the named cells created in Part 2 along with our projection information on the “Hitter Ranks” and “Pitcher Ranks” sheets to calculate total projected points for each hitter and pitcher.

Please note that this series has been adapted into a nine-part book that also shows you how to convert points over replacement into dollar values and how to calculate in-draft inflation. Click here if you’re interested in reading more about the conversion to dollar values.

EXCEL FUNCTIONS AND FORMULAS IN THIS POST

We’ll just be doing some basic addition and multiplication.  We won’t be adding in any new features, but we will be doing this basic math using the named cells for your league’s scoring settings that we created in earlier parts of the series.  All_Point_ValuesTo refresh your memory and to see the complete list of named cells, access the “Formulas” tab of the Ribbon and then click the “Name Manager” button.Name_Manager

The list will display all named cells/ranges and named tables.  To view only named cells, click on the “Filter” drop down menu and choose “Defined Names”.Defined_Names

STEP-BY-STEP INSTRUCTIONS

Continue reading “How To Calculate Custom Rankings for a Points League: Part 5 – Calculating Projected Points”