How to Pull Projected Starting Pitcher Information into Your Daily Fantasy Baseball Spreadsheet

In this post I’m going to address two common questions I get about creating daily fantasy baseball spreadsheets:

  1. Where and how can I download today’s AND tomorrow’s projected starting pitchers?
  2. Why I don’t see the yellow arrow when trying to web query a site in Excel?

And in addressing those two questions, we’ll also take a look at a powerful tactic of using Google Sheets and Excel together to get baseball data off the web. We’ll be focusing closely on obtaining a list of projected starters, but the concepts behind using Google Sheets and tying that back into Excel is one that can be applied in many other areas (like creating spreadsheets for your season long leagues).

Where Can I Find a Reliable and User-Friendly List of Probable Starting Pitchers?

To view the actual HTML code of a site, right click on a web page and choose the option to "Inspect Element". If you don't see < TABLE > references, you may not be able to import the page with Power Query.
To view the actual HTML code of a site, right click on a web page and choose the option to “Inspect Element”. If you don’t see “<TABLE>” references, you may not be able to import the page into Excel.

We all know DFS is exploding and there are countless sites out there providing lineup information, alerts, weather data, and more. But unless I’m looking in the wrong spot, most of that information is intended for that day’s games. And as a father of two with a day job, I can’t practically create a lineup the day of a contest. I need to prepare a day in advance for the next day’s games.

The other challenge in finding this information is that it will be a lot easier to deal with in Excel if we can find the data in a table format (see image to the right, I won’t bore everyone with technical details, but just because data looks to be in columns and rows on a site, doesn’t mean it’s in the format Excel can handle easily).

I have struggled and struggled to find a good resource for tomorrow’s projected pitchers. AND IT HAS BEEN RIGHT IN FRONT OF MY FACE ALL SEASON! Take a look at the Fangraphs home page:

Fangraphs probable pitchers link.

If you visit the “Probables Leaderboard” (here’s an example link), it looks perfect. A table of all the projected starters, and even some friendly advanced metrics we could use in evaluating each player.

Fangraphs probable starting pitchers.

Now take a look at the URL for the page:

Fangraphs probable starting pitchers URL.
The URL for the probable pitchers list is dynamic and changes based upon the day you’re accessing it.

I started to write this post on September 4th. And when I clicked the “Probables Leaderboard” link, it took me to the “p2015-09-04” web address. You can see that last part simply reflects the current date.

Anytime you see a URL like that, with all the different arguments and parameters (like “pos”, “stats”, “lg”, “season”, etc.), you should get excited. It likely means you can manually type in values for those parameters and create your own “query” of the site. Here’s an example I wrote awhile back using Brooks Baseball to illustrate these concepts.

So instead of just using the “p2015-09-04” address, I tried “p2015-9-5”. This was to test two different things. First, to see if I could get tomorrow’s probables in the same table format. Second, to see if the zeros before the month and day numbers were important… And it worked!

Fangraphs dynamic URL
You can edit the URL to see tomorrow’s probable pitchers.

So not only do we have a reliable list of probable starters, we can also get the projected starters for days in advance!

We Need a Dynamic Web Query

While it’s great that we now know where to get tomorrow’s probable starters, the fact that the URL changes each day is a challenge. We’ll need to create a dynamic web query that can determine tomorrow’s date and download the data from the appropriate web address.

With this in mind, I brushed up my memory on how to create a dynamic web query (look for the section titled “Step-by-Step Instructions, Dynamic and Updating Web Query”) and started the process of building it in Excel.

Why Don’t I See the Yellow Arrow in My Excel Web Query Window?

Everything was going so well until I hit a common stumbling block that occurs when web querying in Excel. No yellow arrow displays on the table of data I want to capture in my web query. No yellow arrow displaying in Excel's web query preview window.

Why does this happen? One definite cause is if the information isn’t really in HTML table format (remember that image above?). But the Fangraphs table is in fact a table. I checked. I don’t have a great explanation as to why you don’t always see the yellow arrow, but I imagine it has something to do with how the table is coded or just Excel’s ability to properly process it.

But if you do in fact see that the data is stored in an HTML table, Google Sheets offers a very simple method of doing a web query. One that works even when the yellow arrow box is missing!

Why Don’t You Use Power Query?

Power Query was able to import the table, even when a traditional web query could not... But I can't yet figure out how to create a dynamic query in Power Query.
Power Query was able to import the table, even when a traditional web query could not… But I can’t yet figure out how to create a dynamic query in Power Query.

That’s a really good question. I just spent thousands of words spouting the virtues of Power Query, and in my next post I turn my back on it?

I would like to. But the dynamic web address tripped me up. I spent three days trying to figure out how to get it to work and was unsuccessful.

I ultimately realized that I knew a much easier way to do this with Google Sheets, and this is something I’ve been meaning to demonstrate for a long time. So rather than continue to waste time trying to get Power Query to do the job, why not go with something I already know?

The ultimate irony of the situation is that Power Query didn’t have a problem importing the probables! If I could only have gotten a dynamic query to work…

Enter Google Sheets

If you’re not familiar with Google Sheets, it is a very strong spreadsheet alternative to Microsoft Excel. And it’s free.

So why don’t I write more about using Sheets? Quite frankly, Excel is the better product. It is much more powerful and responsive, largely because it’s an application that runs on my own computer. Google Sheets is web-based and suffers from performance limitations and access issues because of it (if you have a slow internet connection or a lot of calculations in your spreadsheets, you’ll drive yourself crazy using Google Sheets).

With that said, there are some really interesting benefits to Google Sheets. Being free is hard to beat. It’s very easy to share a workbook and work on the spreadsheet at the same time as others. And as I mentioned, importing HTML table data is a snap!

Another really neat feature is that you can publish (or share) the results of a spreadsheet online in CSV format.

And a file in CSV format is easily importable into Excel!

So we can create a Google Sheet to web query troublesome table data. Publish that data as a CSV. And then use Excel (and even Power Query) to import the data into our master spreadsheet.

IMPORTHTML within Google Sheets, publish as CSV, the import the CSV into Excel.
Google Sheets makes the process of querying data from a web page very easy. You can then “publish” you Google Sheet as a CSV file on the web and then use Excel to import the CSV.

Let’s get started!

Prerequisites

To use Google Sheets, you need to have a Google account (if you use Gmail, Google Drive, or any other Google service beyond searching the web, you already have one). If you don’t have a Google account you can create one from the Google Sheets sign up page here.

Google Sheets Functions Used in This Post

IMPORTHMTL

Google IMPORTHTML formula guidance

In Excel, we set up a special connection to pull information from a website. Things are much simpler in Google Sheets. You enter a very simple formula and the data gets pulled into the document.

The specific function we’ll use is “IMPORTHTML”. The function has three inputs:

  1. URL – Enter the web address of the page to be queried in quotation marks. In our example, it will be the address of the Fangraphs Probables page.
  2. Query Type – This is the data type you wish to pull from the web page. You can enter either “table” or “list”. Similar to what we look for when doing an Excel web query, we most likely will be using the “table” option.
  3. Index – This is the instance number of the table (or list) on the web page. Google’s documentation says the index begins at 1, meaning if you want to query the first table on a page you would simply type a 1. If you want the fourth table on a page, you’d enter a 4. But for some reason using a 0 is what works for the Fangraphs page we’ll be using.

MONTH, DAY, and YEAR

These are three separate functions. Each is looking for one input, a date.

The MONTH function will return the numeric representation of the month in the date. DAY returns the numbers from the date string corresponding to the days. And YEAR returns the numbers of the year in the date.

Going back to our example date string from earlier, a formula of =MONTH("09/04/2015") will return “9”.

TODAY

The TODAY function requires no inputs. And when used it simply returns today’s date.

For example, if you enter the formula =TODAY() and look at your spreadsheet on September 5th, 2015, your spreadsheet will display “9/5/2015”.

The formula updates when your spreadsheet recalculates. So if you opened the spreadsheet the next day, the formula would display “9/6/2015”.

You can perform addition with the TODAY function. So if you wanted to display tomorrow’s date, the formula would be =TODAY()+1. Or a week from now would be =TODAY()+7. Knowing that we can add one to the TODAY function will be important to finding tomorrow’s probable starters.

CONCATENATING or BUILDING TEXT STRINGS

By now you probably realize that we’re going to take the beginning of that long Fangraphs URL and then attach the date, as calculated by the TODAY function, to that. Every day these formulas will update and automatically create the new URL to determine tomorrow’s pitchers.

To attach two strings of text together in Google Sheets (or in Excel), you can use the ampersand (“&”). For example, we could put tomorrow’s date in cell A1 of a spreadsheet and then use this formula to build the Fangraphs web address:

="http://www.fangraphs.com/leaders.aspx?pos=all&stats=pit&lg=all&qual=0&
type=8&season=2015&month=0&season1=2015&ind=0&team=0&rost=0&age=0&filter=&players=p"&A1

Step-by-Step Instructions – Create a Google Sheet and Use the IMPORTHTML Function

Step Description
1. Go to the Google Sheets home page and click the button to start a new blank spreadsheet.Start a new Google Sheet
Click on the “Untitled spreadsheet” title and give the file a better name. Maybe something like “Tomorrow’s Probables”.UNTITLED_SPREADSHEET
2. Next, we’ll use the date formulas previously discussed to build the date string to attach to the Fangraphs probable starters URL. Enter the following formula in cell A1:

=YEAR(TODAY())

This should result in just the year of today’s date. As I write this post in September of 2015, the formula returns “2015”.

Now we’ll continue to build on this formula. Add the following to the existing formula in cell A1:

=YEAR(TODAY())&"-"&MONTH(TODAY())

Hit ENTER to accept your changes. See how the ampersand is used to add the hyphen and then another ampersand is used to add the month? As I write this post, that last formula results in “2015-9”. We’ll continue to use the ampersand to add new pieces of text to this string.

Now add the following to the end of the formula:

=YEAR(TODAY())&"-"&MONTH(TODAY())&"-"&DAY(TODAY())

This last piece puts in one more hyphen and then the current day of the month. In my example file it’s showing “2015-9-5”, which is the exact format we need for the Fangraphs page.

But remember, we want to show tomorrow’s date. Not today’s. So make these last final adjustments:

=YEAR(TODAY()+1)&"-"&MONTH(TODAY()+1)&"-"&DAY(TODAY()+1)

The reason we have to add one to all three pieces of the date is to account for when you reach the last day of a month. If you don’t add one to the month component, your day would reset to “1” but your month would still be lagging one behind (e.g. If it’s August 31st and I don’t add one to all of the today formulas, my formula would results as “2015-8-1”, not “2015-9-1”).

3. We’ve completed the last date piece of the Fangraphs web address, so let’s create the full address to the page so that it will update dynamically. Visit the Fangraphs probables page (here’s a link you can use that will lead to an old date). Fangraphs probable pitchers link.
Use your mouse to select all but the end of the URL that contains the date (get the “p” though!).
Copy the entire Fangraphs web address but exclude the date piece of the Fangraphs URL.
Exclude the date piece of the Fangraphs URL.

Copy that URL. Then return to your Google Sheet. In cell A2 type an equal sign then a quotation mark:

="

Then paste the Fangraphs URL and close it with another quotation mark:

="http://www.fangraphs.com/leaders.aspx?pos=all&stats=pit&lg=all&qual=0&
type=8&season=2015&month=0&season1=2015&ind=0&team=0&rost=0&
age=0&filter=&players=p"

Now use an ampersand to append in cell A1 (the missing date component to the end of this formula):

="http://www.fangraphs.com/leaders.aspx?pos=all&stats=pit&lg=all&qual=0&
type=8&season=2015&month=0&season1=2015&ind=0&team=0&rost=0&
age=0&filter=&players=p"&A1

Hit ENTER to complete the formula and you should see a fully usable hyperlink that will take you to tomorrow’s probable starters. IMPORTHTML_URL
To test the hyperlink, hover your mouse over it and then click on the popup that appears.Pop up URL link.

4. There are just two more inputs needed for the IMPORTHTML function. Type TABLE into cell A3 and a zero into cell A4.IMPORTHTML_ARGUMENTS
Now click the downward pointing arrow on the sheet name at the bottom of the screen and then choose the menu option to “Rename…”.Rename sheet
Give this tab or the spreadsheet a meaningful name, like “IMPORTHTML Inputs”.IMPORTHTML_INPUTS
5. Now click the “+” sign, to the left of this newly renamed tab, in order to start a new sheet.Add a new sheet.
Click the downward pointing triangle on this new sheet and rename it to something meaningful, like “Probable Starters”.RENAME_PROBABLE_STARTERS
6. Click your mouse into cell A1 and enter the following formula:

=IMPORTHTML('IMPORTHTML Inputs'!A2,'IMPORTHTML Inputs'!A3,'IMPORTHTML Inputs'!A4)

If you named your first tab exactly the same as I did, you can copy and paste the formula above into cell A1. Or instead of typing out the formula, you can click to your “IMPORTHTML Inputs” tab and select the applicable cells.

Hit enter to accept the formula. After several seconds (depending on the speed of your internet connection), you should see the probable pitchers load!Fangraphs probable starting pitchers list

You can see that it’s very easy to pull data from the web into Google Sheets. Much easier and with fewer steps than in Excel.

7. Before we go on, think for a moment about how an Excel spreadsheet runs its calculations. Similar to Google Sheets, Excel has a TODAY calculation. But if the Excel file containing the TODAY formula was closed for an entire week, we wouldn’t expect that the TODAY formula was updating each day in that closed spreadsheet.

We face a similar problem with this Google Sheet. We don’t want to have to open this list of probable starters each day just so it can update the list. It would be great if there were a property we could turn on so that the spreadsheet would refresh itself every so often… And fortunately Google offers this feature!

Within the Google spreadsheet, go to the “File>Spreadsheet Settings…” menu.File menu, spreadsheet settings

In the ensuing menu, adjust the “Recalculation” setting to the “On change and every hour” setting. This means the spreadsheet will reevaluate the TODAY formula each hour and update the list of probable starters accordingly. Recalculation settings, recalculate on change and every hour.

Click the “Save settings” button to accept this change.

8. The last task we need to complete in the Google Sheet file is to publish the list of starters as an online CSV file.

To start this process, click on the “File>Publish to the web…” menu. Publish Google Sheet as online CSV file.
Click the drop down that currently says “Entire Document”.PUBLISH_TO_WEB_SCOPE
Then choose to only publish the “Probable Starters” tab.PUBLISH_PROBABLES
Now click the drop down that says “Web page” and change it to the “Comma-separated values (.csv)” option.Publish Google sheet as CSV file.
Click the “Publish” button to complete your changes.PUBLISH

9. After you click the publish button, the menu will change to display a link to the published CSV file. Copy this link for now. We’ll need it in the next section. In fact, you may want to copy and paste it into a Word file or some other place for easy access. We will use it again a couple of times.COPY_CSV_LINK
You can always return to review or change these settings under the “Publish to the web…” menu. Just click the “Stop publishing” button, reconfigure the settings to your liking, then republish the document.STOP_PUBLISHING

Google Sheets Wrap Up

Now you see how much more simple the “web query” is in Google Sheets. Especially the creation of a dynamic query that can grab the results of a different page each day with no need for us to update or even open the file! When a new day rolls around, the probable starters list will automatically update in the Google Sheet and in the published CSV file.

The ease of importing data is a huge benefit of Google Sheets, but on the whole I don’t find it to be up to par with Excel. So now let’s take a look at how to get this CSV file into our daily fantasy baseball spreadsheets.

Step-by-Step Instructions – Import a Published CSV File Into Excel

Continue reading “How to Pull Projected Starting Pitcher Information into Your Daily Fantasy Baseball Spreadsheet”

How to Import Draft Kings Salaries into Excel

I’ve been exclusively focusing on web querying Fan Duel to this point. But what about Draft Kings? Is there a way to web query their salary information? Here’s an e-mail I recently got from a reader of the site…

Hey Tanner, I just got back from vacation and saw some of your recent posts. That looks great… but I play on Draft Kings. When I try to do those steps on that site, it doesn’t work. No player names. No salaries. Any ideas?

Draft Kings salary info.
There is no <Table> element on the Draft Kings salary info.

I followed the same web querying steps we went through in this post, “How to Use Excel to Web Query DFS and Other Fantasy Baseball Data”, using FanDuel’s site this time. And sure enough, no matter what web querying option you use… No dice. If you do an investigation using the “Inspect element” option for your browser, I believe the technical reason the web query is unsuccessful is because the salary information is not coded as an HTML table.

So things aren’t going to be as easy as the web query. But due to a nice feature on Draft Kings, we can get very close. It’ll take just a few extra mouse clicks to get to the same place.

You see, even running a FanDuel web query isn’t seamless. You still have to log into your FanDuel account in your browser, locate a contest ID (URL), and type it into your Excel file to kick off the web query.

One Extra Step for Draft Kings

While Draft Kings salary info can’t be web queried, they do offer the ability to export all player information from a given contest into a nicely packaged CSV file. The link for a contest appears just below the player list you sort through to select players for your lineup.Export DraftKings salary information to CSV.

“CSV” stands for “comma separated values”. Many computer generated data export files have some kind of delimiter, or character, that breaks the data into columns. The CSV format specifically uses a comma to separate the values. Here’s what a CSV file looks like when opened outside of Excel:

Draft Kings comma separated values export file.

That Looks Awful, How Do I Work With That File?

The CSV format does not look user-friendly when viewed outside of Excel. But fortunately, once you install Excel, it becomes the default program to open CSV files… And this format is so standardized, Excel is trained to clean up the data and put it in our friendly Excel format:

Draft Kings CSV export opened in Excel.

This Doesn’t Look as Efficient as a Web Query

Right about now you’re probably thinking something like this…

You mean to tell me each time I want to update my DFS baseball spreadsheet, I need to…

  1. go Draft Kings,
  2. find a contest to enter,
  3. export the salary information,
  4. open that CSV file,
  5. copy the information,
  6. and then paste it into my spreadsheet?

Not quite. Some of those things we have to do regardless. We can’t really automate the selection of the contest you want to enter. But we can have Excel automate steps four, five, and six (the process of getting the CSV data into a preexisting DFS spreadsheet.

Excel Data From External Sources

DATA_FROM_TEXTYou might remember in the previous web querying article that we used the “From Web” option to “Get External Data” into our Excel files.

Fortunately, Excel also has an external data option for “From Text” (and a CSV file is a text file) that works on the exact same principles of a web query.

Just like how the web query is set up to pull information from a very specific web URL, this text file link can be set up to pull information from a very specific file path.

So as long as we save and name our Draft Kings CSV file in the same place each time, the data can automatically pull in to our DFS Excel files each time we open it.

Step-By-Step Instructions

Step Description
1. Log into your Draft Kings account (unlike for a web query, you can use any browser you wish for this).

Once you’ve logged in, locate a contest you’d like to participate in and click the “ENTER>>” button.ENTER_DK_CONTEST

2. After the contest loads, click the “EXPORT TO CSV” button below the player salaries (located on the left half of the contest screen).

NOTE: It does not matter if you check or uncheck the “ONLY PROBABLE PITCHERS” box. The export always contains all pitchers.EXPORT_TO_CSV_DRAFT_KINGS

3. Depending on the browser you’re using and its settings, one of three things will likely happen at this point:

  1. The CSV file will automatically download to your pre-specified download location
  2. You’ll be prompted to name the CSV file and choose where to save the file
  3. Or Excel may open and display the CSV file immediately

Regardless of what scenario you find yourself in, the main goal here is to save the CSV in a location where you want it to stay going forward. Maybe that’s your desktop or maybe you have a “Fantasy Baseball” or “DFS” folder on your computer. Save the CSV file in that long-term location.

So if it automatically downloaded to your “Downloads” folder, move it. If you’re being asked where to save the file, point it to the desired location. If Excel launched, perform a “Save As” to the desired folder. If you have the CSV file open, close it now.

4. Open your DFS spreadsheet (you may want to make a backup before proceeding) or start a blank Excel file if you’re just beginning (don’t start in the CSV file!).
5. Go to the “Data” tab and click the “From Text” button.
DATA_FROM_TEXT
6. At this point Excel will prompt you to browse for the text file you want to import. Browse to and locate your CSV file.
DKSalaries.csv

Once you’ve located the file, select it and click the “Import” button.
Import DKSalaries.csv

7. Excel’s “Text Import Wizard” will launch and ask which type of text file you’re importing. Remember that a CSV file is a special type of delimited text file that uses a comma to separate the values (a fixed width file doesn’t have a character (like a comma or semi-colon) separating the data, it would just have clearly visible spaces separating the data into columns). So choose the “Delimited” option.

Also check the “My data has headers” box and click “Next”.

IMPORT DELIMITED

On the next screen, choose the “Comma” delimiter option and uncheck any other options. You’ll get a preview of how the columns will be identified. Click “Next”.
COMMA_DELIMITED

On the final screen of the Text Import Wizard you can likely just hit “Finish”. For future imports, if you have a column containing something you only want treated as text or a date, you can choose those options. But “General” is fine for our purposes.

FINISH_TEXT_IMPORT_WIZARD

8. Once you click “Finish”, you’ll be prompted about where you want to import the CSV data to in the Excel file. If you are working in your pre-existing DFS spreadsheet, click on the tab you want the text to import to and use the “Existing Worksheet” option, or just choose the “New worksheet” radio button.

Don’t click OK yet! Click the “Properties” button.

IMPORT_DATA_OPTIONS

9. There are many available properties to adjust, but I’ll call out a few I feel the most strongly about changing:

Prompt for file name on refresh – If you plan to save the CSV file in the same location and with the same name each time, you should uncheck this option. If left checked, each time you refresh the connection to the CSV file you will need to browse to and select the file. Conversely, if you don’t want the restriction of having to name the file consistently and in the same spot, you may want this checked. But you’ll then need to browse and point to the new file each time you want to refresh the link.

Refresh data when opening the file – As you can probably guess, if you check this box, each time you open your DFS spreadsheet it will link to and import the text file information.

Fill down formulas in columns adjacent to data – We may not need this, but if we end up needing formulas next to the data being imported, I would recommend checking this box.

IMPORT_PROPERTIES

Click “Ok” to accept the settings changes.

Then click the next “OK” button on the “Import Data” window.

10. That’s it! The next day, when you go to prepare for a new contest, simply repeat steps 1 through 3.

In doing so, be careful to name or save the CSV with the exact same name and in the exact same location as your Excel file is looking to.

For instance, if you saved your first Draft Kings export as C:\Users\tanner.bell\Downloads\DKSalaries.csv, then make sure you save it there again the next day. If you have an older CSV already sitting in that location, the export will likely save as “DKSalaries (1).csv”. Not a problem. Just delete the older one and rename the new download to be just “DKSalaries.csv” (right-click on the file and choose “Rename”).
Draft Kings Salary CSV Export

Web Query vs. CSV Import

I’m not sure which is superior or if there is even a clear winner. There is something elegant about the web query. And I really like how you can create a dynamic web query (I wish there were a way to do a dynamic text file link, but I can’t see a way to do that). But the web query is slow.

The text file import does require a couple of extra steps, in terms of naming the CSV file, but once that part is complete it’s essentially the same. The actual import of the data into Excel is much faster for the text file option (at times my web queries can take 30 seconds or more to complete).

For both options, Excel will seamlessly import the data and the rest of your Excel file can be built around this to automatically update based on the salary list.

Second Look at FanDuel

Having noticed this for Draft Kings, I took another look at FanDuel… Lo and behold, FanDuel has the same export option.

Export FanDuel player list and salaries.

One key difference in the FanDuel CSV is that it has a unique file name associated with the contest ID. Unique FanDuel CSV file name.

So if you decide to go the CSV route for FanDuel, instead of using this unique name, rename/save the file with a generic name like “FDSalaries.csv”. If you also save the file in the same location each time, you would be able to link an Excel file to it repeatedly. At that point, just follow the setup instructions above and there is no difference in the setup of a FanDuel export and a Draft Kings export.

But Wait… What is that on the FanDuel Salaries List???

Oh my… This might be enough to convert me off the web query and to an exported CSV approach even for FanDuel (damn you Chris Youngs and Jose Ramirezes!).

FanDuel Player ID Numbers

Player IDs (don’t know what a Player ID is?)… FanDuel player IDs, coming soon to a Player ID Map near you.

Doing This in Excel For Mac 2011

If you happen to be using Excel on a Mac, the menu location to start the import process is slightly different. You can find it on the “Data” tab. Excel for Mac Import

Or you can also find it under the Data menu.Data_Get_External_Data_Import_Text_Mac

Once you find these menu options, the settings are the same as above (for the most part). Make sure to click on the “Properties” button to get options about when and how to refresh the import upon opening the Excel file. Text file import properties.

How Is Your DFS Spreadsheet Coming Along

Do you have any neat features you’ve added? Struggling to get some kind of information added? Feel free to post a message below to share what you’ve done or let me know your struggles and maybe I can help.

Stay smart.