NHL.com Play by Play data | HFBoards - NHL Message Board and Forum for National Hockey League

NHL.com Play by Play data

SaskRinkRat

Registered User
Apr 1, 2010
502
0
Hey everyone,

I'm curious if there is a season rollup of the NHL.com play by play data available online anywhere. They make it available game by game (i.e., http://www.nhl.com/scores/htmlreports/20122013/PL030147.HTM ), but it would be great to have entire season(s) worth of data accessible.

If not, I was thinking I could create a macro in excel that would allow it to be easily dumped into a single database, but I'd probably need to crowd-source a few people to help scrape the 1200 games per season into excel files.

Would there be any interest in this?
 
Well there are some great stats for power play on stats.hockey analysis that I already converted to excel. I have from 2007-2013 advanced power play stats if you want me to send them to you.
 
Are the stats on hockeyanalysis.com available on an individual play by play basis?

What I would like to get access to, eventually, is a database that lists literally every single play that happened all season. Think of it like the game by game play by play I linked above, only in an easily manipulable excel / csv file that contains every game of the season(s).
 
Sorry, I think I'm missing something. What step do you need people for?

I've done this in the past, my basic data flow was: Scrape html from nhl.com; Extract the data to a data structure (I recommend the beautiful soup python library); Output it to a standardized format.
 
Sorry, I think I'm missing something. What step do you need people for?

I've done this in the past, my basic data flow was: Scrape html from nhl.com; Extract the data to a data structure (I recommend the beautiful soup python library); Output it to a standardized format.

What I think SaskRinkRat is asking for is a way to convert play by play data into excell form. SaskRinkRat my guess is there is but you would need some type of data/computer software that has that capability.
 
What I think SaskRinkRat is asking for is a way to convert play by play data into excell form. SaskRinkRat my guess is there is but you would need some type of data/computer software that has that capability.

Yeah, that is the second step of the dataflow I outlined. You write a script that extracts the table data directly from the raw .html and stores/outputs it in whatever format you like.
 
Yeah, that is the second step of the dataflow I outlined. You write a script that extracts the table data directly from the raw .html and stores/outputs it in whatever format you like.

is that a hard thing to learn (I'm only a junior in high school so I haven't really learned much when it comes to computer science and stuff like that)? I'd really like to learn how to do that, since it would allow me to have access to far more data for my research.
 
I would be interested in working on this. I am pretty decent in pho scripting and have done some work extracting sports data from natural language sources.

Unfortunately I'm in the middle of moving and in taking a few

But I would be willing to do two things:

If you send me a sample file, I'll provide some guidance on getting started and perhaps even a seed script so you can learn by example.

2) this fall if you still need help. I would be able to dive in a bit deeper.
 
I would be interested in working on this. I am pretty decent in pho scripting and have done some work extracting sports data from natural language sources.

Unfortunately I'm in the middle of moving and in taking a few

But I would be willing to do two things:

If you send me a sample file, I'll provide some guidance on getting started and perhaps even a seed script so you can learn by example.

2) this fall if you still need help. I would be able to dive in a bit deeper.

Hey could I email you (seems easier to communicate that way rather than through this hockey board). My email is [email protected] If you could help me out in anyway that'd be great.
 
That's a weird one - not only does that one not come up, but all of the other associated reports link to a (blank) game between Carolina and Tampa in November of 2007.
 
Webscraping seems to be the in thing nowadays. Lots of talk of using it in statistical projects and products. In some cases scraping off the internet is being used to reduce the number of physical purchases needed to compute economic figures.

It'd be nice if somebody scraped it off into a public repository but that'd require more than what I believe anybody is willing to put together.
 
I'm working on getting at least the past six seasons into a database right now, with plans to put it online. It'll have every play from all of these play-by-play reports, and hopefully a very, very good filtering system so you can calculate whatever you want.

It's unfortunate that there are a few reports missing or incomplete for each season, though. Here's another one I've encountered:
http://www.nhl.com/scores/htmlreports/20092010/PL020081.HTM
 
I'm working on getting at least the past six seasons into a database right now, with plans to put it online. It'll have every play from all of these play-by-play reports, and hopefully a very, very good filtering system so you can calculate whatever you want.

It's unfortunate that there are a few reports missing or incomplete for each season, though. Here's another one I've encountered:
http://www.nhl.com/scores/htmlreports/20092010/PL020081.HTM

I've already done most of this. I've event sourced every draft,and 6 years worth of games. I can do all the games that have reports with the click of a button. I am mostly making sure the quality of data is high. I have every player as well. I am going to start creating projections for all the standard stats, then i'll tackle advanced, then some cool data vis if i have time.
 
Anyone still working on this ? I've googled for an API and found this on my first search http://developer.sportsdatallc.com/docs/NHL_API but it appears to be subscription-based. I can totally understand the necessity of this - having paid, dedicated staff ensuring the data is available and accurate - but I was just curious if anyone had their own API or DB to access. I realize that if you did then potentially several developers would attempt hitting your DB to query but I just thought I'd ask.

If not then I can follow the aforementioned method of scraping and formatting. Thanks.
 
For those who know R, the nhlscrapr package has a way to extract all of the data into a very usable format. Kudos to Sam Ventura and Andrew Thomas for making this publicly available. Might even be worth learning R just for that.
 
I seem to recall a website that had lots of statistical analysis of players on it. Think it might've been called something like hockey analysis but it had info on different strength success for players etc.

Also, does anyone know a website that charts where people start their shifts in a game? Or that website that has the time on ice for a player in a game? Thanks.
 
I seem to recall a website that had lots of statistical analysis of players on it. Think it might've been called something like hockey analysis but it had info on different strength success for players etc.

Also, does anyone know a website that charts where people start their shifts in a game? Or that website that has the time on ice for a player in a game? Thanks.

Sounds like stats.hockeyanalysis.com for the top graf, though behindthenet.ca has a lot of similar data. It's a little less robust in my opinion.

Timeonice.com might have what you're looking for in the bottom graf, just make sure you grab the five digit game code from the box score.
 
Sounds like stats.hockeyanalysis.com for the top graf, though behindthenet.ca has a lot of similar data. It's a little less robust in my opinion.

Timeonice.com might have what you're looking for in the bottom graf, just make sure you grab the five digit game code from the box score.

Where exactly is the 5 digit code? I can't find it or at least when I think I do and put it in its not valid
 

Users who are viewing this thread

Latest posts

Ad

Ad