Age-Adjusted Points-Per-Team-Points-Percentage: Analyzing Prospect Stats from the OHL, WHL and QMJHL

37 others

Registered User
Apr 18, 2017
465
235
As a computer science major and huge fan of hockey, I’ve been looking for ideas for a side project to combine the two interests. Although I have no expectations for this project to produce anything useful, I think it will be a good exercise in data science and will give me a reason to learn Python.

Eventually I want to tinker with machine learning and prospect projections, but I need to start with baby steps (through this project) before I run a marathon (implementing ML and prospect projections).

As an alternative metric to points-per-game, I've been working on calculating the amount of points a player puts up in each game they play relative to their teammates and adjusting that based off of the player's age. I'm not familiar with any similar metrics in the NHL, but I wouldn't be surprised if one exists.

AATPt% works out to something like this: ((Points by a player / points by their team in that game) * an age modifier) averaged for each game they play in a season.

With this metric, better players should, in theory, produce more of their teams’ points than worse players, and better players at younger ages should, in theory, make better prospects than better players at older ages. Thus, the age adjustment is important to differentiate younger prospects from overagers who commonly put up monster seasons in Canadian junior leagues.

Similarly, good, young players on worse teams should show a higher score than good, older players on better teams.

AATPt% isn’t a perfect metric by any means. Ideally, I’d adjust the metric not only on age, but on ice-time as well. I’ve noticed that the aggregated AATPt% for many players is really pulled down by seasons with low-ice time in their younger years. Unfortunately, I don’t believe TOI information is available for the Canadian junior leagues. It also doesn't highlight the defensive ability of players (obviously, being based off of points), and doesn't separate one player's play from his linemates (when two linemates feed off of a a dominant player for the line, you can't tell the difference between the dominant player and the non-dominant player).

As mentioned, I’ve been working on scraping data from the OHL, WHL, and QHMJL for the past 10 years, then aggregating the data into presentable tables. The workflow has been something like this: 1) scrape the data from each league overnight (and resume the scraping the next night if the scraper ran into problems) 2) collapse the game-by-game data per player to season-by-season data per player 3) adjust the player’s season-by-season total-team-point-% by their age 4) collapse the season-by-season data to single entries per player.

As a quick note before we get into the analysis of the data, my age modifiers come from those put forward by Ian Fyffe in Rob Vollman's book, Stat Shot. Also, I flip-flip on the use of my acronyms, so, for all intents and purposes, AATPt% = Adjust PPTPt%.

I include the following data in each table:
  • Adjusted PPTPt% — This equals the average of each player’s Raw PPTPt% (Points-Per-Team-Points-Percentage) multiplied by their Fyffe Multiplier (i.e. their age multiplier) for each season they have played in
  • Draft Eligibility for the 2018 NHL Draft (this is only based off of the player’s ages, so players drafted in previous drafts which are still technically eligible for this draft are listed as draft eligible)
  • Raw PPTPt% — The average of each player’s unadjusted PPTPt%
  • Adjusted PPTPT% Standard Deviation — If a player has played more than one season in the time-frame, this equals the standard deviation of their Adjust PPTPt% for each season
  • Average Player Age — The average age of the player across the time-frame
  • Average Fyffe Modifier — The average Fyffe Modifier of the player across the time-frame
  • Average Games Played — The average games played of each season for the player across the time-frame
  • Seasons Played
Unfortunately, I can't embed airtable links like I can on Medium, but I'll try to include some screenshots of the lists. Nonetheless, I’ll point out some of my findings:
  • Unsurprisingly, 16-year-old (and expected 2020 NHL Draft first overall pick) Alexis Lafreniere owns the highest AATPt% across all three leagues from the 2007–2008 season to the 2017–2018 season, at 0.4363. His raw AATPt% is 0.3881 (good for 26th overall in that time frame for the QMJHL), due to having played only one season so far at an average age of 16.227.
  • Despite being unranked in this year’s coming draft, RW Trey Fix-Wolansky of the Edmonton Oil Kings has the third highest AATPt% of presently undrafted, draft-eligible players. However, he is an overager of one year, and EliteProspects lists him as only 5'8" 165 lbs. I haven’t been able to find any scouting report on him.
  • Defenseman Ryan Merkley’s AATPt% of 0.3065 would put him second overall in all presently undrafted, draft-eligible players. This is a whole 0.1046 ahead of the next defensemen, overager Thomas Gregoire.
  • The most-consistent player award goes to defenseman Steven Varga, playing 4 seasons with an AATPt% standard deviation of only 0.0014177, making him more consistent than 90% of other players.
  • The least-consistent player award goes to forward Nicolas Roy, playing 3 seasons with an AATPt% standard deviation of 0.2096. Eric O’Dell leads the pack of players who have played more than 3 seasons, with a standard deviation of 0.1191.
  • I don't believe they're included in the tables below, but I did also calculate Goals-Per-Team-Goals-Percentage and Assists-Per-Team-Assists-Percentage. I can upload them if there's enough interest.
Forward Rankings

Top 10 Single-Season AATPt% (name — AATPt%— season— NHL Rights/Current League)
  1. Alexis Lafreniere — 0.4363— 2017–2018 QMJHL — 2020 Eligible
  2. John Tavares —0.4292 — 2007–2008 OHL — NYI
  3. Nicolas Roy — 0.4126 — 2016–2017 QMJHL— CAR
  4. Jason Robertson — 0.4083 — 2016–2017 OHL —  DAL
  5. Sam Reinhart — 0.4072 — 2013–2014 WHL— BUF
  6. Matt Barzal — 0.4028 — 2014–2015 WHL— NYI
  7. Sam Reinhart — 0.4024 — 2012–2013 WHL— BUF
  8. Mitchell Marner — 0.3986 — 2014–2015 OHL— TOR
  9. Nail Yakupov — 0.3980 — 2010–2011 OHL— COL
  10. Eric O’Dell — 0.3968 — 2007–2008— KHL
Top 10 Single-Season Raw PPTPt% (name — AATPt% — season— NHL Rights/Current League)
  1. Luke Philp — 0.6041 — 2015–2016 WHL— USports
  2. Brendan Shinnimin — 0.5428 — 2011–2012 WHL— SHL
  3. Jordan Weal — 0.5393 — 2011–2012 WHL— PHI
  4. Brayden Burke — 0.5227 — 2016–2017 WHL— ARI
  5. Sam Reinhart — 0.5215 — 2013–2014 WHL— BUF
  6. Michael Frolik — 0.5178 — 2007–2008 QMJHL— CGY
  7. John Tavares — 0.5154 — 2008–2009 OHL— NYI
  8. Josh Ho-Sang — 0.5135 — 2014–2015 OHL— NYI
  9. Aleksi Heponiemi — 0.5108 — 2017–2018 WHL— FLA
  10. Claude Giroux — 0.5100 — 2007–2008 QMJHL— PHI
Top 10 Presently Undrafted, Draft Eligible Aggregated AATPt% (name — AATPt% — league)
  1. Filip Zadina — 0.3145 — QMJHL
  2. Andrei Svechnikov — 0.2913 — OHL
  3. Trey Fix-Wolansky — 0.2828 — WHL
  4. Joe Veleno — 0.2762 — QMJHL
  5. Philipp Kurashev — 0.2644 — QMJHL
  6. Cameron Hillis — 0.2642 — OHL
  7. Akil Thomas — 0.2627 — OHL
  8. Linus Nyman — 0.2548 — OHL
  9. Anderson MacDonald — 0.2496 — QMJHL
  10. Gabriel Fortier — 0.2395 — QMJHL
Top 10 NHL-Associated Prospects, 2017–2018 AATPt% (name — AATPt% — league — NHL Team)
  1. Gabriel Vilardi — 0.3749 — OHL — LAK
  2. Robert Thomas — 0.3673 — OHL — STL
  3. Aleksi Heponiemi — 0.3303 — WHL — FLA
  4. Maxime Comtois — 0.3198 — QMJHL — ANA
  5. Cody Glass — 0.2696 — WHL — VGK
  6. Vitalii Abramov — 0.2772 — OHL — CBJ
  7. Nick Suzuki — 0.2740 — OHL — VGK
  8. Jordy Bellerive — 0.2660 — WHL — PIT
  9. Jason Robertson — 0.2654 — OHL — DAL
  10. Owen Tippett — 0.2632 — OHL — FLA
Defensemen Rankings

Top 10 Single-Season AATPt% (name — AATPt% — season— NHL Rights/Current League)
  1. Anthony DeAngelo — 0.3466 — 2013–2014 OHL — NYR
  2. Ryan Merkley — 0.3161 — 2016–2017 OHL — 2018 Eligible
  3. Zach Bogosian — 0.3106 — 2007–2008 OHL — BUF
  4. Ryan Merkley — 0.2968 — 2017–2018 OHL — 2018 Eligible
  5. Tyson Barrie — 0.2953 — 2009–2010 WHL — COL
  6. Evan Bouchard — 0.2926 — 2017–2018 OHL — 2018 Eligible
  7. Ryan Murphy — 0.2775 — 2010–2011 OHL — MIN
  8. Kyle Capobianco — 0.2711 — 2014–2015 OHL — ARZ
  9. Matthew Dumba — 0.2666 — 2011–2012 WHL — MIN
  10. Ryan Ellis — 0.2643 — 2008–2009 OHL — NSH
Top 10 Single-Season Raw PPTPt% (name — AATPt% — season— NHL Rights/Current League)
  1. Anthony DeAngelo — 0.4466 — 2013–2014 OHL — NYR
  2. Dallas Jackson — 0.4264 — 2009–2019 WHL — Retired
  3. Ryan Ellis — 0.4173 — 2010–2011 OHL — NSH
  4. Tyson Barrie — 0.4012 — 2009–2010 WHL — COL
  5. Travis Sanheim — 0.3930 — 2015–2016 WHL — PHI
  6. Ty Wishart — 0.3921 — 2007–2008 WHL — Czech
  7. Kale Clague — 0.3884 — 2017–2018 WHL — LAK
  8. Jake Bean — 0.3857 — 2017–2018 WHL — CAR
  9. Evan Bouchard — 0.2926–2017–2018 OHL — 2018 Eligible
  10. Anthony DeAngelo — 0.3760 — 2014–2015 OHL — NYR
Top 10 Presently Undrafted, Draft Eligible Aggregated AATPt% (name — AATPt% — league)
  1. Ryan Merkley — 0.3064 — OHL
  2. Thomas Gregoire — 0.2019 — QMJHL
  3. Nicolas Beaudin — 0.1986 — QMJHL
  4. Ty Smith — 0.1982 — WHL
  5. Calen Addison — 0.1849 — WHL
  6. Alexander Alexeyev — 0.1841 — WHL
  7. Evan Bouchard — 0.1759 — OHL
  8. Jared McIsaac — 0.1657 — QMJHL
  9. Radim Salda — 0.1620 — QMJHL
  10. Noah Dobson — 0.1614 — QMJHL
Top 10 NHL-Associated Prospects, 2017–2018 AATPt% (name — AATPt% — league — NHL Team)
  1. Kale Clague — 0.2309 — WHL — LAK
  2. Jake Bean — 0.2302 — WHL — CAR
  3. Henri Jokiharju — 0.2068 — WHL — CHI
  4. Nicolas Hague — 0.2001 — OHL — VGK
  5. Cal Foote — 0.1870 — WHL — TBL
  6. Pierre-Olivier Joseph — 0.1813 — QMJHL — ARZ
  7. Dennis Cholowski — 0.1779 — WHL — DET
  8. Josh Mahura — 0.1755 — WHL — ANA
  9. David Noel — 0.1745 — QMJHL — STL
  10. Cam Dineen — 0.1737 — OHL — ARZ
Data Tables (with Airtable links)

OHL 2007–2008 to 2017–2018 Data: OHL 2007-2018 - Airtable

7owif0Y.png


WHL 2007–2008 to 2017–2018 Data: WHL 2007-2018 - Airtable

lcWPeY8.png


QMJHL 2007–2008 to 2017–2018 Data: QMJHL 2007-2018 - Airtable

VVbsofa.png


Aggregated Collapsed Data: Aggregated Collapsed - Airtable

7dYEq2r.png


Aggregated Uncollapsed Data:

LC4ac1u.png

Complete datasets and code for the project are available on my GitHub here.
 
Last edited:

CallMeShaft

keep calmala
Apr 14, 2014
16,297
22,719
I mean this in the best way possible, you are a huge nerd. Thanks for doing this.

One thing I do want to ask, how do these numbers equate to points in the NHL. Like you have Jokiharju, who I'm very interested in as a Hawks fan, at  0.2068. Is there any way to use that number to calculate how many points he'd likely get next season if he played in the NHL?
 
  • Like
Reactions: nopurposeflour

37 others

Registered User
Apr 18, 2017
465
235
I mean this in the best way possible, you are a huge nerd. Thanks for doing this.

One thing I do want to ask, how do these numbers equate to points in the NHL. Like you have Jokiharju, who I'm very interested in as a Hawks fan, at  0.2068. Is there any way to use that number to calculate how many points he'd likely get next season if he played in the NHL?

Haha, thanks! Right now, I have no way of comparing prospects metrics to NHLer's, but I'll definitely add finding a way to do that to my to-do list!
 
  • Like
Reactions: CallMeShaft

NHL RankKing

Fantasy Guru
Aug 31, 2013
863
106
Hockeytown
www.nhlrankking.com
Awesome! Very ambitious...I love it!!

I actually finished a project like this last fall for my Masters and created a stat called PNHLe.
You can read up on it more here: http://nhlrankking.com/PNHLe.htm

Also, it's incorporated into an app I developed (for iOS). I think there is a link it my signature and it's completely free.

Send me a message if you have any questions or need someone to bounce ideas off of
 
  • Like
Reactions: Les Averman

37 others

Registered User
Apr 18, 2017
465
235
Awesome! Very ambitious...I love it!!

I actually finished a project like this last fall for my Masters and created a stat called PNHLe.
You can read up on it more here: http://nhlrankking.com/PNHLe.htm

Also, it's incorporated into an app I developed (for iOS). I think there is a link it my signature and it's completely free.

Send me a message if you have any questions or need someone to bounce ideas off of

Really cool stuff! One of my next projects is use to prospects' AATPt% to project success in the NHL, and it sounds like it'll come out similar to your project. It's really cool that you were able to do that for your Master's! Was it for your thesis?
 

TheWhiskeyThief

Registered User
Dec 24, 2017
1,625
497
Interesting stuff.

If you’re looking for predictive value, I’d suggest looking at height/weight ratios for something with more granularity. Production for players a standard deviation away from average size should have a similar correlation.

I’d posit that a 6’1” 30g scorer in the O has a better shot at the next level than a 5’9” 35g scorer in the Q or a 6’3” 35g scorer in the Dub. But I’d love to know where that inflection point really is.
 
  • Like
Reactions: nopurposeflour

37 others

Registered User
Apr 18, 2017
465
235
Interesting stuff.

If you’re looking for predictive value, I’d suggest looking at height/weight ratios for something with more granularity. Production for players a standard deviation away from average size should have a similar correlation.

I’d posit that a 6’1” 30g scorer in the O has a better shot at the next level than a 5’9” 35g scorer in the Q or a 6’3” 35g scorer in the Dub. But I’d love to know where that inflection point really is.
That's on my to-do list. I basically want to use machine learning to figure out the best weights for height, weight, AATPt%, and probably few other metrics as well to get the best projection of NHL stats. It won't be a perfect projection by any means, but it'll be cool to see at least .
 

Hynh

Registered User
Jun 19, 2012
6,170
5,345
Interesting stuff.

If you’re looking for predictive value, I’d suggest looking at height/weight ratios for something with more granularity. Production for players a standard deviation away from average size should have a similar correlation.

I’d posit that a 6’1” 30g scorer in the O has a better shot at the next level than a 5’9” 35g scorer in the Q or a 6’3” 35g scorer in the Dub. But I’d love to know where that inflection point really is.
Shouldn't it be a 5'11" player in the Q?

I think something that might be interesting is that the WHL is finally moving to a 68 game schedule next year. 4 fewer games per year means more time for practice.
 

TheWhiskeyThief

Registered User
Dec 24, 2017
1,625
497
That's on my to-do list. I basically want to use machine learning to figure out the best weights for height, weight, AATPt%, and probably few other metrics as well to get the best projection of NHL stats. It won't be a perfect projection by any means, but it'll be cool to see at least .

Just from staring at numbers for years, I’ve come around to an anecdotal concept that an acceptable height/weight number(in the metric system of cm-kg) in your draft year in H-W=100. If you’re closer to 90 you’re closer to NHL readiness(or fat) while if you’re closer to 110 you are going to have a tough time getting to the show unless you have sublime skill. Plenty of 1ppg forwards in the CHL who if not in the “average” NHL size range never made it past the ECHL. The farther away from “average” the production needs to become obscene. The small guy who needs that much more skill to make up for the lack of size, the big guy who needs to show he has skill beyond his size.

It’s an operable concept considering the average NHL player number (185cm-95kg)would be 90, but the farther away you get from average, it breaks down to where shorter players in their draft year need to be closer to 90 while taller guys can be slightly above 100 and still project well. If they’re 110 it’s almost too much ground to make up.

Definitely not a hard and fast rule, but something to keep in the back of your mind.
 
  • Like
Reactions: hockeynorth

93LEAFS

Registered User
Nov 7, 2009
34,173
21,367
Toronto
Just want to put this out there, while it is in Vollman's book, the work you are examining and trying to improve upon is Iain Fyffe.

Looking at these numbers compared to Fyffe's system, I'm not sure how much percentage of points matters and to what level it should be weighted. But, I obviously appreciate the hard work involved on this.

The big questions when trying to improve on Fyffe's model are. How much does a percentage of team points matter? How much does the type of point matter (for example a primary assist at 5v5 vs a secondary assist on a 5-3 PP)? The other issue I feel exists that with no TOI stats available how much ice-time is skewing things. Being on a bad team can lead to getting significantly more ice-time and opportunity.

I'd also love to see it broken down into individual seasons though, and possibly with a regression technique to compare it to how these guys started in pro-hockey. While data from their 16 year old season is important, the most important is their most recent season. So, it would be interesting to see the results if you went like this for a guy entering the NHL at 19, which would be (5x18 year old season + 3X17 year old season+ 1x16 year old season)/9
 
  • Like
Reactions: nopurposeflour

93LEAFS

Registered User
Nov 7, 2009
34,173
21,367
Toronto
Just from staring at numbers for years, I’ve come around to an anecdotal concept that an acceptable height/weight number(in the metric system of cm-kg) in your draft year in H-W=100. If you’re closer to 90 you’re closer to NHL readiness(or fat) while if you’re closer to 110 you are going to have a tough time getting to the show unless you have sublime skill. Plenty of 1ppg forwards in the CHL who if not in the “average” NHL size range never made it past the ECHL. The farther away from “average” the production needs to become obscene. The small guy who needs that much more skill to make up for the lack of size, the big guy who needs to show he has skill beyond his size.

It’s an operable concept considering the average NHL player number (185cm-95kg)would be 90, but the farther away you get from average, it breaks down to where shorter players in their draft year need to be closer to 90 while taller guys can be slightly above 100 and still project well. If they’re 110 it’s almost too much ground to make up.

Definitely not a hard and fast rule, but something to keep in the back of your mind.
The study he's basing this off of, came to the conclusion that height at a players draft date mattered, but his body mass index or weight didn't.
 

Henkka

Registered User
Jan 31, 2004
32,316
13,331
Tampere, Finland
That's on my to-do list. I basically want to use machine learning to figure out the best weights for height, weight, AATPt%, and probably few other metrics as well to get the best projection of NHL stats. It won't be a perfect projection by any means, but it'll be cool to see at least .

Nice, really nice work. Could you do another adjustment with weighing more goalscoring? Like doubling the value of goals vs. assists.
 

37 others

Registered User
Apr 18, 2017
465
235
Nice, really nice work. Could you do another adjustment with weighing more goalscoring? Like doubling the value of goals vs. assists.
Yup, that's not hard to do at all! In the Python script, I have assists/team assists and goals/team goals tracked, so I might upload some of those stats soon.

I just need to find modifier values for a player's height and I'll include those as well, although I think the modifier should only be used to discount short players stats rather than boost super tall players stats...

Edit: I have been giving this some thought, and my intention with this metric is not to make it a catch-all projection metric. I can totally play around with adjusting values for player's heights, but this obviously doesn't make much sense if I adjust for height for players in the NHL (since I'm planning on doing the same analysis for the NHL).
 
Last edited:

37 others

Registered User
Apr 18, 2017
465
235
Just want to put this out there, while it is in Vollman's book, the work you are examining and trying to improve upon is Iain Fyffe.

Thanks for the heads up! I actually just bought the book for myself so I've been doing some reading into it.

How much does a percentage of team points matter? How much does the type of point matter (for example a primary assist at 5v5 vs a secondary assist on a 5-3 PP)?

The WHL, OHL, and QMJHL web pages make it harder to get that information than how I currently doing (scraping the stats table for each game summary), but I can probably get that by looking at the play-by-play summaries (assuming whoever is listed first for assists on goals got the primary assist).
 

93LEAFS

Registered User
Nov 7, 2009
34,173
21,367
Toronto
Thanks for the heads up! I actually just bought the book for myself so I've been doing some reading into it.



The WHL, OHL, and QMJHL web pages make it harder to get that information than how I currently doing (scraping the stats table for each game summary), but I can probably get that by looking at the play-by-play summaries (assuming whoever is listed first for assists on goals got the primary assist).
I think it’s Prospect-stats.com has all the info for those leagues.

The tracking is somewhat unreliable, but the mistakes will somewhat cancel out.
 

37 others

Registered User
Apr 18, 2017
465
235
Awesome! I’d love to see it separates by only player seasons in their first year of nhl draft eligibility.
That should be really easy to do! Now that it's the summer, I should have plenty of time to work on this project.
 

Ad

Upcoming events

Ad

Ad