Corsi: What Am I Missing? | HFBoards - NHL Message Board and Forum for National Hockey League

Corsi: What Am I Missing?

Synergy27

F-A-C-G-C-E
Apr 27, 2004
13,203
13,414
Washington, D.C.
So, I've been messing around with some data from War on Ice for the last few minutes. I basically did a very simple analysis to determine how often the team that wins the Corsi battle also wins the game. The results were surprising, to me, so I'm wondering if I'm just overlooking something. Here's what I found, using dating from WOI going back to 2002-03 (14,861 games):

In the regular season, the team that wins the Corsi battle only wins the game about 43.7% of the time.

In the playoffs the results are similar but slightly muted. The team that wins the Corsi battle wins the game about 46.3% of the time.

In both cases, losing the Corsi battle leads to winning the game (or is associated with winning the game, correlation doesn't equal causation) a lot more often than winning it. So why are so many people so worked up about CF% being such an important part of being a winning team? What am I missing?
 
Did you account for score effects? I.e. when a team is up, it tends to go into a shell and have many more opportunities toward their own net than toward the opposition's.
 
Did you account for score effects? I.e. when a team is up, it tends to go into a shell and have many more opportunities toward their own net than toward the opposition's.

I didn't account for anything, and I'm aware of the impact of score on Corsi.

But what I'm getting at is, since we know that a trailing team is going to press and a leading team is going to defend, and that this will have a direct impact on the Corsi situation, why is it considered so important to have a high CF%?
 
I didn't account for anything, and I'm aware of the impact of score on Corsi.

But what I'm getting at is, since we know that a trailing team is going to press and a leading team is going to defend, and that this will have a direct impact on the Corsi situation, why is it considered so important to have a high CF%?

Oh, okay. My next question then. Is it considered very important to have a high CF%?
 
Oh, okay. My next question then. Is it considered very important to have a high CF%?

A lot of people are discounting the Rangers performance this year and/or bracing themselves for an inevitable "regression" in the playoffs because they have a poor CF%. I was under the assumption that CF% was the main "advanced stat" that people liked to use to predict whether or not good teams are really good.
 
A lot of people are discounting the Rangers performance this year and/or bracing themselves for an inevitable "regression" in the playoffs because they have a poor CF%. I was under the assumption that CF% was the main "advanced stat" that people liked to use to predict whether or not good teams are really good.

Common misconception, usually it's a combination of many advanced stats and possession stats. Corsi, Fenwick, PDO just to name a few.
 
I'm not a huge fan of Corsi personally.

It's certainly not a bad stat, but it clearly, clearly has not earned the "Holy Grail" status that some people have attributed to it.
 
You also have teams like the Ron Wilson's leafs that i imagaine have good Corsi because they shoot from everywhere. I mean Jason Blake probably was a very good corsi player just because he had about a 12 shots a game.

Still corsi is very flawed by itself. Its a misunderstanding of what it means that leads people to say advanced stats are not good indicators. Truth be told i doubt that even the best advanced stats for equaling success have even been created yet. Maybe some are almost completely impossible to keep track of without the capabilities of a multi million dollar organization like an NHL team.
 
You also have teams like the Ron Wilson's leafs that i imagaine have good Corsi because they shoot from everywhere. I mean Jason Blake probably was a very good corsi player just because he had about a 12 shots a game.

Still corsi is very flawed by itself. Its a misunderstanding of what it means that leads people to say advanced stats are not good indicators. Truth be told i doubt that even the best advanced stats for equaling success have even been created yet. Maybe some are almost completely impossible to keep track of without the capabilities of a multi million dollar organization like an NHL team.

Yeah, and this annoys me a bit. Its so easy for the NHL to just present a simple zone stat.

Its so obvious with like loyal 4th lines that just gets the puck deep that they will get bad corsi, while they in fact have much better possession stats and that is what ppl are looking for. Corsi also favors teams that goes real deep with all three forwards, with pressure like that you open up the points and a team often only ends up with one option and that is to put the puck back to a D who can fire it against a 5 man wall bumping their CORSI.

The big thing in hockey that ppl are missing right now -- when most seem to have realized the importance of puck possession -- is how teams play in the attacking zone and in the transition game. Because while puck possession is important scoring goals is what matters and puck possesssion is not equal to scoring goals.
 
There are a lot of Corsi advocates that mistakenly believe that correlation implies causation: ie. that high Corsi causes more goals.

However, originally Corsi was designed as an observation that the teams that performed best tended to have high CF%.

I don't know for sure, but I suspect that some NHL teams have analytics people on staff that succumb to the above fallacy. I wouldn't be surprised if Toronto paid big money to bring in a statistician who doesn't have the passion for the sport necessary to understand how Corsi should and should not be used, and he's telling them to focus on increasing shot volume.

I would be interested to see how the CF% to win% correlation looked when Corsi was a new concept and these misconceptions hadn't been born yet. Unfortunately I don't have the time to compile that data at this moment but I might take up that endeavor if nobody else beats me to it.
 
However, originally Corsi was designed as an observation that the teams that performed best tended to have high CF%.

Originally it was devised by Corsi (but not called that by him) to help him more accurately measure how busy (#of events) the goaltenders he was coaching were.. rather than by shots on goal alone.

Tim Barnes and Gabriel Desjardins ran with the idea as a proxy for possession and all that.. probably because there wasn't much else available.

I don't have a lot of confidence in Corsi predicting much of anything over a meaningful time period to an active club -- which tends to be more short term (win now) because over the long term you introduce too many changes. Certainly nothing that couldn't be discerned by shots as well, anyways.
 
Another thing is if you have a great goalie and are a middle of the pack team, i.e. Habs are 23rd in corsi but have Carey Price, you can still get 100+ points, win your division, and make the conference finals. Another major problem with Corsi is a team (like the Leafs under Horachek) take a billion crappy shots that the goalie doesn't have to put effort into and then wins the Corsi battle when they were easily outplayed.
 
There is some misinformation in this thread and I'll go into more detail below. First off, in answer to your question, the result you are getting is a combination of omitting score effects, assuming Corsi is the only possession metric and poor methodology.

First of all, for what you are trying to do unblocked shot attempts (Fenwick) probably works better than all shot attempts (Corsi). This is fairly minor.

The real meat is that you are ignoring score effects as others have mentioned and this is compounded by the method you are using. The "score effect" is the tendency of teams that are behind to push harder to catch up and therefore have better possession numbers than they normally would. There are a couple ways to account for this but the best one currently available is to use score adjusted Cosi/Fenwick.

The third issue is your methodology throws away large parts of the signal you want to measure. When a good team falls behind they are more likely to out-shoot their opposition and do so by a wider margin. By using a binary win/loss for a single game you are essentially discarding all this data. Since you have thrown away so much of the signal of interest it's much easier for noise/bias like score effects to show up and skew your result.


Stepping back at bit, possession metrics like Corsi are important for a few reasons.

At the highest level Score Adjusted Coris/Fenwick are the best predictors of of future wins currently available. (It's important to remember this means it tells you who has the best chance to win not who will win. Think of it like roulette in a casino. the house has the best chance to win that doesn't mean they win every game or that some people don't walk away ahead, it only means that the odds are not in favor of them doing so.)

It's not a complete metric in that it doesn't factor in goal tending and special teams, but even without doing so it's the most predictive metric available and because they are not already factored in you can still account for these other factors. E.G the Rangers are 18th overall in 5 on 5 Score adjusted Corsi, but because they have good goaltending and special teams they are certainly much better than the 18'th best team overall, but they are still unlikely to be the best team in the NHL.

While Corsi is a proxy for puck possession it also correlates well to scoring chances. This is very useful because tracking scorning chances is very subjective. Even if they generally agree on who is getting more the numbers themselves end up different so you can't just combine the numbers generates by different people, and there are to many games to have them all done by the same person. Since Corsi correlates to scoring chances, you can use it as a proxy for who is getting more scoring chances. (Which makes sense or possession wouldn't matter)

Mathematically speaking the score in a hockey game is (Your shot attempts * their save %) - (their shot attempts * your save %)

It turns out that Corsi/Fenwick (shot attempts/unblocked shot attempts) is the most manageable part of this equation. If a team wants to improve in Corsi/Fenwick it's you can identify coaches, players, systems, play-styles, tactics, etc that improve this much more readily than you can other aspects of the game.

It's much more difficult to influence save % or shooting %. Over large enough samples the only thing that measurably impacts save % is the goalie, while the only thing that seems to measurably impact team shooting is the career shooting percentage of the players on the team. Teams with a lot of players who have had high sh% and lots of shots over their career may be able to sustain sh% higher than the NHL norm, but everyone else tends to regress to the norm. Likewise sv% tends to regress to the career sv% of the goaltenders on the team regardless of coaching, strategy, "attention to detail" or any of the other catch phrases that get thrown around.
 
Is there a stat yet that measures the effect of getting the puck down ice from the offensive zone to the defensive zone and its effect on subsequent shift?

What I mean is that since hockey shifts are about a minute in length a good fourth line can move the puck down the ice and then turn it over to the top line who scores the goal.

The fourth line will likely see no credit while the first line gets the points.

The fourth line should get some credit for starting the play and the handoff but that gets into a larger question of how goals are scored (turnovers, controlling offensive zone possession, number of shots within a particular timespan, etc.)

Corsi, Fenwick, and PDO are good starting points for me but I have a tendency to focus on more macro level items and then drill down trying to discover why something occurred or is occurring.
 
The real meat is that you are ignoring score effects as others have mentioned and this is compounded by the method you are using. The "score effect" is the tendency of teams that are behind to push harder to catch up and therefore have better possession numbers than they normally would. There are a couple ways to account for this but the best one currently available is to use score adjusted Cosi/Fenwick.

Thank you for the well thought out post. I admit to not having thought about this for an extended period of time, and was really just curious why my admittedly simple analysis was wrong or just too simple to be meaningful.

I agree with just about everything you've written, but the one thing I can't completely get over is the validity of the score adjustments. I don't think that every 1-goal game is the same, for example, and I also don't think that being tied in the third period is the same as being tied at the beginning of the game. I don't know if the score adjustments try to account for these potential differences or not, but they don't seem to superficially.
 
So, I've been messing around with some data from War on Ice for the last few minutes. I basically did a very simple analysis to determine how often the team that wins the Corsi battle also wins the game. The results were surprising, to me, so I'm wondering if I'm just overlooking something. Here's what I found, using dating from WOI going back to 2002-03 (14,861 games):

In the regular season, the team that wins the Corsi battle only wins the game about 43.7% of the time.

In the playoffs the results are similar but slightly muted. The team that wins the Corsi battle wins the game about 46.3% of the time.

In both cases, losing the Corsi battle leads to winning the game (or is associated with winning the game, correlation doesn't equal causation) a lot more often than winning it. So why are so many people so worked up about CF% being such an important part of being a winning team? What am I missing?

That's because Corsi, especially when adjusted to score in some way, has a very high predictive value. Essentially, Corsi tells you what will happen in the future better than other stats.

Which has nothing to do with who won the game that's already over.
 
Thank you for the well thought out post. I admit to not having thought about this for an extended period of time, and was really just curious why my admittedly simple analysis was wrong or just too simple to be meaningful.

I agree with just about everything you've written, but the one thing I can't completely get over is the validity of the score adjustments. I don't think that every 1-goal game is the same, for example, and I also don't think that being tied in the third period is the same as being tied at the beginning of the game. I don't know if the score adjustments try to account for these potential differences or not, but they don't seem to superficially.

On thing to keep in mind about all this is that we're using stone age tools here. Hockey's publicly available advanced stats are a joke. They are not advanced. They are just all we have at the moment and they are being shoehorned into uses they may or may not be that great for..

Corsi is a terrible proxy of possession really.. I mean essentially you're measuring possession by how often you get rid of the puck.

Yeah being in a position to make a shot attempt correlates with you having had the puck but there is a lot of wiggle room there.

Teams have their own internal metrics that I can only assume are much better than the stuff we're using publicly and I think we'll see a lot more interesting things come out once they are tracking all the players and the puck etc. in real time all the time.
 
On thing to keep in mind about all this is that we're using stone age tools here. Hockey's publicly available advanced stats are a joke. They are not advanced. They are just all we have at the moment and they are being shoehorned into uses they may or may not be that great for..

Corsi is a terrible proxy of possession really.. I mean essentially you're measuring possession by how often you get rid of the puck.

Yeah being in a position to make a shot attempt correlates with you having had the puck but there is a lot of wiggle room there.

Teams have their own internal metrics that I can only assume are much better than the stuff we're using publicly and I think we'll see a lot more interesting things come out once they are tracking all the players and the puck etc. in real time all the time.

Wrong. It's a pretty damn good measure of possession:
http://www.pensionplanpuppets.com/2013/9/16/4727746/leafs-attack-time-at-the-halfway-mark
 
I don't want to start a new thread for this, but I can't find regular season stats on NHL.com anymore. Am I losing my mind or just missing something?
 
I'm not a huge fan of Corsi personally.

It's certainly not a bad stat, but it clearly, clearly has not earned the "Holy Grail" status that some people have attributed to it.

This... Corsi is something to look at... But its not everything like people make it out to be.
 

Users who are viewing this thread

Ad

Ad