# The Puck Stops Here

## An Introduction To Some Corsi Issues

I thought it would be a good idea to write an article in an accessible (as non-mathematical as possible) style to discuss Corsi analysis and some of issues discussed on the internet regarding it.

Buffalo Sabres assistant coach Jim Corsi wanted to better understand how busy goalies had been so instead of counting only shots against, began counting blocked and missed shots as well.  After a while it was realized that this could be a basis of individual player assessment.  It was found that the differential of Corsi events (the difference in the number of shots attempted for and against) correlates extremely strongly with which team possesses the puck and also with which zone the puck is in.  A team with a high Corsi possesses the puck the majority of the game and keeps the puck in their opponent’s zone.  These are good things that are valuable to playing winning hockey and this is the best (known?) statistical measure from the information the NHL routinely publishes online.

We measure the differential of Corsi events when an individual player is on the ice (his Corsi rating) to try to gage which players are driving puck possession and puck position during games.  The problem with looking at things on an individual level is that Corsi is inherently a team gage and must be put into context to try to get individual ratings.  A player on a good team will be more likely to have a good Corsi than a player on a bad team.  A player who plays in offensive situations or against weak opposition will be more likely to have a good Corsi than one who doesn’t (and one can imagine several other reasons a player might have a good Corsi related to the context in which he plays).

In order to remove the bias of the situation a player plays, Corsi is usually only recorded in 5 on 5 situations.  Thus power play and penalty kill situations are removed. In some cases data is also recorded in these situations but usually kept separately.  In power play situations any player would be expected to have a good Corsi and in penalty kill situations it is expected to be poor.  Nevertheless, it is possible to do Corsi analysis to special teams situations.  Since more time is played at even strength, there is a poorer sample size on special teams, so there is more random error in this.

With an even strength Corsi, I find the most useful corrections (in that they are the biggest) are the team the player plays on and the situation in which the player plays (as measured by zone starts - does he start more shirts in the offensive or defensive zones).  These corrections make Corsi a more useful number to compare various players because they better include the context in which the player plays.

The complaint is often that Corsi (even with various adjustments) is not a “be all and end all” statistic to rank all players.  That is unfair because no existing number in hockey is held to that standard.  Goals, points, +/- ratings etc. are not a “be all and end all” statistic to rank all players, but like Corsi they give us useful information about how players are playing.

The internet father of Corsi analysis is Gabe Desjardins who publishes the fabulous behind the net website where he publishes the numbers to do Corsi analysis.  The numbers go back to the 2007/08 season, so we do not have an ability to do Corsi analysis through most of the NHL’s history - only the recent years are available.  Sufficient data was not recorded throughout most of the NHL’s history to allow this.

Desjardins estimates that about 40% of the game is captured by Corsi analysis.  It measures the shots attempted at even strength.  This is a strong measure of puck possession and puck position on the ice, but it does not measure a lot.  It does not measure special teams play.  It does not measure the ability to score goals; it only measures the ability to generate shots.  Not all shots are equal.  Some shots are better than others and have a better chance of becoming goals.  Shot quality is not measured in Corsi analysis.  Neither is a particular player’s ability to score (his finishing ability) in a given situation.  It also does not include team’s ability to prevent quality shots or to make saves (generally this is goaltending).  In order to turn a Corsi rating into a more useful number, this must be taken into account if it is possible.  Corrections can be made for some of these effects.  Some of these effects are not too dependent upon the individual player involved.  The saves percentage while a player is on the ice is strongly dependent upon goaltending and not the individual player involved.

In hockey scoring goals is more important than generating shots.  Some think that a better analysis can be done using goals instead of shots.  This is the position of David Johnson who runs the website hockey analysis.  He is often seen as a “heretic” in the sabermetrics community because although that idea sounds simple enough it doesn’t hold up to scrutiny.

In order to go from shots to goals you merely multiply by the shooting percentage of the player involved.  On the team level, as we are talking about all shots when a player is on the ice, shooting percentage is not an individual number; it is the number for all players on the ice when a player is on the ice.  How much control does a player have on the shooting percentage of another player?  If the player is a good set-up man who can get his teammates into good scoring opportunities, he has some control, but it is generally very limited.  On the flip side saves percentage of a given team is not well controlled by any other player on the ice.

Even on an individual level, shooting percentage for a player is not a very repeatable number.  There are often wide differences in shooting percentage for an individual player from year to year.  These shooting percentage differences are often enough to explain unusually high or low scoring seasons.  In fact Corsi is found to be a better, more sustainable number from year to year than points or shooting percentage.  It is more consistent from year to year and hence a better measure of the individual contribution of a player.  Corsi is an underlying number that does a good job of showing how well a player is playing and it is a persistent measure of his talent.

Johnson’s goals analysis is equivalent to replacing Corsi with +/- ratings.  The problem with +/- ratings is they are strongly dependent on the shooting percentage and saves percentage of a team, while a player is on the ice.  These numbers are largely random and not under a player’s control.  They can be more repeatable if a player consistently plays with the same goalie or linemates, but they are largely a random factor.  This idea is quantified as pdo .  This is the sum of the shooting percentage and saves percentage when a given player is on the ice.  The leaguewide average by definition is 1.  Players who have a PDO well above 1 have a combination of saves percentage and shooting percentage well above average.  Usually this means they have had good luck and will not be able to sustain their numbers over the longterm.  Conversely if a player has a PDO well below 1, it means he has had bad luck and things will likely get better soon.  The biggest caveat here is that a player on a team with goaltending that is far above or below average is not as likely to regress to 1 in time as the saves percentage will not at a league average value.

This is not to say that there is nothing to be learned looking at shooting percentage and saves percentage while a player is on the ice.  There are small corrections that can be made.  Some players have extremely good or bad finishing ability for example.  To fully understand their contributions it is necessary to take it into account.  The problem with rejecting Corsi analysis for shot based analysis is that you lose information on the individual level because of the largely random PDO effects.

Johnson argues that he can identify some players who consistently have high or low PDO values.  Marian Gaborik is the best example of a high PDO player and Travis Moen is the best example of a low PDO player.  Something can be learned by understanding why these players have consistently extreme PDOs.  Gaborik has spent his career with good goaltending.  He plays with Henrik Lundqvist in New York and he is one of the top goalies in hockey.  Previously in Minnesota he played in a top defensive system under Jacques Lemaire where goalies had high saves percentages.  Gaborik is a player who is a very good finisher and has been the top offensive forward on his line throughout his career, so teams want to have him taking shots.  He helps to increase his team’s shooting percentage and has had the luck to play with a good saves percentage.  Travis Moen is a defensive forward who does not play in offensive situations often.  He plays against top opposition and his team usually does not press for high quality shots and accepts low quality shots that reduce their shooting percentage as they are more concerned with defence.  The strong opposition tends to reduce his team’s shooting percentage and his line’s lack of offensive play reduces their shooting percentage.

Despite the existence of a few players who have extreme PDOs that persist from year to year, for almost every player their PDO will regress to 1 over time.  It is a far better model to try to ignore the usually random effects of team shooting and saves percentages on most players than to start with numbers that include these largely random effects.

Johnson argues that you can look at the situation as an apparently linear equation:

goals scored = shots taken * shooting percentage

He then treats them all as independent variables when they are not.  If I want to raise my shooting percentage, I can chose not to shoot expect in situations where I have a very good chance of scoring.  I will reduce my goals scored and shots taken but increase my shooting percentage significantly.  Similarly if I shoot every time I am 200 feet from the goal on any angle with any number of players in the way (somebody at the games screams shoot whenever this situation presents itself), I will have a high number of shots taken an very low shooting percentage and may even have reduced my goals scored since I am no longer patient enough to get high percentage shots.  These variables are clearly dependent upon on another and cannot be views as independent linear variables.  This is not a linear equation.  The simplest way to show this point is that any player with zero goals scored will necessarily have a zero shooting percentage.  It is impossible for that to not be the case.

The further problem is that we are not discussing the shooting percentage of individual players.  We are discussing the shooting percentage of all players on the ice when a player is on the ice.  A player has little effect on the shooting percentage of others.

In trying to find a quantifiable example of a player who looks good in a goals based analysis but not in a Corsi based analysis, Johnson picks out Brendan Morrison of the Calgary Flames.  Johnson calls him a good signing despite the fact Corsi disagrees.  Morrison has failed so far this season.  He has no points in eight games so far this season.  He has had injury problems and played very few games, but he is a perfect example to show the difference between these two schemes.  Morrison has consistently had a high PDO in the past.  That is unlikely to last.  It hasn’t so far this season.  As a result his scoring numbers have declined.  It is unlikely that when Morrison finally comes back from injury, he will continue to have no points, but it is likely he will underproduce his previous numbers as his PDO is likely to drop from its past levels.  Morrison is a very good example of the difference between the two systems.  He is predicted far better by Corsi analysis than by a goals based analysis because of the randomness in the goals analysis captured by his PDO.

Corsi analysis is more powerful than a goals based analysis because it is more repeatable.  While shooting percentages can have a measurable effect that needs to be included to better understand the value of a player than a Corsi based number that doesn’t take it into account will give, it is usually a small correction.  It is far better than neglecting Corsi based analysis to do a goals based analysis.

Corsi analysis is a strong measure of puck possession and puck position.  With some context based adjustments it is possible to find individual player values.  This number is a useful number for rating players, just as goals or points is.  Just as goals or points, nobody in their right mind would claim this number is a “be all and end all” statistic to rank players.  Much of the resistance to Corsi analysis comes from the fact that people do not understand it.  They often expect it to be a “be all and end all” statistic and criticize it when it fails, as it is expected to.  It gives useful information about how players are playing.  With this information we can better assess players than without it.

Filed in: | The Puck Stops Here | Permalink

## Comments

Thanks for trying, but my eyes glazed over about half way through. I’m not big on statistical analysis of human beings. I just watch the game and usually get a pretty good feel for who’s doing well and who’s not.

Posted by MsRedWinger from Flori-duh on 11/17/11 at 08:22 PM ET

First you suggest Corsi is a better stat but that you need to adjust it based on the team he plays for and the situations he plays in.

With an even strength Corsi, I find the most useful corrections (in that they are the biggest) are the team the player plays on and the situation in which the player plays (as measured by zone starts - does he start more shirts in the offensive or defensive zones).

And then you criticize PDO because it is heavily influenced by the team, and in particular the goalie on the team, he plays for.

Johnson argues that he can identify some players who consistently have high or low PDO values.  Marian Gaborik is the best example of a high PDO player and Travis Moen is the best example of a low PDO player.  Something can be learned by understanding why these players have consistently extreme PDOs.  Gaborik has spent his career with good goaltending.  He plays with Henrik Lundqvist in New York and he is one of the top goalies in hockey.  Previously in Minnesota he played in a top defensive system under Jacques Lemaire where goalies had high saves percentages.

But, like you and Corsi, I have never suggested we should use a goal based evaluation without considering teammates and situation.  It is being misleading to suggest otherwise.  Furthermore, the most significant reason why Gaborik has an elevated PDO is not because of his goalie’s save percentage but because of his on-ice shooting percentage, which has had nothing to do with his goalie, and it is pretty difficult to conclude that Gaborik has benefited by playing with other elite level offensive players.  By changing the debate from shooting percentage (which is what we were debating on my website) to PDO seems like a way of disguising the real issue.

He then treats them all as independent variables when they are not.  If I want to raise my shooting percentage, I can chose not to shoot expect in situations where I have a very good chance of scoring.  I will reduce my goals scored and shots taken but increase my shooting percentage significantly.

What you are saying here is that my claim that we must take into account shooting percentage should be dismissed because players can in fact control shooting percentage.  Players can’t control shooting percentage because they can??

This is worse than Gabe’s argument that we can’t include sub-replacement level players when calculating league-wide average shooting percentage because these sub-replacement level players will unrealistically pull down the league wide average while on the other hand insisting shooting percentage is mostly just dumb luck.  If shooting percentage was mostly just dumb luck, why the worry of sub-replacement players pulling down the average?

Either the players can and do control shooting percentage, in which case we should take it into account, or the players don’t.  You can’t play both sides.

Posted by HockeyAnalysis on 11/17/11 at 09:26 PM ET

Please reconcile the following 2 statements you made in the above post:

Statement 1:
“Johnson’s goals analysis is equivalent to replacing Corsi with +/- ratings.  The problem with +/- ratings is they are strongly dependent on the shooting percentage and saves percentage of a team, while a player is on the ice.  These numbers are largely random and not under a player’s control.”

Statement 2:
If I want to raise my shooting percentage, I can chose not to shoot expect in situations where I have a very good chance of scoring.  I will reduce my goals scored and shots taken but increase my shooting percentage significantly.  Similarly if I shoot every time I am 200 feet from the goal on any angle with any number of players in the way (somebody at the games screams shoot whenever this situation presents itself), I will have a high number of shots taken an very low shooting percentage and may even have reduced my goals scored since I am no longer patient enough to get high percentage shots.

Posted by HockeyAnalysis on 11/17/11 at 10:27 PM ET

David

There is no discrepancy between those two statements whatsoever.

You are aware that hockey players are not actively attempting to maximize or minimize their shooting percentage when they are playing .... right?

Posted by PuckStopsHere on 11/17/11 at 10:40 PM ET

So if they are not actively attempting to maximize or minimize their shooting percentage then why can’t I, at least in this context, consider t shooting percentage and shots taken to be independent variables?  I understand that you *could* control them but if players are *not* controlling them it seems reasonable to consider them independent.

In effect you are dismissing my theory because of a hypothetical situation that would make shooting percentage and shots dependent variables and then telling me that that hypothetical situation doesn’t happen in the NHL.

Posted by HockeyAnalysis on 11/17/11 at 11:55 PM ET

I am telling you there is a relationship between shots, shooting percentage and goals.  It is a complex relationship.  We do not clearly know what the relationship is.

In the end that shouldn’t matter because shots taken when a player is on the ice better shows a true talent of players than does the shooting percentage of all the players on the ice when a player is on the ice.  Shooting percentages for individual players are highly volatile and it is worse for shooting percentages of other players who happen to be on the ice with a given player.

Consider this thought experiment.  Marian Gaborik has a repeatedly high PDO (high shooting percentage of he and his teammates is a part of this).  Travis Moen has a repeatedly low PDO (low shooting percentage of he and his teammates are a part of this).  What happens if Moen and Gaborik were played on the same line?  Obviously at least one of them could not keep his PDO trend.  It is a consequence of how the player is played - you want to claim it as a true talent of the player.

Posted by PuckStopsHere on 11/18/11 at 12:25 AM ET

I have stated several times that style of play is a significant factor in shooting and save percentages.  Gaborik is on the ice to score goals as are the line mates he is on the ice with.  He generates a high on-ice shooting percentage, in part because of his talent, and in part because of the style of game he is asked to play.

Moen is on the ice to prevent goals.  He isn’t asked to drive the opposition net or go deep into the offensive zone or take risks offensively that might put himself in a weak defensive position.  As a result of the style of play and his talents (he probably isn’t as offensive gifted as Gaborik) he has a far worse on-ice shooting percentage.

The point is, it isn’t *why* they have different on-ice shooting percentage that matters here, it is that they *have* different on-ice shooting percentages.  The typical shot taken when Moen is on the ice is far less difficult than a typical shot taken when Gaborik is on the ice.  The why is irrelevant when we are trying to answer the question of whether we should take into account shooting percentage.  If you don’t consider shooting percentage you are treating a Moen shot the same as a Gaborik shot and thus you will over value Moen’s offensive ability and under value Gaborik’s with a shot or corsi based analysis.

The fact that some players have their roles change season to season, or when one coach is fired and another is hired, or as they improve as a young player or decline as an older player is part of the reason why it is difficult to identify shooting talent.  But just because shooting percentage is difficult to identify as a talent, doesn’t mean it doesn’t exist.  It does.  Intuitively it makes sense that it does (it would seem strange if all players had identical shooting talent), and statistically the evidence is there if you look hard enough.  Anyone who believes otherwise is oblivious to reality.

As for your thought experiment, it is unlikely that any coach would ever ask Gaborik and Moen to be on the ice together though if they did what would happen to their on-ice shooting percentage would depend in part what role they are asked to do.  Is Moen being given more freedom to be creative offensively when playing with Gaborik or is Gaborik being asked to utilize his skills in a more defensive role?  From there we need to ask whether Moen’s skills will translate well to creating offense or whether Gaborik’s skills can translate well to being a responsible defensive player.

Posted by HockeyAnalysis on 11/18/11 at 12:55 AM ET

Corsi limited to score tied is around r^2=.55 with wins, I think.

Posted by Ralph on 11/19/11 at 02:13 AM ET

Team level corsi and player level corsi have to be treated differently.  There is less spread in shooting percentage across teams than there is between players.

Posted by HockeyAnalysis on 11/19/11 at 02:51 AM ET

There is less spread in shooting percentage across teams than there is between players.

Exactly David.  This is why you should be skeptical of players who have very good or very bad PDOs (the players your system selects that Corsi doesn’t).  The shooting percentage of all the players on the ice when that player is on the ice is an outlier and as you say there is little spread in these numbers (usually).  One player is hardly responsible for the shooting percentage of all the other players on the ice when they are.  That is why Corsi analysis is better than your goal based analysis.

Posted by PuckStopsHere on 11/19/11 at 03:03 AM ET

No, the lack of spread at the team level is an indication of two things:

1.  Teams can’t afford to have a lot of high shooting percentage players on their roster.  They would love to, but they can’t because high shooting percentage players are high priced players and teams have budgets and salary caps to adhere to.

2.  Teams build their rosters in similar ways generally.  A line or two of offensive oriented players and a line or two of defensive oriented players so at the team level several styles of play are covered by the 20 players on their roster.  Individual players are often asked to focused on just one style of play - Gaborik offense, Moen, defense.  To have a similar spread at the team level you’d need to have one team that plays a pure offensive game lines 1 thru 4 and another team that plays a pure defensive game lines 1 thru 4.  Those teams generally do not exist.

Posted by HockeyAnalysis on 11/19/11 at 03:16 AM ET

You are aware that what is important here is the shooting percentage of all players on the ice when a certain player is on the ice, which is much closer to a team level shooting percentage than an individual one.

Essentially you are making points that hurt your case - however I often think that you know that and wont back down from stubborness.

Posted by PuckStopsHere on 11/19/11 at 03:20 AM ET

You are aware that what is important here is the shooting percentage of all players on the ice when a certain player is on the ice, which is much closer to a team level shooting percentage than an individual one.

You are making stuff up.  Please provide evidence.

Posted by HockeyAnalysis on 11/19/11 at 11:54 AM ET

David you are not really this silly.  What matters in Corsi analysis is all shots taken when a player is on the ice.  Hence to go with your goal based analysis, what matters is the shpoting percentage of all the players on the ice, not individual ones.

Posted by PuckStopsHere on 11/19/11 at 04:15 PM ET

Yes, but Gaborik doesn’t play with every teammate an even amount of time.  His play is not evenly distributed so his on ice shooting percentage is vastly higher than the teams overall shooting percentage.  Similarly Moen’s on-ice shooting percentage would be vastly lower than the teams overall shooting percentage.

So, when comparing the Rangers to the Canadiens as a team they may have fairly similar team shooting percentages.  But individual players can have vastly different on-ice shooting percentages.

For example, here are Gaborik’s and Drury’s 5v5 on-ice shooting percentages over the past 4 years.

Gaborik:  12.7, 13.16, 10.73, 9.55
Drury:  7.30, 6.68, 6.16, 6.25

Drury’s best season is 2.25% below Gaborik’s worst and in the two years they played together on the same team the difference between their on-ice shooting percentages was 4.57% and 3.3%.  Now, I challenge you to find two teams that have anywhere close to that difference.

What does that shooting percentage matter?  Well, over the past 2 seasons when Gaborik was on the ice the Rangers generated 13.8 fenwick events for per 20 minutes while they generated 12.8 fenwick events per 20 minutes while Drury was on the ice.  That gives Gaborik an 8% edge on fenwick events for.

But, while Gaborik was on the ice the Rangers generated 1.031 goals per 20 minutes and just .575 goals per 20 minutes while Drury was on the ice, giving Gaborik a 79% edge in goals scored for.

Posted by HockeyAnalysis on 11/19/11 at 05:38 PM ET

Gaborik and Moen are the extreme examples you have to keep returning to in order to attempt to make a point.  They are some of a handful of players who have sustained PDOs that are significantly different from 1.  The problem is you are stuck on the extreme cases.
They can be accounted for because of the circumstances in which they play.

I believe the numbers you are quoting form Gaborik and Drury as disingenuous.  I think they are individual shooting percentages and not those of everyone on the ice.  At any rate you are likely asking me to explain a result which is largely a fluke.  These numbers are always largely driven by random fluctuations.

The best method to judge players is a Corsi based method because in almost all cases PDO effects cannot be sustained.  You implicitly assume that they can and that gets you some really bad calls like Bredan Morrison.  The few cases where there is a meaningful reason PDO effects can be sustained come out largely when looking at the context (usage) of the player in question.

Posted by PuckStopsHere on 11/19/11 at 06:23 PM ET

Gaborik and Moen are the extreme examples you have to keep returning to in order to attempt to make a point.

Fine.  Use Crosby and Pahlsson.  Or either Sedin and Hanzal.  Or Ribiero and Gregory Campbell.  Or St. Louis and Kopecky.  Or Tanguay and Gomez.  Or Stastny and Grier.  Or Semin and Dvorak.  Or even Cory Stillman and John Madden.  Not all of these are quite as extreme as Gaborik and Moen, but there are still significant differences and there are a lot more where they came from.

I believe the numbers you are quoting form Gaborik and Drury as disingenuous.  I think they are individual shooting percentages and not those of everyone on the ice.

They are not individual, they are off all the players on the ice with them.  I am not making stuff up.  Pretty much all of the stats on stats.hockeyanalysis.com are on-ice stats, not individual stats, shooting percentages included.

At any rate you are likely asking me to explain a result which is largely a fluke.

All I am asking is that you take a look at reality with an open mind.  Ask yourself why all the top on-ice shooting percentage players are players we consider good offensive players and why almost all of the lowest on-ice shooting percentage players are defense first third liners?  Ask yourself, is that a fluke?  Is that random or is there some kind of order to that list?

If after that you honestly believe that all the players we consider offensive stars are at the top of the shooting percentage list because of luck and that all of the players we consider defensive specialists are at the bottom of the shooting percentage list because of luck, I give up.  I can’t help you.

The best method to judge players is a Corsi based method because in almost all cases PDO effects cannot be sustained.

Bogus.  I can give you dozens of forwards where unusually high or unusually low shooting percentage is sustained over the past 4 seasons starting with most of the guys I listed above.

Posted by HockeyAnalysis on 11/19/11 at 06:49 PM ET

David

You are stuck looking at an effect that is dwarfed by the year to year randomness in the PDO numbers.  That is why it is a useless place to start your analysis.

Ask yourself why all the top on-ice shooting percentage players are players we consider good offensive players and why almost all of the lowest on-ice shooting percentage players are defense first third liners?

Because offensive players tend to get into more good scoring chances and defensive players do not try to do that as often.  It is easily explained. It has been explained repeatedly in THIS THREAD.  The randomness in the numbers make them unusable unless you have years of data where one can assume the player has played at the same level all of that time and will continue to into the future.  Shots are a far more repeatable number on a shorter term basis.  This is a comparison of apples and oranges because you are quoting individual shooting percentages and not those of all players on the ice when a given player is on the ice like is being discussed here.  Further you include power play and penalty kill time.  Its a bait and switch game you are playing.

Posted by PuckStopsHere on 11/19/11 at 07:02 PM ET

The randomness in the numbers make them unusable unless you have years of data

Ok, now we are getting somewhere.  So to summarize, what you are saying is:

1.  Good offensive players do in fact have better on-ice shooting percentages.
2.  But the randomness of it over small sample sizes renders using shooting percentage (or goals) makes its use in player evaluation meaningless over those small sample sizes.

If you agree to those two statements, the only thing we need to be discussing is the following:

1.  What constitutes a small sample size?
2.  If we just consider corsi, which we agree is only a part of a players value, how much of a players total value can we tell from a corsi analysis over a small sample size?

My answer to those two questions is <1 year and probably somewhere between 40% and 45%, 50% max, if the corsi analysis is perfect.

Just to be clear, it is my opinion that any analysis using 1 year of data or less is useless (at least using current techniques).  Goal based analysis is useless because of sample size issues and corsi based analysis because you can’t get a fair evaluation of a player by ignoring shooting percentages which is a major part of a players skill.

When we start analyzing more than a single season of data a goal based analysis will provide a better evaluation of a players true talent level.  Three or four years is best.  There is no benefit to using multiple years in a corsi based analysis, it was bad at a single season or less and it is still bad at 3 or 4 seasons.

If I were a GM of a team I would never dole out a big contract to someone with less than a 2 or 3 year track record.  It is a far to risky of a bet and when you lose that bet you get stuck with someone like Ville Leino under performing on your roster for the next six years.

Shots are a far more repeatable number on a shorter term basis.

So probably are the number of curse words a player uses during the course of the game, but if it isn’t very reflective of a players ability to score goals, we can’t really use it as an evaluation tool.  Shots are a far better evaluation tool than curse words (I assume, have no data to back that up), but it is still a limited evaluation tool to the point where I wouldn’t use it to make a decision of any importance, or even to claim that Player A is better than Player B.

This is a comparison of apples and oranges because you are quoting individual shooting percentages and not those of all players on the ice when a given player is on the ice like is being discussed here.

I am not quoting individual shooting percentages.  Whenever I referenced shooting percentage in these comments I was referring to on-ice shooting percentage, meaning the teams shooting percentage when that player is on the ice.  Did you not read my last comment?

Further you include power play and penalty kill time.

No, I do not.  I only consider 5v5 ice time where each team has 5 skaters and 1 goalie on the ice.  Stop making stuff up and start reading what I am writing.

Posted by HockeyAnalysis on 11/19/11 at 07:38 PM ET

1.  Good offensive players do in fact have better on-ice shooting percentages.

By a small amount that is lost in randomness when looked at on an individual level on an individual season.

2.  But the randomness of it over small sample sizes renders using shooting percentage (or goals) makes its use in player evaluation meaningless over those small sample sizes.

Yes.

1.  What constitutes a small sample size?

Based on your work,. it appears to be any period less than about four complete seasons for a player in order to use your method.  But to be fair, this answer cannot be clearly given.  It depends on circumstances.  A small smaple size for goals based analysis is much bigger than a small sample size for Corsi analysis.

2.  If we just consider corsi, which we agree is only a part of a players value, how much of a players total value can we tell from a corsi analysis over a small sample size?

It depends what we consider a small sample size.  It is clear that one season is a significant sample size for a player when it comes to Corsi and it is NOT long enough to tell anything meaningful with your method.  Four years wasn’t enough data for you to get Brendan Morrison right.  One season was more than enough for Corsi to do it well.

Posted by PuckStopsHere on 11/19/11 at 10:27 PM ET

By a small amount that is lost in randomness when looked at on an individual level on an individual season.

It is not a small amount!!  See my Gaborik vs Drury example above.  Do you not read what I write?  It turns an 8% advantage in fenwick for into a 79% advantage in goals for.  Over 2 seasons so the sample size isn’t even that small and isn’t that affected by randomness.

Based on your work,. it appears to be any period less than about four complete seasons for a player in order to use your method.  But to be fair, this answer cannot be clearly given.  It depends on circumstances.  A small smaple size for goals based analysis is much bigger than a small sample size for Corsi analysis.

Based on my work, anything less than one season corsi is a better predictor of future goal scoring rates (but still not a good predictor).  At one season corsi and goals are similarly good (or in actuality similarly bad) predictors of future goal scoring rates.  Anything beyond one season and the edge goes to goals.  The more data you have, the larger the advantage for goals as the predictor.

It is clear that one season is a significant sample size for a player when it comes to Corsi

Yes, unfortunately corsi is a poor predictor of goal scoring rates.

it is NOT long enough to tell anything meaningful with your method.

It’s not long enough, but one year makes it an equally bad predictor as corsi.

Four years wasn’t enough data for you to get Brendan Morrison right.

Love how on one hand you suggest that my theory is not good because even one or two full seasons of data is a far too small sample size for goals and yet on the other hand you are so eager to use Morrison’s 8 games as a counter example in a feeble attempt to prove my theory wrong.

Posted by HockeyAnalysis on 11/19/11 at 11:04 PM ET

David

Cut the crap.  Instead of giving meaningless facts that do not address anything I have said, lets look at the results you produce.  As near as I can tell HART+ is your ultimate statistic in your system.  The top 10 players by this stat - Nikita Filatov, Matt Lashoff, Derek Smith, Paul Bisonette, Scott Parse, Eric Godard, Kyle Wellwood, Ben Smith, Francis Lessard and david Perron.  That is as random a group of players as ever.  That is stupid.

Perhaps you will argue sample size is an issue.  So lets restrict ourselves to 400 minutes or more.  Now our top 10 is Alexander Semin, Jonathan Toews, David Backes, Justin Abdelkader, Logan Couture, Brian Campbell, Brooks Laich, Adam McQuaid, Jeff Carter and Zdeno Chara.  It is still a rather random looking group.

It fails.

Posted by PuckStopsHere on 11/20/11 at 01:36 AM ET

So now you have given up on your anti-shooting percentage tirade and now you are on an anti-HART+ tirade?  I can only assume that since you have no defense you have accepted that shooting percentage matters but are too gutless to admit you were wrong.

As for HART+, 400 minutes with one season of data is still not enough of a sample size to draw any conclusions.  I think I have only mentioned that a dozen times in this thread.

Get back to me when you are done playing games and have an honest interest in actually understanding the stats.

Posted by HockeyAnalysis on 11/20/11 at 02:13 AM ET

David

Its the same thing.  You get the math right with the understanding behind what you are doing wrong.

Shooting percentage matters.  No that I said that you will hold that up as some kind of proof that you are correct and again ignore the message.  Shooting percentage when a player is on the ice is largely random.  It makes sense that it should be.  How can any player control the shooting percentage of four teammates and five opponents at a high level?  it isn’t sensible.  The numbers clearly show it to not be very repeatable.  That doesn’t mean there is nothing at all repeatable.  The problem is that in most cases what is repeatable is not nearly as big as that which isn’t and that which is repeatable is often dominated by the goaltending on their team and if the player is played in a defensive or offensive role and not things that are based on the talent of the player.  Even with that said, some talent exists (you can be dishonest and take that to mean I said you are correct) - it is just hidden behind far too many effects to be meaningful.

When you look at things that are often dominated by random effects, you get garbage in garbage out.  That is why your rankings like your HART+ fail so badly.

Posted by PuckStopsHere on 11/20/11 at 02:29 AM ET

Shooting percentage when a player is on the ice is largely random.

But it isn’t random.  See Gaborik vs Drury example above.  What don’t you understand about this?

If it is random, why are the shooting percentage leaders all offensive players and the trailers mostly defensive minded third liners?

It makes sense that it should be. How can any player control the shooting percentage of four teammates and five opponents at a high level?

For the skilled shooter it is obvious.  His good shooting elevates his on-ice shooting percentage.

A skilled passer can improve the shooting percentages with quality cross ice passes that get the goalie out of position.

The guy who can force turnovers could end up helping his team get more quality 2 on 1 chances.

Now combine all three on one line and ask them to generate offense.

The numbers clearly show it to not be very repeatable.

You keep talking about “The numbers” but you haven’t provided one single number in either your post or any of your comments.  When I provide numbers you ignore them, or worse accuse me of making them up or using the incorrect numbers and being “disingenuous” about them.  Please provide some numbers that will clearly show anything you are suggesting.

The problem is that in most cases what is repeatable is not nearly as big as that which isn’t and that which is repeatable is often dominated by the goaltending on their team and if the player is played in a defensive or offensive role and not things that are based on the talent of the player.

I have deliberately attempted to keep you focused on shooting percentage to factor out goalies.  Goalies don’t influence their teams shooting percentage.  Additionally, I have shown that players can influence offense more than defense so the effects are more dramatic when looking at offense.  Unfortunately you keep having this urge to look at PDO, I can only presume to act as a diversion.

Even with that said, some talent exists (you can be dishonest and take that to mean I said you are correct) - it is just hidden behind far too many effects to be meaningful.

Gaborik vs Drury.  8% edge becoming a 79% edge.  Tell me that isn’t meaningful.

That is why your rankings like your HART+ fail so badly.

Yet another diversion.

Posted by HockeyAnalysis on 11/20/11 at 03:40 AM ET

But it isn’t random.

Its not entirely random.  I agree to that.  Random effects are more meaningful than those that are talent based.  That doesn’t mean that shooting and passing skills do not matter.  Random effects dwarf them.

I have deliberately attempted to keep you focused on shooting percentage to factor out goalies.

You cannot do this.  When you look at goals based analysis to try to do a differential calculation (goals for less goals against) you include saves percentages of goalies.  It is unavoidable.  You pretend you can make it go away.  That is dishonesty.  If you are looking only at offensive production and ignoring defence then you are doing something less valuable and that is one reason your method fails.

Posted by PuckStopsHere on 11/20/11 at 03:49 AM ET

Random effects dwarf them.

Prove it.  Provide me these numbers you speak so fondly of.  Explain Gaborik vs Drury.

When you look at goals based analysis to try to do a differential calculation (goals for less goals against) you include saves percentages of goalies.

In a complete analysis yes, and you would try to factor out the goalie.  But we are not at that stage yet because you won’t even accept that we have to consider shooting percentage (which is independent of the goalie).  Let’s start with the basics.  And last time I checked, Gaborik and Drury played in front of the same goalies the past 2 seasons so can you address them please.  Is their on-ice shooting percentage difference all luck?

Posted by HockeyAnalysis on 11/20/11 at 03:59 AM ET

Is their on-ice shooting percentage difference all luck?

I have told you about 5 times it isn’t all luck.  But luck (non repeatable stuff) is a big part of things.  Your signal is so lost in non-repeatable stuff you can’t make much of a quantitative argument about it.  and this is the case you cherrypicked because it is one that best suits your case.  Drury’s shooting percentage in Buffalo was very different from what it was in New York.  That is not a sign his ability changed.  He was merely used in tough defensive situations.  That isn’t a talent of his.

Lets find a more ‘average” situation.

Toni Lydman.  his shooting percentages of all his teammates when he was on the ice at 5 on 5.  10.91, 7.10, 7.85 8.07.  Last year was clearly a fluke.  He was about 30% higher in shooting percentage on shots he didn’t take.  This is typical.  Your system ranks him as a top player last year because of this fluke.  That is a much more typical case than the one you are using.  and we are still dishonestly removing the saves percentage of his team which also shows big variations from year to year.

You want a system that best captures the talent of all players and not your cherrypicked example.  There are a few outliers but for the most part any player with a high PDO one year regresses to the mean the next year and those with low ones do the same.  There are more cases of this than the few cases you cherrypick.  Many of those cases fall apart like Brendan Morrison becuase it was a fluke sustained a few years.

The special circumstances which are more strongly goaltending and offensive/defencsive usage can be factored out next to capture most of your “signal” (the Drury/Gaborik one is usage).  After that there is some meaningful ability a player has to get a high shooting percentage or prevent his opponents from doing so, but it is a smaller effect than randomness, usage and goaltending.  It is hard to get at.  it is a smaller correction than the errors in front of it that ruin your calulcations that can give you Nikita Filatov as the top player in the league (or Alexander Semin a pick that is equally wrong once the players with limited ice time are removed).

Posted by PuckStopsHere on 11/20/11 at 04:19 AM ET

I have told you about 5 times it isn’t all luck.  But luck (non repeatable stuff) is a big part of things.

Can you please provide me a number because you seem to be hiding behind words.  Please guestimate for me how much is luck and how much is skill.

Toni Lydman.  his shooting percentages of all his teammates when he was on the ice at 5 on 5.  10.91, 7.10, 7.85 8.07.  Last year was clearly a fluke.

I don’t like the word “fluke” (I prefer anomaly) but yeah, Lydman’s season last year was abnormal.  I never denied that.  Again, >1 season of data is preferential.  His other three seasons were quite similar and typical of most defenseman.

That is a much more typical case than the one you are using.

Is it?  In what way?  That he had 3 very similar seasons followed by an anomaly?  Or is he a defenseman that generally can’t influence his teams offense very much (very few can)?

But again, the point isn’t that you can identify a significant percentage of players that are neither exceptionally good nor exceptionally bad at shooting percentage or PDO.  The point is that there is not an insignificant number of players that do and we want to be able to evaluate the full spectrum of players.  We want an evaluation system that rates Gaborik as the exceptional offensive player he is, not as a slightly better than average one.  Not to mention the fact that the variation between players is greater for their long-term on-ice shooting percentage than it is for their long-term on-ice corsi for rates.  This would indicate that shooting percentage accounts for a greater percentage of the variation in goals for rates than fenwick for rates.

Again, my estimate is shooting percentage accounts for 50-60% of scoring goals and fenwick/corsi accounts for 40-50%.

The special circumstances which are more strongly goaltending and offensive/defencsive usage can be factored out next to capture most of your “signal” (the Drury/Gaborik one is usage).

How do you factor these out if you only look at corsi?.

(or Alexander Semin a pick that is equally wrong once the players with limited ice time are removed).

Over the past 3 seasons, of all players with >175 games played, Semin ranks 8th in points per game and 3rd in goals per game and is tied for second at +83.  I am not ashamed that he ranks highly in my system.

Posted by HockeyAnalysis on 11/20/11 at 04:58 AM ET

Again, my estimate is shooting percentage accounts for 50-60% of scoring goals and fenwick/corsi accounts for 40-50%.

Your estimate doesn’t even make sense.  Corsi is a differential number.  It is the difference between shots for and shots against.  If a player prevents shots against, he isn’t necessarily trying to score.  It doesn’t make any sense to say that preventing shots in your own zone scores goals.

This is perhaps why you are as wrong as you are.  You are trying to force Corsi to be something that it isn’t and arguing that it doesn’t do it.  Preventing shots in your own zone won’t score any goals.  You are asking me to give numbers that do not make any sense.  Yours do not make sense.

How do you factor these out (usage differences) if you only look at corsi?.

I don’t only look at Corsi.  I look at usage too.

as for Alexander Semin being first in your rankings in a year he put up 54 points in 68 games.  You are proud of that?  But to justify your pride you require including past seasons where he had better offensive numbers.  That is sad.

Posted by PuckStopsHere on 11/20/11 at 05:13 AM ET

You clearly fail to understand anything I have written or understand anything about my statistical view of hockey and the terminology I use.  I know you have no understanding of my statistical view of hockey because if you did you wouldn’t have written something so non-sensical as your last comment.

All you seem to be able to do is regurgitate the same silly findings as Gabe, and then mix in some word justifications for when, in your mind, his system fails and attempt to shoot down other peoples ideas with 8 meaningless game samples.  When you actually get a clue as to how statistics works and have reviewed and understood the work I have done maybe we can revisit this conversation.  Until then, consider this conversation over.  It’s a pointless waste of time.

Posted by HockeyAnalysis on 11/20/11 at 07:17 AM ET

## Add a Comment

Please limit embedded image or media size to 575 pixels wide.

Add your own avatar by joining Kukla's Korner, or logging in and uploading one in your member control panel.

Captchas bug you? Join KK or log in and you won't have to bother.

Smileys

Notify me of follow-up comments?

## About The Puck Stops Here

The Puck Stops Here was founded during the 2004/05 lockout as a place to rant about hockey. The original site contains over 1000 posts, some of which were also published on FoxSports.com.

Who am I? A diehard hockey fan.

Why am I blogging? I want to.

Why are you reading it? ???

Email: y2kfhl@hotmail.com