I thought I would look into Hunter Pence's projections because they were off so much when I looked at them.
ogc thoughts
Projections are good when looking at the whole major leagues, but you need to know when to bend the rules when it comes to individual players. Because, when it comes to projections, they are like a casino: they are playing the averages in order to come ahead, overall, at the end. But sometimes projections go awry because of this.
Take, for example, Hunter Pence.
Pence 2013
In 2012, he had a down season overall, producing only 1.7 WAR that season, after producing 5.4 WAR the season before, and averaging 3.7 WAR per 650 PA up to that season and 3.9 WAR per 650 PA for the three years before that season. Because of that, ZiPS projected his 30 YO season in 2013 to be 2.1 zWAR with a zwOBA of .316. Whereas, had he been a free agent after the 2011 season, he would have been assumed to produce 4.9 WAR in 2012, and 4.4 WAR in 2013, per the way free agent signings are done. In 2013, he produced 5.5 bWAR with a wOBA of .356 or (5.2 WAR/650PA).
Pence 2014
For 2014, ZiPS projected 2.6 zWAR and .325 zwOBA, with 2012 again damaging his projections, along with him moving into his 30's. However, instead, he produced 4.7 bWAR with .341 wOBA (4.3 WAR/650PA).
Pence 2015
For 2015, ZiPS projected 2.5 zWAR and .327 zwOBA. And this is where the system helps out, it assumes an average injury rate over time, dampening the projections over time to account for seasons like this one where he was injured. He only produced 1.6 bWAR, though at only 223 PA, he was producing at roughly a 4.7 WAR/650 PA pace. And he had a .347 wOBA, which was in line with the last two seasons.
Pence 2016
Hence why I take Pence's 1.9 zWAR projection at .329 zwOBA with a big pinch of salt. Not because the system is bad, it is doing what it is suppose to be doing, reducing the residuals across a large sample so that their projections overall minimizes the errors across all those players, especially for players over 30 YO. But because our goal is not to average across so many players but to see what is probable for any player when looking at any particular player.
Over the past three seasons, Pence has averaged 4.7 WAR/650PA, which is the pace he produced last season. He is now 33 YO, so one would expect some sort of decline, but one of the great things about him is that he takes great care of his body, as he knows that his team, and his fans, expect him to be at his best. Even if we take 0.5 WAR the low rate of the past three seasons, that would leave him at 3.8 WAR, or roughly double what his projection is (which does take into account some injury, due to 2015, projecting 485 PA, for a 2.5 WAR/650PA rate). 0.5 off his average would put him at 4.2 WAR. And if he continues to produce like he has the past few years (except for 2012), he would be at 4.7 WAR.
As I understand your comments, the system is supposed to be minimizing total intrasystemic error. This strategy makes the system look good. But, I would think, it almost entails that the system won't serve the purpose that most users have in mind, projecting the performance of individual players. Your discussion of Pence seems to support this objection, that the system does not provide its users with what I should think those users really want. That has in fact been my experience with the projections, and especially so when one recognizes the imponderables--you mention two of these, age and injury--that the system must account for, pretty arbitrarily. Built into the system is the requirement that it put a number on something, such as how well Matt Cain's repaired arm will let him pitch, about which it can have no reasonable idea, a degree of limitation which ought to disqualify anyone from making any projection, not that this logic actually inhibits anyone from expressing her/his groundless pre-ST opinion.
ReplyDeleteAgreed, projections are not really for fans in general, it's mostly for fantasy players looking for an edge across the whole body of baseball players in their fantasy universe.
DeleteStill, I think it can still have some useful purposes. For any individual, lots of imponderables, but when averaged across, say, a lineup, the errors do tend to zero out generally, so one can get a feel for the strength of the lineup. Or, if you want, get the estimate in the ballpark of what it should be.
Then you can pivot off of that with, like my example above with Pence, adjustments for players that you feel are not being properly handled by the system. Or it could be a worse case scenario.
Another way to use it is to get range of estimates. Some I know to usually be on the high side: Bill James. And others tend to be on the low side.
But yeah, projections are at their worse when dealing with injured players coming back, like Cain, no system can tell you what he will produce, either last season nor this season.
Actually, it is bad. Injuries are a tricky thing and the flaw of ZIPs, and many other projection systems, is that they average them out. It doesn't work like that because timing and nature of the injury is important.
ReplyDeleteFor example, two players that break wrists can be identical before and after the injury as players. But if one gets a broken wrist the first game of the season and misses 60 games vs the other getting the same broken wrist the last week of the season, the impact on the projection system going forward will be disproportional and divergent
And that's where Pence falls. Does anyone, in their right mind, think he's only going to get 485 PAs? That's over 200 less than his 2012, 2013 & 2014 PAs and 115 less than his career average PAs (600), which includes his rookie AND injury year.
Then the nature of the injury. Is it chronic or one-off? Pence had a broken wrist. That's different than a player who might have chronic shoulder or knee issues. One will be fairly predictable in its recurrence, the other will be a fluke and unpredictable.
Yet the system treats them equally. Even though they're not.
The players one has the most curiosity, or anxiety, about are the players who are least predictable. We don't need to be told that it's probable that Posey and Bumgarner are likely to do well. But players coming back from injuries, young players with scanty MLB records, players who had breakout years, players switching to a new team and ballpark--Cain and Pence, Duffy and Tomlinson, Crawford, Cueto and Samardzija--for these players, the ones we are edgy about, the system misleads as much as it informs, all with the same black-and-white declarative imperturbability. As you point out, Moses, systems misuse their own stats by being bound to look at results, effects, without being able to look at causes.
DeleteThe likely performances of Cain and Span, Cueto and Samardzija, Duffy and Tomlinson, Crawford and Pagan--all of them players vital to the Giants' season--are, like the likely performance of Pence, are as clear or clearer to a fan cwho follows the team closely and has access to Fangraphs as they are to a fan who studies ZIPs, Steamer, PECOTA, et al.
Yes, totally agree with you, MosesZD, the system fails with injuried players and not being able to account for the type and severity of the injury.
DeleteFurther good points campanari about the variety of problematic issues that these projection system have with players for whom we have the most uncertainty. And I totally agree that the fans who follow the team closely will have an advantage over the generalists.