Wednesday, May 02, 2018

newPQS Doesn't Work Well So I'm Creating ogcPQS

I'm still not happy with the new PQS methodology.  For me, the idea of this methodology is to separate the good from the average and the bad.  I'm going to have to go my own way with this going forward.  I will get to the April edition of PQS later this week, maybe next.

ogc thoughts

The old methodology did this, as per the data provided regarding the difference between old and new shows:

PQS:  DOM starts had 2.32 ERA (from 2012-2014), DEC had 4.54 ERA, DIS had 11.20 ERA
newPQS:  DOM starts had 1.56 ERA, DEC had 3.70 ERA, and DIS had 7.92 ERA.

These make sense, but there are problems when you look deeper.

newPQS Doesn't Make Sense

So DEC looks decent and that's good but when you look at the underlying stats, newPQS does not make sense:

0 newPQS:  10.01 ERA; 2.23 WHIP; 4.8 K/9; 1.0 K/BB
1 newPQS:    6.75 ERA; 1.86 WHIP; 5,8 K/9; 1.4 K/BB
2 newPQS:    4.56 ERA; 1.45 WHIP; 7.3 K/9; 2.4 K/BB
3 newPQS:    2.89 ERA; 1.16 WHIP; 7.7 K/9; 2.9 K/BB
4 newPQS:    2.00 ERA; 0.91 WHIP; 8.5 K/9; 4.6 K/BB
5 newPQS:    0.95 ERA; 0.71 WHIP; 8.9 K/9; 7.5 K/BB
NL 2012-14:  4.03 ERA; 1.30 WHIP; 7.2 K/9; 2.6 K/BB

How is 2.89 ERA and 1.16 WHIP just DECENT?   The K/9 rate there is not great, just barely above average, but a 2.89 ERA is very good  and 2.9 K/BB is good, and yet it is not categorized as good start.

newPQS Doesn't Work Well

Looking over the majors pitching stats for 2012-2014, the average ERA was roughly 4.00, WHIP was roughly 1.30, strikeout rate was roughy 7.2, K/BB was roughly 2.6, which makes 3 newPQS at 2.89 ERA, 1.16 WHIP, 5.3 K/9 and 2.9 K/BB a pretty good pitcher.   The whole point, I thought was to find the dominant pitchers, and 2.89 ERA is pretty dominant, so is 1.16 WHIP.

On top of that, from 2012-2014, there are only 5 starting pitchers out of 131 qualifiers that had an ERA better than 2.89.   And for the individual seasons:  for 2012, only 9 of 88, for 2013, only 10/80, and for 2014, only 19/88 had an ERA of 2.89 or better.   2.89 ERA is pretty elite, and these pitchers did it over a whole season, not just in one 3 newPQS start.

newPQS works for what it is intended, which is to find dominant pitchers who would be good for fantasy league teams:  high strikeout and high K/BB ratio (which is linked to low ERA).  For example, I took pitchers in 2017 with at least 9 starts (unfortunately, was unable to separate out starts from relief, so there is a mix) and split them up into various K/BB buckets (that are roughly about the same number in each):

GT 3.5:    3.44 ERA; 1.131 WHIP; 9.62 K/9; 4.42 K/BB; 8.00 H/9
3.0-3.49:  3.70 ERA; 1.231 WHIP; 9.32 K/9; 3.12 K/BB; 8.09 H/9
AVG:       4.34 ERA; 1.352 WHIP; 8.4   K/9; 2.49 K/BB; 9.33 H/9
2.5-2.99:  4.41 ERA; 1.358 WHIP; 7.90 K/9; 2.74 K/BB; 9.24 H/9
2.0-2.49:  4.69 ERA; 1.396 WHIP; 7.33 K/9; 2.21 K/BB; 9.24 H/9
LT 2.0:     5.04 ERA; 1.507 WHIP; 6.88 K/9; 1.68 K/BB; 9.46 H/9

As you can see from the above, if we had a pitcher who pitched 3 newPQS in each and every game in the season, and averaged 2.89 ERA, 1.16 WHIP, and 2.9 K/BB, he would rate among the best pitchers in the league, but according to newPQS methodology, he would only be considered decent.

So I'm at a crossroads.  I'm probably not going to use newPQS during the season, as my goal is to illuminate which pitchers have been dominant.

Below are the oldPQS stats:

0 PQS:    12.08 ERA; 2.59 WHIP; 7.4 K/9; 1.3 K/BB
1 PQS:      7.21 ERA; 1.95 WHIP; 3.8 K/9; 0.9 K/BB
2 PQS:      5.84 ERA; 1.72 WHIP; 5.4 K/9; 1.5 K/BB
3 PQS:      4.07 ERA; 1.41 WHIP; 6.0 K/9; 2.0 K/BB
4 PQS:      2.84 ERA; 1.10 WHIP; 7.4 K/9; 3.3 K/BB
5 PQS:      1.74 ERA; 0.86 WHIP; 9.5 K/9; 5.1 K/BB
NL12-14:  4.03 ERA; 1.30 WHIP; 7.2 K/9; 2.6 K/BB (starting pitcher stats)

This distribution does better at what I would like PQS to do, which is to separate out the good starts from the mediocre starts, from the disaster bad starts.
oldPQS vs. newPQS Distribution 2012-2014
source:  2017 Baseball Forecaster

The main impetus, it seems, from the study explaining newPQS, is that the author of the study, who is not the same person as the one who created PQS in the first place, wanted the distribution of pitchers to be a normal curve from 0 to 5 PQS, and that is a good way to separate out the good pitchers from the bad, per fantasy purposes.

But that is not what I've been wanting for my studies, I wanted to be able to identify good starts, and then through a compilation of them, be able to see whether the starting pitcher was good or not, in order to better understand the Giants starting rotation, and whether it is competitively good or not.  I don't care that the distribution is now closer to a normal curve, nor do I care that DOM starts means the pitcher has struck out a lot, all I care about is identifying a quality start from a non-quality start, as many as there happen to be, damn the distribution.  And he has classified a bunch of good pitchers in the 3-PQS category.

That newPQS is a more normal curve is nice, but again, the purpose there is to find the best fantasy pitchers out there, whereas I want a more nuanced result, which is to find out which pitchers are good and which are bad, at delivering good results, that is, good ERA.   newPQS separates out pitchers who are good but not dominant, and I want to see all the good as well as the dominant starters.  That's what 5 PQS is for, anyway.

ogcPQS

I think I have no choice but to branch off now and do my own thing.  Unfortunately, I don't have a huge database of starting pitching game stats to perform analysis on and craft a perfect distribution, so I'll have to work from what was provided in the analysis leading up to newPQS.  Which is okay, I was plenty happy with what PQS had previously provided me, so these tweaks should make me happier.  I will go over each point category here.

Here is the data chart (if anyone knows how to copy a portion of the screen, and not the whole screen, please share):