Info on Blog

Saturday, October 18, 2008

The Expected Costs of Obtaining Good Players From the Draft: 2008

I thought it would be interesting to see what I can derive for the expected cost of obtaining a good player in the first round, based on the probabilities that I had derived in my draft study and the bonuses paid by teams in the latest draft.  I've been meaning to write on this in a gigantic post about my draft study, but I had this thought so I thought I would write on it now.

The concept behind this is elemental probability where the cost of the activity is divided by the probability to derive what the expected cost would be of a particular event.  For example, suppose the odds are one out of three for a successful outcome and the cost of each is $3M;  then the expected cost of one success is $9M.

Draft Probabilities

Here are the probabilities of selecting a good player via the draft, based on ranges of picks, for the first round:

  1-   5:  43%
  6-10:  23%
11-20:  16%
21-30:  11%

2008 Bonuses Paid

Here are the average bonus paid in the 2008 draft, based on ranges of picks:

  1-   5:  $5.50M
  6-10:  $2.25M
11-20:  $1.70M
21-30:  $1.30M

I took out outliers of $3M+ bonuses in both the 11-20 and 21-30 ranges, as they improperly skewed the .

Expected Cost of Finding a Good Player via the First Round of 2008 Draft

Following is the expected cost of finding a good player via the 2008 draft in the first round, based on the information above.

  1-   5:  $12.8M
  6-10:    $9.8M
11-20:  $10.6M
21-30:  $11.8M

The average across the four ranges equals $11.25M

Giants Thoughts

What this means is that, on average, a team can expect to spend approximately $11.25M to find a good player via the draft.   It also gives you a back of napkin evaluation of what major league teams think are the odds of a prospect making it to the majors and being good.  Rafael Rodriguez was giving $2.5M, so that's about 20% odds of him becoming a good MLB player.  The A's gave roughly Inoa $4M, so they think his odds are around 36%.

It also gives you a way of judging how many players one can expect to get out of the draft.  For example, the other day, I noted the Giants spent a little over $9M on the draft in 2008.  Thus the Giants should be getting roughly 0.8 of a good player from the 2008 draft.  You add them up over a number of years, using this methodology, and you can get a rough idea whether Giants and Sabean is ahead or behind in developing good players from the draft.

In addition, it also allows you to compare obtaining free agents versus drafting players.   The money spent on the draft is not insignificant.  People like to look only at the cost of the successful players making the majors, but there are all the costs involved with drafting all the other players who never make the majors or just briefly.   In other words, the roughly $11M spent searching for that good player.  

Still, the cost of young draftees is still much cheaper than getting a good free agent, but the argument can be made that sacrificing a draftee in one year is justifiable if you acquire a useful player who can make a contribution now, when you need it, and not later when you don't know what situation you will be in, competing or re-building, or even if the draftee, which has a very high failure rate, ever be a regular in the major leagues.

6 comments:

  1. How are you operationalizing the term "good player" ?

    I think it's important to define the term "good player" because as I'm trying to read it, it seems a little ambiguous. I could be wrong though, but what's the definition of good?

    ReplyDelete
  2. Sorry, very good point, thanks for the comment, should have provided a link to my draft study: http://sfgiants.scout.com/2/343576.html

    For my study, I had split players into Stars, Goods, Useful, Marginal, and Others, but for this post, and most times I refer to my study, "good" players refer to "Star" and "Good" players in my study, as I think that is the best way to look at the data, because of the roughness of the data I had on hand.

    Here is what I would encapsulate as "good" players:

    "For Stars, he had to hit over .295 or had an ERA under 3.30 and played at least 6 seasons, unless I recognized him as a star despite poorer stats then I would check his full career stats to see if his other stats look like a star. Yes, subjective but otherwise data collection would be a bear and, I believe, not worth the extra margin of detail. A good player had to hit from .275 to .295 or had an ERA under 4.25 plus over 6 seasons played. Also, players who had under 6 seasons played and had these stats plus who were among the top of the 100 draftees in games played, would count as a good player. To me, a good player not only has good stats but longevity and healthiness as well."

    So hitters above .275 BA and pitchers below 4.25 ERA, with adjustments (admittedly subjective) for the borderline cases and players I thought were good (checked their full batting line; pitching line) but were marked as not, plus had at least 6 years worth of playing time, 150 games per position player-year, 30 starts per starting pitcher-year, and 50 appearances per reliever-year.

    ReplyDelete
  3. Thanks for the link.

    Here's another Q:

    Why use BA to sort your offensive players? Juan Pierre has a BA of over .300 in his career. Is he really a star? I wouldn't think so, at least not in the terms that you're trying to define players. Even if you bump his value up some for his baserunning, he's still been a below average player for most of his career.

    Why not use something simple like OPS? Or, get really fancy, and use a linear weights metric like wOBA (the formula is freely available for and relatively easy to calculate).

    ERA could have it's problems, too. But, I'm not here to nitpick just some ideas. It looks like you wrote this study several years ago, ever thought about updating it with more appropriate metrics? That would be interesting.

    ReplyDelete
  4. I guess you didn't read the methodology in my article very carefully or at all, I explained the situation there but will repeat here: I had no easy access to advanced data nor a flunky assistant/intern to order around to do this, so I manually copied the only free source of this data that was available at that time, from The Baseball Cube, and despite their advanced stats on each player's stat card, they inexplicably provided only each draftee's games and batting average in their draft section (and also appearances and ERA for pitchers).

    Yes, people like Pierre could have made it through based on those standards but I know my good players, don't recall if he was a early round pick, but I would have changed him to useful as I don't consider him good, whereas Matt Williams didn't make it and I made him a good pick. Anybody who got rated good had to past my smell test, else I checked their stats. And there were some who didn't rate a good but I thought should have, and I also checked them in detail. Better to do this for maybe 25-50 players than all 1800 picks.

    It's OK to nitpick, it's good nitpicks, very legit nitpicks, and I thank you for mentioning them, but as you noted and I above, it is what it is, the best data I could locate for free on the internet.

    Would love to update, baseball reference has better stats for hitters now but unfortunately their pitching stats don't give me a good view of each pitcher's longevity, they only provide W-L and ERA. Else I would work on updating it now, as I had announced a few posts ago but now have to announce that I won't be updating my study until better stats are available. I want to be able to deliver a better study, as you note, rather than a compromised one again. I've written to them about an idea and they are considering it; they've added other ideas I've suggested so I think it has a decent chance of making it into their next major revision of the draft section.

    Still, despite these limitations, it was still a valuable study because nobody up to then had publicly published any sort of analysis of this type on the draft and the only one since that I've seen is Baseball Prospectus, and frankly, they totally missed the point, which is surprising to me because the author is a doctor who should understand distributions, particularly skewed ones, and magnitude. I've been wanting to post this rebuttal for a long while, because many people have rebutted me with it, so I may as well tackle it now.

    Baseball Prospectus only focused on the fact that the best value in the draft is available early and that the value is best on average early on. I had already shown that with my rough study plus I made the more important point that it is extremely rare TO FIND GOOD PLAYERS even early on, even with the first picks overal.

    For example, they made the point that the early draft picks had WARP of 30-40 in the early picks, calling that good value and thus you have to be an idiot (they have been nagging on and dissing Sabean for years now for his Michael Tucker decision) to give up a draft pick in the first round. According to their study, the Tucker pick was worth on average an 8 WARP career.

    Well lets take a stroll through memory lane of Giants past to see what that means in context? Guess who has a WARP of approximately 30? Michael Tucker. If he is the good pick you can expect to get on average from an early first round pick (where, remember, bonuses of $6M were just paid this season), why not just pay only $1.5M to get the real thing, instead of a lottery ticket that you cash in, albeit cheaply, in 4-6 years?

    And remember, Tucker isn't even the caliber type of player you get with the pick we gave up for him, which was the 29th pick.

    Marquis Grissom, on the other hand, has a WARP of nearly 70 for WARP3, so he would be a superstar based on the average scale of even the early first round picks.

    By the last part of the first round, where the Giants gave up their pick for Tucker, the average WARP jumped around but was about 14 for picks 21-25 and 8 for 26-30.

    Marvin Benard, as lame as a career he had for us, had nearly 19 WARP. Thus the pick we gave up (#29 I think) is about half as good as a Marvin Benard.

    Ramon Martinez, a good utility infielder for us had a 11-13 WARP, so he was also better than the player, on average, we could have drafted with the Michael Tucker pick.

    Neifi Perez, with a 31 WARP, so he is what you could expect with an early Top 5 draft pick. He is heads and above what we could have expected to select with the pick we gave up for Michael Tucker.

    Pedro Feliz has a WARP of 21 and is much better than what we could have gotten with Michael Tucker draft pick, about 2.5 times the WARP.

    Dustan Mohr had a WARP of nearly 12, even he's still better than the average draft pick we gave up for Michael Tucker, and he basically had a 4 year career, and part time at that.

    Jason Ellison has a WARP of 4.0, so he's not quite the draft pick we gave up.

    Deivi Cruz was up there with Tucker, 30 WARP, he is who you can look forward to on average with a top 4 pick.

    So, to say again, the average 5th pick, not sure why it dipped so low, had an 11 WARP, but average of 4th and 6th pick was about 32, so on average the 4-5-6th pick has turned out to be a Deivi Cruz or Michael Tucker. Again, not too exciting.

    Here's a good one, Ruben Rivera had a WARP of 14, so even HE is better than the average 21st-30th pick of the first round that we gave up.

    Here we go, someone closest, Armando Rios had a career 10.2 WARP, a little higher than what the average draft pick that we gave up for Michael Tucker, but about what can be expected for a 20-something pick overall. Never played a full season. Played parts of 7 seasons, but only two seasons where he played more than half a season. Only accumulated 1021 ABs, 36 HR, 14 SB, .269/.341/.445/.786 batting line. So would you pay $1M to get someone like that on average or $1.5M to get someone like Tucker who could and did play three quarters of a full season and hit about that much too?

    A better example is Bobby Estalella. His WARP3 was 7.9, basically the same as that of the 29th pick we gave up. He earned most of that in his one good year for us in 200, when he got 299 AB, about half a season's worth, for a total of 4.8 WARP3. He only got over 100 PA in 4 other seasons, totalled only 904 AB, 1056 PA, had 48 HR and a career batting line of .216/.315/440/.755. That is basically what we gave up to sign Michael Tucker.

    To my point in my study, is that anything to cry crocodile tears over? Is that worth all the strum und Drang that has dominated Giants fans talk about Sabean for the past 5 seasons since it happened? Is that worth Baseball Prospectus taking every opportunity to put a dig in on Sabean to mention this tactical move, that I thought and still think was brilliant?

    That is the essence of Moneyball right there, zigging when others zag, Yinging when everyone is Yanging, thinking differently, finding an angle to maximize the resources that you do have.

    Giving up a pick is not an ideal situation, nor is it something you can do indefinitely. But if you do it selectively, it won't significantly damage your player development talent level plus give you an extra edge that particular season, player budget-wise. It is, as I noted long ago, the same as throwing away a lottery ticket that you would have paid $1M for, but had very low odds of ever collecting on it.

    That's why I have been advocating that Magowan step down and have a richer owner take over, like the Angels have and D-backs have, deep pocket owners who won't blink at spending an extra $1M here and there to get what we need and not have to make such choices. But given Sabean's circumstance, he made the best of the fiscal situation that he was presented with, with some innovative thinking.

    I'll admit I kind of went on and on with the list of Giants, but I'm tired of people not getting my draft study. I'm normally a modest, humble person who don't toot my horn (hurts me in the workplace) but I think I did a damn good study using free data and on my own. Why people cannot see my conclusions, I have no idea, but the only Giants fan who has publicly acknowledged my conclusion as information that is worthy is "Only Baseball Matters". It is not even complicated math and statistics, just stuff anyone would learn in Stat 101.

    I'll admit it is not a perfect study, but given how poorly the odds came out, how bad the distribution and availability of talent is, even if I was off by 100%, it is still pretty horrible odds to find a good player in the back of the first round.

    Which was what I was investigating in my study, if Sabean was as bad as everyone is saying, how bad is he, what is the magnitude of badness? I was with everybody on this and expected to expose how bad Sabean was.

    I was flabbergasted when I discovered how badly the odds of finding good players falls within the first round, let alone within the draft itself. I originally only looked at the first round picks, then expanded it to the first 50 picks, then 75 picks, then finally to 100 picks, having to go back and recollect the data each time, because the odds were just that bad, and I wanted to see how bad it gets, even within the first 3 rounds or so.

    I would like to do further analysis, much like what BP did with HS vs College, hitter vs. pitcher, but I was just focused on answering my one simple question and didn't collect such data, which would be good to have collected. Still, I think my conclusion is more powerful than anything they have published off of their data, as sophisticated and extensive as they were. And once I get my hands on the data, I will try to take that next step.

    ReplyDelete
  5. OGC,

    I did glance over your methodology intro but I had problems with it which I listed above. As you know, good methodology is the foundation of good research.

    I'm a little confused with this:

    "so I manually copied the only free source of this data that was available at that time, from The Baseball Cube, and despite their advanced stats on each player's stat card, they inexplicably provided only each draftee's games and batting average in their draft section (and also appearances and ERA for pitchers)."

    When I open the draft page for a certain year, say 1998, all it does it list the players. To see the individual stats on each player, I have to click their name and go to their player page. Is this how it worked for you? And if so, if you're clicking that far, why not go ahead and do the OPS calculation which is easy to do with any slash-stat line. Even easier if you create a spreadsheet to do the calcs for you.

    Has the Baseball Cube layout changed since then or am I on the wrong page?

    Just some thoughts. Also, if you're going to create a baseline for your definitions of what is good and what isn't good and then tweak it by subjective measures, which you admit is a little wrong, then things can get flimsy.

    One the WARP issue I need some clarification if you wouldn't mind. WARP totals are cumulative, they add up over the years and the longer you player the more WARP you'll accumulate. I guess my question is, so what if Tucker had career WARP of 30? He only accrued 5.2 of that with the Giants. Are we giving him credit for the other 25 he accumulated with the teams he played with before the Giants? That's what I need clarification with. Why not judge what he did WITH the Giants and not with the Royals and Braves, because the production he gave to other teams doesn't influence the Giants at all.

    What I'm trying to say is that past performance -- using WARP scores from earlier in the career of players -- doesn't mean that future performance will be on those same levels. Ramon Martinez had a career WARP score of 12.6 but that's not because he gave the Giants 12 wins above the replacement level player -- he actually only gave the G's 7-8 wins -- but because he played for 12 seasons.

    Also, BP is notorious for setting their replacement level super low for WARP, thus artificially inflating scores. Most define replacement level as a AAA player or any freely available talent. BP has their level set much lower, something closer to a AA player.

    I commend you on the study and I think it's got some good roots, but I'm just trying to clarify some things about it. So that I can better understand the points you're trying to make.

    And if you're looking for more advanced stats that are readily available, First Inning and Statcorner both give wOBA scores online for free.

    ReplyDelete
  6. Yeah, they changed their format, also inexplicably, about 3 years ago, foiling any thoughts of updating or extending the study. What you see is the latest format, I think they changed the format and took away some of the data and added new data, then at some point reached the no data state that you see today.

    So all good suggestions, just not available.

    Data is always flimsy compared to the future, making the first studies look dated in the process. Many of the great works in the 50's and 60's were based on a few hundred games that they scorecarded themselves listening to games by the radio.

    The art of data analysis is knowing when you can adjust the data to reach the proper conclusion. Per your example of Juan Pierre, is it better to blindly follow the definition guidance because of the limitation of the data or to remove him when you as an expert knows he doesn't fit the definition? Subjective yes, but I think we would both agree that removing him as a "good" defined player improves the results of the study.

    I think I did the best that I could given the limitations of the data available at that time for free. I did what I knew to improve the study's results. If BP would give me their database, I would happily analyze it for free and they can publish it in their free content channel under their byline without my name, I don't really care.

    Yes, I'm aware of how WARP is cumulative. All the WARP used by BP in their study is the cumulative first 15 years of each draftee's career, thus why I am comparing their average WARP with each player's career WARP, that gives equivalent comparison.

    When if they say that the, say, 4 to 6th pick is around 30 WARP, that means that the average prospect selected in that range of picks had a 15 year cumulative WARP of 30. Michael Tucker has a cumulative career (shorter than 15 years) WARP of approximately 30. Thus, the talent level of that average pick is approximately the level of Michael Tucker. That has nothing to do with what he accumulated while with the Giants, but whether on a career basis he matches up with the average pick.

    I'll admit that he's at the end of his career vs. the young draft picks start of his career, but that is balanced by him being available today when you need it versus 5-6 years later, when you have no idea what you need.

    I would love to use any and all advanced stats. I even used the proper OBP times SLG in my spreadsheet analysis for my keeper league draft, plus the stuff I know, K%, BB%, contact rate%, K/BB for pitchers, BB/K for hitters, etc. Just don't have the data to do them on.

    I don't have time to cut and paste each and every draftee who made the majors into a spreadsheet and analyze. For the prior study, I cut and paste each round's list of draftees and meager stats into a spreadsheet, which I was then able to print and analyze when I had pieces of free time, like waiting in line, waiting for my wife to be done shopping, waiting for kids while they rode a kiddie ride, etc.

    I am hoping to copy just each round's stats, limiting it to just one copy per round, which would simplify things for me. If someone wants to collect all the advanced data for me, feel free to do that. :^)

    I will have to check out Statcorner, thanks, I haven't heard of them before. FYI, The Hardball Times notes that SABR has started a new minor league database that people can access right now, and in the future could be available through baseball-reference.

    ReplyDelete