Saturday, October 26, 2013

Bob Gibson, Headhunter

Over at SB Nation, Grant Brisbee has compiled a list of how people who dislike the Red Sox and the Cardinals (you know who you are, and why you do) can find likable attributes about the two teams. Check it out; it's pretty good.

One of his reasons is the two teams' living legends, Carl Yastrzemski and Bob Gibson. He says,
Have you ever heard, in your life, someone say "Ugh. I can't stand Carl Yastrzemski" or something similar about Gibson? No. Unless Gibson broke your uncle's wrist with an inside fastball, which is possible.
That's the thing about Gibson, of course. Great pitcher, had that amazing 1968 season, fast worker (Vin Scully: "Bob Gibson pitches as though he's double parked"), but if you lean in too much, or hit a home run off him, or take too much time, he'll plant a fastball in your ribs.

At least that's the reputation. Is it true?

Not really, it turns out. I looked at all pitchers who started 400 or more games. (Gibson started 482.) There have been 111 of them since 1901. For each, I calculated the perecentage of opposing batters the pitcher hit by dividing hit by pitches by plate appearances. For a frame of reference, in 2013, pitchers hit 0.83% of the batters that faced them: 1,536 hit batters among 184,872 plate appearances.

So how much above 0.83% was Gibson? He wasn't. Among the 111 pitchers with 400 or more starts, there were nine who hit over 1% of batters during their careers. Gibson's not one of them. Two are knuckleballers (Tim Wakefield and Charlie Hough) who were never sure where the ball was going. Two are hard throwers who last pitched in 2009: Pedro Martinez and Randy Johnson. Getting hit by them had to hurt. There's an old-timer, Eddie Plank, and more contemporary ones, Dave Stieb and Kevin Brown. And there are two of Gibson's peers: Don Drysdale and Jim Bunning. Gibson? He's way down at 36. That's above average for that peer group, but by no means moves him into the headhunter realm. Here's the list, 1-36. It's fun to see all the starters who hit batters more often than the putative headhunter (Greg Maddux? Jamie Moyer? Orel Hershiser?).

Percentage of Batters Hit, Minimum 400 Games Started
 1. Tim Wakefield  1.33%   13. Barry Zito      0.88%   25. Frank Tanana    0.73%
 2. Pedro Martinez 1.24%   14. Walter Johnson  0.88%   26. Dennis Martinez 0.73%
 3. Randy Johnson  1.11%   15. David Cone      0.87%   27. Kevin Appier    0.72%
 4. Don Drysdale   1.09%   16. George Mullin   0.86%   28. Nolan Ryan      0.70%
 5. Charlie Hough  1.08%   17. Jeff Suppan     0.84%   29. Wilbur Cooper   0.70%
 6. Dave Stieb     1.07%   18. Jamie Moyer     0.84%   30. Lee Meadows     0.67%
 7. Eddie Plank    1.07%   19. John Burkett    0.79%   31. Greg Maddux     0.67%
 8. Kevin Brown    1.03%   20. Roger Clemens   0.79%   32. Dwight Gooden   0.67%
 9. Jim Bunning    1.02%   21. Bert Blyleven   0.76%   33. Jim Kaat        0.64%
10. Tim Hudson     0.94%   22. Javier Vazquez  0.74%   34. Earl Whitehill  0.64%
11. Orel Hershiser 0.89%   23. CC Sabathia     0.74%   35. Kevin Millwood  0.64%
12. Kenny Rogers   0.89%   24. Tom Candiotti   0.73%   36. Bob Gibson      0.63%

Wednesday, October 23, 2013

Tim Lincecum, Two Years, $35 Million

Over the past two years, there have been 86 pitchers with 300 or more innings pitched. Tim Lincecum's 4.76 ERA ranks 83rd. His 1.389 WHIP ranks 76th.

There are 53 pitchers with 350 or more innings pitched. Lincecum's ERA ranks 52nd and his WHIP ranks 48th.

There are 41 pitchers with 375 or more innings pitched. His ERA ranks last and his WHIP ranks second-to-last.

I suppose he's durable.

I don't begrudge any ballplayer making a ton of dough. But let's not hear the Giants whine about having to keep their payroll low because of the limited Bay Area market and competition from the A's. 

Tuesday, October 22, 2013

Why On Base Percentage Matters

If You Don't Want to Read the Whole Thing: I assume you know what batting average is. On base percentage, which measures a batter's ability to get on base via hits, walks, and hit by pitches (hits by pitches?) is more closely related to run scoring than batting average. So when you want to learn how good a hitter is, on base percentage does a better job than batting average.

But If You Do: I haven't been posting anything during the postseason. This is the time of year when you have many ways to enjoy baseball, from nationally televised games to wall-to-wall coverage by major media to Twitter to baseball-specific sites that are absolutely on top of their games. Easy to get lost in that shuffle. My only post has been to lament the demise of small-market teams in the postseason, though I'll concede that a World Series matching the teams with the best regular season records in their respective leagues is aesthetically, if not emotionally, pleasing.

I'm going to use this hiatus to start what I expect will be a series of posts going into the basics of baseball analysis. One of the things I've learned in a long career as a financial analyst is that you've got to constantly question your assumptions. The rules that worked a decade ago may not still be applicable. So let's start questioning.

I'm going to start with on base percentage, or OBP. It measures the frequency with which a batter gets on base. The formula is (hits + walks + hit by pitch) / (at bats + walks + hit by pitch + sacrifice flies). You've seen me use it several times in posts. Why is it important? Or, as the casual fan may ask, why should I care about it when I have batting average?

To partially answer the second question, batting average (BA) is just hits divided by at bats. It measures the percentage of time that a player gets a base hit. A good hitter hits .300. Ty Cobb, Babe Ruth, Ted Williams, Rod Carew, Tony Gwynn...all .300 hitters. A .300 batting average is good. A .200 batting average isn't. So what's wrong with that? Why do I need OBP?

Well, let's look at batting. The object on offense is to score runs. Something that helps you score runs is good. Something that impedes scoring runs is bad. We can measure how good or how bad something is by using a statistic called correlation. Correlation measures how closely related two sets of numbers are. The higher the correlation, the stronger the relationship. The correlation coefficient is the statistic that correlation yields. It runs between 0 (no correlation at all) to 1 (perfect correlation). Once you get above 0.7 or so, you're talking about a pretty decent correlation, in general. (I've put a longer discussion of correlation in the tab Statistics at the top of the blog. You can go there to read it if you want. I figure many of you already know about correlation, and many others of you didn't come here for a math lesson.)

In order to measure how much a statistic like BA or OBP contributes to run production, I ran correlations. I used every season in the three-division era that began in 1995. That works out to 564 team seasons: 28 teams for each of 1995-97, and 30 teams starting in 1998, when Arizona and Florida joined the league.

The correlation coefficient between runs and batting average is 0.80. That's pretty high. What it means is that batting average explains a lot of how runs are scored (technically, about 65%).

Again, that's pretty high correlation. If we're trying to measure how runs are scored, batting average does a nice job, and that's based on 564 datapoints, so it's not random.

The thing is: on base percentage does better. The correlation between runs and OBP is 0.88. That's a good bit higher than for BA. If batting average provides 65% of the explanation of how runs are scored, on base percentage explains 78%. That's a big difference, big enough to make OBP much more useful than BA. Simply stated, on base percentage contributes more to scoring runs than batting average. 

We're talking about teams here, but this relationship applies to individuals as well. Consider, for example, two catchers: The Tigers' Alex Avila and the Angels' Chris Ianetta. They played about the same amount this year (379 plate appearances for Avila, 399 for Ianetta) with similar power (14 doubles, 1 triple, 11 homers for Avila; 15 doubles, no triples, 11 homers for Ianetta). Their batting averages were similar: .227 for Avila, .225 for Ianetta. Identical players? Not at all: Ianetta drew 68 walks and was hit twice, while Avila walked 44 times and was hit once. That difference gives Ianetta a .358 on base percentage, fifth best among the 24 catchers who played in 100 or more games, compared to Avila's .317, which ranks 16th. That makes Ianetta a more valuable offensive performer, given the importance of OBP. 

Or compare two second basemen, the Mets' Daniel Murphy and Tampa Bay's Ben Zobrist. In an almost identical number of plate appearances (697 for Murphy, 698 for Zobrist) they displayed similar power (38 doubles, 4 triples, 13 home runs for Murphy; 36, 3, and 12 for Zobrist) and Murphy had a better batting average, .286-.275. So was Murphy better? Nope. Zobrist got on base an additional 45 times that don't show up in batting average compared to Murphy (72 walks and 7 hit by pitch for Zobrist, 32 and 2 for Murphy). That gives Zobrist a much bigger edge in OBP (.354-.319) compared to Murphy's edge in the less important BA.

As you can tell, the biggest difference between BA and OBP is walks. A player who walks a lot boosts his OBP more than one who walks infrequently. You know that saying from the playground when you were a kid, "a walk's as good as a hit?" From the perspective of OBP (and from the perspective of scoring runs), that's exactly right.

What's a good OBP? This year the major league average OBP was .320 compared to .256 for batting average. The top three were Miguel Cabrera (.442), Joey Votto (.435), and Mike Trout (.432). The bottom three were Alcides Escobar (.259), Darwin Barney (.266), and Adeiny Hechavarria (.267). The 75th percentile was Adam Lind's .357, and the 25th percentile was Ryan Doumit's .314. If we're looking for a rule of thumb, like for a .300 hitter, well, there were 24 hitters who batted .300 or better in 2013. Chris Davis was 24th in OBP with .370. So let's say a .370 on base percentage is like a .300 batting average.

When you're watching the Series, you might see that Mike Napoli hit .259 this season compared to his likely first base counterpart, Matt Adams, who batted .284. Here's two other numbers: Napoli had a .360 on base percentage, while Adams' was .335. So you'll know that Napoli was actually the better offensive performer, and that's before you start comparing beards.

Saturday, October 12, 2013

The Small-Market Lament

Fangraphs nails it.

No, St. Louis, you are not a small market. Not with the 11th highest payroll in all of baseball, you're not. 

Wednesday, October 9, 2013

It Was 17 Years Ago Today

There was an umpiring controversy in last night's A's-Tigers game, as Oakland claimed fan interference on a home run hit by Detroit's Victor Martinez. The umpires ruled no interference, which was probably the right call, in contrast to this play that occurred exactly 17 years ago, on October 9, 1996:

Tuesday, October 8, 2013

Reason to be Glad the Dodgers Won

This is really infantile. Dis an icon because he doesn't pick your team to win?

Uh, Really, AJ?

Juan Uribe has had an interesting four years. In 2010, he was the world champion Giants' primary shortstop, playing 148 games and batting .248 (.310 on-base, .440 slugging) with 24 homers and 84 RBI. He then signed a three-year, $21 million contract to be the Dodgers' third baseman. Over the next two years, he played 143 games, batting .191 with a .262 on-base percentage and .289 slugging percentage. He hit 6 homers and drove in 45 runs. His on-base plus slugging (OPS) of .552 was the third-worst in the majors among players with 450 or more plate appearances in 2011-2012:
Chone Figgins .502 147 454 42 84 16 3 3 26 40 90 9 15 7 .185 .249 .253
Paul Janish .515 169 503 45 103 20 2 0 32 35 76 8 4 2 .205 .262 .252
Juan Uribe .552 143 432 36 86 21 0 6 45 30 97 18 2 1 .199 .262 .289
Jeff Mathis .557 164 458 43 89 25 0 11 49 24 143 5 2 2 .194 .236 .321
Wilson Valdez .564 176 467 54 108 18 4 1 45 26 77 17 6 4 .231 .270 .293
Orlando Cabrera .573 130 450 39 107 16 0 5 51 17 57 12 8 4 .238 .267 .307
Jason Bartlett .591 168 637 69 147 27 3 2 44 60 125 19 23 10 .231 .299 .292
Franklin Gutierrez .596 132 472 44 111 23 1 5 36 25 87 11 16 3 .235 .276 .320
Brendan Ryan .598 264 843 93 187 38 6 6 70 78 185 11 24 8 .222 .296 .302
Brent Morel .603 161 526 58 121 20 1 10 46 29 96 11 9 5 .230 .274 .329
Provided by View Play Index Tool Used
Generated 10/8/2013.

This year, he had an unexpected resurgence, batting .278 and slugging .438 with a career-high .331 on-base percentage. He hit 12 homers, scored 47 runs and drove in 50 over 132 games. He also played solid defense at third. Last night he got his biggest hit of the year. a two-run eighth inning home run that sealed a 4-3 series-clinching victory for the Dodgers.

No question, it was a dramatic turnaround for the 34-year old Uribe. Interviewed after the game, his teammate, catcher AJ Ellis, as heard this morning on SiriusXM, extolled Uribe's hitting and declared him "the best defensive third baseman in the majors."

Those of you watch games at the corner of West Camden and South Eutaw: I hear ya.

Monday, October 7, 2013

Carlos Beltran and an Old-Timer

The ninth and tenth most prolific home run hitters in postseason baseball history:
                G  PA  AB  R  H 2B 3B HR RBI SB CS  BA   OBP  SLG
Carlos Beltran 37 164 136 41 49 11  0 16  31 11  0 .360 .463 .794
Babe Ruth      41 167 129 37 42  5  2 15  33  4  3 .326 .467 .744

Yes, I know. Ruth put up his numbers in 41 World Series games, while Beltran has never appeared in the Series - his 37 games comprise a wild card game, 16 Divisional Series games, and 20 League Championship Series games. And, of course, Mets fans will never view him as clutch.

But still. Postseason is still postseason. 

Friday, October 4, 2013

Barmes Benched? Or not.

Per Hardball Talk, the Pirates are benching shortstop Clint Barmes today in favor of better-hitting Jordy Mercer

Except it might not be a benching. Here are the Pirates' NLDS starters:
A.J. Burnett 1.35 1.90
Gerrit Cole 1.01 1.60
Francisco Liriano* 1.06 1.52
Charlie Morton 1.78 2.74
League Average 0.86 1.17
Team Total 1.14 1.55
Provided by View Original Table
Generated 10/4/2013.

GB/FB is the ratio of ground balls to fly balls. (Baseball-Reference includes line drives as fly balls. Some other sites don't.) GO/AO is the ratio of outs on the ground and outs in the air.

There are two things you should get from this table.

  • First, the Pirates get batters to hit balls on the ground a lot more than the average NL team. This isn't a surprise, as I posted a couple weeks ago.
  • Second, Gerrit Cole, who's pitching today, and Francisco Liriano, who'll start Sunday, are much less dependent on ground balls than A.J. Burnett and Charlie Morton.
So taking the good-field no-hit guy out of the lineup in favor of the guy with more offense makes strategic sense when you're starting pitchers who don't produce as many grounders. I'd assume that's the case here.