The Metrics System: FIP
Written by Bill   
Thursday, 21 January 2010 20:52

As promised, time for another in our series of posts where a guy who doesn't really get math (but really loves baseball stats) attempts to explain an advanced statistic for other people who don't really get math (but want to learn something about baseball stats). The first entry, on OPS+, ERA+ and wRC+, is here.

(Note: the following is a collection of oversimplifications and approximations, meant to give the uninitiated a very basic understanding of what this is all about. If you know all this stuff already, you probably know it better than I do, and you're doing me a favor by not really reading this very closely. Thanks very much for coming, but I'm not talking to you people today. Try again tomorrow or Monday.)

FIP stands for Fielding-Independent Pitching and attempts to remove luck (good or bad) and defense (good or bad) from a pitcher's performance. It, like just about every other fun and/or useful stat, is available on FanGraphs. FIP is an estimate of what a pitcher's ERA might have been had those effects not come into play (that is, average luck, average defense).

In short, what makes FIP useful is that it does a better job than ERA does of predicting what a pitcher is likely to do in the future. If pitcher A had a 3.50 ERA and 4.50 FIP in 2009, and pitcher B had a 4.50 ERA but a 3.50 FIP, there's an excellent chance that pitcher B was actually the better pitcher, and some chance that pitcher B will have a better performance than A in 2010 (or at least that it will be much closer than it looks). Another thing is that if a pitcher is performing way above his established level (like Randy Wolf in 2009) or way below it (like Ricky Nolasco in 2009), it's a good idea to check his FIP; odds are good that the "change" is at least partially good or bad luck (or good or bad defense), in which case he's a pretty good bet to snap back to his old self next season.

It's not perfect. It makes an assumption that pitchers can control the number of homers they give up, when there's some evidence that if the ball is hit in the air, pitchers don't have a ton of control over whether it stays in the park (an adjusted version of FIP, xFIP, tries to control for this, but it has its own issues). And it's also just a fact that some pitchers just do a better job of preventing runs than FIP thinks they "should." Trevor Hoffman's ERA has been better than his FIP in all but three of his 17 seasons, and is 0.26 better for his career; Mariano Rivera's career ERA is 53 points better than his FIP. At some point, it stops meaning that the pitcher is lucky and starts meaning that he's just beating the stat. But it's incredibly useful on a season-by-season basis.

So that's what it is, more or less. Click "read more" below for a pretty long and meandering discussion about the history and theory of defense-independent pitching statistics.

Just a little over ten years ago, a young statistician with the delightful (if not actually given) name Voros McCracken had some crazy ideas about pitching and posted them on an internet baseball newsgroup (which, for the kids out there, was the kind of thing the internet was before the internet was remotely useful to most people). In 2001, after McCracken posted this article on Baseball Prospectus (and it's wonderful that it's still available and free -- the original article he wrote, even more wonderfully, is still available here), Rob Neyer picked up on it, and then Bill James picked up on it. Here's some of what James wrote about McCracken's findings later that year in his (still very much worth reading) New Historical Baseball Abstract:

McCracken proposes a way of rating pitchers based on, for example, their strikeouts/inning, their walks/inning, etc., but with a twist. McCracken did not use hits/innings [sic] as an element of his system, and this was no mere oversight. McCracken argues that, other than getting strikeouts and allowing home runs, there is little that a pitcher can do to cause his hits allowed to be higher or lower. Therefore, he argues, if the pitcher's hits allowed are higher or lower than we would expect (in view of his strikeouts), this reflects not skill, but pitching in good or bad luck.
...
[W]hile the plusses and minuses are not totally random, they are mostly random. The extent to which any pitcher is reliably over or under his expected hits allowed is just a few hits a year.
This knowledge is very useful.... If a pitcher has allowed 25 or 30 hits fewer than one would expect him to have allowed, then that pitcher has been pitching in extremely good luck, and it is enormously likely -- probably almost 100% certain -- that his value in the following season will decline. Conversely, if a pitcher has allowed a significant number of hits more than one would expect, we can expect his luck to normalize in the following season, and thus there is some cahnce that that pitcher's record will improve in the following season.

James also said he "fe[lt] stupid for not having realized this 30 years ago."

So the basic idea is this: pitchers could be good or bad at getting strikeouts, good or bad at avoiding walks, and good or bad at avoiding home runs. But nobody (or so the argument goes) is good or bad at what happens once the ball is put in play and stays in the park. Whether the ball falls for a hit or turns into an out is dependent largely on luck (and James doesn't say this, but of course it's also dependent on the strength of the defense behind him).

This was a really hard idea for almost everybody to accept (including me, back then), and still is. If Pedro Martinez in his prime is throwing 95 MPH fastballs and making the ball dance like crazy, isn't he more likely to get weak pop-ups and grounders than someone who just throws straight fastballs for hitters to square up on?

Well, James' answer was "yes, a little, but only a few hits a season." This is kind of true, and now we know that groundball pitchers yield slightly different hit rates and types than flyball pitchers, and that pitchers don't have as much control over their homers allowed as we might have thought, and so forth (but that's getting ahead of myself). But McCracken's answer was "actually, no." From the BP article linked above:

The pitchers who are the best at preventing hits on balls in play one year are often the worst at it the next. In 1998, Greg Maddux had one of the best rates in baseball, then in 1999 he had one of the worst. In 2000, he had one of the better ones again. In 1999, Pedro Martinez had one of the worst; in 2000, he had the best. This happens a lot.

Man, did that floor me the first time I read it. So anyway, McCracken created a statistic to measure these effects, called DIPS -- Defense-Independent Pitching Stats. Basically, DIPS took the pitcher's actual hits, walks, strikeouts, innings and some other stuff, substituted what we'd expect his hits allowed to be in place of his actual hits allowed, develop an alternate stat line for the pitcher, and use those numbers to figure out what his ERA might have been with average defense and luck (note--it was much more complicated than that, but the hits were by far the biggest component of the difference between DIPS ERA and actual ERA). McCracken argued that Aaron Sele, who had put up a 4.79 ERA in 205 innings in 1999, was one of the best pitchers in the American League, but suffered from a bad pitchers' park (Texas), terrible defense and bad luck. By DIPS, his ERA "should have been" 3.80. (Interestingly, Sele then went to the Mariners, and in 2001 -- good pitchers' park, good D, big drop in strikeouts -- and severely underperformed his own ERA).

Anyway, I did all that mostly because I've always found the story and the development of the whole idea pretty interesting. FIP is essentially a much, much simpler version of DIPS ERA, developed by the intimidatingly brilliant Tom Tango. The formula is

(13HR + 3BB -2K)/IP + 3.20

The number added changes a little depending on the league, but it's usually around 3.2; the purpose of that is to make the result look like ERA. And the results tend to be very similar to McCracken's original DIPS ERA: Sele's 3.80 in 1999 becomes 3.85, Dave Burba's 4.13 in 2000 becomes 4.00.

My sense is that FIP is considered to be just a bit better or more accurate than DIPS ERA. After all, FanGraphs chose FIP, and FanGraphs isn't exactly adverse to using extremely complex formulae. But I don't know how or why, and I don't care. My take is that these measures are inexact enough that the little differences don't matter much in light of what the stat is actually useful for. Either one (or any of several other measures) is better at telling you what most pitchers are likely to do in the future than ERA is. Good enough for me.



Digg! Reddit! Facebook! Technorati! StumbleUpon! BallHype: hype it up!
 

About Bloguin

Bloguin is the revolutionary blog network specifically focused on helping bloggers get the most out of their websites. We're currently working on building a large network of online communities and hope to expand our blogging coverage to include a wide range of topics.

Advertisers

The Bloguin Network allows advertisers to promote their products and services to our ever-growing number of visitors. We offer both site-specific ad placements as well as the ability to run a network-wide campaign. If you're interested in working with Bloguin to meet your advertising needs, please contact us.

Bloggers Wanted

The Bloguin Network is always looking to expand. We're specifically looking for blogs in the sports, entertainment, and video games field, but are open to adding any type of quality site.. If you're a blogger and interested in joining our network, please fill out our application form.

The Bloguin Login

The Bloguin Login gives you full access to everything our network has to offer. Your name and password will work for each and every one of our sites. Signing up is simple, and will allow you to post in all our forums, create member blogs, and access other cool features! What are you waiting for? Create an Account!