Lost in Translation: Why Your Projections and Dollar Values Won't Save You

Written by

Published on April 10, 2010

We've been having a robust debate on the Card Runners Fantasy Baseball Blog with the poker and Wall Street guys about who has the edge in our league and why. To sum up: I feel whoever knows the player pool best and has the best instincts, i.e., knows the facts and how to interpret them on the fly, should be the favorite, whether that person is a fantasy baseball pro or a poker player. They seem to think having a good model which can convert stats to dollar values is also relevant. Let me explain why I think they're mistaken.

When you have a hammer, everything looks like a nail

If you're able to build models that do a good job of figuring out exactly what 20 HRs, 80 RBI, 75 runs, 9 steals and a .291 batting average in 550 at-bats is worth in a given environment, you'll be inclined to imagine that you can use that to your advantage in fantasy baseball. I would argue you cannot unless very specific conditions obtain. Those conditions are that either that everyone is using substantially similar projections to you, is worse at translating them to dollar values and is very strictly sticking to their dollar values at auction time; or (2) That the projections you're using happen to be right. Both cases are extremely unlikely.

Few people are depending on exact projections, many projections vary significantly and many people don't even bother with projections or dollar values.

Old school types like Lawr Michaels (reigning AL Tout Champ) and myself (reigning mixed Tout champ) just go into the draft with a list of players to cross off. (Actually, I'm not even sure Lawr does that). Speaking for myself, I have no preconceived dollar values in mind, haven't usually made up my mind when to stop bidding on a player, and certainly don't have precise projections for any of them. I try to know the player pool deeply (from historical performance, to health to team context), use past experiences with pricing in the given format and adjust for market conditions on the fly. I translate player knowledge directly into bidding or keeping quiet based on all of these factors and skip the intermediate translation into projected numbers completely.

The argument against this is that surely I'm doing some kind of translation - I'm not just buying players at random. So I must have a sense of Derek Jeter's numbers before I pay $27 for him. That's true, but I think it misses the point. I don't speak French very well, so if I hear someone say something I can understand in French, I translate it to English first, and then I know what it means. But if I really learned to speak it, I would just hear the French, and |STAR|know|STAR| what it meant. Because I am not well versed, I must include an inefficient translation step. I think the same thing applies in fantasy baseball. If you know the players well enough, and you see what others are going for and know what similar players have gone for in the past, I believe you can translate facts directly into action. The facts plus the knowledge of the format and attention to the market as it's happening go directly into your ears the way a known language would. You don't have to translate it into the math first. (The only math I'm doing at auction involves my budget - never the players or their projections).

So you might argue that you've created a much better French to English translation program than me, and therefore you have the advantage in understanding the language, but then I tell you - "dude, I don't need one - I already speak French fluently!"

So not everyone's using your same framework, and I would argue your model assumes a basic inefficiency that occurs because you're relying on someone else's numbers, and you don't know the player pool fluently enough to bypass that step.

Your projections are necessarily wrong

Ron Shandler told Jeff Erickson and me on the radio this past week that something like 40 percent of the industry consensus top-15 picks actually finish in the top-15. And you can imagine the attrition rate gets worse the deeper you go into an uncertain pool. To that end, Shandler - who has advanced the ball on fantasy player evaluation about as much as Bill James has done so in the real-life game - has proposed the Mayberry method of simplifying player values into 1-5 ratings. Under this strategy, it no longer matters exactly what a player will do - after all it's impossible to know whether Albert Pujols will hit 38 or 48 homers, so it gives him a rating, e.g., 5. And the aim would be to fill one's roster with the players who have the best rating to cost ratios. (I'm sure I'm oversimplifying here), so feel free to provide more details in the comments.

Those who believe in the sacredness of projections might argue that Pujols' precise homer total is a problem of variance, but that doesn't prevent you from setting a median over/under number and using it as a baseline. Well variance is part of the problem, but it goes deeper than that. To know the true median number requires knowledge it's just not possible to have. In a game like Texas Hold 'em, we can know with certainty the expected win percentage of JJ vs. AK. That's because every deck is exactly the same in every relevant way. Not so with baseball players who are unique, and whose circumstances every season are unique and not repeatable. So when a supposedly declining Jeter hits 18 homers, steals 30 bases and hits .334 in 634 at-bats, it's impossible to say how much of that was due to variance, i.e., our median projection was correct, but he had his 95th percentile season, and how much of it was due to our projection flat-out being wrong.

We encounter this same problem when handicapping football games. After a 4-5 game sample, it's not always too hard to determine what a team's median performance level is, and we can compare it to the team they're playing and create a line. And just because last week, one of the teams played great and the other terrible, doesn't mean that both will do the same when they meet on Sunday. The great and terrible performances can be attributed to variance - one team played to its full capacity, and another played below it. But we have 4-5 games for each to know what the baseline is for each, and while last week's contests will be a factor in that, we're going to use the entire sample to determine capacity.

The other thing we can do is season our capacity analysis with a broader historical perspective. Winning on the road in the NFL is always difficult, and maybe the game will be played in cold and windy weather where it might be harder to throw the ball downfield - okay, we factor that into the line , too - no problem.

But what happens if the team that won in a blowout last week didn't merely play to its maximum capacity but shifted its very |STAR|identity|STAR|? In other words, it didn't just play its 95th percentile game, but actually played its 60th percentile game because it's no longer a mediocre team but a very good one. The baseline moved. Examples of teams whose baselines moved are the 2007 playoff Giants who barely made the playoffs then beat the greatest regular season team in the history of the NFL, and the 2008 Cardinals who limped into the playoffs then nearly took down the No. 1 defense in the NFL and would very like have but for James Harrison's miracle 14-point play at the end of the first half, and Santonio Holmes' incredible game-winning touchdown catch.

So the idea that teams or players have baselines, and all movement is variance around them is obviously fiction. Players change every year not just by varying around some median baseline but by improving their trajectory, growing their skills. Matt Holliday was a middling prospect, then a Coors Field product, and now he's a bona fide star. How did that happen? Because what a player is at age 22 or 24 or 26 does not determine what he'll be at 30. Or to the extent it does, we're just unable to read it correctly.

After the fact, we can see that Holliday's career went on a different trajectory altogether, or that the 2007 Giants had a much higher baseline than we imagined, but it's much harder to see in real time (after the Giants won in Tampa in the Wild Card round, for example) and harder still to know or "project" in advance. Moreover, it's impossible to say with any precision how much of the Giants run was variance, and how much of it was their being a different team than they were the previous month. It's something to which we can never know the answer. In poker, if someone hits a two-outer on you, you can know exactly the extent to which you took a bad beat.

Because trajectory changes are inherently unprojectable with any precision - (after all the whole concept of trajectory |STAR|is|STAR| our projection), your dollar values will necessarily be wrong. And I haven't even gotten into the difficulty of predicting and evaluating player volatility where Conor Jackson and Cameron Maybin might fetch the same price at auction, but whose ceilings and floors are widely disparate.

If one's projections are necessarily wrong, and not everyone needs to translate the facts into precise projections or dollar values before taking action at an auction, I don't see how doing so and having a good translation system confers a meaningful advantage.

I'll add one more speculative thought here about what I think makes a big difference in winning fantasy baseball leagues, and that's imagination. I like to try and draft off the 2011 cheat sheet usually, and the only way I have access to that is by imagining what it will look like. We can all try to do this, but I think the imagination is like a muscle of sorts, and not only must one use it and trust in it over time, but also provide it with feedback so that you know where the limits are, what's wishful thinking and what's really possible. You must be open to unlikely possibilities as well as likely ones. The biggest mistake experienced owners make is to imagine that K:BB ratio is constant, or K/9 from last year determines which pitchers have strikeout skills. The more rigid one's system, the less likely one will be ahead of the curve when something unexpected happens. Wayne Gretzky's father told him to "skate to where the puck is going, not to where it's been," and I think that's the goal here, too. If you keep an open mind, and consider the possibilities, then you have a chance to be there already when the possible becomes the actual. If you're waiting until your model can absorb the new information, you're already too late.