Throughout the entire MLB season, the AL East has stood out as the strongest division. For a big part of the year, all five teams had winning records. Currently four out of the five teams have a winning record, and
Friday, August 8, 2008
AL East Playoff Projections
Tuesday, July 22, 2008
Washington Nationals' Injury Report
In a 162-game season, injuries will undoubtedly play a role on almost every major-league team. The key to a successful season lies in part with avoiding too many significant injuries to key players.
In my following of Major League Baseball this season, along with the Washington Nationals, I have noticed an abnormal amount of injuries.
Here is the injury report for the Nationals this season:
1B Nick Johnson: Out for season (only played in 38 games this year), making $5.5 million this year.
Closer Chad Cordero: Out for season (only pitched in six games this year), making $6.2 million this year.
3B Ryan Zimmerman: Out since May 26, making $465,000 this year.
1B Aaron Boone: Out since July 7, making $1 million this year.
OF Elijah Dukes: Out from beginning of season until May 9, and has been out since July 6, making $392,500 this year.
Starting Pitcher Shawn Hill: Out from March 20 to April 18 and has been out since June 25, making $402,000 this year.
OF Lastings Milledge: Out since June 29, making $402,500 this year.
C Paul Lo Duca: Out April 18 to May 2 and May 9 to June 17, making $5 million this year.
1B Dmitri Young: Out from April 8 to May 15 and out since July 19, making $5 million this year.
OF Austin Kearns: Out from May 22 to July 3, making $5 million this year.
OF Wily Mo Pena: Out March 20 to April 13 and out since July 18, making $2 million this year.
2B/3B Ronnie Belliard: Out May 20 through June 10, making $1.6 million this year.
Relief Pitcher Ryan Wagner: Out since March 20, making $450,000 this year.
C Johnny Estrada: Out from March 26 to April 9 and from May 9 to July 18, making $1.25 million this year.
Starting Pitcher Odalis Perez: Out from June 14 to June 26, making $850,000 this year.
This is a very long list, and these are all players who played, or would have played, significant roles on the Nationals this year.
The only position-player starters from the beginning of the season that have avoided the DL are the middle infielders: SS Cristian Guzman and 2B Felipe Lopez. However, Lopez has lost his starting spot at multiple times throughout the year.
The Nationals payroll this season is $43.3 million. Calculated from this list of players, $35.5 million of those players have spent some time on the disabled list.
That is, financially 82 percent, of the team.
The Nationals currently have just over 50 percent, financially, of their team on the DL.
This is not normal, and from what I remember, this does not seem to be much different from last year, either. Nick Johnson missed almost the entire season last year as well, and Cristian Guzman missed the whole second half.
For a team that has been a bottom dweller for all of recent memory, it makes it extremely difficult to rebuild when all of your players are injured.
The list of players the Nationals have used in left field this season is longer than most team’s available infielders: Rob Mackowiak, Wily Mo Pena, Elijah Dukes, Willie Harris, Paul Lo Duca, Ryan Langerhans, and Kory Casto.
I am not one to think the injury bug in DC is coincidence, and I have two possible explanations.
The first is that the Nationals’ medical staff and trainers are totally incompetent.
And the second, more viable explanation, is that players have no interest in coming off the disabled list.
Who can blame them? Who wants to play for a team that has been outscored by more than 100 runs this season?
The Washington Nationals have a lot of issues to address. But first and foremost, they need to get, and keep, their players healthy.
Even the players that have been healthy have no risk of being demoted because there are no available replacements. For most of the first half of the season, both Willie Harris and Wily Mo Pena struggled to hit .200. Aside from Guzman, the rest of the averages haven’t been much higher, either.
Wednesday, July 16, 2008
MLB All Star Break Report: Statistical Predictions

Teams that outscore their opponents, on average, should win a lot of games. Likewise teams that get outscored on average should lose most of their games.
This is a very simple concept, and I will use it to analyze the MLB season before the All-Star break and make some predictions for the rest of the year.
I have used the run differential (total runs scored minus total runs allowed) in a linear regression to try to explain the win percentage of each major league team.
In general, it would make sense that teams with the highest positive run differential would also have the highest winning percentage. And vice versa; that teams with the highest negative win differential would have the lowest win percentage. Teams with a run differential around 0 should have a win percentage around .500 because, on average, they should win just as many games as they lose.
Of course this would never work out in real life. In addition to random variation and luck, some teams also just perform really well in close games while others do not. Some teams are more over-matched by the better teams and some teams are better at pounding the bad teams.
However, I propose that teams with a run differential much higher than their record would suggest have a strong potential to find more success in the remainder of the season because they have shown the ability to consistently outscore opponents. The reverse is also true; teams that have a win percentage much higher than their run differential would indicate (teams that are getting “lucky”) a potential for a less successful second season.
Under these assumptions, I interpreted the results from my regression and will exhibit them below. The graph of predicting win percentage from scoring difference can be seen at the top of the article.
I was very pleased with how the graph turned out for several reasons. The regression equation is:
Winning Percentage = .500 + .000941(Run Differential)
This says that for every run a team scores more than their opponent, their winning percentage will increase by .0941%. It is a very good sign that this is a positive number, or else scoring more runs than you’re opponent would be a bad thing.
The equation also says that a team with a run differential of exactly 0 would be expected to have a winning percentage of .500. This makes sense and I was very pleased that this worked out exactly. It is a good sign that run differential is a good predictor of winning percentage.
Finally, the R-squared value for the regression was 71.3%. This means that 28.7% of the variability in team winning percentage is left unexplained by only using run differential. This makes sense from my discussion before; some teams get lucky and some teams also have a knack for winning or losing close games. However, 71.3% is fairly high for only using one variable. While run differential is not necessarily a good precise predictor for win percentage, it is a very reasonable approximation.
Now that I feel fairly safe with my assumptions, here are the interpretations for the results.
First, the most interesting teams on the graph are ones that fall far from the regression line. Teams underneath the line have won fewer games than their run differential would suggest (“unlucky”), and teams above the line have won more games than their run differential would suggest (“lucky”). The further a team is from the line, the more lucky or unlucky they have been. Note that I use the term lucky and unlucky very loosely here, as there is certainly some skill involved in winning close games.
Based on the results, here are ten teams that should expect the biggest change in winning percentage for the rest of the season. The over-achievers will likely perform worse, and the under-achievers should do better.
Top 5 Over-Achievers
1. Angels
2. Marlins
3. Twins
4. Rays
5. Rangers
Top 5 Under-Achievers
1. Indians
2. Braves
3. Mariners
4. Phillies
5. Blue Jays
Now I will break down the MLB pre-All Star break season, still based on my results, for each division. I have calculated a modified version of the standings assuming that win percentage only depends on scoring margin. I included the original standings for comparison. Teams with significant changes in standings have strong potential to have differing success for the rest of the season.
AL East:
Team | Modified Standings | Original Standings |
Red Sox | - | - |
Rays | 5 | .5 |
Yankees | 7 | 6 |
Blue Jays | 7 | 9 |
Orioles | 10 | 10 |
The biggest flag here is the Tampa Bay Rays. They could be much further behind the Red Sox now, so don’t be surprised to see them fall further behind after the All-Star break.
Also, look out for the Blue Jays in the second half. It will be difficult for any team to dethrone the Red Sox, but the Blue Jays could have a strong run and at least contend for the Wild Card.
AL Central:
Team | Modified Standings | Original Standings |
White Sox | - | - |
Twins | 6 | 1.5 |
Indians | 7 | 13 |
Tigers | 7 | 7 |
Royals | 13 | 12 |
While many consider the White Sox season to date to be a fluke, the numbers suggest otherwise. They have a comfortable division lead in the division standings, so I wouldn’t expect them to fade very much.
After some small glimmers of hope, Royals fans should expect another very poor end of the season results.
The Tigers still need to improve a lot to make a run at the division, and the Indians could also move up the standings a lot in the second half of the season.
The Twins may have already played their best baseball of the season, but could still hang around for a while.
AL West:
Team | Modified Standings | Original Standings |
A’s | - | 6 |
Angels | 4 | - |
Rangers | 8 | 7.5 |
Mariners | 11.5 | 20 |
The modified standings show a huge reversal at the top of this division. Even though the A’s just traded away their ace, the Angels should be a lot more worried about being caught than most people think.
The Rangers have over-achieved so far, so don’t expect them to make a serious run towards the playoffs.
Also, the Mariners aren’t quite as bad as their record would suggest. They could win a lot more games in the second half and build some momentum going into next season.
NL East:
Team | Modified Standings | Original Standings |
Phillies | - | - |
Mets | 3.5 | .5 |
Braves | 3.5 | 6.5 |
Marlins | 9.5 | 1.5 |
Nationals | 17.5 | 16 |
Like the other
The Phillies look like they are going to be tough to beat this year, but they will have to watch out not only for the Mets but the Braves as well.
The Nationals are flat out bad and should easily secure the worst record in baseball after the All-Star break.
NL Central:
Team | Modified Standings | Original Standings |
Cubs | - | - |
Cardinals | 7.5 | 4.5 |
Brewers | 8 | 5 |
Astros | 13.5 | 13 |
Reds | 14 | 11.5 |
Pirates | 15.5 | 12.5 |
From these results, the Cubs look to have the safest division lead out of all the division leaders. Every team in this division has actually over-achieved, but the Cubs and Astros have over-achieved the least.
The Brewers may be the only team with hope of making a run at the Cubs after adding C.C. Sabathia to the top of their starting rotation.
NL West:
Team | Modified Standings | Original Standings |
Dodgers | - | 1 |
Diamondbacks | .5 | - |
Giants | 6 | 7 |
| 9 | 8.5 |
Padres | 9 | 10 |
Amazingly, all of these teams under-achieved in the first part of the season. That’s a very good sign considering how bad these teams have been so far. No team has a winning record.
The Dodgers and Diamondbacks should have a very close race for the division lead, and the Giants will be looking to make it a three-way race.
The defending N.L. Champion Rockies might need another miraculous win streak to have a chance to defend their title.
Of course it is impossible to predict what is really going to happen in the future, but hopefully this analysis provides some good insight for what to expect. It will be interesting to see how well these discrepancies match up with what actually plays out, and I will be sure to keep an eye on that.
For those interested, detailed MLB Standings can be found here. I found it interesting to look at the probabilities for each team to make the playoffs, win the division, and win the Wild Card.
Monday, July 14, 2008
New York Yankees' Playoff Chances
I’d like to start off with a question: What’s wrong with the Yankees?
Sunday, June 15, 2008
Alex Rodriguez: MLB's Best Player?
Alex Rodriguez of the New York Yankees is making more money this year than the entire Florida Marlins team. This is an astounding fact in itself, especially considering the Marlins currently have a better record than the Yankees, but it is even more disturbing if you consider the value A-Rod actually adds to the Yankees. Of course he is widely considered the best player in the league, and he is on pace to break every offensive record ever set as long as he stays healthy... but how much has he really improved the teams he has played for? Each player on a team, in some way contributes to the success of that team and, intuitively it seems, individual success for a player should imply positive contributions to team success. However, Alex Rodriguez is a perfect contradiction to this hypothesis.
Team | Last Year w/ A-Rod | Year After A-Rod | Team | First Year w/ A-Rod | Year Before A-Rod |
2000 Mariners | 91-71 | 116-46 | 2001 Rangers | 73-89 | 71-91 |
2003 Rangers | 71-91 | 89-73 | 2004 Yankees | 101-61-1 | 101-61 |
Year | Record | |
1996 | 90-72-1 | |
1997 | 77-85 | |
1998 | 88-74 | |
1999 | 95-67 | |
2000 | 71-91 | |
2001 | 73-89 | * A-ROD |
2002 | 72-90 | * A-ROD |
2003 | 71-91 | * A-ROD |
Major League Baseball World Series: 1996-2007
Year | Champion | Runner-Up | |
1996 | Yankees | Braves | |
1997 | Marlins | Indians | |
1998 | Yankees | Padres | |
1999 | Yankees | Braves | |
2000 | Yankees | Mets | |
2001 | Diamondbacks | Yankees | |
2002 | Angels | Giants | |
2003 | Marlins | Yankees | |
2004 | Red Sox | Cardinals | *A-ROD |
2005 | White Sox | Astros | *A-ROD |
2006 | Cardinals | Tigers | *A-ROD |
2007 | Red Sox | *A-ROD |
Here is the list of qualified players, along with their batting averages, that were part of a team that Rodriguez left:
A-ROD LEAVING: Batting Average Comparison
Player | Team | Last Season w/ A-Rod | Season After | Improvement |
David Bell | Mariners | .247 | .260 | .013 |
Mike Cameron | Mariners | .267 | .267 | .000 |
Carlos Guillen | Mariners | .257 | .259 | .002 |
Stan Javier | Mariners | .275 | .292 | .017 |
Mark McLemore | Mariners | .245 | .286 | .041 |
John Olerud | Mariners | .285 | .302 | .017 |
Dan Wilson | Mariners | .235 | .265 | .030 |
Hank Blalock | Rangers | .300 | .276 | -.024 |
Michael Young | Rangers | .306 | .313 | .007 |
Player | Team | Season Before | First Season w/ A-Rod | Improvement |
Frank Catalanatto | Rangers | .291 | .330 | .039 |
Rusty Greer | Rangers | .297 | .273 | -.024 |
Gabe Kapler | Rangers | .302 | .267 | -.035 |
Ricky Ledee | Rangers* | .236 | .231 | -.005 |
Rafael Palmeiro | Rangers | .288 | .273 | -.015 |
Ivan Rodriguez | Rangers | .347 | .308 | -.039 |
Jason Giambi | Yankees | .250 | .208 | -.042 |
Derek Jeter | Yankees | .324 | .292 | -.032 |
Jorge Posada | Yankees | .281 | .292 | .011 |
Bernie Williams | Yankees | .263 | .262 | -.001 |
Enrique Wilson | Yankees | .230 | .213 | -.017 |
*played only a partial season with the Rangers in 2000 (season before A-Rod)
So what does this really mean? The changes that Alex Rodriguez has brought about to his team’s performances are entirely counterintuitive to how a player with such outstanding individual success should affect a team. His departure gave way to one of the best regular seasons in Major League history for the Seattle Mariners in 2001 and his addition to the Yankees in 2004 appears to be the event that has reversed the Curse of the Bambino. Since he has only been traded twice, it is very difficult to rule out coincidence as the cause of the differences in team successes. Still, the evidence available is startling. Even if A-Rod is the victim of coincidence, he still appears to carry a great deal of bad luck with him. Further, players who play with Rodriguez tend to do better after he is traded away and players who are new to playing with him tend to do worse than the prior season. This claim is a slight, but reasonable, extrapolation from the fact that there is undeniable evidence of (1) a correlation between A-Rod's arrival and a decrease in teammate's batting averages and (2) a correlation between A-Rod's departure and an increase in teammate's batting averages. The reasons behind this A-Rod effect are still totally unclear, but also relatively unimportant. The goal of any Major League franchise should be to win as many games as possible; and so to maximize team success it would be logical to avoid Alex Rodriguez at all costs. This is the complete opposite approach of the New York Yankees, who are paying him more than any other player in the history of sports. Admittedly, it is undeniable that A-Rod is an outstanding individual player but all available data shows his presence is not at all helpful for a Major League team.
For some final notes, I would hypothesize that other superstars in baseball and in other sports may negatively affect teammate’s individual performance. Ideally the superstar’s own performance would more than compensate for this, but clearly this has not been the case with Rodriguez. I also found it interesting that the few players who did not follow the pattern of the rest of A-Rod's teammates (Gabe Kapler, Frank Catalanatto, and Jorge Posada) all had relatively large changes in batting averages. I was suprised that the statistical evidence found was so strong in spite of these three, and also am curious to what made these players immune to what happened to the rest of their teammates.