Previously, I described my hitter and pitcher projection philosophy and methods. Now it's time for the next step - valuation. The process is rooted in replacement player theory. Way back when, when we needed to name the method, REP was chosen. Since then, others have deemed it PVM for Percent Value Method.
What follows is a slightly edited essay first published for Platinum subscribers in 2010. In it, you'll note the notion of positional scarcity was debunked, something everyone else is just noticing now - another example of Mastersball setting the curve.
After the 3300-plus word monster is concluded, I'll chime in with some current thoughts as eight years is a long time, and my approach in a couple areas has changed.
Additionally, I'll link the valuation chapter of the Mastersball Annual, a book we published almost twenty years ago.
Without further ado, here's a discussion of the Mastersball Valuation Methodology.
Simply stated, fantasy baseball is a game in which you assemble a team of real baseball players whose statistics are used to score and ultimately rank your team. To do this effectively, you need to do three things
This essay will focus on the middle aspect, the quantification of performance, perhaps better known as player valuation.
There are a bevy of valuation systems in use to quantify statistics. Why are there multiple? Because there is no definitive right or wrong. There is no precise manner to put a static designation on a fluid entity. There may be better ways, but it follows that if there were truly a correct or even best way, that would be basically universal. Admittedly, in my younger, more naïve days I felt the method I am about to describe was the be‐ all‐end‐all and dedicated my life to that crusade. But over the years, I have developed a truer grasp of what it really takes to succeed at this endeavor and have softened my stance. Player values and rankings are a guide, a piece of the puzzle. I much prefer being recognized for my acumen in the third element of the hobby described above, the assembling of your squad than being known as the premier valuation guy in the industry, though that does have its advantages. That said, those of you that favor the popular SGP method are using an inferior process as it is theoretically and mathematically flawed. After all, I did say soften, not completely change my stance.
What you need from a valuation system is a snapshot view of what a player is worth relative to other players. Since this snapshot is composed of several elements, it also helps if you have an idea of what comprises that snapshot. It really helps if you understand how the snapshot is generated, so you can do some massaging to the system to best make it work for your league and its unique tendencies.
What you don’t need is a green light, red light designation of perfectly accurate value. I chalk it up to the maturation process, both as a person and a fantasy gamer, but I honestly feel the focus I put on “proper player valuation” stunted my growth as a player, detracting from my ability to look at the big picture, understanding how to use that piece of information most efficiently.
With all that said, what follows is a description of a valuation process that I believe to be most effective when looking at the big picture. It does not presuppose anything in terms of player value. It gives an unbiased snapshot of how each player can help your squad. It is incredibly flexible, so it can handle any tweak or alteration you feel necessary. It can account for all aspects of your league’s dynamics and does so in a sound philosophical as well as logical manner.
Simply put, value is distributed in proportion to each player’s contribution to the overall player pool. If I have $1000 to pay a crew for doing a job, someone did 50% of the work gets $500. If someone else did 30%, they get $300, leaving $200 for the remaining 20% contribution of the third member.
Of course, valuing players for rotisserie style scoring is more complex as contributions are across multiple scoring categories. The player’s contribution in each category is determined, and these are all summed for a total value
Boundarie$ and Parameter$
Even though much of the introduction was designed to drive home the point that ultimately the value assigned should be viewed rather loosely, we still need to treat the system in a static nature. As such, there are several logical boundaries and parameters of a sound valuation method.
Let us begin with what will be referred to as the draft‐worthy pool. The draft‐worthy pool should be composed of exactly enough players to field a league full of legal rosters, taking into account positional requirements. By means of example, a 12‐team league with 14 hitters and 9 pitchers will have a draft‐worthy hitting pool totaling 168 players, with 108 in the draft‐worthy pitching pool. More specifically, if the positional requirements are the standard 2C, 1B, 2B, 3B, SS, CI, MI, 5OF and UT, then there needs to be 24 C, 12 1B, 12 2B, 12 SS, 12 CI, 12 MI, 60 OF and 12 UT in the hitting pool.
Because most rules specify a minimum bid of $1 on each player, the lowest ranked player of the draft‐worthy pool should be worth $1, with the top‐ranked non‐drafted player being $0. An argument can be made this condition should be set upon each position. That is, the worst catcher in the draft‐worthy pool be set at $1, the worst second baseman $1, etc. Later, the mathematical manner to do this will be detailed.
Values should be assigned in a zero‐sum nature. A typical team budget is $260. That means our 12‐team league has $3120 to spend on the previously discussed 168 hitters and 108 pitchers. The entire $3120 should be exactly distributed amongst the 276 players comprising the draft‐worthy pool.
Because the points earned in each scoring category in most rotisserie league are equal, the money assigned to each should be the same. For example, in leagues that use a $260 budget with 5 hitting and 5 pitching categories, you should plan on spending $26 for each. However, as most everyone knows, conventionally, more money is spent on hitting than pitching. Currently, the average 5x5 league spends 69% of its budget on hitters. This drops to 67% in the ever‐disappearing 4x4 leagues. This equates to each 5x5 team budgeting $179.40 ($35.88 per category) for hitters and $80.60 ($16.12 per category) for pitching. In a global sense, a 12-team league distributes $430.56 per hitting category and $193.44 per pitching category.
Replacement Level and the Concept of Useful Stats
We have already established that each player’s value is assigned according to the summed percentage contribution across the categories. The number of players with a value of $1 and greater is dictated by the number of teams in the league and positional requirements while the total amount of money distributed is equal to the number of teams multiplied by the team budget.
In short, I have an issue paying for something I can get for free. Okay, this does not explain my penchant for buying bottled water, but I digress. In fantasy baseball terms, due to the positional restraints of legal lineups, there is a certain level of statistics that everyone has on their roster. If everyone has these, why pay for, ergo, place a value on them? It does not make sense. If you are doing a football pick‐‘em pool and everyone chooses the same team to win, the result of that game is inconsequential. If the worst catcher on a roster in a fantasy league is projected to hit 5 homers, then everyone in the league has those same 5 homers, so why pay for them? What you want to pay for is that which differentiates you from the rest. I term these “useful statistics”. To bring the point home, our system only values these useful stats.
Here is an example I like to use to demonstrate the concept and utility of useful stats. Let us set up a 2‐team HR derby league, you and me. We each need a player from Group A and one from Group B. I will give you first pick. Here is the available player pool:
So, who do you want? Hopefully Green. Why? If you take Red because he is the best hitter, I will take Green then Blue for a total of 65 HR. You get Red and Yellow for a total of 60 HR and I get to call SCOREBOARD!!!
Here is how the pool should be considered:
This represents the number of USEFUL homers each batter swats.
At this point, you may be wondering if this is the mathematical manner to deal with positional requirements, that is, what if instead of 2 groups there were 6 and instead of alphabetical designation, there were catcher, first base, second base, etc.? You are very wise, Grasshopper.
This is precisely the manner proper valuation should be conducted and will also result in the worst player in each pool being valued at $1 as discussed previously. It also explains why 20 homers from a catcher are worth more than 20 homers from a different position. In 2‐catcher leagues, the same 20 homers account for more useful homers for the catcher as the amount subtracted from the replacement catcher is smaller than that of the other positions. In mixed leagues 15 or 16 of a catcher’s 20 home runs are useful as compared to perhaps only 12 or 13 for the other positions.
While it is easiest to explain the concept of useful stats using a simple home run derby league, the fact there are multiple categories in standard rotisserie formats adds a significant layer of complexity to the process. The way we overcome this issue is to employ a mythical replacement player, who is a composite of the worst players at each position. You cannot single out a particular player as there are many reasons why a player is of low value. He could have a poor batting average but decent counting stats. He could have a poor average and low power, but a lot of steals. His average could be solid but the associated production minimal. The point is, using an individual player to set the replacement level can skew the useful stats as the adjustment could be too much or too little. So instead, we use the mythical player who has a mythical stat line representing the average production of the last few replacement level players
A final point to be made is some draft‐worthy players may in fact contribute a negative value in a counting stat category if their contribution is lower than that of the mythical replacement player in that category.
Converting Ratio Stats to Counting Stats
There is one final speed bump that we need to deal with before we are ready to tie it all up. It is straightforward to envision the distribution of value with the categories involving the counting stats such as HR, RBI, runs, SB, wins, saves and K in standard formats. It is a mite trickier with respect to the ratio categories of BA, ERA and WHIP. We need to convert a ratio stat to a counting stat. This exercise is worthy of an essay unto itself, so I will just provide the Cliff Note’s version and encourage questions on the message forum.
What we do is take the player’s ratio and compare it to a baseline ratio, then multiply the difference by at bats or innings pitched to apply a weight. We have empirically determined that the most effective baseline ratio is that of the typical last place team in your league of the category in question.
Since the baseline batting average for hitters is numerically lower than what you expect for a useful hitter, the baseline average is subtracted from the player’s average and multiplied by at bats. On the other hand, since a superior ERA and WHIP are numerically lower than the baseline, they are subtracted from the baseline and multiplied by innings. The resulting number is now treated the same as a counting stat.
Treatment of Middle Infield, Corner Infield and Utility Positions along with Multiple Eligibility Players
The in‐depth manner to account for the fact that either a second baseman or shortstop can fill middle infield, a first baseman or third baseman can be slotted at corner and all hitters can fill utility is beyond the scope of this essay. For those interested, the explanation is provided in other site material. For this essay, let us assume in our model league above the middle infield pool is composed equally of 6 second basemen and 6 shortstops while the corner pool has 6 first basemen and 6 third basemen. We will also assume the utility pool is all outfielders and DH‐only. This means the draft‐worthy pool will include 24 catchers, 18 at each infield position and 72 outfielders. In your league, the actual spread will be different. We explain how to deal with this in primers explaining the actual usage of the site’s CVRC (customizable value and ranking calculator).
As you know doubt are aware, some players carry multiple eligibility. We use the assumption that they will be drafted at the position they enjoy the most value. As such, we designate positions according to the following positional hierarchy:
C > 2B > SS > 3B > 1B > OF
Putting it All Together
We now have everything necessary components to determine dollar values. We can determine the number of useful stats each player contributes by subtracting the corresponding replacement level across the positions. Using our model league and considering just home runs, the top 24 catcher useful home runs total, the top 18 homer totals at each infield position and the top 72 outfield homer totals are all summed and represent the total of useful homers for the draft‐worthy pool. Value is then assigned according to the percentage of useful homers each player contributes multiplied by the monetary amount assigned for the pool.
By means of example, let us say the pool of homers for the draft‐worthy pool is 2000. An outfielder is projected to hit 40 and the replacement at the position is 10. He is thus given credit for 30 useful homers. According to our calculations above, each hitting category is allocated $430.56. The players HR$ is then 30/2000 x $430.56 or $6.46. This is done in a similar manner for the other categories and the individual categorical contributions are summed for a final value.
To emphasize a point discussed previously, let us consider a catcher that is projected to hit the same 40 homers as the outfielder. The difference is the replacement level for catchers is no doubt less than that for outfield, perhaps only 5. This yields 35 useful homers for our catcher, translating to a HR$ of $7.53. The same 40 raw homers are worth more coming from a catcher as he contributes more useful homers to the global total.
In the name of full disclosure, there is still some algebraic tweaking necessary as doing replacement in this fashion results in a hitting pool not necessarily composed of exactly 168 players and a pitching pool with precisely 108 hurlers. The take home lesson is not this adjustment, but the understanding of how we assign player value in a global sense.
Making the Theoretical Practical
You can now put away your calculator. We are done considering the value calculation as a static entity. While it is true that what has been described is a theoretically logical procedure, it is not an entirely practical means of assigning value in all instances. There is a fine balance between what a player is theoretically worth and the practical amount you need to pay to acquire his services.
The multiple eligibilities of players along with the corner, middle and utility designations cloud the picture. Who is to say every player eligible at both second and outfield are put at second? Doing this alters the number of players in each position’s draft‐worthy pool, skewing the composition of the replacement player which affects the number of relative useful stats each player contributes. The best way to combat this is to simplify your pool designation. In almost all leagues, the catcher pool needs to remain distinct. The first basemen and third basemen can be combined into a single corner infield pool. Similarly, second basemen and shortstops can be merged into the middle infield pool. This cuts down the total number of pools from 6 to 4. In addition, since many outfielders and corner infielders have eligibility in both pools, integrating those is perfectly acceptable as well. Now we only have 3 pools to deal with. Finally, partly due to the plethora of multiple eligibility players and the nature of the current player pool in general, in AL and NL only leagues and even some deeper mixed leagues, the replacement level player is so close to the same across all non‐catcher positions you can really simplify matter by using only a catcher and non‐catcher pool. All you need to do is lump the respective pools together and determine the replacement level player based on that new group.
Another consideration is you may not feel it is practical to assign the same budget to each category. You may want to invest a higher portion to more stable categories. Perhaps this entails devaluing batting average and wins. Perhaps your league’s dynamics are such that speed or saves are devalued. You can readily adjust the budget you dedicate to steals or saves. The idea of devaluing speed makes sense from a theoretical aspect as well and is something we first discussed several years ago. Our value system assumes linear distribution of stats within the final standings. However, the reality is the spread between teams is not linear, especially in steals. We have conducted some studies that show you do not need to spend as much money in the steals category to finish at the same point in the category as you do others. We call this category efficiency and adjust our category weights accordingly, shunting some budget from steals to home runs. Why home runs? Site research demonstrates the category league champions fare the best in is homers, so it makes sense to help insure success there. In addition, the same studies show winning teams fare the poorest in steals, providing further evidence that it is practical to reduce steals allocation.
As suggested in the introduction, the beauty of our system is it is flexible enough to easily handle these and any other practical alterations. The foundation is rooted by solid theoretical principles which can be modeled mathematically. But, the roots are not unmovable. So long as you understand the principles, you can adjust in any manner you deem reasonable to produce the most practical, hence useful set of bid guidelines possible. This is true for any size league with any positional requirements and any scoring categories. It can be adapted to keeper leagues. There is not a format we cannot handle. The key is understanding exactly what the value represents. It is not an (incorrect) measure of how many points you can gain in the standings with that player. It is not a measure of how many standard deviations a player is from an average player. It is the summed total of each player’s contribution of useful stats across each scoring category. We firmly believe this provides you with the optimal guidance to assist in your endeavor to assemble a championship team..
OK, back to 2018 Todd again. For those unaware, Mastersball Platinum has an Excel tool programmed to generate values using the PVM method. The original process involves iterative sorts until replacement stabilizes, something I am not skilled enough to program. Instead, I use the LARGE function to derive the replacement level player. As an example, in a 12-team league with two catchers, the replacement level for homers is the 24th highest total. The replacement for the rest of the pool is the 144th largest. This is subtracted from the rest of the pool to derive useful homers.
Another small tweak from the original method is using marginal pricing, since it's easy to program. Here, every player is assigned $1, since conventionally, that's the minimum required. The replacement player then earns $0.
Mathematically, in a 5x5 league, each player gets $0.20 marginal pricing for each category. As such, the cumulative marginal pricing needs to be removed from the category pool. Recall in a 12-team, 5x5 league, each hitting category distributes $430.56 to the draft-worthy pool. There are 168 hitters, assigned $33.60 marginal value. This is subtracted from $430.56, leaving $396.60 to be distributed to the useful stats in each category.
This process is done for all five categories, then each category contribution is summed for the final value. At this point, the pool probably isn't perfectly sized, so the program adjusts the prices proportionately, so they fit within the parameters and boundaries of the specific league.
The final change from the original method is an empirical discovery, unique to Mastersball. The research is available to Platinum subscribers and will soon be brought out from behind the firewall. Even with accounting for catcher scarcity, the pre-season values aren’t representative of what will transpire over the season. In short, the calculated scarcity bump is too severe. To deal with this, I’ve coded the CVRC to price backstops more realistically. Note, this is only true for two-catcher leagues.
Now for the grand finale. In the inaugural Mastersball Annual, John Mosey authored the valuation chapter. Mosey did a great job, but some readers had trouble understanding it, so I was tasked with translating the chapter into English for our second publication. Mosey was quite gracious and helpful throughout the process.
I was not alone in the endeavor, enlisting friend and colleague Rob Leibowitz, now proprietor of Rotoheaven to be my guinea pig and editor. Rob not only made sure my words were clear, he tested the steps of the method along the way.
For those inclined to download the chapter and try it out, I can’t promise much support. Things have changed for me professionally and I may not be able to guide you through as closely as a few years ago. I can, however, preach patience. A LOT of patience. Eventually, the replacement level will settle. After going through the process several times, you’ll probably figure out some tricks. But again, to get there, BE PATIENT. Feel free to post questions on the message forum.
With that, here’s the chapter on Player Valuation, circa 2002 (or so).
National Fantasy Baseball Championship Cutline Primer
(Note - this was originally written a year ago, so some of the references are a little dated. They'll be updated for Platinum subscribers in early February)
New for the 2016 fantasy baseball season is a unique contest offered by our friends at the National Fantasy Baseball Championship (NFBC) called the Cutline Championship. A complete review of the rules can be found HERE.
In brief, the Cutline Championship is a points-based, best-ball scoring format. The leagues consist of ten teams and use standard NFBC roster requirements and position eligibilities. There will be an initial snake draft to fill 36 roster spots then a pair of in-season FAAB periods. The first is the week after the season starts where you can add up to five more players with the second in early June where you can add as many as you want to a maximum of 46 roster spots. The regular season ends right around the All-Star break where teams will be entered into the Cutline Finals, Consolation Round or have their season end. More teams will be eliminated over the next nine weeks until a Cutline Champion is crowned in early September.
What follows is a primer for those entering the inaugural Cutline Championship. Even though the discussion will focus on that contest, many of the principles transcend into other formats so hopefully all Platinum subscribers can glean a nugget or two to help in their draft prep.
The Cutline scoring is designed so that the ranking of the players by points emulates the ranking via standard 5x5 rotisserie scoring. The hitter’s correlation coefficient is .99 while pitcher’s is about .90.
A noteworthy difference between the Cutline and other NFBC contests is there isn’t a Friday transaction day for hitters. The scoring period for everyone runs from Monday through Sunday.
Points are awarded as follows:
For those not familiar, best-ball scoring means your optimum lineup will be determined automatically each week without you ever setting a lineup. The only team management required is the initial draft and the two in-season FAAB periods. The NFBC site does the rest.
The intelligence is designed to account for corner infield, middle infield, utility and multiple position eligibility. There’s no delineation between starting pitchers and relievers – your top nine arms each week contribute to your total, regardless of their role.
PROPER RANKING USING POINTS SCORING
As discussed, back-testing using previous season’s final stats was used to produce a system that correlates very well to 5x5 roto-scoring. That’s all well and good but it’s still essential to come up with a draft list incorporating principles intrinsic to points scoring.
If you play fantasy football you know where this going. The key to points leagues is rankings should not be based on raw points but rather adjusted points using the last player drafted at each position as a baseline. The idea is everyone in the league is credited with the number of points scored by the worst active player at each position so the person with that player essentially earns no useful points from that player.
Mathematically, find the worst draft-worthy player at each position, subtract those points from everyone at the position and re-rank according to those adjusted points.
Truth be told, this is by no means perfect, especially in a best-ball format. The calculation only works if one player occupies each roster spot all season – which is obviously not the case. In addition, the use of corner, middle, utility and players that are eligible for multiple positions skew the replacement level. Still, doing the best you can to determine replacement is better than ignoring it. Ultimately, draft flow comes down to varying expectations of player performance but having a starting point where, at minimum, the players are ranked accurately relative to each other is very beneficial.
HOW MASTERSBALL GENERATES CUTLINE RANKINGS
Let’s start with the easy part – pitching. There are ten teams with nine roster spots so the expected points from the 90th highest total is subtracted from all the hurlers.
Hitting is where it gets dicey. Here’s what we know.
Players with multiple eligibility are assigned a primary position according to this hierarchy:
C > SS > 2B > 3B > OF > 1B
This is how I view the strength of positions – you may see it differently. Your team, your call.
The projected points for all the hitters are calculated. The top-140 (ten teams, 14 roster spots) are examined to see if the above criteria are satisfied, starting with catcher and moving the hierarchy. If a position is short, the highest ranking player at that position is brought into the top-140, knocking out the lowest ranked player at a position that has not yet been checked. When finished, the top-140 should now consist of ample players at each position to fill all ten active rosters.
The lowest ranked player at each position is identified and those points are subtracted from every player with that same primary position. These adjusted points are used to rank hitters and pitchers together.
To reiterate, this process isn’t perfect, but it’s better than using unadjusted points. Because of the unique Cutline points system, the adjustment isn’t all the steep. However, to those playing in points leagues other than the NFBC cutline, omitting the adjustment is the biggest mistake made. The projected points for hitters and pitchers will be computed and it is wrongfully concluded that one is way more valuable than the other based on raw points.
GO BIG OR GO HOME
Before we go on to discuss some specific strategies, it’s necessary to set the proper mindset. Sure, there’s a league prize as the top-scorer in each ten-team league will pocket a nifty $250. Hopefully it’s obvious that the NFBC Cutline is a contest where you’re playing to win the whole kit and caboodle and not simply best nine others to double your money. As such, you’re going to need to take some chances along with being clever about roster construction to take best advantage of the best ball aspect.
Taking chances means jumping players with higher ceilings up the rankings. This doesn’t mean players with high stable floors but limited ceilings should be ignored. It just means you’ll need to increase your risk profile to defeat the thousands of teams trying to win the Cutline Championship.
There are three subsets of players that generally carry the most risk:
Can Carlos Correa and Francisco Lindor’s sustain last season’s power spike? Can Corey Seager maintain such a high BABIP? Can Miguel Sano continue to be productive despite so many strikeouts? Will the league adjust to Noah Syndergaard? Can Raisel Iglesias take the next step? These are all legit concerns that may worry conservative drafters. Sorry, but caution will not take down the Cutline.
Neither Byung-ho Park nor Hyun-soo Kim have swung at Major League pitch in anger yet. Kenta Maeda hasn’t thrown a pitch that counts in the bigs. Sure, we’d like to see if they pass the eye test in the spring but we don’t have that luxury. Risk averse players prefer to dance with the devil they know. Winning the Cutline requires venturing into the unknown.
You can’t mess up your first round pick, right? Those that agree won’t be starting their team with Bryce Harper or Giancarlo Stanton. What about Yu Darvish? Or Carlos Gonzalez? Something to keep in mind is with best ball scoring, someone will always be there to backfill an injured player. Even though you risk carrying an empty roster spot, you won’t be getting a zero – you just have one fewer option to fill your best lineup.
Or you can combine all three and draft Rusney Castillo.
Please don’t misinterpret the above and throw a dart at every pick. All I’m saying is you need to pick and choose instances to let your hair down and go outside your comfort zone.
TAKING ADVANTAGE OF BEST-BALL SCORING
Consistency is a concept not all that relevant to standard rotisserie formats. You don’t care about the pathway; all you care about is a player’s season ending stats. However, the best-ball aspect of the Cutline affords several means to take advantage of the scoring to optimize your weekly scores.
Every player has a baseline expectation but there are factors that can raise or lower that expectation over the course of a scoring period – at least on paper. The idea is there will be some weeks a handful of players exceed their baseline and will be included in your total while others they fall below, to be replaced by some other players in a positive situation that period. Let’s take a look at some of these scenarios.
To be clear, the following should be applied to fringe players. The points of the better players will almost always end up counting in the best-ball accounting. However, there will be back-end players that have better and worse weeks that will find their way into your optimal lineup. It’s with these players you may want to look at a combination of what follows to maximize your week-to-week totals.
Home versus Away
Here’s some vitals to demonstrate the superiority a team has playing at home.
Clearly, a player produces more at home. Note the homers are close home and away. This is due to the home team not hitting in the ninth when they have a lead. However, assuming the majority of your hitters hit in the upper half of the order, this isn’t an issues since they’ll usually get the extra plate appearance.
On the surface, this may seem like more of an in-season management ploy and thus not applicable for the Cutline but when you’re looking to win a tournament of this nature, you need help at the fringes.
Again, you’re not going to fade Paul Goldschmidt because of his schedule but you may look at the Diamondbacks schedule when considering Jake Lamb. To that end, here’s a review of the weekly schedules for each team (click HERE to download the spreadsheet). The heading represents the number of weeks each team has that number of home games. On the left is the first 14 weeks (before the initial cut) while the second is the nine-week playoff period.
|7 or 6||5 or 4||3 or 2||1 or 0||7 or 6||5 or 4||3 or 2||1 or 0|
We’ll save the detailed analysis for a stand-alone piece.
The key with park factors is there are several venues that are counter-intuitive. Some examples are
Applying park factors to your rankings is also tricky as you need to do it in concert with the scoring system and type of player. The Cutline scoring system really favors homers so power hitters are really helped by parks that elevate power. Players that get hits, score runs and drive in teammates but aren’t sluggers aren’t hurt by Kauffman Stadium. With regards to pitching, as noted, Yankee Stadium isn’t horrible at all, unless you’re a fly ball pitcher.
Thinking about pitchers, it seems obvious that starters scheduled to go twice have a great chance of making your optimal lineup. Considering it’s a given to choose fringe pitchers with venues that hurt run scoring, going one step further and looking for parks with the maximum number of 6 and 7 home games in a week increases the chances of two starts at home – which will really pump up that pitcher for that week. There’s no guarantee your pitcher’s two start weeks coincides with two home games – all we’re doing is looking to improve the chances.
This is another topic that’s worthy of further discussion, especially since the analysis goes hand-in-hand with home vs. away. Look for it soon.
Here’s a look at platoon splits from the past three seasons.
|vs RHP as LHB||0.747||0.324||104|
|vs LHP as RHB||0.739||0.320||101|
|vs RHP as RHB||0.701||0.305||90|
|vs LHP as LHB||0.668||0.295||84|
|vs RHP as LHB||0.713||0.315||100|
|vs LHP as RHB||0.731||0.322||105|
|vs RHP as RHB||0.684||0.304||90|
|vs LHP as LHB||0.647||0.290||83|
|vs RHP as LHB||0.741||0.325||104|
|vs LHP as RHB||0.738||0.323||103|
|vs RHP as RHB||0.691||0.303||91|
|vs LHP as LHB||0.645||0.287||78|
As expected, the largest spread is for left-handed hitters. Keeping in mind the idea is to embrace variance, using left-handed batters to fill out the back-end of your roster and reserves will lead to some weeks a team is facing a preponderance of righties, in theory increasing the lefty swinger’s performance over his baseline.
You’re going to have to trust me on this but players with lower contact rates are generally more inconsistent than players that don’t fan as much. Combine this with a power hitter with a low contact rate and you have a highly variable player that will score very well the weeks he goes deep a couple of times while scuffling those periods the punch outs are prevalent. That’s fine – you’re approaching this with the assumption that with ample fungible players, someone will be in a good spot to cover your slumping slugger.
Stolen base specialists
One of the tricks of DFS is to identify pitchers and catchers (hopefully forming a battery) that are weak at controlling the running game. Sometimes you can find a team whose philosophy is to focus more on the hitter than runner thus allow an above average number of stolen bases. The repercussion is this puts base stealers in the inconsistent category when it comes to points-based scoring. Speed merchants will have weeks with multiple steals when they face the likes of Carlos Ruiz or Kurt Suzuki for a series. In roto, we often avoid these one-category contributors but in the Cutline, they’re perfect examples of players to target later – perhaps avoid early.
Multiple position eligibility and utility
According to the Mastersball projections, there are seven hitters that qualify at DH/UT only with just Miguel Sano and perhaps Evan Gattis expected to gain different eligibility. This means at least half of a Cutline league will have a player that can ONLY score points at DH/UT. This seriously hinders your ability to take advantage of a great week by a lesser player. Sure, he’ll bump someone, but the difference in points you gain isn’t as significant as compared to the edge if he filled one of your fungible spots. Of course, one way to counter the likes of David Ortiz or Prince Fielder blocking your utility is having a bunch of back-end players with multiple-eligibility. That said, having the utility spot as one of your intended spots for inconsistent players discussed above will really help take advantage of the variance and give you a handful of extra points each week.
Yeah, I’m crazy thinking it’s going to take this sort of out-of-the-box thinking to win the Cutline – but I’m also right. Remember, this doesn’t apply to your early or even middle round picks. Let’s designate corner, middle, utility and two outfield spots for hitters (total of five) and four pitching spots you expect different players to occupy in your optimal lineup. That’s nine so beginning in Round 15 and through your reserves you focus on these highly variable players. Initially, that’s 22 candidates to fill nine spots. After one week that grows to 27 and eventually 32 players. Sure, there will be injuries but chances are you’ll always have more than twice as many options to take those nine spots. With those odds, there’s a good chance, most, if not all are filled by players enjoying a week over their baseline expectations. That’s how you win a contest of this nature.
As alluded to throughout this discussion, there are topics that require further treatment. Another is how to construct a best-ball pitching staff. Look for those discussions soon.
This piece was originally posted in 2010 for Platinum subscribers and has been archived there since. It's the first essay associated with Valuation Theory being brought out from behind the firewall. Please feel free to ask questions in the comments, or preferably on the newly upgraded site forum.
Much of my preseason presentation will involve discussing a player’s value. The signature element of the system is the way we set the pool size by assigning value only to useful statistics. A useful statistic is designated as production over and above that which can be had for free. That which can be had for free is specified by the replacement level player. Since this concept is so integral to our system, it is worthwhile to spend a little time focusing on replacement player theory and understanding its application and ramifications.
When I first embarked on the journey to fully comprehend the science of player valuation, the most difficult hurdle was grasping why some valuation processes gave negative value to players with positive counting stats. A player hits a home run, drives in a run or steals a base and he helps your fantasy team. How could that be worth a negative dollar amount? The answer lies in replacement player theory.
Let us begin by pretending the available player pool is just sufficient enough to provide each team with ample players at every position. Not all the statistics in that pool are useful. A certain amount of each statistic is shared by every team in the league. These shared statistics are not useful, that is they do not help you to achieve rotisserie points. Think about a typical weekly football pool where you pick the winners of all the games. If everyone picks the same team, that game does not matter. All the participants get a win or a loss.
Because a standard rotisserie roster requires players to occupy a roster spot defined by their positional eligibility, it is necessary to compare useful statistics position by position. If the useful statistics contributed by one position are different than another, the value earned by these two players is different. They may both contribute the same number of raw statistics, but they provide a dissimilar amount of useful statistics. Remember, value is only awarded to useful statistics.
A simple way to illustrate this principle is to envision a home run derby league with four available players, two catchers and two outfielders. We both need one of each. One catcher hit 30 homers, the other 10. One outfielder hit 30 homers, the other 20. You have first pick, who do you choose?
Of course, you opted for the 30-homer catcher. Of his 30 dingers, 20 would be useful as I will be stuck with the 10 from the other catcher. I’d then take the better outfielder, but he nets me a paltry 10 home run advantage over the lesser one. So, we each drafted a player that knocked 30 out of the yard, but you win as your catcher’s production as compared to my catcher surpassed the advantage my outfielder gained over yours.
Now pretend the above example is for an auction league and you have $260 from which to bid. You throw out the first name and bid $259 for the better catcher. Not that it matters, but I’ll take the superior outfielder for $2 and the lesser catcher for $1, leaving you with the $1 outfielder.
This brings us to the concept of a replacement level player. In theory, the inferior catcher and outfielder in the above scenario have no value as they supply no useful home runs. The caveat is we are forced to spend $1, even though their value is $0, thus the introduction of the replacement level player. Instead of subtracting out the statistics shared by the positively valued player pool, we subtract away the statistics of an imaginary replacement player whose estimated performance is dictated by the best remaining non-drafted players. As suggested, each position has its corresponding replacement level player.
One of the most commonly debated topics in fantasy baseball is positional scarcity. There are a couple of different types of positional scarcity. One focuses on the perception that a player pool does not contain sufficient positively Replacement Player Theory valued players at each position. The second concentrates upon the overall talent of a position or perhaps the large drop-off of talent after the top few players. The former is a facet of player valuation and can be broached using replacement player theory. The latter is really a strategy-oriented entity.
The term perception was carefully chosen to suggest some player pools lack enough draft-worthy players at the so-called scarce positions. By applying replacement player theory, this perception is more properly labeled a myth. One of the principle rules of our valuation system is that a player pool is composed in such a manner that there are exactly enough players at each position for every participant in the league to field a legal lineup. In short, there is no player scarcity—everyone has a player of positive value at each position in their lineup. This should make some obvious sense – a player has positive value if he can be rostered and does not if he cannot, regardless of what their raw stats are.
The best way to convince yourself this must be a condition of a viable valuation process is to think about how you would assign value if there were no excess players at all available. That is, every Major Leaguer had to be on an active fantasy roster. The worst players at each position would be valued at $1 and everyone else would be scaled upward. The possibility exists that $1 players at different positions would be of varying quality.
Now think about how the setup really is with extra players in the pool. The best non-drafted players at each position comprise the replacement player pool and can be valued at $0. As just illustrated, depending on the depth of the player pool, it is quite possible for there to be different levels of replacement players by position. Here’s the key. After you take away the production of the replacement level player at each position, you should be left with a similar number of useful stats at each position for the $1 player. That is, $1 players may have different raw stats by position, but they have the same amount of useful stats. The thing is we don’t just score the useful stats, we score all the stats. We just don’t assign value to all the stats.
Taking one more opportunity here, one could say that a player’s value is determined not by the raw value of their statistics but by the opportunity cost given up acquiring that player. That is, if instead of drafting player A, I instead waited until replacement to fill that slot, how much extra am I buying?
Putting this all together, it is now possible to understand why a home run from a catcher is worth more than a home run from an outfielder. For simplicity, let us again think of things in terms of a home run derby league. The replacement level for catchers is far inferior to that of outfielders; therefore, fewer homers are taken away from the raw total per catcher than per outfielder. It was just explained that the number of useful homers of a catcher and outfielder of the same value is the same. When the number of homers taken away due to the replacement level player is added back, it follows that outfielders and catchers of the same value hit a different number of home runs. Specifically, since the replacement level for catchers is less than that for outfielders, the raw total is also less. This means that a catcher earning the exact same amount as the outfielder needed to hit fewer homers to attain that value. If the dollar value is expressed as dollar per homer, a home run from a catcher is worth more than a home run from an outfielder which was the original premise we set out to prove.
Let’s use some real numbers. Pretend the number of useful homers a $20 ballplayer hits is 20. Let’s say the catcher replacement level is 2 and the outfielder level is 6. A $20 catcher hits 22 homers while a $20 outfielder hits 26. Dividing $20 by 22 yields each catcher homer being worth 91 cents. Dividing $20 by 26 means each outfielder’s homer is worth only 77 cents, 14 cents less than that from a catcher.
One could fairly say that a player’s value cannot be determined by their relation to replacement alone, that there are other factors implicit in the value determination. They would be correct. However, as a baseline for setting value, we need a process which values players against each other on a fixed level. Adjustments thereafter can and should be made to take other factors into account, but they are strategic in nature.
In summary, a proper valuation system will account for the myth of position scarcity and set the player pool to render ample players at every position. The repercussions of this are that players of the same value produce at varying levels according to position.
The National Fantasy Baseball Championship offers satellite auctions, run by Andy Saxton. They're all 15-team mixed, both Draft Championship style (draft and hold, buy a 23-man active roster, draft 27 reserves, no FAAB) and standard (in-season FAAB, buy 23-man active roster, draft 7-man reserve). They come at $125, $250 and $500 price points. Click HERE for the list of available auctions and to sign up.
Below is the average auction value for the leagues to date. This will be updated as Andy completes more leagues. Click HERE to download the data in spreadsheet form.
|McCullers Jr., Lance||P||11||10||12||10||6||7||9||6||8.9||154||152.5|
|Souza Jr., Steven||OF||5||8||5||9||5||8||2||8||6.3||201||185.6|
|Taylor, Michael A.||OF||4||4||6||11||4||3||2||3||4.6||223||224.8|
|Bradley Jr., Jackie||OF||6||4||4||2||3||6||5||4||4.3||231||283.0|
|Oh, Seung Hwan||P||3||0||0||3||0.8||342||499.4|
|Edwards Jr., Carl||P||2||0||0||4||0.8||344||345.1|
|Almora Jr., Albert||OF||1||0||0||0||0.1||421||375.1|
About a week ago, I detailed my process for generating hitting projections. Now it’s time to do the same for the pitchers.
Like with hitting, skills are expressed as a rate stat. Hitting used plate appearances, as does pitching, though I’ll express it as per innings.
Using strikeout and walks as the example, K% and BB% are better than K/9 and BB/9 to get the true skill level. For those unaware, K% and BB% use batters faced (essentially plate appearances) as the denominator. It’s subtle, but K% is a better indicator than K/9. A pitcher allowing more runners faces more hitters, availing more chances to punch them out. Think of it this way. Two pitchers carry an identical 8.1 K/9, whiffing 180 in 200 innings. One faced 800 batters, akin to about a 1.10 WHIP while the other faced 844 hitters, equating to about a 1.30 WHIP. The first posted a 22.5 K% while the second registered a 21.3 K% mark. This is like two batters each garnering 160 hits, but one needing 550 AB (.291 average) while the other required 580 AB (.276 average). Which is the better hitter? Of course, the former. Well, the difference between .291 and .276 is the same as 22.5% and 21.3%.
The engine projects K% and BB%, but I also project batters faced per inning. K% and BB% can easily be converted to K/9 and BB/9. It’s easier for me to project innings when doing pitching playing time, so while technically I use K% and BB% in the projections, the final projection takes K/9 and BB/9 out to K and BB using innings.
Pitching projections utilize the same three-year stat spread and weighted average as hitting. Similarly, MLEs are used to fill in the blanks for prospects. Finally, composite park factors and aging adjustments are incorporated in the same manner.
A common theme with pitching will be regression, even more so than with hitters. The sample size of the different events associated with throwing a baseball is small, even for a workhorse starter. Outcomes don’t always sync with skills. Thus, almost all the components require regression to best frame what’s likely to happen.
Please keep in mind I’m a bit of a obstinate stickler with respect to the term regression. It’s come to mean “play worse” in the fantasy lexicon. In my Utopia, regression would have the specific meaning of correcting for outcomes out of the pitcher’s control. Admittedly, with improvement in data collection and analysis, we’re learning more and more about the proverbial luck versus skill delineation, but we can only go by what we know at the current time. My default level of regression is 50 percent. That is, the projected number is the average of expected and actual. I’ll then massage as appropriate, but always with a reason.
With that as a backdrop, let’s go through the four basal skills intrinsic to a pitcher’s projection: home runs, strikeouts, walks and hits. From there we’ll move onto the standard roto categories then hit some of the stats used in points-leagues scoring.
While hitters set their own home run per fly ball baseline (HR/FB), pitchers cluster around the league average. As you know, this is on the rise:
As an aside, hit types are still determined subjectively. Soon, they should be designated via objective criteria. Until then, some number, like HR/FB may differ between data sources. The key for analysis is using the same source for the research, of in this case, projections.
Based on the number of fly balls a pitcher allows, an expected number of homers can be determined. After some park neutralization, the actual and expected number of long balls are averaged. After being divided by batters faced, the park-neutral HR% is calculated. This will eventually be converted to HR/9 for the final projection.
Some elegant studies show strikeouts are proportional to Swinging Strike Rate (SwStr) with an influence of First Pitch Strike Rate (FpK). I have a formula using this data to generate an expected K%. Again, a park neutral K% is computed then regressed with actual K%.
The great research team at Baseball HQ demonstrated a similar relationship between the number of balls thrown and BB%. I have developed an expected BB%. You know the rest.
There’s a reason hits are discussed last. Similar to how I project hits for batter, I use batting average on balls in play (BABIP) for pitchers. As you can likely surmise, more specifically, expected BABIP. As discussed in the hitting essay, I have data breaking batted balls into multiple classifications: groundball, infield line drive, outfield line drive, fly ball, bunt and popup. All but bunt and popup are broken into soft, medium and hard hit. The league average for each subset is determined and they employed to calculate an expected BABIP. After the usual park neutralization, a park-neutral BABIP is determined and plugged into this formula to derive hits.
Hits = (AB – HR – K + SF) x BABIP + HR
There’s a couple of components needed not discussed yet, namely hit by pitch (HBP) and sacrifice fly (SF). They’re just a three-year weighted average like what was done with hitting.
So, now I have expected hits and actual hits, all that’s left is to regress, blah, blah, blah.
The neutral H/9 and BB/9 are treated with the aging and park adjustments to get projected hit and walk rates. This is a bit circular, but based on the projected innings, projected hits and walks are determined, which are then summed and divided by projected innings to generate projected WHIP.
I use a modified expected ERA formula, using the aforementioned described skills to derive expected runs. This is regressed to actual neutralized earned runs to land on projected earned runs which gets the aging and park alteration for the final numbers. Using projected innings, the projected ERA follows.
This isn’t perfect, but I’ve been using it for over a decade and it works as well as any other method I’ve seen. Many years ago, Bill James came up with a manner to estimate team wins using what’s now knows as Bill James Pythagorean Theorem. It incorporates runs allowed and runs scored to calculate an expected winning percentage. To get wins, the winning percentage is multiplied by the number of decisions.
Let’s start with runs allowed. Above, earned runs are projected. Using a team defense factor, I generate the number of runs. Next is estimating a bullpen component. The number of runs allowed while the pitcher in question is in the game Is added to the bullpen projection. I now have total runs allowed.
Runs scored is simply an estimation, based on previous season’s numbers and how the team has improved or declined.
Decisions are proportional to the number of projected innings, using 9 x 162 in the denominator. It’s not perfect, but correlation studies show it’s reasonable.
Calculating wins for starting pitchers plugs all this into the standard Bill James formula. Relievers are tricky, since set up men and closers have a greater chance to lose games than win them, based on their usage. As such, I flag all relievers projected for holds and saves and apply a modified Bill James formula.
Based on some research I’ll present in an upcoming Z Files, available for Platinum subscribers, there’s some science involved with projecting saves. The short version is percentage of wins that are saved correlate best with team ERA. Using this, I generate a projected team saves total. A percentage of saves projection is made for relievers, which is multiplied by projected team saves to yield the saves projection.
Currently, I haven’t found any relationship between team wins or saves and holds. It looks to be a matter of how each manage deploys his bullpen. Some use more lefty specialists, some rely on their best setup reliever to work more than an inning. As such, holds are projected manually, on a player-by-player basis.
There’s a couple of formulas available on the web to derive quality starts (QS). Each year, I look to see which did the best job of back-projecting the previous season’s number of QS and I’ll use that. I’ll be interested to see how well these hold up with the current trend of pitchers throwing fewer innings. There are coefficients in each that could need tweaking with the changing landscape. With many leagues incorporating QS into their scoring, I want to make sure I provide a usable number.
Compete Games, Shut Outs, No hitters
Yes, some leagues give points for no hitters. No, I don’t project anyone to toss a no-no. I will project CG and SO using historical data, but it’s more a guess than scientific.
Singles, Doubles and Triples
Some points leagues score this so I need to project it. Homers are done as discussed while singles are hits minus extra base hits. That leaves doubles and triples. These are park-neutralized then adjusted via BABIP before the usual aging and park changes to yield the final projection.
All that’s left are innings. For starters, I use the past three seasons to derive an innings pitched per start number. It’s not always the three-year weighted, but that’s the starting point. I’ll tweak as I do each pitcher’s projected games started. Relievers are done on a pitcher-by-basis, based on past and expected usage.
As with hitter’s plate appearances, I try to keep each team reasonable, but I no doubt over-project some staffs. Most of the time, there’s a sixth and seventh starter pushing the total team starts over 162. They’re very likely to pick up starts as in injury replacement, I just don’t know who will get hurt. All I can do is give an honest appraisal.
That should do it for the Mastersball projection process. It’s a fluid process, constantly undergoing changes as more data is available to refine regressions. I’ve been asked on several occasions over the years how well it stacks up against other models, as well as wondering if I back-test against the previous season. The answers are I don’t know, and no. This usually disappoints the person asking, but it’s the truth. The primary reason is I have yet to see a grading system that adequately scores the components of projections. Some use rate stats, but that ignores the diligence of playing time estimations. Some use raw numbers, which are also influenced by playing time as well as luck. I suppose the obvious follow-up is why don’t I devise a system that scores playing time and basal skills. It’s a fair question. The answer is I know intuitively if there’s a deficiency in a specific area; I don’t need to quantify it. Early on, I could sense where the projections were faulty. Over the years, I’ve refined the process to the point my time is best spend boning up on the new research and incorporating the results into the engine, usually to further refine regressions.
Next up is bringing the valuation methodology out from behind the firewall.
Questions? Concerns? Criticisms? Hit me up in the comments, or preferably on the newly renovated site forum.