How to Conquer Daily Fantasy Football Competitions using Optimization— Part One

Methods for determining an optimal lineup for Fanduel, DraftKings, or Yahoo DFS

Riley Howsden
11 min readSep 28, 2021
Photo by Adrian Curiel on Unsplash

Football season is back, which means Daily Fantasy Football is back as well. This year, you’re eager to show your spouse, friends, and whoever will listen to you for more than five seconds that you know how to pick ’em. But do you? Do you really know what you’re doing? Your gut indeed convinces you every week that you’ve chosen the “optimal” lineup, but time and time again, you’re dismayed by the results. What does that “optimal” word even mean? Let’s find out.

Of course, you’re probably eager to jump in the deep end and optimize an entire fantasy football lineup so you can make mad cash this upcoming weekend but hold onto your horses. We first need to prove to ourselves that we can solve a more straightforward problem. Instead of looking to the future, we should first look to the past, where we already know each player’s outcome, and see if we can find the optimal lineup. But what is the value of that? Nobody has gotten rich by optimizing the past! The hard truth is if you can’t find the most favorable lineup from a set of known values, you sure as hell can’t do it with the additional uncertainty baked in.

Simplifying the Problem

Okay, so you’re convinced that we should figure out how to optimize on the past first. We know the salary cap and are ready to pick the entire lineup. Whoa again, the whole lineup? That still seems a tad ambitious. Is there a less complicated problem that we could solve first? Perhaps a situation where, given a salary, we choose the best quarterback? Hmm, maybe a bit too easy. After all, we could filter out any quarterback outside of our salary range and select the player with the max number of points scored from the remaining. Voila, we’re magicians! Let’s step it up a bit; what if, given a salary, we had to choose the best three wide receivers we could afford? Take a few minutes to ponder how you might do that.

To make this more practical, let’s focus on a specific week of player scores from the NFL, week one of the 2021 season. You can find the salaries and fantasy points from FanDuel HERE. Of course, all methods presented below will work regardless of the week or daily fantasy provider, so feel free to try this out in any scenario. For now, we will stick with the dataset mentioned above so you can follow along. Once you have the data and you have had some time to create a solid strategy for choosing your golden trio of wide receivers, let’s discuss the possibilities.

Manual Selection

The first thought, and quite frankly, the one we’re trying to distance ourselves from; we could take a “hands-on” approach to determine a great lineup. We could start with players we like, mix and match them around a few times, and see which combination scores the most points while staying under the salary threshold. To speed up the process, though, we could rank players by a simple heuristic: “Salary per Point.” We are most likely to choose players with low values in this statistic as they are the most efficient. For our current dataset, the top players are:

For example, it costs about $204 for Amari Cooper to score a point and about $284 for Adam Thielen to score a point; one is a better investment than the other. Of course, we could choose the top three from this list, but we need to deal with the salary cap. There are three possible outcomes in selecting the top three players by this heuristic:

  • Their cumulative salary equals the cap; congratulations, you’ve found the optimal solution for this salary.
  • Their cumulative salary is less than the cap; there may be more expensive players who are less “efficient” but still score more points — worthwhile to try and swap for these players if you can stay under the salary threshold.
  • Their cumulative salary is more than the cap; uh-oh, this violates our constraints — we need to find less expensive players to swap for that are, unfortunately, less efficient as well.

While this gives us a better guardrail for selecting players, it doesn’t scale well. We might end up lucky for a given salary, but how would we adapt if we enforced a different salary cap? We’re back at square one — even with the heuristic guiding us, it is still manual work. On top of that, it feels like we would be more comfortable stating, “I think this is the best option” instead of “I know this is the best option.” Let’s change that!

Brute Force Selection

In the previous section, our strategy has been an approximation. While it is still possible to end up with the most optimal lineup, there is no guarantee. What is the simplest solution we can think of that has that guarantee? Well, we could iterate through each possible combination, determine if they meet our salary threshold, and store the lineup if it is better than all previous lineups. The code would look something like this:

def brute_force_selection(salary_cap, wr_list):
max_points = 0
best_players = []
size = len(wr_list)

for i in range(size):
for j in range(i + 1, size):
for k in range(j + 1, size):
salary = (
wr_list[i]['FD salary'] +
wr_list[j]['FD salary'] +
wr_list[k]['FD salary']
)
points = (
wr_list[i]['FD points'] +
wr_list[j]['FD points'] +
wr_list[k]['FD points']
)
if salary <= salary_cap and points > max_points:
max_points = points
best_players = [
wr_list[i]['Name'],
wr_list[j]['Name'],
wr_list[k]['Name']
]
return best_players, max_points

There isn’t much exciting stuff to point out here as this is mostly the bread and butter of a brute force method; the only exception is the “i + 1” and “j + 1” search ranges. If we are only interested in the combinations of players instead of the permutations, we don’t need to return to players “before” the current selection. Still, since there are 160 receivers and we have to limit ourselves to three, the combination count nears 700k. Is that bad? While still possible, it doesn’t seem like this method will scale as we approach more extensive lineups.

??Random Selection??

In scenarios with a large number of combinations, one tactic I have seen someone approach firsthand is to simulate lineups randomly. In the face of brute force computation, one might assume this to be a reasonable approximation. The task is carried out by considering a subset of combinations at random, say 1000, and tracking the maximum value while iterating through them. At the end of this process, we assume that the best from the simulations is “close enough” to optimal. In this way, we can avoid searching the entire space. With some minor modifications from our brute force code above:

def random_selection(salary_cap, wr_list):
max_points = 0
best_players = []
size = len(wr_list)

for i in range(size):
for j in range(i + 1, size):
for k in range(j + 1, size):
salary = (
wr_list[i]['FD salary'] +
wr_list[j]['FD salary'] +
wr_list[k]['FD salary']
)
points = (
wr_list[i]['FD points'] +
wr_list[j]['FD points'] +
wr_list[k]['FD points']
)
if salary <= salary_cap and points > max_points:
max_points = points
best_players = [
wr_list[i]['Name'],
wr_list[j]['Name'],
wr_list[k]['Name']
]
return best_players, max_points

However, a question still looms: just exactly how good of an approximation is this? For our current problem, I decided to run through a bunch of simulations on top of this sampling method to see how well it performs:

That isn’t very reassuring; even after 1000 simulations, we’re only getting about 3/4th of the optimal value. In truth, how well this approach works will be heavily dependent on the underlying data. For example, if there were minor variance across player scores, this would not perform so poorly. However, the general intuition should be that random sampling is not an excellent tool for optimization problems.

Pruning the Search Space

There has got to be a better way to search this space without having to sample. Like what we did with the heuristic during our manual selection, we can prune some of the receivers from our list on intuition alone, making our brute force method more tractable. How would we do that? Here is the critical thought: if at least three players have higher scores than a given player and have salaries equal to or less than that player, remove that player as an option. We can guarantee that this player will not be part of the optimal set, and discarding them limits our search space. For example, there are 33 wide receivers in our list with a salary of 4500; we only need to keep the top three point-wise. Our code for pruning our receiver list would look something like this:

import heapqdef prune_players(player_list):
player_list = sorted(player_list,
key=lambda x: (x['FD salary'],
-x['FD points']))

player_list_short = []
top_three = [0, 0, 0]
heapq.heapify(top_three)

for i in range(len(player_list)):
low_value = heapq.heappushpop(top_three,
player_list[i]['FD points'])
if player_list[i]['FD points'] != low_value:
player_list_short.append(player_list[i])

return player_list_short

The process above reduced the possible receivers from 160 to 15; now, brute force doesn’t seem daunting. If we were to rerun the previous function on the pruned player list, it would be about 1000x faster in finding the optimal solution.

Scaling Issues

Imagine returning to the real-world problem for just a second, optimizing our fantasy lineup as a whole. Does any of our strategies above scale well once the situation becomes more complicated? After all, we need to choose a full roster:

  • One quarterback (QB)
  • Two running backs (RB)
  • Three wide receivers (WR)
  • A tight end (TE)
  • A flex position — RB, WR, or TE (FLEX)
  • A defense (DEF)

We could carry out the pruning and selection process at the position level, returning a list of the best points for each salary and position. Once we have the salary and point pairs for each position, we could treat them like players as in our previous step. Now we can use the brute force method to determine the amount of salary to allocate to each position, which, in turn, chooses the players for that position.

We would be successful in picking the best lineup under one assumption: none of the positions have a dependence on each other in the selection process. See that flex position, where we can select an RB, WR, or TE? That minor complication thwarts our grand scheme. Unfortunately, many daily fantasy competitions have these random nuances, such as MVP and PRO players who get a 2.0x or 1.5x multiplier or extended flex positions such as SuperFlex (includes QB in flex) or AnyFlex (anyone), so we will need a magical framework to deal with them. Luckily, as long as we can adequately express a problem’s constraints, that magical method exists. Let’s try it!

“Magical” Selection

Up until now, we’ve represented our players as two vectors: a points vector and a salary vector. However, we need an additional vector, a selection vector, that indicates whether we have decided to select the player or not. All three vectors for our data are below:

You may have noticed that the selection vector (x) takes on a binary integer value, either 0, we selected that player, or 1, we did not. From these vectors, we can generate two constraints that must be satisfied:

  1. The sum of the selection vector equals 3; this represents that we must select exactly three wide receivers.
  2. The sum of the product of the selection and the salary vector is less than the salary cap; we can not select three players if they break the salary cap.

Alongside the constraints, we also need an objective function:

  • Maximize the sum of the product of the selection and the point vector; we want the highest amount of points we can get!

It may already be apparent to some that this can be solved using integer programming (IP). Many open-source tools, such as OR-Tools, Pyomo, or PuLP, can aid us in formulating a general structure that we can send to various mathematical solvers. For the rest of the article, we’ll be using the Python implementation of OR-Tools from Google. Note that OR-Tools is also available in C++, Java, and C#.

Setting Up the IP Solver

Jumping into the thick of it, we would initialize the solver as follows:

from ortools.linear_solver import pywraplpsolver = pywraplp.Solver.CreateSolver('SCIP')salaries = [player['FD salary'] for player in player_list]
players = [player['Name'] for player in player_list]
points = [player['FD points'] for player in player_list]
size = len(players)

The most curious piece in the setup is the selection of ‘SCIP’ in CreateSolver — this specifies the solver that the system will use after we’ve identified all the constraints. While there are many open-source and proprietary solvers, such as Gurobi or CPLEX, we’ll be using SCIP for this example, which you can learn more about here. The good news is that all solvers accommodate the same input structure in OR-Tools, so we can easily swap out and test other solvers if we have access to them.

Add Constraints

Our first task is to express the constraints mentioned above in terms that the solver can ingest. To set up the binary problem, we can start with:

x = {}
for i in range(size):
x[i] = solver.IntVar(0, 1, players[i])

The code here sets up an integer variable with a lower bound of 0, an upper bound of 1, and the player’s name. Of course, we loop through all players, and “x” effectively represents the selection matrix we mentioned earlier. Now we need to constrain the number of players that can be in the “on” position:

solver.Add(
solver.Sum([x[i] for i in range(size)]) == 3
)

We also want to ensure that we can pay the players we’ve selected, so we set the salary constraint:

solver.Add(
solver.Sum(x[i]*salaries[i] for i in range(size)) <= salary_cap
)

Add Objective

Now that we’ve set the constraints, we need to specify what we want the solver to optimize. In our case, we want it to maximize the overall points for the selected players:

solver.Maximize(
solver.Sum([x[i]*points[i] for i in range(size)])
)

Solve

Now that we’ve parameterized the constraints fully, all there is left to do is let the solver do its thing and output the results!

status = solver.Solve()

best_players = []
max_points = 0
if status in (pywraplp.Solver.OPTIMAL, pywraplp.Solver.FEASIBLE):
for i in range(size):
if x[i].solution_value() > 0.5:
best_players.append(x[i].name())
max_points += points[i]

As with our other selection methods, given the salary cap and the list of players, this will output the three best players and the points they scored, which in this case, we know is the optimal solution.

Complete Code for the Three Wide Receiver Lineup

from ortools.linear_solver import pywraplpsolver = pywraplp.Solver.CreateSolver('SCIP')salaries = [player['FD salary'] for player in player_list]
players = [player['Name'] for player in player_list]
points = [player['FD points'] for player in player_list]
size = len(players)
x = {}
for i in range(size):
x[i] = solver.IntVar(0, 1, players[i])
solver.Add(
solver.Sum([x[i] for i in range(size)]) == 3
)
solver.Add(
solver.Sum(x[i]*salaries[i] for i in range(size)) <= salary_cap
)
solver.Maximize(
solver.Sum([x[i]*points[i] for i in range(size)])
)
status = solver.Solve()

best_players = []
max_points = 0
if status in (pywraplp.Solver.OPTIMAL, pywraplp.Solver.FEASIBLE):
for i in range(size):
if x[i].solution_value() > 0.5:
best_players.append(x[i].name())
max_points += points[i]

Summary

After all this work, it may seem discouraging that we have only solved a subset of the total optimization problem. However, we have built a solid foundation that we can use to impose additional constraints. In an upcoming article (now live HERE), I’ll be extending this framework to dig into the details of optimizing an entire lineup, so stay tuned!

All images by the author unless noted otherwise.

--

--

Riley Howsden
Riley Howsden

Written by Riley Howsden

“Half-Stack” machine learning propagandist in the gaming industry — a critic of all data presented deceitfully, unless it contains a meme, then it must be true.

No responses yet