No Genie App? No Problem! How to Maximize Your Day at Disney
Constructing an optimal route through Disney using graphs and combinatorial optimization
If you have visited any Disney park in your life, there is a decent chance you have encountered the following situation: one of your friends or family members crafting the “perfect” plan for the day. Either they spent way too much time beforehand carving out a route to the top attractions, or they are frantically trying to assess what is best immediately after grabbing a park map. As excessive as this may seem, the planner’s goal is to ensure that the group gets to the best attractions while minimizing the time spent in the park. The recently released Disney Genie app should have the ability to accomplish this, as theoretically, it could have all the information about the current state of the park and your visit, including:
- The distance between any two attractions
- The rides and experiences that you and your friends or family prefer
- The current wait time for every ride
Unfortunately, Disney Genie’s attempt to provide a personalized experience seems to be falling short of parkgoers’ expectations. Do we even need the Genie app, though? Could we optimize our day without it?
Building a Better Genie
While I have no inside details on how the Genie app works, I would like to explore how one might construct optimal itineraries without the app. Can we set up a structure that maximizes the value gained (rides visited) while minimizing the cost (time in the park, distance walked)? The neat thing is, if we can find a solution, we can use the same process for any theme park, shopping mall, or grocery store to help decide an optimal path. Let’s get started!
A Baseline Scenario
First, it is essential to point out that there are many moving parts when visiting a Disney theme park, and we won’t be able to address them all in this article alone, so we’re going to start with the most basic scenario and build from there. It will be unrealistic, but it will give us the core tools to develop more sophisticated solutions in the future. Here are the assumptions we will start with — no one else is in the park, and we will assume there is plenty of time to ride all of the attractions; all we need to do is walk between them. Our objective is to minimize walking distance. Where would we begin?
Representing the Disney Park as a Graph
One piece of information will be beneficial; a graphical representation of the park that maps out the distance between attractions and other points of interest. The most accurate way to do this would be to take a measuring wheel to the park itself and manually record each distance. Since that would take a substantial amount of time, we will use a less involved but still tedious solution — Google Maps. We can represent each attraction by a node on our map and edges as walking routes within the park. We can include additional nodes that designate commonly traveled intersections as well. Here is an example for Disneyland:
Some Disney enthusiasts may point out that this graph isn’t entirely accurate. We ignored some routes that ran parallel to others and pathways that did not appear to be a part of an optimal route between any two attractions. As for the attractions themselves, ones with variable ride lengths have been left out entirely. For other attractions, if they happened to be close enough to an intersection, we approximate their location by placing it at that junction. We can see all of these cases by observing New Orleans Square more closely:
Notice that many paths run parallel to the “Rivers of America” waterfront, but we have only included one. Additionally, some routes enter the shopping and food section to the south, but it shouldn’t be hard to see that these aren’t optimal routes to any attraction. Also, we do not include Tom Sawyer’s island as an attraction for reasons that will become more apparent later on. Finally, the Pirates node is further north than the ride entrance. We make these approximations in the interest of time, but if someone wanted to create a more accurate map, all that follows would still be applicable!
Extracting Distances from Google Maps KML Output
We can click on any edge within our map and see the measurement between the two points. However, we need a way to extract all of those distances. Fortunately, Google provides a way to export a KML file, an XML notation commonly used for geographical data. A slice of our map’s KML file looks like this:
While these files can contain a lot of metadata, we are solely interested in the coordinates along our paths as we can use them to calculate the total distance. We want to structure our data in a tabular format, with each edge represented by (location_A, location_B, distance_between). The following Python code extracts the relevant information from the KML file and creates a list of pair distances.
Computing Shortest Paths Between Attractions
The distance between each set of bordering points is a crucial step, but we need to determine the shortest path between each set of attractions. We could handle this in a few ways, such as uploading our nodes and edges to a graph database: Neo4j, ArangoDB, or Dgraph are some options. Each consists of the proper algorithms to determine the shortest path between two points. Since our data is relatively small, we will use the NetworkX Python package instead to make these calculations. Following the KML code snippet, we can get the shortest distances between attractions as follows:
The vital piece here is the shortest_path_length function, which defaults to Dijkstra’s algorithm for determining shortest paths. We want to filter out any node that isn’t an attraction, and since we used a numeric labeling structure to identify these, we check the source and target strings for any digit. The output of this step is a symmetric distance matrix which is the core component for this optimization.
Setting Up The Traveling Genie Optimization
Now we can start on the optimization itself. We hope to minimize the total distance one needs to walk to ride all the attractions as mentioned above. Curiously enough, this matches a standard combinatorial optimization problem within computer science — the traveling salesman problem (TSP). The main idea for TSP is, given a list of cities and the distances between each set of cities, what is the shortest possible route that visits each city exactly once and returns to the starting city? Our scenario differs slightly from the generic TSP problem; if the fastest way between two attractions passes through a node we have already visited, we’re allowed to pass through again without stopping. Formally, this is known as “Graphical TSP.” We can leverage many python libraries to formulate this optimization problem; Pyomo, PuLP, and OR-Tools are the most common. We will move forward with OR-Tools, an open-source project developed by Google.
The first step is mostly book-keeping — transforming our matrix into a list of distances and attraction names that OR-Tools can understand. For the time being, we will identify the starting node as “Entrance.” Next, we implement an index manager, which attaches the solver’s internal indices to the number of attractions. The inputs are the attraction count, the number of “vehicles” that will stay at one to denote a single group of people, and the starting location. We also initialize the routing model with the index manager.
from ortools.constraint_solver import pywrapcpdf = shortest_distances()
distance_matrix = df.values.tolist()
attraction_names = df.columns.tolist()
starting_node = attraction_names.index('Entrance')index_manager = (
pywrapcp.RoutingIndexManager(
len(distance_matrix),
1,
starting_node)
)routing_model = pywrapcp.RoutingModel(index_manager)
We need to give the routing model the ability to determine the distance between two points, creating a distance callback that we register with the model. We also need to set the arc cost evaluator, which calculates the cost of travel between any two locations. For our example, that cost is the distance callback itself, but this will change as we move into more sophisticated scenarios.
def distance_callback(start_index, end_index):
start_node = index_manager.IndexToNode(start_index)
end_node = index_manager.IndexToNode(end_index)
return distance_matrix[start_node][end_node]distance_callback_index= (
routing_model
.RegisterTransitCallback(distance_callback)
)routing_model.SetArcCostEvaluatorOfAllVehicles(distance_callback)
Finally, we can set a few different search parameters, such as time limits, local search options, and the strategy for finding the first solution. Note that the first solution will probably not be optimal, but it helps set up our solver for success in future iterations. Once we have set these search parameters, we prompt the routing model to find a solution.
from ortools.constraint_solver import routing_enums_pb2search_param = pywrapcp.DefaultRoutingSearchParameters()
search_param.first_solution_strategy = (
routing_enums_pb2.FirstSolutionStrategy.PATH_CHEAPEST_ARC)
search_param.local_search_metaheuristic = (
routing_enums_pb2.LocalSearchMetaheuristic.GUIDED_LOCAL_SEARCH)
search_param.time_limit.FromSeconds(5)solution = routing_model.SolveWithParameters(search_param)
And, that’s it! Seriously! We have set up the model and constraints for the graphical TSP in a matter of minutes. Of course, we will also need to write code to print out the solution path, which I have included in the complete program at the end of the article. But first, we are eager to know what the “best” route is for our visit to the park. Here is what the program itself outputs:
The details here suggest that the quickest route through the park while attending each attraction is a total of 11,592 feet. The general solution should not be too shocking as it effectively wants you to circle the outskirts of the park, riding each attraction as you encounter it. Highlighting the edges traveled on Google Maps, we can see the route more clearly:
Additional Modifications
The solution so far is rather underwhelming; after all, this should be relatively close to the heuristic that most Disney planners will come up with, given they intend to ride everything. We need to start addressing reality to make the scenario more interesting. Some questions we might ask:
- “What if we are only interested in a subset of the attractions?”
- “How long will we need to be in the park to ride everything?”
- “What if we limit the time that we can be in the park?”
- “What if we prefer some rides over others?”
Determining Total Time in Park
We have focused on minimizing the distance traveled by foot, but it will be easier and more insightful to work with time. To convert to time, we will need two pieces of information. The first is somewhat dependent on the traveler, walking speed, and the second is a list of ride lengths. Once we have the walking speed, we can determine the time of each path using basic algebra. As for rides, we only need to append the ride length of the target node to our overall time for that path. For example, if we are traveling from Splash Mountain to the Haunted Mansion, the total time would be walking time plus the duration of the Haunted Mansion attraction. Our distance matrix is no longer symmetric with this modification, but this is not an issue!
Adding a Time Constraint
Now that we have switched our objective to be time-based, we can add a constraint on the total hours allowed in the park. We handle the extension within the optimization code itself in two steps. The first is we need to tell it to the routing model:
routing_model.AddDimensionWithVehicleCapacity(
time_callback_index,
0,
[3600*hours_in_park],
True,
'TimeCapacity'
)
The first argument of this added dimension, the callback function, informs the system that we are setting time constraints. For our scenario, this is relatively easy as we use a similar callback function as before. The third argument allows us to specify the time limit for staying in the park. If we set this time to be lower, the optimization will drop some of the ride nodes. We need to add some code to allow for this, logging a penalty whenever it occurs:
penalty = 10000for node in range(1, len(distance_matrix)):
routing_model.AddDisjunction(
[index_manager.NodeToIndex(node)],
penalty)
The idea here is simple; we add the possibility of a disjunction for every attraction. The solver can now remove any node (create a disjunction), but it must pay the penalty. We choose the penalty to be magnitudes higher since we still want to ride as many attractions as possible; we will only accept a disjunction as a last resort. If the value is too small, we may drop more locations than necessary. To show the constraint in action, assume we have given ourselves five hours in the park — essentially, we can ride everything:
Note that the optimal route here requires 4.94 hours in the park. What happens if we set the allowed time to 4.9 hours? We will have to get rid of at least one attraction.
“Rise of the Resistance” is cut from the list. Since this attraction is located in the far back corner of the park and has a lengthy ride time, it is intuitive to drop it from the route. What if we only had 30 minutes in the park?
As expected, we drop the majority of attractions and make a beeline towards a high ride density area with the shortest lengths, Fantasyland
Capturing Different Ride Preferences
In the last section, we saw that “Rise of the Resistance,” a popular ride, was dropped from our route. Many parkgoers would likely prefer certain attractions over others, and currently, we are setting every ride as equal in priority. We need a way to capture our preference for different rides to ensure that we are less likely to drop the most sought-after ones. Fortunately, we can modify the penalty code from the last section to achieve this goal. If we have a list of ride priorities, say zero through ten, we can adjust our penalty to be a multiple of this score, and therefore it will avoid removing rides with high priority. The nice thing is, if we are not interested in an attraction, we can set a penalty of zero.
penalty = 10000for node in range(1, len(distance_matrix)):
routing_model.AddDisjunction(
[index_manager.NodeToIndex(node)],
penalty*ride_priority[attraction_names[node]])
Recall that “Rise of the Resistance” was the first ride to be removed when constrained on time. However, if we raise the ride’s priority by two times, it is kept as part of our route, and we remove the Riverboat attraction instead.
Final Thoughts
We have done enough for now, but we are far from being done! There are easily over a dozen additional modifications to this problem to make it more realistic. “What if we would like to ride something multiple times?” “What if we want to eat at a certain restaurant at a specified time?” “What if we want to ensure I not only visit the nodes but walk along certain edges, such as the one that goes through the castle?” All of these are amusing alterations to consider, but one piece of information could significantly affect our optimal route; wait times. We need to relax the primary assumption that we are alone in the park, which has substantial consequences. We will attempt to tackle this issue in a future article. Stay tuned!
You can find the entire program for this blog post on GitHub or below: