The Neighborhood Shuffle Paradox

Riley Howsden
9 min readFeb 9, 2021
Photo by Ross Joyner on Unsplash

Envision for a second a long row of houses in a neighborhood. For the sake of this exercise, we shall number the houses from zero to some maximum number of dwellings, D. At the current time, you and a close friend live in adjacent houses, so visiting them takes little effort. Imagine an annual “housing shuffle,” where families are relocated to another home in the neighborhood by a uniform, random selection. For this thought exercise, it may be easier to think of the houses as dorm rooms and that the shuffle occurs at the end of the quarter or semester.

Due to this shuffling process, your friend is worried that you will no longer be close to one another. You’re determined to find out the expected distance between your houses before this process occurs. How might you do that?

The Paradox

Assuming a naive relaxation that instead, we’re working with a continuous uniform distribution, we could quickly arrive at a seemingly intuitive answer to this question. In this situation, if the house could exist anywhere between 0 and D equally, then the expectation of your location, E(you), is D divided by 2. Likewise, for your friend, their expectation, E(friend), is also D divided by 2. Since E(you) — E(friend) evaluates to 0, it appears that the anticipated location for both you and your friend is the same spot!

However, something seems off; if we were to carry out this exercise for every family in the neighborhood, we would get the same result. It looks like someone will need to construct a skyscraper at this central location to accommodate the expectation! It should be clear that we are missing something, as this doesn’t seem plausible.

Could it be that our shift from a discrete distribution to a continuous one caused this issue? After all, if there were eleven houses in the neighborhood and we were selected to live in house number 5, then we should set a constraint that the next neighbor can’t occupy this spot. The remaining locations are [0, 1, 2, 3, 4, 6, 7, 8, 9, 10], and we might wrongly assume that given the expectation of this set of numbers is also 5, that the next assigned person will reside at 4 or 6. Since this is a uniform distribution, the person is just as likely to occupy 0 or 10. Our usage of expectation seems off-track and dangerous.

However, we have unveiled that each assignment is not independent; knowledge from previous designations limits future possibilities and, in turn, affects our solution. In theory, the expectation we should be thinking is closer to E(max{you, friend}) — E(min{you, friend}) instead of |E(you) — E(friend)|; these are not equivalent. The question remains, how far away do we expect our friend to live?

Before going on, I suggest you either speculate on the answer or write down a rough solution.

The Brute Force Solution

One way to solve this problem is to evaluate every possible combination of location pairs, calculate the distance between them and take the average. If the number of houses was 100, the script below should accommodate this:

It appears that the average distance between the two households in question is between 33 and 34 houses. At first glance, it seems the general solution for the expected gap between two people is D/3. If we change the number of homes and rerun the script, a pattern emerges; (20 houses, expected distance — 21/3), (500, 501/3), (1000, 1001/3). The exact answer appears to be (D + 1)/3.

Are there other ways we could have arrived at the same conclusion?

The Simulation Solution

Instead of using brute force, we could randomly sample a set of location pairs from the distribution and calculate the distance between them. The process would continue for a specified number of simulations or until the average distance measurement between the current and previous step has converged.

Unlike the brute force method, this may not yield the exact solution, but it should be close. In situations where the number of houses is large, this approach will become more appealing due to less computation. Another benefit of this method is that it is easier to substitute other distributions for evaluation; this would not be as trivial if taking the brute-force approach.

Generalization

Both of the methods above result in approximately D/3. Given this number, is there any other intuition that we can draw from the problem? Recall that a single resident’s expected location was D/2, which splits the uniform distribution in half. Curiously, the first two individuals’ locations seem to break the uniform distribution into three equal parts, with the first person’s expected location at D/3 and the second person at 2D/3. We can extend this to any number of people, P, with the expectation that each person is D/(P+1) away from the person before them, equally splitting the uniform distribution into P+1 parts. Note that there is still some hand-waving going on here that is falsely equating the problem’s discrete and continuous versions. Still, the observations above are reasonable approximations in most situations.

Order Statistics

While both the brute force and simulation methods are simple to create, could an analytical solution to the problem exist? The relatively unknown area of order statistics may hold the key. If you’ve never heard of order statistics, do not be afraid; they are relatively easy to understand at a high-level. Imagine you had observed a series of numbers drawn from some distribution, such as 7, 3, 2, 8. The first order statistic (X_1) would be 2, the minimum of the series, while the nth order statistic (X_n) would be 8, the maximum of the series. We could assign each number in between in a similar fashion.

By returning to our original problem, it is evident that we can represent any two friends’ positions by these order statistics. All that is needed now is a way to measure the expected value of each order statistic. Recall that for a random variable X that we can define the cumulative distribution function (CDF) as:

A quick reminder on CDFs; this calculation represents the probability that X will take on a value less than or equal to x. For example, if we had a uniform distribution [0, 10], P(X<=3) would evaluate to 30%.

Now turning to order statistics; if we wanted to calculate the CDF for the maximum order statistic, it would follow this form:

This new CDF represents the probability that all of X_n will take on a value less than or equal to x.

Similarly, the cumulative distribution for the minimum order statistic is:

Of course, this represents the probability that all of X_n will take on a larger value than x.

If we are looking for the CDF of any other order statistic, we can combine the above CDFs and raise them each to a power relative to that order statistic’s location. The full formula would be:

That may seem daunting, but an example may help. If we want the CDF of the 2nd order statistic, then we effectively need to calculate the probability that one of the numbers is smaller than x and the remaining n-1 numbers are larger than x. The binomial coefficient tacked on to the front of this formula represents the number of possible permutations in which the outcome can occur. Remember, we do not require a specific variable to be smaller than x, just that any single variable must be less than x while the remaining variables are larger.

Finally, we can obtain the probability distribution function (PDF) by taking the derivative from any of the above formulas. For example, the PDF of the maximum order statistic would be:

and the PDF for the minimum order statistic would be:

To visualize this for our simple case, below is the rough probability mass function for the 1st and 2nd order distributions with n = 2. You will notice that for a neighborhood size of 100, that there is a 1% chance of being at any location. However, when looking at the order statistics, it is more likely that they will be at one end of the neighborhood or the other.

Waving our hands a little bit, we can extend this to any order statistic as:

For a continuous uniform distribution — Unif(0,1) — this translates into:

The formula may look familiar. It is just a beta distribution with an alpha value of k and a beta value (n-k+1), and the expected value is just alpha/(alpha+beta). With this in mind, we can find a quick analytical solution for the expectations of both the first and second-order statistics in our neighborhood problem, D/3 and 2D/3.

We can plot any order statistic — below is an example of the maximum order statistic when k = 10. Like the earlier chart, most of the probability lies in the high end of the housing locations, but it is more extreme in this situation. It is doubtful that the site of the maximum order statistic will be anywhere before the 60th house.

Often, the underlying distribution will not be as easy to work with as the uniform distribution, and we will need to calculate the expected value directly using:

Evaluating this formula by hand can become quite gnarly for most distributions, but this integral should be a reasonable calculation for most computers.

Summary

Compared to both the brute-force and simulation method, order statistics are harder to grasp initially but have some clear advantages. For example, the brute-force method is not generalizable; if we had a normal distribution instead of a uniform distribution, we would have to add a weight to each pair based on their likelihood of occurrence. The simulation approach handles this issue well as it can directly sample from any distribution, discrete or continuous. However, this technique will need many samples to gain clarity, even more so for distributions with tail behavior; therefore, this solution can be expensive to carry out. If the distribution is known, order statistics can give an immediate answer as long as the integral is solvable.

Note: The formulation of the order statistics scenario above is far simpler than it actually is. The selection process is discrete and constrained by earlier assignments in the original setting, meaning no two families can live in the same home. Unfortunately, accounting for this discreteness alongside the “without replacement” property is beyond this post’s scope and is also beyond my ability to explain adequately.

--

--

Riley Howsden

“Half-Stack” machine learning propagandist in the gaming industry — a critic of all data presented deceitfully, unless it contains a meme, then it must be true.