find_addends - Part 1

Explaining my approach

Published: 5/4/2021

A preliminary graph of estimated performance.
A preliminary graph of estimated performance.

This series of posts outlines one of my more technical projects to date. I attempt to solve a puzzle using two approaches. My goal is to optimize my custom approach until it can consistently outperform the more traditional approach on a variety of data sets. I anticipate this will require a rather long explanation so this project will come in parts.

Beyond simply improving my own algorithm I also plan to benchmark the varying results, perform noise reduction on the benchmarks, and compile for multiple targets including Web Assembly. All source code for this project is available on Github.

Background

As I worked through the Advent of Code 2020 day 1 puzzle it occurred to me there was a considerably faster approach to computing the solution. The puzzle itself is rather straightforward. You are given a list of numbers and are asked to find the two numbers in the list which sum to 2020. Afterwards you are asked to repeat this process but to instead find three addends summing to 2020.

The most straightforward solution is to find all combinations with two selections of the listed numbers, add them, and see if the result is 2020. Then repeat this process for all three-selection combinations.

Developing a New Approach

I realized it made no sense to try adding numbers that were both less than half of the desired sum. Two numbers both less than or both greater than half the desired number could never sum to the desired number. This is the basis of my approach.

Terminology: Going forward I will refer to addends less than half the desired sum as smalls and addends greater than half the desired sum as bigs. The "straightforward" solution I referred to previously will be called use_combos and my own custom algorithm will be called find_addends.

The small and big values from our initial list will be sorted into their own respective lists by comparing each entry to half the sum. Now only combinations of small + big need to be considered.

In the worst case scenario for find_addends there are an equal number of bigs and smalls meaning we may have to compare each small to every big until we find a solution. Compare this to use_combos's worst case where every entry must be compared against every other entry.

Where n is the number of entries in the list we expect something like (n2)2(\frac{n}{2})^2 iterations for find_addends and (n2){n\choose 2} iterations for use_combos.

In further detail, if there are an equal number of bigs and smalls and n entries in total then we know there are n2\frac{n}{2} addends of either type. Worst case, we compare n2\frac{n}{2} smalls to n2\frac{n}{2} bigs thus we expect (n2)2(\frac{n}{2})^2 iterations. For use_combos we simply expect the number of two-selection combinations of n entries.

So far, so good. In theory we should be down from (n2){n \choose 2} iterations to n24{\frac{n^2}{4}} in the worst case, a two times speed increase barring the overhead to sort the into two lists.

The General Case

The current solution only covers cases where two numbers are added to find our desired sum. However, you may recall that we are also asked to find three addends from the list. Rather than create a tailored solution that only finds three addends I will take this opportunity to extend my algorithm to the general case. That is, I will adapt find_addends to find any r number of entries which sum to our desired number. E.g. where r = 4 we will be looking for an x1 + x2 + x3 + x4 which yields our desired number.

For use_combos this is incredibly simple as I am using a handy Rust Itertools library which includes a combinations iterator. Here I just pass in r selections instead of 2.

For find_addends things get a little more complicated. Remember that for two selections we only needed to consider the case small + big. Let's dwell on why this is for a moment.

Consider the following: We are asked to find r numbers which sum to 12. Where r = 2, 4 + 8 and 7 + 5 are both solutions Notice how 4 is 2 less than half the sum, 6, and 8 is 2 greater than half. In other words the big addend, 8, makes up for the deficiency of the small addend, 4. In mathematical terms we refer to the offset between an addend and the 'half value' as a residual. So by whatever residual one addend is less than half, the other addend is larger than half by that same residual.

Now notice why we compare these values against 'half'. We are asked to find 2 addends, therefore our residuals are determined by subtracting sum / 2 from each addend. Using three addends, such values cannot be all less than or all greater than one third the desired some.

E.g. if r = 3 and our sum was 12 a third would be 4. Therefore 1 + 2 + 3 (all smalls) and 5 + 6 + 7 (all bigs) will never sum to 12.

More generally, the number we compare our entries to is sum / r. But what should we call such a value? The sum divided by the number of addends, otherwise known as the average.

With this in mind we can shift our entire perspective of the situation. For every big value we have a 'positive' residual greater than our average. For every small value we have a 'negative' residual less than our average. When we sum our addends our positive and negative residuals must cancel (sum to zero).

Let's return to our sum of 12, this time considering r = 3. Valid solutions include 6 + 3 + 3 and 5 + 5 + 2. Our average is sum / r or 4.

For our first solution both 3s have a negative 1 residual with our average and 6 has a positive 2 residual with our average. Two negative residuals of 1 equal a single negative 2 residual which cancels with our single positive 2 residual.

A more visual approach: we start with 6 + 3 + 3 and subtract the average 4 from each addend yielding 2 - 1 - 1. Each addend now represents the aforementioned 'residual' with the average. 2 - 1 - 1 then sums to 0 confirming our solution.

The possible solutions for three addends assume the form small + big + big or small + small + big.

Jumping ahead, for four addends we must consider the three following cases: small + big + big + big, small + small + big + big, and small + small + small + big.

By now you may have noticed a pattern developing. For two addends we considered one case, for three addends we considered two cases, and for four addends we consider three cases. More generally, for r addends we must consider (r - 1) cases.

In practice find_addends will begin by testing all 1 small + (r - 1) bigs sums, then 2 smalls + (r - 2) bigs up to (r - 1) smalls + 1 big to find a solution, incrementing the number of small addends to use as it goes along. For each i number of small addends we compare (n2\frac{n}{2} choose i) smalls to (n2\frac{n}{2} choose r - i) bigs in the worst case.

Performance-wise this puts find_addends at an estimated

i=1r1(n/2i)(n/2ri)\displaystyle\sum_{i=1}^{r-1} \dbinom{n/2}{i}\dbinom{n/2}{r - i}

iteration count with use_combos still sitting around (nr){n \choose r} iterations.

It turns out as r approaches infinity these two expressions actually converge meaning so long as r is kept below infinity and memory allocations and sorting overhead are kept in check find_addends should out-perform use_combos on average.

Dealing with Edge Cases

Allow me to take a step back and revisit the core principle of my approach. Before any comparisons are made find_addends first separates the initial list of inputs into smalls and bigs depending on whether the values are less than or greater than our average respectively, but how should we handle cases where an input is equal to our average?

Let's call such numbers perfects. In previous versions of find_addends I had performed mathematical gymnastics and even recursion to cover all perfect cases. However, while writing this article and working to optimize my implementation a far simpler approach struck me.

While sorting our inputs into bigs and smalls if a perfect value is found place it into the bigs list. Then if another perfect value is found place it into the smalls list. Repeat this alternating pattern for all perfect values.

For r = 2 addends we compare small + big. If we find two perfects then the perfect in the smalls list and the perfect in the bigs list will be compared according to this form and produce a solution.

Likewise for r = 3 addends we compare small + big + big and small + small + big. The bigs list will end up with two perfect values (due to alternation). Therefore, while iterating through all small + big + big combinations of values we will make the comparison over perfect values.

For any number of perfect inputs found there will be at most one more perfect in the bigs list or an equal number of perfects in each list. Since find_addends covers all combinations of addends from one small to one big addend we are guaranteed to cover all cases with one more big addend than small addends or equal small and big addends.

Finally, reconsider perfect values as an addend with a 0 residual with the average. As such, when searching for 3 addends our solution may contain only one perfect so long as the remaining big and small addends cancel with each other. The big + big + small case will cover this scenario since the big addends test ALL values in the bigs list including the perfect values stored therein. We can rest assured find_addends has complete coverage from one perfect to r perfects.

Since our perfect values are incorporated nicely into `find_addends we manage to perform no extra steps outside of toggling a boolean value while sorting our inputs. Therefore, dealing with this edge case added a negligible performance impact.

Part One - Conclusion

To test the validity of my mathematical models I created a test input which contains 100 elements, 51 smalls, 49 bigs, and five correct addends. Furthermore, this input places the correct addends at the very end of the file and uses 4 small addends. This ought to produce the worst-case iteration count for both algorithms which the models seek to calculate.

Entering this data into WolframAlpha yields 71031576 using the find_addends model and 75287520 using the use_combos model.

Running the test input through both algorithms and counting the iterations outputs:

find_addends completed in 75808 ms with 71031576 iterations
solution set: [10.0, 10.0, 10.0, 10.0, 1980.0], sum: 2020, product: 19800000
use_combos completed in 87519 ms with 75287520 iterations
solution_set: [10.0, 10.0, 10.0, 10.0, 1980.0], sum: 2020, product: 19800000

Both values match exactly with their respective model which implies these models are accurate.

Therefore, the performance of find_addends versus use_combos should resemble the following formulas.

find_addends use_combos
Iterations

i=1r1(n/2i)(n/2ri)\displaystyle\sum_{i=1}^{r-1} \dbinom{n/2}{i}\dbinom{n/2}{r - i}

(nr){ \dbinom{n}{r} }

Big O

O(nr){ \Omicron(n^r) }

O(nr){ \Omicron(n^r) }


Although I was hoping for a larger performance lead than calculated find_addends still manages to remove the "all small" and "all big" combinations and prove to be more efficient than use_combos. However, simplifying the number of combinations from the form (ni)n \choose i to nrn^r we might expect n2in2(ri)\frac{n}{2}^i * \frac{n}{2}^{(r - i)} iterations times the number of small addends we consider. This yields roughly r14nr\frac{r - 1}{4} * n^r iterations for find_addends.

In other words, as r1r - 1 exceeds 4 performance gains become increasingly negligible but as the difference in the number of smalls and bigs increases performance greatly improves.

Despite these shortcomings this exercise has been far from fruitless. As I will detail in future posts find_addends actually manages to hit upon several key ideas leveraged by leading k-SUM algorithms.

There is plenty more analysis I would like to share in the following parts so keep an eye out on LinkedIn and thank you for reading! I hope this post was interesting and the explanations clear. If you have any questions or comments please feel free to contact me. Until later! end mark