This series of posts outlines one of my more technical projects to date. I attempt to solve a puzzle using two approaches. My goal is to optimize my custom approach until it can consistently outperform the more traditional approach on a variety of data sets. I anticipate this will require a rather long explanation so this project will come in parts.
Beyond simply improving my own algorithm I also plan to benchmark the varying results, perform noise reduction on the benchmarks, and compile for multiple targets including Web Assembly. All source code for this project is available on Github.
As I worked through the Advent of Code 2020 day 1 puzzle it occurred to me there was a considerably faster approach to computing the solution. The puzzle itself is rather straightforward. You are given a list of numbers and are asked to find the two numbers in the list which sum to 2020. Afterwards you are asked to repeat this process but to instead find three addends summing to 2020.
The most straightforward solution is to find all combinations with two selections of the listed numbers, add them, and see if the result is 2020. Then repeat this process for all three-selection combinations.
I realized it made no sense to try adding numbers that were both less than half of the desired sum. Two numbers both less than or both greater than half the desired number could never sum to the desired number. This is the basis of my approach.
Terminology: Going forward I will refer to addends less than half the desired sum as
smalls
and addends greater than half the desired sum as bigs
. The "straightforward"
solution I referred to previously will be called use_combos
and my own custom algorithm
will be called find_addends
.
The small
and big
values from our initial list will be sorted into their own respective
lists by comparing each entry to half the sum. Now only combinations of small + big
need
to be considered.
In the worst case scenario for find_addends
there are an equal number of bigs
and
smalls
meaning we may have to compare each small
to every big
until we find
a solution. Compare this to use_combos
's worst case where every entry must be
compared against every other entry.
Where n is the number of entries in the list we expect something like
iterations for find_addends
and iterations for
use_combos
.
In further detail, if there are an equal number of bigs
and
smalls
and n entries in total then we know there are addends of
either type. Worst case, we compare smalls
to bigs
thus
we expect iterations. For use_combos
we simply
expect the number of
two-selection combinations
of n entries.
So far, so good. In theory we should be down from iterations to in the worst case, a two times speed increase barring the overhead to sort the into two lists.
The current solution only covers cases where two numbers are added to
find our desired sum. However, you may recall that we are also asked to find
three addends from the list. Rather than create a tailored solution that
only finds three addends I will take this opportunity to extend my algorithm to
the general case. That is, I will adapt find_addends
to find any r number
of entries which sum to our desired number. E.g. where r = 4 we will be
looking for an x1 + x2 + x3 + x4
which yields our desired number.
For use_combos
this is incredibly simple as I am using a handy
Rust Itertools library which includes a
combinations iterator. Here I just pass in r selections
instead of 2
.
For find_addends
things get a little more complicated. Remember that for two
selections we only needed to consider the case small + big
. Let's dwell
on why this is for a moment.
Consider the following: We are asked to find r numbers which sum to 12
.
Where r = 2, 4 + 8
and 7 + 5
are both solutions Notice how 4
is 2
less than half the sum, 6
,
and 8
is 2 greater than half. In other words the big
addend, 8
,
makes up for the deficiency of the small
addend, 4
. In mathematical
terms we refer to the offset between an addend and the 'half value' as
a residual. So by whatever residual one addend is less than half, the
other addend is larger than half by that same residual.
Now notice why we compare these values against 'half'. We are asked to
find 2
addends, therefore our residuals are determined by subtracting
sum / 2
from each addend. Using three addends,
such values cannot be all less than or all greater than one third the
desired some.
E.g. if r = 3 and our sum was 12
a third would be
4
. Therefore 1 + 2 + 3
(all smalls
) and 5 + 6 + 7
(all bigs
)
will never sum to 12
.
More generally, the number we compare our entries to is sum / r
. But
what should we call such a value? The sum divided by the number of addends,
otherwise known as the average.
With this in mind we can shift our entire perspective of the situation. For
every big
value we have a 'positive' residual greater than our average.
For every small
value we have a 'negative' residual less than our average.
When we sum our addends our positive and negative residuals must cancel (sum to
zero).
Let's return to our sum of 12
, this time considering r = 3. Valid solutions
include 6 + 3 + 3
and 5 + 5 + 2
. Our average is sum / r
or 4
.
For our first solution both 3
s have a negative 1
residual with our average and 6
has a positive 2
residual with our
average. Two negative residuals of 1
equal a single negative 2
residual which cancels with our single positive 2
residual.
A more visual approach: we start with 6 + 3 + 3
and subtract the average 4
from each addend yielding 2 - 1 - 1
. Each addend now represents the
aforementioned 'residual' with the average. 2 - 1 - 1
then sums to
0
confirming our solution.
The possible solutions for three addends assume the form small + big + big
or small + small + big
.
Jumping ahead, for four addends we must consider the
three following cases: small + big + big + big
, small + small + big + big
, and small + small + small + big
.
By now you may have noticed a pattern developing. For two addends we considered one case, for three addends we considered two cases, and for four addends we consider three cases. More generally, for r addends we must consider (r - 1) cases.
In practice find_addends
will begin by testing all 1 small + (r - 1) bigs
sums, then 2 smalls + (r - 2) bigs
up to (r - 1) smalls + 1 big
to find a solution,
incrementing the number of small
addends to use as it goes along. For each
i
number of small addends we compare ( choose i) smalls
to ( choose r - i) bigs
in the worst case.
Performance-wise this puts find_addends
at an estimated
iteration count with use_combos
still sitting around
iterations.
It turns out as r approaches infinity these two expressions actually
converge
meaning so long as r is kept below infinity and memory allocations and sorting
overhead are kept in check find_addends
should out-perform use_combos
on average.
Allow me to take a step back and revisit the core principle of my approach.
Before any comparisons are made find_addends
first separates the initial
list of inputs into smalls
and bigs
depending on whether the values are less
than or greater than our average respectively, but how should we handle cases
where an input is equal to our average?
Let's call such numbers perfects
. In previous versions of find_addends
I had performed
mathematical gymnastics and even recursion to cover all perfect
cases. However,
while writing this article and working to optimize my implementation a far simpler
approach struck me.
While sorting our inputs into bigs
and smalls
if a perfect
value is found place
it into the bigs
list. Then if another perfect
value is found place it into the
smalls
list. Repeat this alternating pattern for all perfect
values.
For r = 2 addends we compare small + big
. If we find two perfects
then the
perfect
in the smalls
list and the perfect
in the bigs
list will be compared
according to this form and produce a solution.
Likewise for r = 3 addends we compare small + big + big
and small + small + big
.
The bigs
list will end up with two perfect
values (due to alternation). Therefore,
while iterating through all small + big + big
combinations of values we will make
the comparison over perfect values.
For any number of perfect
inputs found there will be at most one more perfect
in
the bigs
list or an equal number of perfects
in each list. Since find_addends
covers all combinations of addends from one small
to one big
addend we are guaranteed to cover all cases with one more big
addend than small
addends or equal small
and big
addends.
Finally, reconsider perfect
values as an addend with a 0
residual with the average.
As such, when searching for 3
addends our solution may contain only one perfect
so long as the remaining big
and small
addends cancel with each other. The
big + big + small
case will cover this scenario since the big
addends test ALL
values in the bigs
list including the perfect
values stored therein. We can
rest assured find_addends
has complete coverage from one perfect
to r perfects
.
Since our perfect
values are incorporated nicely into `find_addends we manage
to perform no extra steps outside of toggling a boolean value while sorting our
inputs. Therefore, dealing with this edge case added a negligible performance impact.
To test the validity of my mathematical models I created a test input which contains
100 elements, 51 smalls
, 49 bigs
, and five correct addends. Furthermore, this input
places the correct addends at the very end of the file and uses 4 small
addends. This
ought to produce the worst-case iteration count for both algorithms which the models
seek to calculate.
Entering this data into WolframAlpha
yields 71031576 using the find_addends
model and 75287520 using the use_combos
model.
Running the test input through both algorithms and counting the iterations outputs:
find_addends completed in 75808 ms with 71031576 iterations
solution set: [10.0, 10.0, 10.0, 10.0, 1980.0], sum: 2020, product: 19800000
use_combos completed in 87519 ms with 75287520 iterations
solution_set: [10.0, 10.0, 10.0, 10.0, 1980.0], sum: 2020, product: 19800000
Both values match exactly with their respective model which implies these models are accurate.
Therefore, the performance of find_addends
versus use_combos
should resemble the
following formulas.
find_addends | use_combos | |
---|---|---|
Iterations |
| |
Big O |
|
|
Although I was hoping for a larger performance lead than calculated find_addends
still manages to remove the "all small
" and "all big
" combinations and prove
to be more efficient than use_combos
.
However, simplifying the number of combinations from the form
to we might expect
iterations times the number of small
addends we consider. This yields roughly
iterations for find_addends
.
In other words, as exceeds 4 performance gains become increasingly negligible
but as the difference in the number of smalls
and bigs
increases performance
greatly improves.
Despite these shortcomings this exercise has been far from fruitless. As I will detail
in future posts find_addends
actually manages to hit upon several key ideas leveraged
by leading k-SUM algorithms.
There is plenty more analysis I would like to share in the following parts so keep an eye out on LinkedIn and thank you for reading! I hope this post was interesting and the explanations clear. If you have any questions or comments please feel free to contact me. Until later!