A narcisistic number is a number where the sum of the nth powers of the digits
of an n-digit number is equal to the number. For instance, 153 is a
narcissistic number of length 3 because 1^3 + 5^3 + 3^3 = 1 + 125 + 27 = 153
or 1^4 + 6^4 + 3^4 + 4^4 = 1634 for a 4 digit number.
The Challenge set on Programming Praxis is to find all the narcissistic numbers. We are given a little help, in that although 61*(9^61) < 10^60 (meaning there are no narcissistic numbers greater than 10^61, we are also told that the largest narcissistic number is 115132219018763992565095597973971522401 (n=39).
Starting with a naive implimentation
My first approach iterated through the numbers from the shortest number to be found, up to the longest.
A problem with range.
Using range() to generate the list of numbers to iterate over soon caused problems. When the program started to calculate values for n=10, there was a rather nasty out of memory exception. range() in python 2.7 creates the full list in one go, rather than a generator. The xrange method should be used instead. NOTE: python 3.x returns a generator for either range or xrange, so this would be less of a problem there. In the end, a while loop seemed like the most simple option.
def generate_power_list(power): return [i**power for i in range(0,10)] def find_narcissistic_numbers_naive(min_length, max_length): for number_length in range(min_length, max_length): power_dict = generate_power_dictionary(number_length) max_number = 10 ** number_length number = 10** (number_length -1) while number < max_number: value = 0 for digit in str(number): value += power_dict[digit] if value == number: logging.debug('narcissistic %s ' % number) number += 1
With the range fix, the method works correctly, but performance already became an issue as soon as digit length reached 11-12 – given the performance was O(10^n), getting to n=39 was going to take some time! Clearly we need something that isn’t O(n^10) if we are going to find all narcissistic numbers.
Unique Digit combinations
For any unique combination of digits, there could clearly only ever be one narcissistic number (the digits would only add up to one value). The number of permutations for selecting unique combinations of digits is based on the r-topic (or figurate) numbers, found in Pascals Triangle (in our case, as there are 10 possible digits, so we use the 10th diagonal).
There is a marked improvement on the order of the algorithm, and with that I started on a first implimentation of a recursive algorithm that would use combinations of digits, rather than each number.
In this solution each recursion handles a single digit of the array of digits being used, and tries all appropriate combinations of that digit.
def execute_recursive(digits, number_length): index = len(digits) if digits: number = digits[-1] else: number = 0 results =  digits.append(number) if len(digits) < number_length: while number < 10: results += execute_recursive(digits[:], number_length) number += 1 digits[index] = number else: while number < 10: digit_value = sum_digits(digits) if check_numbers_match_group(digit_value, digits): results.append(digit_value) logging.debug(digit_value) number += 1 digits[index] = number return results def find_narcissistic_numbers(min_length, max_length): for number_length in range(min_length, max_length): digits =  t_start = time.clock() results = execute_recursive(digits, number_length) print 'duration: %s for number length: %s' %(time.clock() - t_start, number_length)
This improved performance drastically, taking 1 minute to find all narcissistic numbers with n=19.
Performance still needs to be improved, in order to reach n=39 in a reasonable time. Time for some code optimisations…
Lists instead of dictionaries
In the first Naive implimentation, I used a dictionary, to cache the values of the powers of a digit for a given number length. Given that digits are all integers, I had been rather slow in not using lists, and simply using the digit as an index. As well as providing faster access to the result, it also meant that we no longer had to do a number of costly string conversions to dictionary keys. This change had a significant performance improvement of about 33%.
Narcissistic number check
In the base version, when checking that a number matched the digits, we iterated through each digit type, to ensure that there were the same number of each type. In this version we have added the optimisation of checking the digit length is correct before doing the full check.
I expected that this would have more of an effect on small number lengths, because as number length increases, there will tend to be more numbers in the middle of the distribution. This was somewhat upheld by the results:
- n=16: 11.5% improvement
- n=19: 9.8% improvement
def check_numbers_match_group(number, digits): number_search = str(number) # new in v1.3 if len(number_search) != len(digits): return False for digit in digit_list: if number_search.count(digit) != digits.count(digit): return False return True
The function to check if a set of digits still used a number of string conversions, as the method gets called on each iteration, removing these conversions improved performance by 6-7%. Lesson learned – don’t use string conversions anywhere performance sensitive – unsurprisingly!
Caching the digits sum
Up to now we have been using a sum_digits() function that calculates the sum
for each combination of digits. However, particularly for long numbers this
has significant repetition, as most of the time, it is only the last number
that has changed. Version 1.5, keeps a running total, of the
n-1, n-2…n-(n-1) sums, so that it only needs to recalulate the sum for digits that have changed.
This made a huge performance improvement for 19 length digits, 30% and this will increase in impact as the number length increases.
The end of recursion
The way the recursive algorithm is structured, means that the code has to create and execute lots of functions, one for every 10 permutations. Step one is to remove all the recursion, and use an array for the digits, and use a couple of counters to work out which digit should be changed next.
def execute_loops(number_length): results =  digits = *number_length segment_sums = *number_length index_limit = column_marker = number_length - 1 while True: while digits[column_marker]==9: column_marker -= 1 if column_marker < 0: return results digits[column_marker] = digits[column_marker]+1 while column_marker < index_limit: if column_marker == 0: segment_sums[column_marker] = power_lists[number_length][digits[column_marker]] else: segment_sums[column_marker] = segment_sums[column_marker-1] + power_lists[number_length][digits[column_marker]] digits[column_marker + 1] = digits[column_marker] column_marker += 1 new_val = segment_sums[column_marker-1] + power_lists[number_length][digits[column_marker]] if check_numbers_match_group(new_val, digits): results.append(new_val) logging.debug(new_val)
Removal of recursion, improved performance by a further 30%. From a time of over 60s for n=19 at the start of the optimisations, the program has now been improved to run in 17s for n=19, an improvement of 70%.
Although the algorithm has been sped up noticeably, there are still a couple more things that could improve performance.
- Threading – Currently the program only uses one core, and watching in task manager, seeing that the CPU is only working at 25%, makes me cry a little in side. The main problem here will be dividing the work up evenly between each thread.
- Pruning – as we saw at the start of the post, as n approaches 61, it gets hard to make numbers big enough to be narcissistic numbers. At n=40, if the digits are only 0 or 9, then there needs to be at least 7 9′s in order to create a number larger than 10**39. Therefore there we can skip combinations with more than 33 0′s, meaning that we save ourselves 48,000 iterations. As n increases, the contribution of other digits to the total reduces (9**40 > 100 * (8**40)), so there are other combinations of digits that will simply be too small even to be considered.
- Magic – As mentioned in the original post Dik Winter calculated all narcissistic numbers in 1985, which took 30 minutes then (equivalent to about 1s today) – I think there is clearly a much more efficient algorithm for finding these numbers if that is the case, fun for a rainy day!