Island of Cannibal Horses: June 2010

Wednesday, June 23, 2010

Project Euler: Problem 10

http://projecteuler.net/index.php?section=problems&id=10

The sum of the primes below 10 is 2 + 3 + 5 + 7 = 17.

Find the sum of all the primes below two million.

Another "prime number" question. Let's reuse and adapt our solution from problem 7:

def f1(n):
    # Define test for primeness
    def isprime(x):
        if x <= 1: return False

        c = 2

        while c <= x**0.5:
            if x%c == 0:
                return False

            if c != 2:
                c += 2
            else:
                c += 1

        return True

    # Iterate over odd numbers and test for primeness
    res = 2

    for i in xrange(3,int(n),2):
        if isprime(i):
            res += i

    # Return the sum
    return res

This is starting to feel slow, though. On my machine, the correct result is returned in about 47 seconds. I'm gonna have to either better optimize my prime generation algorithm or find a new one altogether.

Project Euler: Problem 9

http://projecteuler.net/index.php?section=problems&id=9

A Pythagorean triplet is a set of three natural numbers, a < b < c, for which, a^(2) + b^(2) = c^(2) For example, 3^(2) + 4^(2) = 9 + 16 = 25 = 5^(2). There exists exactly one Pythagorean triplet for which a + b + c = 1000. Find the product abc.

To generate the triplets, we'll use Euclid's formula (http://en.wikipedia.org/wiki/Pythagorean_triple#Generating_a_triple).

def f1(x):
    m,n = 1,0

    while True:
        m += 1

        # No solution?
        if (2*m*m + 2*m) > x:
            return None

        for n in xrange(1,m):
            # Euclid's formula
            a = m*m - n*n
            b = 2*m*n
            c = m*m + n*n

            if (a+b+c) == x:
                return a*b*c

Obviously there are many possible input values for which there are no solutions. In these cases, since 1 is the lowest possible value for n, we can check to see if m has become too large. That's the point of the "return None" statement above.

Project Euler: Problem 8

http://projecteuler.net/index.php?section=problems&id=8

Find the greatest product of five consecutive digits in the 1000-digit number.

7316717653133062491922511967442657474235534919493496983520312774506326239578318016984801869478851843858615607891129494954595017379583319528532088055111254069874715852386305071569329096329522744304355766896648950445244523161731856403098711121722383113622298934233803081353362766142828064444866452387493035890729629049156044077239071381051585930796086670172427121883998797908792274921901699720888093776657273330010533678812202354218097512545405947522435258490771167055601360483958644670632441572215539753697817977846174064955149290862569321978468622482839722413756570560574902614079729686524145351004748216637048440319989000889524345065854122758866688116427171479924442928230863465674813919123162824586178664583591245665294765456828489128831426076900422421902267105562632111110937054421750694165896040807198403850962455444362981230987879927244284909188845801561660979191338754992005240636899125607176060588611646710940507754100225698315520005593572972571636269561882670428252483600823257530420752963450

Python simplifies this a bit, though it wouldn't be too hard to implement in C/C++.

def f2(n):
    s = str(n)
    res = 0

    for i in xrange(len(s)-5):
        x = 1
        for j in xrange(5):
            x *= int(s[i+j])
            if x == 0:
                break
        res = max(res, x)

    return res

By breaking out of the second iteration if the product is zero (it ain't gettin' any bigger!), we shave off 9% of our execution time. On my machine, the result is returned in 5e-03 seconds.

Project Euler: Problem 7

http://projecteuler.net/index.php?section=problems&id=7

By listing the first six prime numbers: 2, 3, 5, 7, 11, and 13, we can see that the 6th prime is 13.

What is the 10001st prime number?

To my knowledge, there aren't any good (efficient) know methods of generating primes. This leaves only quasi-brute force methods. We'll keep iterating our test index (by 2 to only check odd numbers) and check for primeness.

def f1(n):
    # Define test for primeness
    def isprime(x):
        if x <= 1: return False

        c = 2

        while c <= x**0.5:
            if x%c == 0:
                return False

            if c != 2:
                c += 2
            else:
                c += 1

        return True

    # Iterate over odd numbers and test for primeness
    count = 1
    i = 1

    # Stop when we've found n primes
    while count < n:
        i += 2
        if isprime(i):
            count += 1

    # Return the last found prime
    return i

This obviously isn't very good, but it does return the 10001st prime in about 0.8 seconds.

Tuesday, June 22, 2010

Project Euler: Problem 6

http://projecteuler.net/index.php?section=problems&id=6

The sum of the squares of the first ten natural numbers is,
1^(2) + 2^(2) + ... + 10^(2) = 385

The square of the sum of the first ten natural numbers is,
(1 + 2 + ... + 10)^(2) = 55^(2) = 3025

Hence the difference between the sum of the squares of the first ten natural numbers and the square of the sum is 3025 − 385 = 2640.

Find the difference between the sum of the squares of the first one hundred natural numbers and the square of the sum.

This one is easy. Brute force--iterating over all values from 1 to n--is a non-starter. It can't scale. Instead, use the equations for mathematical series.

def f1(n):
    # Calculate the sum
    a = n*(n+1)/2

    # Calculate the sum of squares
    b = n*(n+1)*(2*n+1)/6

    # Return the square of sums minus the sum of squares
    return ( a*a - b )

This returns near instantaneously and is independent of n. I.e., it goes as O(1).

Project Euler: Problem 5

http://projecteuler.net/index.php?section=problems&id=5

2520 is the smallest number that can be divided by each of the numbers from 1 to 10 without any remainder.

What is the smallest positive number that is evenly divisible by all of the numbers from 1 to 20?

Thinking about this for a few minutes, the solution will be the product of the minimal set of prime factors needed to produce each possible value from 1 to n (in our case, n=20). Let's recycle that factorization code we wrote for p003 and add a little somethin' somethin'.

def f1(n):
    factors = {}

    for i in xrange(2, n+1):
        # Recycle factoring code from p003
        primes = []
        last = i

        while last != 1:
            try:
                c = primes[-1]
            except:
                c = 2

            while last % c != 0:
                if c==2:
                    c += 1
                else:
                    c += 2

            primes.append(c)
            last /= c

        # Iterate through the factors and group/accumulate
        for p in primes:
            c = primes.count(p)

            # Keep track of the least amount of factors needed
            if c > factors.get(p,0):
                factors[p] = c

    # Multiply together the minimal set of primes
    results = 1
    for p,c in factors.iteritems():
        results *= p**c

    return results

On my set-up, f1(20) returns in about 8.53e-05 seconds.

Something interesting: I investigated using the built-in SET object to get a unique list of members in 'primes'. That was actually slightly slower than simply iterating over all members in 'primes'. It would likely be quicker for larger, more redundant lists.

Project Euler: Problem 4

http://projecteuler.net/index.php?section=problems&id=4

A palindromic number reads the same both ways. The largest palindrome made from the product of two 2-digit numbers is 9009 = 91 x 99.

Find the largest palindrome made from the product of two 3-digit numbers.

First I tried a brute force method: iterate over unique pairs of three digit numbers and store the largest product palindrome.

def f1(n):
    res = 0

    for a in xrange(10**n - 1, 10**(n-1) - 1, -1):
        for b in xrange(a, 10**(n-1) - 1, -1):
            if str(a*b) == str(a*b)[::-1]:
                res = max(res,a*b)

    return res

I coded this to take in the number of digits for the input pairs. I don't know why; it seemed like the thing to do. This gets the answer, but brute force is usually a bad choice.

For my next iteration, I tried adding in some sanity checks to the a,b iterators. Specifically: if a*b is less than the current best answer, break out of the b iteration and decrement a. Furthermore, of a*a is less than the current best answer, stop altogether.

def f3(n):
    res = 0

    for a in xrange(10**n - 1, 10**(n-1) - 1, -1):
        if a*a < res:
            break

        for b in xrange(a, 10**(n-1) - 1, -1):
            if a*b < res:
                break

            if str(a*b) == str(a*b)[::-1]:
                res = max(res,a*b)

    return res

Much better! On my set-up, f1(3) returns in about 0.44 seconds. The better optimized version, f3(3), returns in about 0.0075 seconds.

As a final note, the largest palindrome made from the product of two 6-digit numbers is 999000000999. The largest palindrome made from the product of two 8-digit numbers is 9999000000009999. Nice symmetry!

Monday, June 21, 2010

Project Euler: Problem 3

http://projecteuler.net/index.php?section=problems&id=3

The prime factors of 13195 are 5, 7, 13 and 29.

What is the largest prime factor of the number 600851475143 ?

Prime factorization should be easy, right? I started with some old code I had (which was probably stolen off the Internet some time in the past):

def f1(n):
    factors = []
    last = n

    if n <  1: return 0
    if n == 1: return 1

    while last != 1:
        c = 2

        while last % c != 0:
            c += 1

        factors.append(c)
        last /= c

    return max(factors)

So far so good; it gets us the right answer. Look at those nasty inefficiencies, though. Specifically, we always start our search with 2 even after we've determined there are no more factors of two left. Also, after 2 we should only check odd candidates. Let's try again:

def f2(n):
    factors = []
    last = n

    if n <  1: return 0
    if n == 1: return 1

    while last != 1:
        try:
            c = factors[-1]
        except:
            c = 2

        while last % c != 0:
            if c==2:
                c += 1
            else:
                c += 2

        factors.append(c)
        last /= c

    return max(factors)

Much better! On my set-up, f1(600851475143) returns in about 3.14e-03 seconds. The better optimized version, f2(600851475143), returns in about 1.24e-03 seconds. We might be able to do better still if we first create a list of prime numbers (up to sqrt(n)) and only use those as possible candidates. I might try that later.

Project Euler: Problem 2

http://projecteuler.net/index.php?section=problems&id=2

Each new term in the Fibonacci sequence is generated by adding the previous two terms. By starting with 1 and 2, the first 10 terms will be:

1, 2, 3, 5, 8, 13, 21, 34, 55, 89, ...

Find the sum of all the even-valued terms in the sequence which do not exceed four million.

The Fibonacci sequence is a recursive sequence. Let's keep generating the sequence terms (starting with 0,1) and only add the even terms to our running total.

def f1(n, s0=0, s1=1):
    total = 0

    while s1 < n:
        s1 = s1+s0
        s0 = s1-s0

        if s1%2 is 0:
            total += s1

    return total

Surprise: this yields the correct answer. I was interested in the efficiency of the "evenness" test. Specifically, which is quicker: checking if n mod 2 is 0 or checking if n bitwise-and 0x1 is 1. Turns out, the modulo calculation is reliably quicker, though not by much.

On my set-up, f1(4e6) returns in about 1.0002e-05 seconds. Using the bitwise operation instead of the modulo check returns in about 1.0090e-5 seconds.

Project Euler: Problem 1

http://projecteuler.net/index.php?section=problems&id=1

If we list all the natural numbers below 10 that are multiples of 3 or 5, we get 3, 5, 6 and 9. The sum of these multiples is 23.

Find the sum of all the multiples of 3 or 5 below 1000.

The simplest, and naive, approach is to iterate over all numbers less than N (in this case, 1000) and test against modulo 3 and 5.

def f1(n):
    total = 0
    for i in xrange(1,n):
        # It's slightly quicker to check against 3 first 
        if i%3 is 0 or i%5 is 0:
            total += i
    return total

This yields the correct answer, but is relatively slow. Can we do better? Yes! Instead of iterating over all numbers from 1 to N, let's use the arithmetic series equation.

def f3(n):
    total = 0

    # Count the number of terms with d=3
    c = (n-1)//3
    # Add in the arithmetic series for d=3
    total += c*(6 + 3*(c-1) )/2

    # Count the number of terms with d=5
    c = (n-1)//5
    # Add in the arithmetic series for d=5
    total += c*(10 + 5*(c-1) )/2

    # Count the number of terms with d=15
    c = (n-1)//15
    # Subtract out the (double counted) series for d=15
    total -= c*(30 + 15*(c-1) )/2

    return total

On my set-up, f1(1000) returns in about 1.99e-04 seconds and f3(1000) returns in about 1.70e-6 seconds. Furthermore, f1(n) scales as O(n) while f3(n) scales as O(1). Nice!

Island of Cannibal Horses

About Me

Blog Archive