Sunday 1 March 2009

Prime factorization, a comparison between Haskell and Scala

Haskell is a powerful and expressive pure lazy functional programming language with some interesting properties. http://www.haskell.org/ Haskell is of course named after the famous logician Haskell Curry http://en.wikipedia.org/wiki/Haskell_Curry who's name can be found in both the Haskell language and the process of "currying" functions.

By way of example and comparison the task of finding prime numbers is shown in Haskell and then its equivalent is shown in Scala.

We can see the conciseness of Haskell and the power of lazyness in expressions of the form:

[3..]

which creates an infinite list of integers from 3 upwards. This seeming impossibility can be handled in a lazy functional language because the infinite list is not evaluated up front, rather the list is only evaluated when needed so the list is finite at any given point in time.

So, in essence a lazy language is one where an expression is only evaluated when it's needed, unlike an imperative language where it's evaluated when declared.


Haskell: prime factorization example:


divides :: Integer -> Integer -> Bool
divides d n = rem n d == 0

ld :: Integer -> Integer
ld n = ldf 2 n

ldf :: Integer -> Integer -> Integer
ldf k n | divides k n = k
| k^2 > n = n
| otherwise = ldf (k+1) n

factors :: Integer -> [Integer]
factors n | n < 1 = error "Invalid argument: argument must be positive"
| n == 1 = []
| otherwise = p : factors (div n p) where p = ld n


If the above is saved in a file (called say primefactorization) and loaded into Hugs (Hugs is a Haskell interpreter shell based on Haskell 98 - http://www.haskell.org/hugs/) using the :l command as follows:

Main> :l c:\Dev\haskell\primefactorisation


Scala: prime factorization example:


package test

object PrimeFactorisationTest {

def divides(d : Int, n : Int) = {
(n % d) == 0
}

def ld(n : Int) : Int = {
ldf(2, n)
}

def ldf(k : Int, n : Int) : Int = {
if (divides(k, n)) k
else if ((k*k) > n) n
else ldf((k+1), n)
}

def factors(n : Int) : List[Int] = n match {
case 1 => Nil;
case _ => {
val p = ld(n)
p :: factors(n / p)
}
}

def main(args : Array[String]) {
if (args.length == 1)
{
val n = Integer.parseInt(args(0))
println(factors(n))
}
}
}


Results:



Main> factors 12343434
[2,3,79,26041]


Both programs produce the same output and are functionally equivalent - it's also clear to see there's a reasonably similarity in both conciseness and expressiveness of the two languages, where Haskell possibly has the slight edge in some respects such as use of the "where" construct.


Update:

Following on from the excellent comments on the Haskell version from Don Stewart, turning the code into something that can be compiled and executed natively, rather than being interpreted using Hugs, roughly involves the following steps.

Install GHC http://www.haskell.org/ghc/

Rename the file to have a .hs file extension

Add a main that can process the arguments:

main = do
[n] <- getArgs
print (factors (read n))

add import to provide environment functions such as getArgs to the top of the file:

import System.Environment

Compile the file:

$ghc --make primefactorisation.hs

Run the executable:

$ time ./primefactorisation.exe 12343434
[2,3,79,26041]

real 0m0.140s
user 0m0.015s
sys 0m0.015s

and there we have it, a functional executable.

Don Stewart also makes a good point about being able to leave out all the type information.

In particular Haskell uses the powerful Hindley-Milner type inference algorithm to allow minimal (often zero) use of explicit type declarations.

http://c2.com/cgi/wiki?HindleyMilnerTypeInference


.

1 comment:

Don Stewart said...

You might want to just leave off those type annotations, since they're fully inferred:

import System.Environment

divides d n = rem n d == 0

ld n = ldf 2 n

ldf k n
     | divides k n = k
     | k^2 > n = n
     | otherwise = ldf (k+1) n

factors n
     | n < 1 = error "Invalid argument: argument must be positive"
     | n == 1 = []
     | otherwise = p : factors (div n p) where p = ld n

main = do
     [n] <- getArgs
     print (factors (read n))

And I'd strongly consider using GHC instead of Hugs (hugs is really a toy).


$ ghc -O2 --make A.hs
[1 of 1] Compiling Main ( A.hs, A.o )
Linking A ...

$ time ./A 12343434
[2,3,79,26041]
./A 12343434 0.00s user 0.00s system 162% cpu 0.002 total