In the following sections we give some slightly more complex examples of list processing functions.
3.3.1. sort
Program 3.1 is a variant of the well-known quicksort algorithm. sort(X)
returns a sorted list of the elements of the list X
.
-module(sort).
-export([sort/1]).
sort([]) -> [];
sort([Pivot|Rest]) ->
{Smaller, Bigger} = split(Pivot, Rest),
lists:append(sort(Smaller), [Pivot|sort(Bigger)]).
split(Pivot, L) ->
split(Pivot, L, [], []).
split(Pivot, [], Smaller, Bigger) ->
{Smaller,Bigger};
split(Pivot, [H|T], Smaller, Bigger) when H < Pivot ->
split(Pivot, T, [H|Smaller], Bigger);
split(Pivot, [H|T], Smaller, Bigger) when H >= Pivot ->
split(Pivot, T, Smaller, [H|Bigger]).
The first element of the list to be sorted is used as a pivot. The original list is partitioned into two lists Smaller
and Bigger
: all the elements in Smaller
are less than Pivot
and all the elements in Bigger
are greater than or equal to Pivot
. The lists Smaller
and Bigger
are then sorted and the results combined.
The function split(Pivot, L)
returns the tuple {Smaller,Bigger}
, where all the elements in Bigger
are greater than or equal to Pivot
and all the elements in Smaller
are less than Pivot
. split(Pivot, L)
works by calling the auxiliary function split(Pivot, L, Smaller, Bigger)
. Two accumulators, Smaller
and Bigger
, are used to store the elements in L
which are smaller than and greater than or equal to Pivot
, respectively. The code in split/4
is very similar to that in reverse/2
except that two accumulators are used instead of one. For example:
> lists:split(7,[2,1,4,23,6,8,43,9,3]).
{[3,6,4,1,2],[9,43,8,23]}
If we call sort([7,2,1,4,23,6,8,43,9,3])
, the first thing which happens is that split/2
is called with pivot 7
. This results in two lists: [3,6,4,1,2]
whose elements are less than the pivot, 7
, and [9,43,8,23]
whose elements are greater than or equal to the pivot.
Assuming that sort
works then sort([3,6,4,1,2]) ⇒ [1,2,3,4,6]
and sort([9,43,8,23]) ⇒ [8,9,23,43]
. Finally, the sorted lists are appended with the call:
> append([1,2,3,4,6], [7 | [8,9,23,43]]).
[1,2,3,4,6,7,8,9,23,43]
With a little ingenuity the call to append
can be removed, as in the following:
qsort(X) ->
qsort(X, []).
%% qsort(A,B)
%% Inputs:
%% A = unsorted List
%% B = sorted list where all elements in B
%% are greater than any element in A
%% Returns
%% sort(A) appended to B
qsort([Pivot|Rest], Tail) ->
{Smaller,Bigger} = split(Pivot, Rest),
qsort(Smaller, [Pivot|qsort(Bigger,Tail)]);
qsort([], Tail) ->
Tail.
We can compare the performance of this with the first version of sort by using the BIF statistics/1
(see appendix, which provides information about the performance of the system). If we compile and run the code fragment:
...
statistics(reductions),
lists:sort([2,1,4,23,6,7,8,43,9,4,7]),
{_, Reductions1} = statistics(reductions),
lists:qsort([2,1,4,23,6,7,8,43,9,4,7]),
{_, Reductions2} = statistics(reductions),
...
We can find out how many reductions (function calls) it took to evaluate the call the sort
and qsort
functions. In our example sort
took 93
reductions and qsort
took 74
, a 20
percent improvement.
3.3.2. Sets
Program 3.2 is a simple collection of set manipulation functions. The obvious way to represent sets in Erlang is as an unordered list of elements without duplication.
The set manipulation functions are as follows:
new()
- Returns an empty set.
add_element(X, S)
- Adds an element
X
to the setS
and returns a new set. del_element(X, S)
- Deletes the element
X
from the setS
and returns a new set. is_element(X, S)
- Returns true if the element
X
is contained in the setS
, otherwisefalse
. is_empty(S)
- Returns true if the set
S
is empty otherwisefalse
. union(S1, S2)
- Returns the union of the sets
S1
andS2
, i.e. the set of all elements which are contained in eitherS1
orS2
. intersection(S1, S2)
- Returns the intersection of the sets
S1
andS2
, i.e. the set of all elements which are contained in bothS1
andS2
.
Strictly speaking, we should not say new returns an empty set but rather new
returns a representation of an empty set. If we represent the sets as lists, then the set operations can be written as follows:
-module(sets).
-export([new/0, add_element/2, del_element/2,
is_element/2, is_empty/1, union/2, intersection/2]).
new() -> [].
add_element(X, Set) ->
case is_element(X, Set) of
true -> Set;
false -> [X|Set]
end.
del_element(X, [X|T]) -> T;
del_element(X, [Y|T]) -> [Y|del_element(X,T)];
del_element(_, []) -> [].
is_element(H, [H|_]) -> true;
is_element(H, [_|Set]) -> is_element(H, Set);
is_element(_, []) -> false.
is_empty([]) -> true;
is_empty(_) -> false.
union([H|T], Set) -> union(T, add_element(H, Set));
union([], Set) -> Set.
intersection(S1, S2) -> intersection(S1, S2, []).
intersection([], _, S) -> S;
intersection([H|T], S1, S) ->
case is_element(H,S1) of
true -> intersection(T, S1, [H|S]);
false -> intersection(T, S1, S)
end.
Running the code in Program 3.2:
> S1 = sets:new().
[]
> S2 = sets:add_element(a, S1).
[a]
> S3 = sets:add_element(b, S2).
[b,a]
> sets:is_element(a, S3).
true
> sets:is_element(1, S2).
false
> T1 = sets:new().
[]
> T2 = sets:add_element(a, T1).
[a]
> T3 = sets:add_element(x, T2).
[x,a]
> sets:intersection(S3, T3).
[a]
> sets:union(S3,T3).
[b,x,a]
This implementation is not particularly efficient, but it is sufficiently simple to be (hopefully) correct. At a later stage it could be replaced by a more efficient version.
3.3.3. Prime numbers
In our final example (Program 3.3) we see how a list of prime numbers can be generated using the sieve of Eratosthenes algorithm.
-module(siv).
-compile(export_all).
range(N, N) ->
[N];
range(Min, Max) ->
[Min | range(Min+1, Max)].
remove_multiples(N, [H|T]) when H rem N == 0 ->
remove_multiples(N, T);
remove_multiples(N, [H|T]) ->
[H | remove_multiples(N, T)];
remove_multiples(_, []) ->
[].
sieve([H|T]) ->
[H | sieve(remove_multiples(H, T))];
sieve([]) ->
[].
primes(Max) ->
sieve(range(2, Max)).
Note that in Program 3.3 we use the compiler annotation -compile(export_all)
– this implicitly exports all functions in the module so they can be called without giving explicit export declarations.
range(Min, Max)
returns a list of the integers between Min
and Max
. remove_multiples(N, L)
removes all multiples of N
from the list L
:
> siv:range(1,15).
[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15]
> siv:remove_multiples(3,[1,2,3,4,5,6,7,8,9,10]).
[1,2,4,5,7,8,10]
sieve(L)
retains the head of the list L
and recursively removes all multiples of the head of the list from the sieved tail of the list:
> siv:primes(25).
[2,3,5,7,11,13,17,19,23]