We have seen tuples as one way of building
composite types in Haskell.
Another kind of composite types are the list types.
Lists in Haskell are homogeneous (i.e., all elements in a list are of the
same types), so we have types such as lists of Integers, lists of Characters,
lists of lists of Integers, and so on.
If A is a type, then the type of lists of As
is denoted [A].
For example, [Integer] is the type of lists of integers,
[Char] is the type of lists of characters (or strings), and
[[Float]] is the type of lists of lists of floating-point
numbers.
Lists are also dynamic; unlike arrays in imperative languages, there is no
bound on the length of lists in Haskell, and elements can be added and removed
from lists.
There are three different notations that can be used for writing lists in Haskell:
1 : 3 : 8 : 4 : [][1,3,8,4][Char]): for example,
"hello"String in Haskell is an alias for
[Char], and the notation "hello" for
['h', 'e', 'l', 'l', 'o']
has been introduced as a convenience for the programmer:
as well as requiring fewer keystrokes, the first notation is easier to read,
and the interpreter takes care to print out all lists of characters in
string notation.
Similarly, the square-brackets notation is more convenient and readable
than the "cons" notation, and again the interpreter will translate
from the cons notation to the square-brackets notation (and again to
string notation in the case of lists of Char):
Prelude> 1 : 2 : 4 : [] [1,2,4] Prelude> 'c' : 'a' : 't' [] "cat" Prelude> ['c', 'a', 't'] "cat" |
However, it is the "cons" notation that is the most fundamental,
and it is this notation that is used in pattern-matching.
The cons notation directly reflects the structure of lists:
a list is either empty (the empty list is written [],
and this represents the empty list of any type),
or it is not empty, in which case it has a first element
(called the head), and a remainder (called the tail).
That is, every list is either empty ([]) or is of the
form h:t for some element h and list
t.
The constant [] and the "cons" operator (:)
are examples of constructors.
Constructors are special operations that have two defining properties:
(Integer,Char)
can be written down using the tuple (_,_) constructor
in the form (i,c), where i is an constant
of type Integer and c is a constant of type
Char.
Moreover, if we type such an expression, e.g., (-3,'d')
at the hugs prompt, the fully-evaluated result is (-3,'d')
itself.
The constructors for lists are [] and cons (:).
As we saw, a list is either:
[]), ora:l
for some element a and list l.1:2:4:[]
is evaluated to [1,2,4], the latter is just a notational
convenience, and the "real" list is still 1:2:4:[].
We have seen constructors in patterns when we looked at tuple types; the general definition of a pattern is any expression built from variables (names), constants and constructors. The following are all examples of patterns:
x0(x,y)(x,1)[]x:lx:y:[](x,y):lExpressions match with patterns when they have the same form, and the effect is that any variables in the pattern become bound to subexpressions. The rules for matching are:
E matches with a variable, and E
is bound to that variable (for example, 24+5
matches a variable x);E matches a constant only if E
is equal to that constant (for example, only 0
matches the constant 0);E matches a pattern
(p1,p2)
if E is of the form
(E1,E2)
and E1 matches p1 and E2 matches
p2 (for example, ([1,9],1) matches
the pattern (x,1) and the variable x
is bound to the list [1,9]);E matches a pattern
p1:p2
if E is of the form E1:E2
and E1 matches p1
and E2 matches p2
(for example, the list 1:4:9:[] matches the pattern
x:y:l, with x bound to 1,
y bound to 4 and
l bound to 9:[]).
-- add up all the numbers in a list sumAll :: [Integer] -> Integer sumAll [] = 0 sumAll (x:xs) = x + sumAll xs |
++ that concatenates two lists
(the two lists must be of the same type). For example,
[1,3,5,7] ++ [9,11] = [1,3,5,7,9,11] .
Another infix operator is !!, where xs!!n
returns the nth element of the list xs,
counting from 0; for example,
[1,3,5,7] !! 1 = 3
Definitions for the predefined functions are:
-- the number of elements in a list: -- e.g., length [2,4,6] = 3 length [] = 0 length (x:xs) = 1 + length xs -- "flatten" a list of lists into a list -- e.g., concat [[1,2], [], [3], [4,5] -- = [1,2] ++ [] ++ [3] ++ [4,5] -- = [1,2,3,4,5] concat [] = [] concat (l:ls) = l ++ concat ls -- return the first element in a list, -- giving an error if the list is empty -- e.g., head "string" = 's' head [] = error "head []" head (x:xs) = x -- return all but the first element in a list, -- giving an error if the list is empty -- e.g., tail "string" = "tring" tail [] = error "tail []" tail (x:xs) = xs -- take two lists and make a list of pairs -- e.g., zip [1,2] [3,4,5] = [(1,3),(2,4)] zip xs [] = [] zip [] ys = [] zip (x:xs) (y:ys) = (x,y) : zip xs ys -- take a list of pairs and return a pair of lists -- e.g., unzip [(1,3),(2,4)] = ([1,2], [3,4]) unzip [] = ([], []) unzip ((x,y):ps) = (x:l1, y:l2) where (l1,l2) = unzip ps |
Lists are to declarative languages what arrays are to imperative languages.
Becuase lists are so widely used, a special notation has been introduced
in some languages (such as Miranda and Haskell) for constructing lists.
This notation is called list comprehension (in analogy with
set comprehension in set theory).
Suppose xs is the list [1,2,3,4,5,6],
then the set comprehension
[ 2*x | x <- xs ]
denotes the list [2,4,6,8,10,12], obtained by multiplying
each list in xs by 2.
In a list comprehension [ E | P <- LExp ],
we refer to E as the body and P <- lExp
as the selector; note that LExp can be any expression
of a list type, P can be a pattern, and E
any expression containing the variables in P.
To illustrate the use of a pattern, here's a function that takes a
list of pairs of Integers and adds each pair:
-- add each pair in a list addPairs :: [(Integer,Integer)] -> [Integer] addPairs ps = [ x+y | (x,y) <- ps ] |
The selector can be combined with one or more guards, separated by commas:
[ 2*x | x <- xs, even x, x < 5]
If xs is the list [1,2,3,4,5,6] then the result
will be the list [4,8] (since only the elements 2
and 4 of xs make both guards true).
There can be more than one selector in a list comprehension, again separated by commas; for example,
Prelude> [ (x,y) | x <- [1,2,3], y <- [2,3] ] [(1,2), (1,3), (2,2), (2,3), (3,2), (3,3)] Prelude> [ [ (x,y) | y <- [2,3] ] | x <- [1,2,3] ] [ [(1,2), (1,3)], [(2,2), (2,3)], [(3,2), (3,3)] ] Prelude> concat [ [ (x,y) | y <- [2,3] ] | x <- [1,2,3] ] [(1,2), (1,3), (2,2), (2,3), (3,2), (3,3)] |
Haskell is a strongly typed language, which means that every expression
has a type, and that the interpreter can infer the type of an expression
from the types of its subexpressions.
All the list operators, functions and expressions that we've looked at
so far have types, but up until now we've omitted any mention of types,
simply saying things like "the cons operator takes an element and a list
and puts the element at the head of the list" - but how can we write down
the Haskell type of the cons operator, or of the length
function?
Let's start by looking at the length function.
Clearly, length returns a number of some kind;
in fact, it returns a 32-bit integer, i.e., an Int.
And clearly it takes a list as argument, but what kind of list?
The answer is that it takes any kind of list as argument:
Prelude> length [2,4,8] 3 Prelude> length "word" 4 Prelude> length [(1,2)] 1 Prelude> length [ [1,2], [2,3,4], [], [4], [4,5,6] ] 5 Prelude> length [] 0 Prelude> :type length length :: [a] -> Int |
In the last line above, we ask the interpreter what the type of
length is, and the response is
length :: [a] -> Int
The type of length's argument is [a],
which means a list whose elements are of any type a.
This a is simply a name standing for any Haskell type;
when we evaluate length "word", for example, the
a stands for (or more precisely, is instantiated to)
Char, and when we evaluate
length [(1,2)], the a stands for
(is instantiated to) (Int,Int).
This is an example of polymorphism, where a function can
have more than one type.
Note that length can not be applied to
elements of any type; it can only be applied to any
list type (which is why the type of its argument is
[a] and not just a).
The "type variable" a is similar to the generic types
in Ada.
Similarly, the type of the cons operator is
a -> [a] -> [a]
which means that it takes an element of some type a
and a list with the same type of elements, and returns a
list of the same type.
For example, instantiations of this type include:
Int -> [Int] -> [Int]
Char -> [Char] -> [Char]
(Int,Bool) -> [(Int,Bool)] -> [(Int,Bool)]
[Float] -> [[Float]] -> [[Float]]
but not:
Int -> [Bool] -> [Bool]
Int -> [Bool] -> [Int]
[Int] -> [Int] -> [Int]
Recall the zip function above that takes two lists and
"zips" them together into a list of pairs.
Clearly the two lists can have different types;
we would expect valid instantiations of zip's type to include:
[Int] -> [Char] -> [(Int,Char)]
[Bool] -> [Float] -> [(Bool,Float)]
[Int] -> [[Int]] -> [(Int,[Int])]
as well as allowing the case where the two lists do have the same type:
[Int] -> [Int] -> [(Int,Int)]
[Bool] -> [Bool] -> [(Bool,Bool)]
etc.
The polymorphic type of zip is
zip :: [a] -> [b] -> [(a,b)]
where the use of two type variables, a and b,
allows the two argument lists to have different types (but note again
that they must both be list types).
sumAll
that takes a list of numbers and returns the result of multiplying
together all the numbers in the list.
shortest :: [[a]] -> [a]
that takes a list of lists and returns the shortest list in the list
(and returns [] if the list of lists is empty).
occursIn :: a -> [a] -> Bool
that takes an element x and a list xs
and returns True if x occurs in xs,
and False otherwise.
(Do not declare the type of the function in your source code;
the reason for this will be covered in a later lecture.)
occurs :: a -> [[a]] -> [[a]]
that takes an element x and a list of listsxss
and returns the list containing all those lists in which x
occurs.
Give a recursive definition, then give anither definition using
a list comprehension.
(Again, do not declare the type of the function in your source code;
the reason for this will be covered in a later lecture.)
crossProduct :: [a] -> [b] -> [(a,b)]
that takes two lists xs and ys,
and returns the list of all possible pairings:
[ (x,y) | x <- xs, y <- ys ]
without using the above list comprehension.
(As an exercise in problem decomposition, try first defining a
"helper" function
pairWithAll :: a -> [b] -> [(a,b)]
that pairs its first argument with each element in its second.)