Lists in Haskell


Contents


Lists and List Types

We have seen tuples as one way of building composite types in Haskell. Another kind of composite types are the list types. Lists in Haskell are homogeneous (i.e., all elements in a list are of the same types), so we have types such as lists of Integers, lists of Characters, lists of lists of Integers, and so on. If A is a type, then the type of lists of As is denoted [A]. For example, [Integer] is the type of lists of integers, [Char] is the type of lists of characters (or strings), and [[Float]] is the type of lists of lists of floating-point numbers. Lists are also dynamic; unlike arrays in imperative languages, there is no bound on the length of lists in Haskell, and elements can be added and removed from lists.

There are three different notations that can be used for writing lists in Haskell:

  1. "cons notation": for example, 1 : 3 : 8 : 4 : []
  2. "Square brackets notation": for example, [1,3,8,4]
  3. "String notation" (only for [Char]): for example, "hello"
Note that these are just three different ways of denoting the same thing. The type String in Haskell is an alias for [Char], and the notation "hello" for
    ['h', 'e', 'l', 'l', 'o']
has been introduced as a convenience for the programmer: as well as requiring fewer keystrokes, the first notation is easier to read, and the interpreter takes care to print out all lists of characters in string notation. Similarly, the square-brackets notation is more convenient and readable than the "cons" notation, and again the interpreter will translate from the cons notation to the square-brackets notation (and again to string notation in the case of lists of Char):

Prelude> 1 : 2 : 4 : []
[1,2,4]
Prelude> 'c' : 'a' : 't' []
"cat"
Prelude> ['c', 'a', 't']
"cat"

However, it is the "cons" notation that is the most fundamental, and it is this notation that is used in pattern-matching. The cons notation directly reflects the structure of lists: a list is either empty (the empty list is written [], and this represents the empty list of any type), or it is not empty, in which case it has a first element (called the head), and a remainder (called the tail). That is, every list is either empty ([]) or is of the form h:t for some element h and list t. The constant [] and the "cons" operator (:) are examples of constructors.


Constructors

Constructors are special operations that have two defining properties:

For example, all elements of type (Integer,Char) can be written down using the tuple (_,_) constructor in the form (i,c), where i is an constant of type Integer and c is a constant of type Char. Moreover, if we type such an expression, e.g., (-3,'d') at the hugs prompt, the fully-evaluated result is (-3,'d') itself.

The constructors for lists are [] and cons (:). As we saw, a list is either:

Although it looks as though a list such as 1:2:4:[] is evaluated to [1,2,4], the latter is just a notational convenience, and the "real" list is still 1:2:4:[].


Pattern-Matching

We have seen constructors in patterns when we looked at tuple types; the general definition of a pattern is any expression built from variables (names), constants and constructors. The following are all examples of patterns:

Expressions match with patterns when they have the same form, and the effect is that any variables in the pattern become bound to subexpressions. The rules for matching are:

Pattern-matching is useful in defining recursive functions over lists; for example, a function to add up all the numbers in a list of numbers could be defined as follows:

-- add up all the numbers in a list

sumAll :: [Integer] -> Integer

sumAll []  =  0
sumAll (x:xs)  =  x + sumAll xs


Some Functions on Lists

Haskell has a number of useful predefined functions and operators on lists. Besides the cons opertor that adds an element to the start of a list, there is an infix operator ++ that concatenates two lists (the two lists must be of the same type). For example,
    [1,3,5,7] ++ [9,11]  =  [1,3,5,7,9,11] .
Another infix operator is !!, where xs!!n returns the nth element of the list xs, counting from 0; for example,
    [1,3,5,7] !! 1  =  3
Definitions for the predefined functions are:

-- the number of elements in a list:
-- e.g., length [2,4,6] = 3

length []  =  0
length (x:xs)  =  1 + length xs


-- "flatten" a list of lists into a list
-- e.g., concat [[1,2], [], [3], [4,5]
--     = [1,2] ++ [] ++ [3] ++ [4,5]
--     = [1,2,3,4,5]

concat []  =  []
concat (l:ls)  =  l ++ concat ls


-- return the first element in a list,
-- giving an error if the list is empty
-- e.g., head "string" = 's'

head []  =  error "head []"
head (x:xs)  =  x


-- return all but the first element in a list,
-- giving an error if the list is empty
-- e.g., tail "string" = "tring"

tail []  =  error "tail []"
tail (x:xs)  =  xs


-- take two lists and make a list of pairs
-- e.g., zip [1,2] [3,4,5] = [(1,3),(2,4)]

zip xs []  =  []
zip [] ys  =  []
zip (x:xs) (y:ys)  =  (x,y) : zip xs ys


-- take a list of pairs and return a pair of lists
-- e.g., unzip [(1,3),(2,4)] = ([1,2], [3,4])

unzip []  =  ([], [])
unzip ((x,y):ps)  =  (x:l1, y:l2)
  where
  (l1,l2) = unzip ps


List Comprehensions

Lists are to declarative languages what arrays are to imperative languages. Becuase lists are so widely used, a special notation has been introduced in some languages (such as Miranda and Haskell) for constructing lists. This notation is called list comprehension (in analogy with set comprehension in set theory). Suppose xs is the list [1,2,3,4,5,6], then the set comprehension

    [ 2*x | x <- xs ]
denotes the list [2,4,6,8,10,12], obtained by multiplying each list in xs by 2. In a list comprehension [ E | P <- LExp ], we refer to E as the body and P <- lExp as the selector; note that LExp can be any expression of a list type, P can be a pattern, and E any expression containing the variables in P. To illustrate the use of a pattern, here's a function that takes a list of pairs of Integers and adds each pair:

-- add each pair in a list

addPairs :: [(Integer,Integer)] -> [Integer]

addPairs ps  =  [ x+y | (x,y) <- ps ]

The selector can be combined with one or more guards, separated by commas:

    [ 2*x | x <- xs, even x, x < 5]
If xs is the list [1,2,3,4,5,6] then the result will be the list [4,8] (since only the elements 2 and 4 of xs make both guards true).

There can be more than one selector in a list comprehension, again separated by commas; for example,

Prelude> [ (x,y) | x <- [1,2,3], y <- [2,3] ]
[(1,2), (1,3), (2,2), (2,3), (3,2), (3,3)]

Prelude> [ [ (x,y) | y <- [2,3] ] | x <- [1,2,3] ]
[ [(1,2), (1,3)], [(2,2), (2,3)], [(3,2), (3,3)] ]

Prelude> concat [ [ (x,y) | y <- [2,3] ] | x <- [1,2,3] ]
[(1,2), (1,3), (2,2), (2,3), (3,2), (3,3)]


Polymorphism

Haskell is a strongly typed language, which means that every expression has a type, and that the interpreter can infer the type of an expression from the types of its subexpressions. All the list operators, functions and expressions that we've looked at so far have types, but up until now we've omitted any mention of types, simply saying things like "the cons operator takes an element and a list and puts the element at the head of the list" - but how can we write down the Haskell type of the cons operator, or of the length function? Let's start by looking at the length function.

Clearly, length returns a number of some kind; in fact, it returns a 32-bit integer, i.e., an Int. And clearly it takes a list as argument, but what kind of list? The answer is that it takes any kind of list as argument:

Prelude> length [2,4,8]
3
Prelude> length "word"
4
Prelude> length [(1,2)]
1
Prelude> length [ [1,2], [2,3,4], [], [4], [4,5,6] ]
5
Prelude> length []
0
Prelude> :type length
length :: [a] -> Int

In the last line above, we ask the interpreter what the type of length is, and the response is

    length :: [a] -> Int
The type of length's argument is [a], which means a list whose elements are of any type a. This a is simply a name standing for any Haskell type; when we evaluate length "word", for example, the a stands for (or more precisely, is instantiated to) Char, and when we evaluate length [(1,2)], the a stands for (is instantiated to) (Int,Int). This is an example of polymorphism, where a function can have more than one type. Note that length can not be applied to elements of any type; it can only be applied to any list type (which is why the type of its argument is [a] and not just a). The "type variable" a is similar to the generic types in Ada.

Similarly, the type of the cons operator is

    a -> [a] -> [a]
which means that it takes an element of some type a and a list with the same type of elements, and returns a list of the same type. For example, instantiations of this type include:
    Int -> [Int] -> [Int]
    Char -> [Char] -> [Char]
    (Int,Bool) -> [(Int,Bool)] -> [(Int,Bool)]
    [Float] -> [[Float]] -> [[Float]]
but not:
    Int -> [Bool] -> [Bool]
    Int -> [Bool] -> [Int]
    [Int] -> [Int] -> [Int]

Recall the zip function above that takes two lists and "zips" them together into a list of pairs. Clearly the two lists can have different types; we would expect valid instantiations of zip's type to include:

    [Int] -> [Char] -> [(Int,Char)]
    [Bool] -> [Float] -> [(Bool,Float)]
    [Int] -> [[Int]] -> [(Int,[Int])]
as well as allowing the case where the two lists do have the same type:
    [Int] -> [Int] -> [(Int,Int)]
    [Bool] -> [Bool] -> [(Bool,Bool)]
etc. The polymorphic type of zip is
    zip :: [a] -> [b] -> [(a,b)]
where the use of two type variables, a and b, allows the two argument lists to have different types (but note again that they must both be list types).

Exercises

  1. Write a function similar to sumAll that takes a list of numbers and returns the result of multiplying together all the numbers in the list.

  2.  
  3. Write a function
        shortest :: [[a]] -> [a]
    
    that takes a list of lists and returns the shortest list in the list (and returns [] if the list of lists is empty).

  4.  
  5. Write a function
        occursIn :: a -> [a] -> Bool
    
    that takes an element x and a list xs and returns True if x occurs in xs, and False otherwise. (Do not declare the type of the function in your source code; the reason for this will be covered in a later lecture.)
     
  6. Write a function
        occurs :: a -> [[a]] -> [[a]] 
    
    that takes an element x and a list of listsxss and returns the list containing all those lists in which x occurs. Give a recursive definition, then give anither definition using a list comprehension. (Again, do not declare the type of the function in your source code; the reason for this will be covered in a later lecture.)

  7.  
  8. Write a function
        crossProduct :: [a] -> [b] -> [(a,b)]
    
    that takes two lists xs and ys, and returns the list of all possible pairings:
        [ (x,y) | x <- xs, y <- ys ]
    
    without using the above list comprehension. (As an exercise in problem decomposition, try first defining a "helper" function
        pairWithAll :: a -> [b] -> [(a,b)]
    
    that pairs its first argument with each element in its second.)
     
  9. What are the types of the other functions in the list of predefined functions above?


 
Grant Malcolm
Last modified: Fri Nov 9 15:59:51 GMT 2001