Types, pattern matching, and polymorphism

Algebraic types give us a very concise way to model composite types, even recursive ones. Pattern matching makes it easy to work with algebraic types. Type classes enable both the fundamental types of polymorphism: parametric and ad-hoc.

Together, these capabilities allow us to easily express many of the Gang of Four patterns.

Algebraic types and pattern matching

Algebraic data types can express a combination of types, for example:

type Name = String
type Age = Int
data Person = P String Int -- combination

They can also express a composite of alternatives:

data MaybeInt = NoInt | JustInt Int

Here, each alternative represents a valid constructor of the algebraic type:

maybeInts = [JustInt 2, JustInt 3, JustInt 5, NoInt]

Type combination is also known as "product of types" and type alternation as "sum of types". In this way, we can create an "algebra of types", with sum and product as operators, hence the name Algebraic data types.

By parameterizing algebraic types, we create generic types:

data Maybe' a = Nothing' | Just' a

Algebraic data type constructors also serve as "deconstructors" in pattern matching:

fMaybe f (Just' x) = Just' (f x)
fMaybe f Nothing' = Nothing'

fMaybes = map (fMaybe (* 2)) [Just' 2, Just' 3, Nothing']

On the left of the = sign we deconstruct; on the right, we construct. In this sense, pattern matching is the complement of algebraic data types: they are two sides of the same coin.

Recursive types

We capture the "composite pattern" very succinctly by creating recursive algebraic types, for example:

data Tree a = Leaf a | Branch (Tree a) (Tree a)

This pattern describes the need to sometimes unify a composite structure with individual members of that structure. In this case, we're unifying Leaf (a leaf being a part of a tree) and Tree (the composite structure). Now we can write functions that act on trees and leaves:

size :: Tree a -> Int
size (Leaf x) = 1
size (Branch t u) = size t + size u + 1

Functions over recursive types are typically recursive themselves.

Polymorphism

Polymorphism points at the phenomenon of something taking many forms. In Haskell, there are two kinds of polymorphism: parametric and ad-hoc (first described by Strachey in Fundamental Concepts in Programming Languages, 1967).

Parametric polymorphism

In the following code, we have a function defined on a list of any type. The function is defined at such a high level of abstraction that the precise input type simply never comes into play, yet the result is of a particular type:

length' :: [a] -> Int
length' [] = 0
length' (x:xs) = 1 + length xs

The length object exhibits parametric polymorphism because it acts uniformly on a range of types that share a common structure, in this case, lists:

length' [1,2,3,5,7]
length' ['1','2','3','5','7']

In this sense, length is a generic function. Functions defined on parametric data types tend to be generic.

Ad-hoc polymorphism

	"Wadler conceived of type classes in a conversation with Joe Fasel. Fasel had in mind a different idea, but it was he who had the key insight that overloading should be reflected in the type of the function. Wadler misunderstood what Fasel had in mind, and type classes were born!"
	--History of Haskell, Hudak et al.

The canonical example of ad-hoc polymorphism (also known as "overloading") is that of the polymorphic + operator, defined for all Num variables:

class Num a where
   (+) :: a -> a -> a

instance Int Num where
   (+) :: Int → Int → Int
   x + y = intPlus x y

instance Float Num where
   (+) :: Float → Float → Float
   x + y = floatPlus x y

In fact, the introduction of type classes into Haskell was driven by the need to solve the problem of overloading numerical operators and equality.

When we call (+) on two numbers, the compiler will dispatch evaluation to the concrete implementation, based on the types of numbers being added:

let x_int = 1 + 1        -- dispatch to 'intPlus'
let x_float = 1.0 + 2.5  -- dispatch to 'floatPlus'
let x  = 1 + 3.14        –- dispatch to 'floatPlus'

In the last line, we are adding what looks like an int to a float variable. In many languages, we'd have to resort to explicit coercion (of int to float, say) to resolve this type of "mismatch". In Haskell, this is resolved by treating the value of 1 as a type-class polymorphic value:

ghci> :t 1 -- Num a => a

1 is a generic value (a Num variable); whether 1 is an int variable or a float variable (or a fractional, say) depends on the context in which it will appear.

Alternation-based ad-hoc polymorphism

There are two kinds of ad-hoc polymorphism. We've seen the first type already in this chapter:

data Maybe' a = Nothing' | Just' a
fMaybe f (Just' x) = Just' (f x)
fMaybe f Nothing' = Nothing'

The fMaybe function is polymorphically defined over the alternations of Maybe. In order to directly contrast the two kinds of polymorphism, let's carry this idea over into another example:

data Shape = Circle Float | Rect Float Float

area :: Shape -> Float
area (Circle r) = pi * r^2
area (Rect length width) = length * width

The area function is dispatched over the alternations of the Shape type.

Class-based ad-hoc polymorphism

We could also have achieved a polymorphic area function over shapes in this way:

data Circle = Circle Float
data Rect = Rect Float Float

class Shape a where
  area :: a -> Float

instance Shape Circle where
  area (Circle r) = pi * r^2
instance Shape Rect where
  area (Rect length' width') = length' * width'

Tip

Downloading the example code

You can download the example code files from your account at http://www.packtpub.com for all the Packt Publishing books you have purchased. If you purchased this book elsewhere, you can visit http://www.packtpub.com/support and register to have the files e-mailed directly to you.

Instead of unifying shapes with an algebraic "sum of types", we created two distinct shape types and unified them through a Shape class. This time the area function exhibits class-based polymorphism.

Alternation-based versus class-based

It is tempting to ask "which approach is best?" Instead, let's explore the important ways in which they differ:

	Alternation-based	Class-based
Different coupling between function and type	The function type refers to the algebraic type `Shape` and then defines special cases for each alternative.	The function type is only aware of the type it is acting on, not the `Shape` "super type".
Distribution of function definition	The overloaded functions are defined together in one place for all alternations.	Overloaded functions all appear in their respective class implementations. This means a function can be overloaded in very diverse parts of the codebase if need be.
Adding new types	Adding a new alternative to the algebraic type requires changing all existing functions acting directly on the algebraic "super type"	We can add a new type that implements the type class without changing any code in place (only adding). This is very important since it enables us to extend third-party code.
Adding new functions	A perimeter function acting on `Shape` won't be explicitly related to area in any way.	A perimeter function could be explicitly related to area by adding it to the `Shape` class. This is a powerful way of grouping functions together.
Type expressivity	This is useful for expressing simple type hierarchies.	We can have multiple, orthogonal hierarchies, each implementing the type class (For example, we can express multiple-inheritance type relations). This allows for modeling much richer data types.

Polymorphic dispatch and the visitor pattern

While exploring ad-hoc polymorphism, we saw how we can simulate static type dispatch ("static" meaning that the dispatch is resolved at compile time, as opposed to "dynamic dispatch", resolved only at runtime). Let's return to our area function:

area (Circle 10)

The preceding command will dispatch to the overloaded area function by matching:

A sub type of the Shape algebraic type (subtype-based)
The type class to which Circle belongs that is, Shape (class-based)

We've referred to this as "dispatching on type" but, strictly speaking, type dispatch would have to resemble the following invalid Haskell:

f v = case (type v) of
  Int -> "Int: " ++ (show v)
  Bool -> "Bool" ++ (show v)

Having said that, pattern-based and type-based dispatching are not that far apart:

  data TypeIntBool = Int' Int | Bool' Bool

  f :: TypeIntBool -> String
  f (Int' v) = "Int: " ++ (show v)
  f (Bool' v) = "Bool: " ++ (show v)

So far, we have only seen dispatching on one argument or "single dispatch". Let's explore what "double-dispatch" might look like:

data CustomerEvent = InvoicePaid Float | InvoiceNonPayment
data Customer = Individual Int | Organisation Int

payment_handler :: CustomerEvent -> Customer -> String

payment_handler (InvoicePaid amt) (Individual custId)
  = "SendReceipt for " ++ (show amt)
payment_handler (InvoicePaid amount) (Organisation custId)
  = "SendReceipt for " ++ (show amt)

payment_handler InvoiceNonPayment (Individual custId)
  = "CancelService for " ++ (show custId)
payment_handler InvoiceNonPayment (Organisation custId)
  = "SendWarning for " ++ (show custId)

The payment_handler object defines behavior for all four permutations of CustomerEvent and Customer. In an OOP language, we would have to resort to the visitor pattern to achieve multiple dispatch.

"Visitor lets you define a new operation without changing the classes of the elements on which it operates...
Languages that support double or multiple dispatch lessen the need for the Visitor pattern." - Design Patterns, Gamma et al.

Unifying parametric and ad-hoc polymorphism

On the one hand, we have parametric polymorphism, where a single generic function acts on a variety of types. This is in contrast to ad-hoc polymorphism, where we have an overloaded function that is resolved to a particular function by the compiler. Put another way, parametric polymorphism allows us to lift the level of abstraction, whereas ad-hoc polymorphism gives us a powerful tool for decoupling.

In this sense, parametric polymorphism is considered to be "true polymorphism", while ad hoc is only "apparent" (or "synthetic").

Haskell blurs the distinction between ad hoc (specialized) and parametric (generic) polymorphism. We can see this clearly in the definition of the type class for equality:

class Eq a where
  (==), (/=) :: a -> a -> Bool
  x == y = not (x /= y)
  x /= y = not (x == y)

(==) and (/=) are both given mutually recursive default implementations. An implementer of the Eq class would have to implement at least one of these functions; in other words, one function would be specialized (ad-hoc polymorphism), leaving the other defined at a generic level (parametric polymorphism). This is a remarkable unification of two very different concepts.

Functions, types, and patterns

Functions and types intersect in several ways. Functions have a type, they can act on algebraic types, they can belong to type classes, and they can be specific or generic in type. With these capabilities, we can express several more Gang of Four patterns.

The strategy pattern

Thanks to higher-order functions (HOF), we can easily inject behavior:

strategy fSetup fTeardown
  = do
      setup
      -– fullfil this function's purpose
      teardown

Here, we are defining an abstract algorithm by letting the caller pass in functions as arguments, functions that complete the detail of our algorithm. This corresponds to the strategy pattern, also concerned with decoupling an algorithm from the parts that may change.

	"Strategy lets the algorithm vary independently from clients that use it."
	--Design Patterns, Gamma et al.

The template pattern

In "OOP speak", the strategy pattern uses delegation to vary an algorithm, while the template pattern uses inheritance to vary parts of an algorithm. In Haskell, we don't have OOP inheritance, but we have something far more powerful: type classes. We might easily abstract an algorithm with this type class that acts as an abstract class:

class TemplateAlgorithm where
  setup :: IO a → a
  teardown :: IO a → a
  doWork :: a → a
  fulfillPurpose
    = do
       setup
       doWork
       teardown

	"Define the skeleton of an algorithm in an operation, deferring some steps to subclasses. Template Method lets subclasses redefine certain steps of an algorithm without changing the algorithm's structure."
	--Design Patterns, Gamma et al.

The iterator pattern

	"Provide a way to access the elements of an aggregate object sequentially without exposing its underlying representation"
	--Design Patterns, Gamma et al.

The map function takes care of navigating the structure of the list, while the square function only deals with each element of the list:

map square [2, 3, 5, 7]

We have decoupled flow control from function application, which is akin to the iterator pattern.

Decoupling behavior and modularizing code

Whenever we pass one function into another, we are decoupling two parts of code. Besides allowing us to vary the different parts at different rates, we can also put the different parts in different modules, libraries, or whatever we like.

Haskell Design Patterns

By : Lemmer

Haskell Design Patterns

By: Lemmer

Overview of this book

Types, pattern matching, and polymorphism

Algebraic types and pattern matching

Recursive types

Polymorphism

Parametric polymorphism

Ad-hoc polymorphism

Alternation-based ad-hoc polymorphism

Class-based ad-hoc polymorphism

Tip

Alternation-based versus class-based

Polymorphic dispatch and the visitor pattern

Unifying parametric and ad-hoc polymorphism

Functions, types, and patterns

The strategy pattern

The template pattern

The iterator pattern

Decoupling behavior and modularizing code

Haskell Design Patterns

By : Lemmer

Haskell Design Patterns

By: Lemmer

Overview of this book

Types, pattern matching, and polymorphism

Algebraic types and pattern matching

Recursive types

Polymorphism

Parametric polymorphism

Ad-hoc polymorphism

Alternation-based ad-hoc polymorphism

Class-based ad-hoc polymorphism

Tip

Alternation-based versus class-based

Polymorphic dispatch and the visitor pattern

Unifying parametric and ad-hoc polymorphism

Functions, types, and patterns

The strategy pattern

The template pattern

The iterator pattern

Decoupling behavior and modularizing code

Create a Note

Delete Bookmark

Delete Note

Edit Note

Confirmation

Buy this book with your credits?