**Abstract**:: We discuss the importance of typing, side-effects whose absence give possible efficiency theorems, evaluation schemes allowing infinite structures, and functions that can act on arbitrary types. Occasionally making comparisons with the

`C#`

language.
Essentially, we briefly summarise some parts of the fifth chapter of Hutton’s Programming in Haskell, 1ˢᵗ edition.

## Table of Contents

(progn

(setq org2blog/wp-use-sourcecode-shortcode ‘t)

;; removed light=”true” (setq org2blog/wp-sourcecode-default-params nil)

;; target language needs to be in here (setq org2blog/wp-sourcecode-langs ‘(“actionscript3” “bash” “coldfusion” “cpp” “csharp” “css” “delphi” “erlang” “fsharp” “diff” “groovy” “html” “javascript” “java” “javafx” “matlab” “objc” “perl” “php” “text” “powershell” “python” “ruby” “scala” “sql” “vb” “xml” “haskell” “haskell” “sh” “emacs-lisp” “lisp” “lua”))

;; (setq org-src-fontify-natively t)

)

module Main where

import Data.Char (chr, ord, isLower)

## Comprehensions

`xs = [x₀, …, xₙ₋₁]`

and you only wanted those items that satisfied predicate `p`

, to which you then wish to apply some operation `f`

, this is obtained by the *list comprehension*

`[ f x | x <- xs, p x]`

. For the C#-inclined, using Linq-syntax, this is essentially
from x in xs where p(x) select f(x)

For example, taking `f = const 1`

, the length of a list `xs`

is obtained by `sum [1 | x <- xs]`

. More concisely, we can use `_`

in-place of the unused `x`

.

- The symbols
`|`

and`<-`

are read*such that*and*is drawn from*, respectively. - The clauses
`x <- xs`

are known as*generators*, whereas`p x`

is known as a*guard*.

## Zip

`zip`

produces a new list by pairing successive elements from two existing lists until either or both are exhausted.
For example, the list of all neighbouring, adjacent, elements of a list can be found by pairing a list with its tail:

neighbours :: [a] -> [(a, a)]

neighbours xs = zip xs (tail xs)

If all pairs of a list are in the correct order, then a list is sorted:

sorted :: Ord a => [a] -> Bool

sorted xs = and [x <= y | (x, y) <- neighbours xs]

## The Caesar Cipher

- The library functions
`ord :: Char -> Int`

and`chr :: Int -> Char`

convert between characters and their Unicode representations as integers.

— We’re only considering [‘a’..’z’], not punctuation or capital letters.

import Data.Char (chr, ord)

fromChar :: Char -> Int

fromChar c = ord c – ord ‘a’

toChar :: Int -> Char

toChar n = chr (ord ‘a’ + n)

and replacing it with a character `n`

-letters down the alphabet list, wrapping around if `n`

is not in `[0..25]`

,

shift :: Int -> Char -> Char

shift n c | isLower c = toChar $ (fromChar c + n) `mod` 26

| otherwise = c

We do this for every letter,

encode :: Int -> String -> String

encode n xs = [shift n x | x <- xs]

The last step is valid since every string is a list of its characters and so list comprehensions apply to strings.

For example,

ghci> encode 3 “haskell for life”

“kdvnhoo iru olih”

ghci> encode (-3) “kdvnhoo iru olih”

“haskell for life”

It is cute to note that `encode n . encode (-n) == id`

.

If we did not know the shift `n`

, the cipher can be cracked, for the most part, by appealing to certain statistical methods.

## Everything below is rwh2

## Why care about types

*type*. For example,

`True`

has type `Bool`

while `"foo"`

has type `String`

. The type of a value indicates that it shares certain properties with other values of the same type. In particular, there are operations that only make sense for values of a certain type. For example, we can add a pair of numbers and `or`

a pair of Booleans, but it makes no sense —that is, it is generally undefined— to `or`

a number with a Boolean.
Whenever you have some items for which an operation is sensible only when they satisfy a property then you may benefit by introducing a new type consisting of those items that satisfy the necessary property.

For example, we can catenate paths in a graph if the source of one is the target of the other: If we have a path in a graph from a node `x`

to a node `y`

and another path from node `y`

to node `z`

we can catenate them to obtain a path from `x`

to `y`

. Rather than consider paths and check the compatibility condition to assert the result is a valid path, we can simply introduce paths between nodes and define catenation as an operation on such paths. ( This is the “Category of paths over a graph”. )

Type systems give us *abstraction* —this cannot be overstated! The benefit of introducing abstraction is that it lets us forget or ignore low-level details.

If I know that a value in my program is a string, I don’t have to know the intricate details of how strings are implemented —whether it be an array or a linked list of characters or what have you. I can just assume they my string is going to behave like all other strings I’ve worked before. So, *a type is a simple form of interfacing*.

At the lowest level, a computer is concerned with bytes, with barely any additional structure. *A type system gives us abstraction* so that we can forget about bytes and treat them as certain values. A type adds meaning to plain bytes: It lets us say that “these bytes are text and those bytes are an airline reservation.” A type system will not allow us to mix up those values.

Programming is a bit like putting pieces into a jigsaw puzzle. A good type system won’t let us put a piece of the wrong shape wherever we like. Conversely, the lack of a type system, or so called *dynamic type system*, means that all the pieces are 1-by-1 squares and always fit (!), so you have to constantly examine the resulting picture and check (through testing) whether it is correct.

### Haskell is *strongly* typed

`C`

compiler will automatically and *silently*(!) coerce a value of type

`int`

into a `float`

in a context expecting a `float`

, but a Haskell compiler will raise a compilation error in a such a scenario.
We must explicitly coerce types by applying coercion functions. If we want something done, we must say so. Unlike other languages, silence does not mean consent to coercion!

We will not “accidentally” use a string when a number is expected, since there is no default coercion. We side-step such *silent* dangers by being explicit about what we want to accomplish.

### Haskell is *statically* typed

*static*type system means that the compiler knows the type of every value and expression at compile time, before any code is executed. For example,

`True && 0`

may be passable in a C-like language, but not for a Haskell compiler: Both arguments of conjunction `&&`

must be Boolean.
This feature allows Haskell to use the surrounding code to *infer*, deduce, the type of a variable. In other languages, the type system is a burden serving the compiler; in Haskell, it serves *you*. I can open a script and write `f x = x + 1`

and a Haskell compiler can accurate deduce that `f`

takes numeric types to numeric types; which we can indicate explicitly as well by, say, `f :: Int -> Int`

before the definition of `f`

. We read `x :: T`

as “`x`

is of type `T`

“; the arrow `->`

is read “to”.

In languages such as Python, *duck typing* is common, where an object acts enough like another to be used as a substitute for it: If it walks like a duck, and quacks like a duck, then let’s call it a duck.

It is the combination of strong and static typing that results in Haskell code often having fewer trivial bugs than that of another language. Indeed such systems help to ensure the absence of runtime type errors.

## Purity : the case for being explicit yet again

*side effect*introduces a dependency between the global state of the system and the behaviour of a function. For example, in an imperative language, consider a method that reads and returns the value of a global variable. If some other code can modify that global variable, then the result of a particular application of that method depends on the current value of the global variable. The method has a side effect, even though it never modifies the variable itself.

Side effects are essentially invisible inputs to, or outputs from, methods. In Haskell, the default is for functions to *not* have side effects: The result of a function depends only on the inputs that we explicitly provide. Such functions are called *pure*; methods with side effects are *impure*.

In imperative language, such as `C#`

, one may explicitly mark a function as `Pure`

, otherwise it is assumed impure and this is the case in a usual imperative language. Incidentally, pure functions are less common and so not the default. Whereas in Haskell, it is the other way around: All functions are by-default pure since they are the more common utility and impure functions are indicated by their type, say by the words `IO`

appearing in the type somewhere.

## Variables: names versus references

*alias*for an expression and serves as an abbreviation for the sake of clarity and to avoid having to write out the entire expression in full each time it is used. In Haskell, a function is also a variable and so it is treated similarly. Such an alias, or naming, can only be bound to one expression.

In a functional setting, once we’ve bound a variable to an expression, we know that we can always substitute it for that expression, because it will not change. This notion of substitutability does not hold in general in imperative programs since, for example, the code `g( f(), f() )`

cannot be replaced with `let x = f() in g(x, x)`

since the two calls to `f`

may have side-effects such as altering global state or printing output, which may not be the same as if doing `f`

once.

## Conditionals : the necessity of `else`

is only sensible

*conditional expressions*for any type. In pseudo-Haskell notation, in-fact this is close to Agda notation,

if_then_else_ :: ∀{type a} → Bool → a → a → a

if True then t else f = t

if False then t else f = f

Functional programs are expression-oriented and so the branches of the conditional must have the same type for the conditional to be sensible. Indeed, what sensible type does one assign to something like `if b then 1 else "apple"`

? If `b`

is true this is a number, otherwise it is a string. Hence, we cannot *statically* assign it a type and as such any context in which it appears would also need to consider the cases of the result being a number or a string —even though it may always be one and not the other! ( That is, when `b`

is constant. )

In an imperative language, it can make sense to omit the `else`

branch from a conditional, since there one is working with *statements*, or commands. In such a case, the “missing else” is implicitly defined to be the empty statement, called `skip`

or simply `;`

in most `C`

-based languages.

( In an imperative program, commands can be composed by sequencing which has the empty statement as unit and so form a monoid. For a monoid `(M, ⊕, e)`

, we can define an `else`

-less conditional:

if_then_ :: ∀{monoid (m, ⊕, e) } → Bool → m → m

if b then t = if b then t else e

The monoid structure ensures we have a default element of our type, namely `e`

, to use when the condition does not hold. In-fact the monoid structure is a bit overkill, it suffice to just request a function `default :: ∀{nonempty type a} → a`

and we can use that in-place of `e`

. Such a function is used in `C#`

for whenever one wants a default value for a type. )

## Lazy Evaluation

*strict evaluation*. Haskell, on the other hand, is

*non-strict*: It evaluates arguments on a need-by-basis, possibly never evaluating them at all! For example, defining the function

`f b n = if b then 0 else n`

and then calling it with arguments `0 < 1`

and the length of the infinite list `[1..]`

we have
f (0 < 1) (length [1..])

= { rewrite via definition }

if (0 < 1) then 0 else length [1..]

= { conditional can only be simplified further if we simplify the guard }

if True then 0 else length [1..]

= { conditional with True guard selects then-branch }

0

Notice that we never evaluated `length [1..]`

since it was never needed! Which is great, since we can never computer the length of an infinite list in finite time anyways!

Haskell creates a “promise” that when the value of a sub-expression is needed, it can compute it. The record type that is used to track an un-evaluated expression is known as a thunk.

However, such a lazy evaluation scheme has some surprises when working with IO. For example, suppose you request input and bind it to some variable, then if you never use that variable for anything, it is never actually evaluated and so the actual request never occurs! More on this later.

## Polymorphism

*polymorphic*. The prefix

*poly*means

*many*and

*morph*means

*form*; so such functions have

*many-forms of usage*.

In Haskell, (type) variables always start with a lowercase letter, whereas concrete types are distinguished by having their names start with a capital case letter.

Haskell supports *parametric polymorphism*: Types may have parameters that we can later bind to concrete types. Not only does the implementing code not care what the concrete type is, but *it has no way to find out what the real type is*, or manipulate a value of that type. It cannot create a value; nor can it inspect one. All it can do is treat is as a fully abstract black-box.

For example, consider a function that takes a list of any type, denoted `[a]`

, and returns an element of that type `a`

. Such a function has type `[a] -> a`

. If the list is empty, then we can not do anything to obtain a value of type `a`

and so such a function cannot be defined on empty lists —we have no `default`

as in `C#`

to obtain a value of the type parameter. If the list is non-empty, then it is of the form `x : xs`

, that is an element of type `a`

followed by another list of type `[a]`

. At this point we can simply return `x`

or continue to look at `xs`

and pick an element from there. Hence, the *only* functions of type `[a] -> a`

are the array indexing functions `[x₀, x₁, …, xₙ₋₁] ↦ xᵢ`

for some `i`

.

As seen, parametric polymorphism means that the implementation of a function can only have a particular form due to the typing.

In contrast, other languages such as `C#`

provide *subtype polymorphism* and allow one to inspect the abstract type. For example, using the `is`

and `as`

keywords in `C#`

we can check if the type `a`

is a subtype of a numeric type, or if it implements a particular interface, and if so cast the elements to be such and then after indexing to obtain an element from the list, we can perform any numeric operation or by-pass the indexing altogether and simply perform some numeric computation fully unrelated to the abstract type.

This latter kind of polymorphism scheme promises less about the implementation than the previous parametric scheme. Furthermore, the previous kind of polymorphism gives us theorems for free that allow us to replace code segments with possibly more efficient versions! For example, from the type `p :: [a] -> [b]`

we know that `p . map f = map f . p`

: We may relabel the elements of a list via a given function `f :: a -> b`

by the call `map f`

then run our computation `p`

, or equivalently run `p`

first and then relabel. In some circumstances it is preferable to do one over the other. All parametric polymorphic functions come with similar assurances! How cool is that!

Such free theorems reflect the idea that *the definition of a polymorphic function is natural* in that relabelling elements before or after the natural computation does not matter since the operation makes no inspection of the element-type in question and so can only duplicate, index, or rearrange elements —in the case of lists, that is.

Of-course this is only guaranteed since we only allow *pure* functions!