Advent of Code 2017 – Day 3 Part 2

Advent of Code 2017 Day 3 part 2 was quite the gruelling exercise. We needed to locate the first memory value larger than the input we are given. The memory locations store an accumulative value (still spiralled) based on its existing neighbours. This is the example given.

147  142  133  122   59
304    5    4    2   57
330   10    1    1   54
351   11   23   25   26
362  747  806--->   ...

Location one still starts at 1. Location two only has one neighbour with a value of 1, so it get’s a value of 1.

1   1

Location three gets a value of 2 since it has neighbours one and two in which both have a value of 1.

    2
1   1

Location four has a value of 4. This is because it has three, two, and 1 as neighbours. We then continue on to generate the rings.

Once we get passed the initial ring, we can use some general rules to generate the values. At most a memory location will have 4 active neighbours. It came down to identifying the unique location requirements.

Since a location may potentially have neighbours in its own ring and the previous ring, we need to keep the last ring generated. I initially thought about the convoluted code that would need to be created, but decided to use Active Patterns.

I was able to determine the points of interest by using some of the data I had obtained from part 1. Specifically the row length. This allowed me to identify whether I was in a corner, before the corner, or after the corner. Once that was established I needed to handle the first and last element in the ring. Everything else was considered general middle elements.

I’ve included a couple of helper functions to handle wrapping the previous array’s index and generating the index offset from the current ring to the previous.

I opted to use mutation for generating a ring. I have my active patterns in the pattern match to help with code readability.

With all the building blocks in place I can now create a sequence of memory values based on the spiral structure.

memoryValues |> Seq.take 12 |> Seq.toList;;
val it : int list = [1; 1; 2; 4; 5; 10; 11; 23; 25; 26; 54; 57]

The part was to take the first element greater than my input.

Advent of Code 2017 – Day 3 Part 1

Day 3 part 1 was about calculating the Manhattan Distance for a specific memory location to memory location 1. The memory blocks are stored in a spiralling out formation with the first block being in the middle of the grid. The memory blocks would look like this:

17  16  15  14  13
18   5   4   3  12
19   6   1   2  11
20   7   8   9  10
21  22  23---> ...

I decided to try and figure out an equation for calculating the locations and distance instead of using the brute force method. (It wasn’t until later I learned about the squares in the bottom right hand corner… oops)

Generating the Rings

The equation I came up with for generating a ring ended up being:
ringNumber * 2 * 4 + (sum of previous rings).

I would generate the sequence of rings with an index and the final value.

I could calculate the number of elements in a ring wall (middle and 1 corner) by taking the ring number and multiplying by 2.

Identifying location 1 as ring Zero. I could say ring 1 would have 2 elements to make the middle and one corner. To make the full cube I could multiply it by 4. This gives us 8 elements in ring 1. Ring 2 would have 2 * 2 * 4 elements to make the square, 16. In the example below, there are 8 ones in ring 1, and 16 twos in ring 2.

2   2   2   2   2
2   1   1   1   2
2   1   0   1   2
2   1   1   1   2
2   2   2   2   2

To get the final memory location I would add the previous ring’s final location to the number of elements required for the current ring.

Running the code would result in:

rings |> Seq.take 6 |> Seq.toList;;
[(0, 1); (1, 9); (2, 25); (3, 49); (4, 81); (5, 121)]

Determining the Distance

The manhattan distance can be calculated by the ring id and the element offset from the centre of a ring wall. Here is the code I used to determine the distance.

The premise of this part is to locate the ring before the passed location. We then determine whether the location is on a corner or on the low side of centre. This will determine the amount of steps taken to get to the centre of the ring wall. We then add the number of rings we need to step through to get to location 1.

Bonus

Computing the rings using square:

let rings = Seq.initInfinite (fun i -> (i, pown (i * 2 + 1) 2))

Creating an Elmish Fable Project

Fable is a compiler that lets you write your web UI in F# and change it to Javascript. This has development benefits of immutability, static types, and discriminated unions (Sum Types). These benefits alone make for more predictable functions and fewer bugs. There is also a library that let’s us use an Elm like architecture for added stability and fun.

Prerequisites

We need to install a few things before we can create the project:

  • .Net Core 2
  • NodeJS
  • Yarn (package manager)

Installing the .Net Core Templates

In your console, enter the commands:

> dotnet new -i Fable.Template
> dotnet new -i Fable.Template.Elmish.React

Creating and Running Our Elmish App

After the templates are installed we can create and run our Elmish app by using the commands:

> dotnet new fable-elmish-react -n MyProjectName
> cd MyProjectName
> dotnet restore
> cd src
> yarn install

With all the dependencies installed we can now run the app using:

> dotnet fable yarn-run start

Open your browser and navigate to localhost:8080.

If we just want to build the app:

> dotnet fable yarn-run build

The Immutable Tool

Immutability is a useful tool. Think of it as an extension of your favourite editor or ReSharper. Using immutable values helps eliminate some of the easy to make mistakes when programming. Even if your language doesn’t fully support immutability you can still take advantage of the benefits.

What Is Immutability?

Immutability is something which does not change. You set a value and it doesn’t change through its lifetime. There are absolutely no mutations of that value. Unexpected mutations are a very common cause of issues in a program. Mutations can happen seemingly anywhere. They can happen concurrently from another thread causing all sorts of issues which are difficult to track down. During my talk I said “there is nothing wrong with global state, the problem is when global state changes.” I was trying to make them think about what the real danger was in their applications. A value which cannot change is immutable.

Benefits

The main benefit with using immutable values is the removal of unexpected changes. This leads to easier to reason about code. Future readers will have confidence that the values set don’t change later in the function or by any function. We limit the mutation or changes to specific areas of the code. This in turn let’s us define locations as being dangerous areas. Solutions using immutable values require less effort to move to a concurrent paradigm. Mutation is the bane of concurrent programming. Limiting mutation lowers the complexity of understanding the solution.

Disadvantages

The main disadvantage of immutable values is always having to make a copy of the values or object to apply the changes. This is not a big disadvantage if the language supports immutability. It is tedious in a language that doesn’t as this would have to be implemented by the developer. There are also some performance issues that may be noticed in mutations of many values in a collection. This is due to all the copying, updating, and garbage collection. This again can be handled by the language if supported. Mutations are a form of optimization.

Levels of Immutability

There are many functional languages that have rigid no mutations allowed. Some notable ones are Haskell and Erlang. If you come from a language that heavily relies on mutating variables, then these rigid languages can really force you to think of different ways to solve the task at hand. There are middle ground languages that support both immutable and mutable values. F# and Clojure are good languages that are immutable by default. It’s up to the developer to mark the values as mutable and accept the responsibility of that decision. Using an immutable by default language will allow you to design your solution using immutability and later modify to using mutations if the performance is required.

What’s nice about Clojure and F# is you get visual cues as to the dangerous areas in your code. F# uses the mutable keyword and a left arrow <- to define mutation.

let mutable x = 5
x <- 6

Clojure uses the earmuffs operator.

(def ^:dynamic *there-be* "dragons")

Summary

Immutability is a tool we can use to help avoid easy to make mistakes of unintentional mutations. This can lead to solutions that are easier to reason about as the code is not littered with mutations. Limiting mutations to specific areas allowing the reader to focus on those dangerous areas instead of worrying about unexpected changes. Like any tool, there are times when it doesn't fit. If performance is paramount to the solution, using mutation may be the way to go. In saying that, it's generally a good practice to make the code correct and then add mutations to make it fast.

There and Back Property

Coming up with properties can be difficult and require a lot of thought. We need to really understand the code or process we are testing against. Thankfully there are some common patterns that our predecessors have identified to help get us started. We’ll start looking at the there and back property.

There and Back

This is a property we can use to verify anything that sets and gets values. Verify data transfer objects or writing and reading from disk. We can use this property to any process that performs a transform that can be reversed. Serialization and deserialization are a good example of a transformation with a reversal. We can serialize and deserialize the data and confirm we get the exact same data that we put in.

I went searching the internet for a good example to test the there and back property. I ended up on the MSDN website with an encryption/decryption, example. I’ve experienced some pain with an encryption library in the past, so this was perfect. Long story short, we had to use a 3rd party encryption library and 8 months into production we found that every key had a subset of values that would not decrypt.

Encryption Library

I placed the encryption code in a C# library which can be found Property Based Testing – There and Back. The project can be built and there should be a ThereAndBack.dll in the bin/debug directory.

FsCheck

The property based testing library for .Net is FsCheck. The FsCheck library can be downloaded using nuget or paket. We can now reference FsCheck and the ThereAndBack.dll built from the downloaded project.

The function ``Check there and back of DateTime encryption`` is the property we are asserting. The body of the function creates an instance of the Encryption object from the C# library. It then converts the input DateTime to a string, encrypts, decrypts, and parses it back to a DateTime. The result of that process is then compared to the original input DateTime value. This gives us our “There and Back” property.

We execute the function with Check.Quick. FsCheck executes the function 100 times while generating a new DateTime value each time.

Summary

Starting with the “There and Back” property is a good way to get introduced to property based testing. This property can be used for any data transformation that has a means to reverse the data back to its original form. This makes the property good for testing any new serialization library we may be using, or writing. We can also use this property for testing writing and reading from disk to find corruption or hardware errors.

What Is Property Based Testing

I originally looked into property based testing a while ago. I knew that it was something I wanted to come back to and dig deeper. I got my chance and recently gave a talk on it at Prairie Dev Con Deliver. This conference is focused on Agile methodologies and testing.

Property Based Testing

Property based testing differs from unit testing in that it doesn’t use specific inputs and results. It instead uses a property to determine correct behaviour. I looked up the definition of Property in the dictionary and came up with “a trait belonging to a process with results common to all members of a set of inputs.” This is still a little abstract, but I found adding an example helped me understand further.

First Property

Any positive number multiplied by negative one will have a result less than zero. In this example:

  • The process is multiplication with negative one: (*) -1
  • The input set is any positive number: choose {1 .. }
  • The result is a number less than zero: < 0

These three pieces make up the property we are asserting.

Generative Advantage

One of the big advantages to property based testing over normal testing is the generative aspect of it. Property based testing doesn’t rely on specific example based testing. The system generates values based on the specifications required. Since the system is generating values at each run, the tests are always actively looking for defects. This also lets the system come up with many variations of inputs that we humans may not think of. This means we may detect a defect that can occur when given a large set of data.

A nice feature to some testing libraries, such as .Net’s FsCheck, is shrinking. When a falsifiable input is found, Shrinking allows the system to start minimizing (shrinking) the input values required to generate the defect. The idea is to make the inputs be as manageable as possible for the developer to be able to reproduce the error with minimal effort. The ideal use case would be to take that falsifiable example and put it in a unit test.

Like unit tests, we come up with many different properties of the process we are testing. The difficulty with property based testing is coming up with properties can be time consuming and difficult. On the other hand, the time consuming and thought does allow us to gain greater knowledge of the process we are testing. Thankfully though, there are some common patterns that have emerged as good ways to get started. We will look at the There and Back property in the next post.

Summary

Property based testing is a unique way to ensure your code is as defect free as possible. We’ve defined three requirements for a property. The process is what we are testing. The input set or subset required to test the process. In the example above we used any positive number. The results with some defined commonality of evaluating the process on our input set.

Function Pattern Shorthand

Match Shorthand

There is a shorthand for a function that immediately passes the last argument (See Currying and Applying Arguments). We can use this shorthand to rewrite our greeting function to be:

let greeting = function
    | {Name = "Dave"; Year = 2001 } 
        -> "I'm sorry Dave, I'm afraid I can't do that."
    | {Name = "Neil"}
        -> "The first person to set foot on the moon."
    | {Name = name}
        -> sprintf "Hello %s." name

Our shorthand above will still have a type signature of Astronaut -> string. The difference is the argument is implied. Loading the Astronaut type and the shorthand greeting function in Visual Studio Code with Ionide-FSharp you will see a result of val greeting : _arg1:Astronaut -> string. Using the shorthand is a very common practice when your function is immediately doing a pattern match.

This shorthand is a matter of preference by the developer. There are multiple styles to use for a function which goes directly into a pattern match.

If a pattern match has only 2 values, some people prefer to use an if then else block. We can still use a match expression or a function pattern. The functions below have the same result and behaviour. It becomes a matter of readability and preference.

let boolYesNo b = if b then "Yes"
                  else "No"

let boolYesNo' b = 
    match b with
    | true  -> "Yes"
    | false -> "No"

let boolYesNo'' = function
    | true  -> "Yes"
    | false -> "No" 

Summary

Using a Function Pattern Shorthand can remove some overhead of the developer by indicating an immediate match. The function pattern shorthand will immediately match on the last argument supplied to the function. Some prefer to use if then else if a match has only two outcomes. It is the preference of the developer as to which style of matching they will use.

Experimenting with Partial Application

Partial application helps us reuse functions by allowing us to create new functions based on the original. This is achieved by passing fewer arguments to the original function and binding the new function that is created. You can read more on this at Currying and Applying Arguments. We’re going to experiment with partial application and understand when execution happens.

Short notes: currying is responsible for taking a function with multiple parameters and turning it into a sequence of functions where each function only takes one argument. Partial Application is the process of applying arguments to unwrap the layers of functions until there are no more functions only the result.

Our Test Code

We’re going to start with a simple key lookup. We’ll have a function which takes a list of integers and another integer to verify it is in the list of keys. We’ve created some sample data for testing our functions.

let datapoints = [1 .. 400000]
let keys = [1; -4; 45000; 2501; 56000]

// seq -> int -> bool
let seekKey keys key =
    printfn "Creating index"
    let index = Set.ofSeq keys
    printfn "Seeking key: %d" key
    index |> Seq.contains key

Looking at our starting code, we are taking in a sequence of keys and turning them into a Set. We then look in the set of keys for verification of the key we are looking for. We’ve added some print statements so we can see the execution.

Let’s turn #time on in our REPL to give us some statistics of execution.

When we execute the function with the data points and a key we should see results similar to this:

> seekKey datapoints 67;;

Creating index
Seeking key: 67
Real: 00:00:00.653, CPU: 00:00:00.650 ...

// Your times will differ.

These are the results we expected. We are creating an index, using a Set, and searching the index. Next we’ll create a partially applied function so we can do multiple lookups.

> let seek = seekKey dataPoints;;

Real: 00:00:00.000, CPU: 00:00:00.000 ...

Again the results aren’t to shocking here. We are unwrapping the first layer of currying to reveal the function below. Now let’s use our partially applied function seek to verify a list of keys.

> keys |> List.map (seekKey datapoints);;

Creating index
Seeking key: 1
Creating index
Seeking key: -4
Creating index
Seeking key: 45000
Creating index
Seeking key: 2501
Creating index
Seeking key: 56000
Real: 00:00:03.396, CPU: 00:00:03.512 ...

We’ve revealed a problem with our function. We are creating the index on every lookup. The reason for creating an index is to improve the performance for multiple lookups. We have not accomplished this.

Multiple Seek Improvement

Let’s rewrite our function. Instead of relying on currying, we’ll have our function return a function. This will allow us to inject some execution before performing the lookup.

// seq -> (int -> int)
let seekKey' keys = 
    printfn "Creating index"
    let index = Set.ofSeq keys
    (fun key -> 
        printfn "Seeking key: %d" key
        index |> Seq.contains key)

> seekKey' datapoints 67;;

Creating index
Seeking key: 67
Real: 00:00:00.659, CPU: 00:00:00.656 ...

The type signature of the new function is close to the original function. In fact, we can call the new function the exact same way as the original.

Look at what happens when we apply only the first argument.

> let seek' = seekKey' datapoints;;

Creating index
Real: 00:00:00.690, CPU: 00:00:00.737 ...

We can see the index is created immediately. How is this going to affect multiple key lookups?

> keys |> List.map (seekKey' datapoints);;

Creating index
Seeking key: 1
Seeking key: -4
Seeking key: 45000
Seeking key: 2501
Seeking key: 56000
Real: 00:00:00.710, CPU: 00:00:00.705

That’s a more performant lookup for multiple values. What we’ve done is made the creation of the index happen after applying the first argument. The index then gets wrapped in a closure to the returned function. This means we don’t have to re-create the index every time we apply the second argument.

What more can we do?

We’re close to how we want the function to work. There’s one last thing. It would be nice to create the index only if we use it. We can accomplish this using a lazy computation.

// seq -> (int -> int)
let seekKey'' keys =
    let index = lazy (
                  printfn "Creating index"
                  Set.ofSeq keys)
    (fun key ->
        printfn "Seeking key: %d" key
        index.Value |> Seq.contains key)

We’ve placed the index creation in the lazy expression and bind it to the label index. We then call index.Value when we want to use the result of the lazy computation.

Let’s run through our testing scenarios.

> seekKey'' datapoints 67;;

Seeking key: 67
Creating index
Real: 00:00:00.679, CPU: 00:00:00.732 ...

This revealed an interesting result. The times are fairly equivalent to the first two code snippets, but in this case the print statements are in a different order. We are printing the “Seeking” before the “Creating”. This is due to the lazy and not evaluating until we tell it to.

Applying one argument:

> let seek'' = seekKey'' datapoint;;

Real: 00:00:00.000, CPU: 00:00:00.000 ...

Yes! This is what we wanted. Immediate return of the function without creating the index. This allows us to apply the first argument without any penalty if we don’t use the returned function later in processing.

For completeness, running multiple lookups.

> keys |> List.map (seekKey' datapoints);;

Seeking key: 1
Creating index
Seeking key: -4
Seeking key: 45000
Seeking key: 2501
Seeking key: 56000
Real: 00:00:00.734, CPU: 00:00:00.727 ...

Again the index is created just before the seek of the first key and the index isn’t created again.

Summary

What we’ve done is take a curried function and turn it into a higher order function. By making this change we’ve improved the performance of our function when doing multiple lookups. We used the lazy computation to remove the initial index creation if we don’t do any lookups. This was a fun experiment.

Full Source Code

Improving an Old Pattern Continued

The active patterns we’ve seen so far are enough to take some complex code and make it clean and readable. Let’s take the active patterns we know and improve the pattern matching code we created earlier.

Reviewing the Code

We were reviewing a chunk of code to determine whether a client requires an update based on the version numbers. Here is where we left off with the pattern match.

let checkForUpdate (rMaj, rMin, rBld) userVer = 
    match userVer with
    | (uMaj, _, _) 
        when rMaj > uMaj                                     
            -> Major
    | (uMaj, uMin, _) 
        when rMaj = uMaj && rMin > uMin                   
            -> Minor
    | (uMaj, uMin, uBld) 
        when rMaj = uMaj && rMin = uMin && rMin > uBld 
            -> Build
    | _     -> NoUpdate

The match is checking the major version, followed by the minor, and the build. The issue I have with this pattern match is we keep having to compare values we already know. As the match falls through the patterns we end up checking the major version three times. This isn’t very readable.

Adding a Partial Active Pattern

The first thing we can do is add a partial active pattern to match on all the cases that don’t require an update. We can leverage the fact that F# does structural equality by default that we can do this quite easily.

let (|NoUpdatePartial|_|) (relVer, userVer) =
    if relVer <= userVer then Some ()
    else None

With this partial pattern we've now eliminated the need to do multiple comparisons of any individual values.

Combining Active Patterns

We'll take the partial active pattern and use it in a multicase active pattern. Our multicase pattern needs only be concerned with matching the individual version values. This is providing we add the partial active pattern before the individual value patterns. It will look something like this:

let (|Major|Minor|Build|NoUpdate|) (relVer, userVer) =
    match (relVer, userVer) with
    | NoUpdatePartial                        -> NoUpdate
    | (rMaj,_,_),(uMaj,_,_) when rMaj > uMaj -> Major
    | (_,rMin,_),(_,uMin,_) when rMin > uMin -> Minor
    | _                                      -> Build

We now have a complete active pattern we can use when determining what kind of update is required.

let checkForUpdate relVer userVer =
    match relVer, userVer with
    | Major    -> doMajorUpdate ()
    | Minor    -> doMinorUpdate () 
    | Build    -> doBuildUpdate ()
    | NoUpdate -> doNoUpdate ()

Summary

We’ve greatly increased the expressiveness of our pattern matching by using Active Patterns. We’ve used a Partial Active Pattern and a Multi-Case Active Pattern. Active pattern can be combined together to create larger patterns all in the name of readability. All of this helps lead us to correct and expressive solutions.

Partial Active Patterns

There are four kinds of active patterns. We’ve already looked at the Single and Multicase Active Patterns. We’ll now look at Partial Active Pattern.

Partial Active Pattern

A partial active pattern is used when the match expression needs to be transformed in different ways for the individual patterns. Partial active patterns are mutually exclusive in that they don’t necessarily have to relate to the other patterns.

We define the partial active pattern using the banana clips, a label, and an underscore. We identify a successful match by returning Some value and a non match with None.

let (|StartsWithCapital|_|) (v : string) = 
    if v.[0] |> System.Char.IsUpper then Some v
    else None

We can use the active pattern in our match expressions for readability.

let processFile (filePath : string) =
    filePath |> System.IO.File.ReadAllLines
      |> Seq.filter (function StartsWithCapital _ -> true
                            | _                   -> false)

Our processFile function takes in a file, reads all the lines, and then gives us a sequence of lines that begin with a capital letter. There is an issue though. We are using an indexer (v.[0]) on the string in our active pattern which will fail if the value is null or empty. Let’s add another partial active pattern.

let (|NullOrEmptyLine|_|) (v : string) = 
    if System.String.IsNullOrEmpty v then Some ()
    else None

The filter of the processFile will now look like:

Seq.filter (function NullOrEmptyLine     -> false
                   | StartsWithCapital _ -> true
                   | _                   -> false)

The order in which we place the pattern expressions matters. We have to do the EmptyLine pattern match before the StartsWithCapital or we will end up with the same error as before.

Putting it to use

We’ve received some specifications for interfacing with an old system. The system streams a sentence per line. A sentence is identified by the starting capital letter of the sentence. There are some issues with the system. Every once in a while the system will send a sentence in reverse or it will send garbage data (no starting or ending capital letter). Some lines may be null and they should be converted to newlines. Format the incoming data where all sentences are not reversed and include the empty lines.

Based on the specs above we have four cases:

  • Normal sentence
  • Reversed sentence
  • Empty line
  • Garbage data

We seem to already have 2 partial patterns for this exercise which means we only need one more to fulfill our requirements. The reversed sentence.

let (|EndsWithCapital|_|) (v : string) =
    if v.[v.Length - 1] |> System.Char.IsUpper then Some v
    else None

With the final partial active pattern in place we can setup our pattern match to fix the streamed data.

let reverseString : string -> string = 
    Seq.rev >> Seq.toArray >> (fun s' -> new string(s'))

let mungeLine = function
    | NullOrEmptyLine     -> Some "\n"
    | StartsWithCapital s -> Some s
    | EndsWithCapital s   -> s |> reverseString |> Some
    | _                   -> None

Taking my word that the funny syntax in the reverseString function actually reverses a string, we can see the mungeLine function uses our partial active patterns for validating the data. Partial active patterns have made out match expressive and readable to non coders.

Summary

Partial active patterns give us a means to do pattern matching when it may seem like the patterns have nothing in common. Partial active patterns allow us to transform the value we are matching before evaluating the pattern. We have not added partial active patterns to our collection of tools to create readable and expressive code.