Monthly Archives: July 2017

Experimenting with Partial Application

Partial application helps us reuse functions by allowing us to create new functions based on the original. This is achieved by passing fewer arguments to the original function and binding the new function that is created. You can read more on this at Currying and Applying Arguments. We’re going to experiment with partial application and understand when execution happens.

Short notes: currying is responsible for taking a function with multiple parameters and turning it into a sequence of functions where each function only takes one argument. Partial Application is the process of applying arguments to unwrap the layers of functions until there are no more functions only the result.

Our Test Code

We’re going to start with a simple key lookup. We’ll have a function which takes a list of integers and another integer to verify it is in the list of keys. We’ve created some sample data for testing our functions.

let datapoints = [1 .. 400000]
let keys = [1; -4; 45000; 2501; 56000]

// seq -> int -> bool
let seekKey keys key =
    printfn "Creating index"
    let index = Set.ofSeq keys
    printfn "Seeking key: %d" key
    index |> Seq.contains key

Looking at our starting code, we are taking in a sequence of keys and turning them into a Set. We then look in the set of keys for verification of the key we are looking for. We’ve added some print statements so we can see the execution.

Let’s turn #time on in our REPL to give us some statistics of execution.

When we execute the function with the data points and a key we should see results similar to this:

> seekKey datapoints 67;;

Creating index
Seeking key: 67
Real: 00:00:00.653, CPU: 00:00:00.650 ...

// Your times will differ.

These are the results we expected. We are creating an index, using a Set, and searching the index. Next we’ll create a partially applied function so we can do multiple lookups.

> let seek = seekKey dataPoints;;

Real: 00:00:00.000, CPU: 00:00:00.000 ...

Again the results aren’t to shocking here. We are unwrapping the first layer of currying to reveal the function below. Now let’s use our partially applied function seek to verify a list of keys.

> keys |> List.map (seekKey datapoints);;

Creating index
Seeking key: 1
Creating index
Seeking key: -4
Creating index
Seeking key: 45000
Creating index
Seeking key: 2501
Creating index
Seeking key: 56000
Real: 00:00:03.396, CPU: 00:00:03.512 ...

We’ve revealed a problem with our function. We are creating the index on every lookup. The reason for creating an index is to improve the performance for multiple lookups. We have not accomplished this.

Multiple Seek Improvement

Let’s rewrite our function. Instead of relying on currying, we’ll have our function return a function. This will allow us to inject some execution before performing the lookup.

// seq -> (int -> int)
let seekKey' keys = 
    printfn "Creating index"
    let index = Set.ofSeq keys
    (fun key -> 
        printfn "Seeking key: %d" key
        index |> Seq.contains key)

> seekKey' datapoints 67;;

Creating index
Seeking key: 67
Real: 00:00:00.659, CPU: 00:00:00.656 ...

The type signature of the new function is close to the original function. In fact, we can call the new function the exact same way as the original.

Look at what happens when we apply only the first argument.

> let seek' = seekKey' datapoints;;

Creating index
Real: 00:00:00.690, CPU: 00:00:00.737 ...

We can see the index is created immediately. How is this going to affect multiple key lookups?

> keys |> List.map (seekKey' datapoints);;

Creating index
Seeking key: 1
Seeking key: -4
Seeking key: 45000
Seeking key: 2501
Seeking key: 56000
Real: 00:00:00.710, CPU: 00:00:00.705

That’s a more performant lookup for multiple values. What we’ve done is made the creation of the index happen after applying the first argument. The index then gets wrapped in a closure to the returned function. This means we don’t have to re-create the index every time we apply the second argument.

What more can we do?

We’re close to how we want the function to work. There’s one last thing. It would be nice to create the index only if we use it. We can accomplish this using a lazy computation.

// seq -> (int -> int)
let seekKey'' keys =
    let index = lazy (
                  printfn "Creating index"
                  Set.ofSeq keys)
    (fun key ->
        printfn "Seeking key: %d" key
        index.Value |> Seq.contains key)

We’ve placed the index creation in the lazy expression and bind it to the label index. We then call index.Value when we want to use the result of the lazy computation.

Let’s run through our testing scenarios.

> seekKey'' datapoints 67;;

Seeking key: 67
Creating index
Real: 00:00:00.679, CPU: 00:00:00.732 ...

This revealed an interesting result. The times are fairly equivalent to the first two code snippets, but in this case the print statements are in a different order. We are printing the “Seeking” before the “Creating”. This is due to the lazy and not evaluating until we tell it to.

Applying one argument:

> let seek'' = seekKey'' datapoint;;

Real: 00:00:00.000, CPU: 00:00:00.000 ...

Yes! This is what we wanted. Immediate return of the function without creating the index. This allows us to apply the first argument without any penalty if we don’t use the returned function later in processing.

For completeness, running multiple lookups.

> keys |> List.map (seekKey' datapoints);;

Seeking key: 1
Creating index
Seeking key: -4
Seeking key: 45000
Seeking key: 2501
Seeking key: 56000
Real: 00:00:00.734, CPU: 00:00:00.727 ...

Again the index is created just before the seek of the first key and the index isn’t created again.

Summary

What we’ve done is take a curried function and turn it into a higher order function. By making this change we’ve improved the performance of our function when doing multiple lookups. We used the lazy computation to remove the initial index creation if we don’t do any lookups. This was a fun experiment.

Full Source Code

Improving an Old Pattern Continued

The active patterns we’ve seen so far are enough to take some complex code and make it clean and readable. Let’s take the active patterns we know and improve the pattern matching code we created earlier.

Reviewing the Code

We were reviewing a chunk of code to determine whether a client requires an update based on the version numbers. Here is where we left off with the pattern match.

let checkForUpdate (rMaj, rMin, rBld) userVer = 
    match userVer with
    | (uMaj, _, _) 
        when rMaj > uMaj                                     
            -> Major
    | (uMaj, uMin, _) 
        when rMaj = uMaj && rMin > uMin                   
            -> Minor
    | (uMaj, uMin, uBld) 
        when rMaj = uMaj && rMin = uMin && rMin > uBld 
            -> Build
    | _     -> NoUpdate

The match is checking the major version, followed by the minor, and the build. The issue I have with this pattern match is we keep having to compare values we already know. As the match falls through the patterns we end up checking the major version three times. This isn’t very readable.

Adding a Partial Active Pattern

The first thing we can do is add a partial active pattern to match on all the cases that don’t require an update. We can leverage the fact that F# does structural equality by default that we can do this quite easily.

let (|NoUpdatePartial|_|) (relVer, userVer) =
    if relVer <= userVer then Some ()
    else None

With this partial pattern we've now eliminated the need to do multiple comparisons of any individual values.

Combining Active Patterns

We'll take the partial active pattern and use it in a multicase active pattern. Our multicase pattern needs only be concerned with matching the individual version values. This is providing we add the partial active pattern before the individual value patterns. It will look something like this:

let (|Major|Minor|Build|NoUpdate|) (relVer, userVer) =
    match (relVer, userVer) with
    | NoUpdatePartial                        -> NoUpdate
    | (rMaj,_,_),(uMaj,_,_) when rMaj > uMaj -> Major
    | (_,rMin,_),(_,uMin,_) when rMin > uMin -> Minor
    | _                                      -> Build

We now have a complete active pattern we can use when determining what kind of update is required.

let checkForUpdate relVer userVer =
    match relVer, userVer with
    | Major    -> doMajorUpdate ()
    | Minor    -> doMinorUpdate () 
    | Build    -> doBuildUpdate ()
    | NoUpdate -> doNoUpdate ()

Summary

We’ve greatly increased the expressiveness of our pattern matching by using Active Patterns. We’ve used a Partial Active Pattern and a Multi-Case Active Pattern. Active pattern can be combined together to create larger patterns all in the name of readability. All of this helps lead us to correct and expressive solutions.