Partial application helps us reuse functions by allowing us to create new functions based on the original. This is achieved by passing fewer arguments to the original function and binding the new function that is created. You can read more on this at Currying and Applying Arguments. We’re going to experiment with partial application and understand when execution happens.
Short notes: currying is responsible for taking a function with multiple parameters and turning it into a sequence of functions where each function only takes one argument. Partial Application is the process of applying arguments to unwrap the layers of functions until there are no more functions only the result.
Our Test Code
We’re going to start with a simple key lookup. We’ll have a function which takes a list of integers and another integer to verify it is in the list of keys. We’ve created some sample data for testing our functions.
let datapoints = [1 .. 400000] let keys = [1; -4; 45000; 2501; 56000] // seq
-> int -> bool let seekKey keys key = printfn "Creating index" let index = Set.ofSeq keys printfn "Seeking key: %d" key index |> Seq.contains key
Looking at our starting code, we are taking in a sequence of keys and turning them into a
Set. We then look in the set of keys for verification of the key we are looking for. We’ve added some print statements so we can see the execution.
#time on in our REPL to give us some statistics of execution.
When we execute the function with the data points and a key we should see results similar to this:
> seekKey datapoints 67;; Creating index Seeking key: 67 Real: 00:00:00.653, CPU: 00:00:00.650 ... // Your times will differ.
These are the results we expected. We are creating an index, using a
Set, and searching the index. Next we’ll create a partially applied function so we can do multiple lookups.
> let seek = seekKey dataPoints;; Real: 00:00:00.000, CPU: 00:00:00.000 ...
Again the results aren’t to shocking here. We are unwrapping the first layer of currying to reveal the function below. Now let’s use our partially applied function
seek to verify a list of keys.
> keys |> List.map (seekKey datapoints);; Creating index Seeking key: 1 Creating index Seeking key: -4 Creating index Seeking key: 45000 Creating index Seeking key: 2501 Creating index Seeking key: 56000 Real: 00:00:03.396, CPU: 00:00:03.512 ...
We’ve revealed a problem with our function. We are creating the index on every lookup. The reason for creating an index is to improve the performance for multiple lookups. We have not accomplished this.
Multiple Seek Improvement
Let’s rewrite our function. Instead of relying on currying, we’ll have our function return a function. This will allow us to inject some execution before performing the lookup.
-> (int -> int) let seekKey' keys = printfn "Creating index" let index = Set.ofSeq keys (fun key -> printfn "Seeking key: %d" key index |> Seq.contains key) > seekKey' datapoints 67;; Creating index Seeking key: 67 Real: 00:00:00.659, CPU: 00:00:00.656 ...
The type signature of the new function is close to the original function. In fact, we can call the new function the exact same way as the original.
Look at what happens when we apply only the first argument.
> let seek' = seekKey' datapoints;; Creating index Real: 00:00:00.690, CPU: 00:00:00.737 ...
We can see the index is created immediately. How is this going to affect multiple key lookups?
> keys |> List.map (seekKey' datapoints);; Creating index Seeking key: 1 Seeking key: -4 Seeking key: 45000 Seeking key: 2501 Seeking key: 56000 Real: 00:00:00.710, CPU: 00:00:00.705
That’s a more performant lookup for multiple values. What we’ve done is made the creation of the index happen after applying the first argument. The index then gets wrapped in a closure to the returned function. This means we don’t have to re-create the index every time we apply the second argument.
What more can we do?
We’re close to how we want the function to work. There’s one last thing. It would be nice to create the index only if we use it. We can accomplish this using a
-> (int -> int) let seekKey'' keys = let index = lazy ( printfn "Creating index" Set.ofSeq keys) (fun key -> printfn "Seeking key: %d" key index.Value |> Seq.contains key)
We’ve placed the index creation in the
lazy expression and bind it to the label
index. We then call
index.Value when we want to use the result of the
Let’s run through our testing scenarios.
> seekKey'' datapoints 67;; Seeking key: 67 Creating index Real: 00:00:00.679, CPU: 00:00:00.732 ...
This revealed an interesting result. The times are fairly equivalent to the first two code snippets, but in this case the print statements are in a different order. We are printing the “Seeking” before the “Creating”. This is due to the
lazy and not evaluating until we tell it to.
Applying one argument:
> let seek'' = seekKey'' datapoint;; Real: 00:00:00.000, CPU: 00:00:00.000 ...
Yes! This is what we wanted. Immediate return of the function without creating the index. This allows us to apply the first argument without any penalty if we don’t use the returned function later in processing.
For completeness, running multiple lookups.
> keys |> List.map (seekKey' datapoints);; Seeking key: 1 Creating index Seeking key: -4 Seeking key: 45000 Seeking key: 2501 Seeking key: 56000 Real: 00:00:00.734, CPU: 00:00:00.727 ...
Again the index is created just before the seek of the first key and the index isn’t created again.
What we’ve done is take a curried function and turn it into a higher order function. By making this change we’ve improved the performance of our function when doing multiple lookups. We used the
lazy computation to remove the initial index creation if we don’t do any lookups. This was a fun experiment.