Increasing performance for your apps is actually a pretty broad subject and can get especially important when you’re trying to match app size limits, improve user experience or just want to keep your costs in check. There’s a litany of small tricks that can be used and today we’ll talk about memoization and how it can help you avoid repeating expensive work.

What is Memoization?

You might have already used memoization before without knowing it, especially if you’ve written some React and its useMemo hook.

Memoization is a fancy name for what is actually a cache for functions: By storing a function’s return on first call for a given set of arguments, we’ll be able to skip re-executing it on the following calls.

As with all cache, you’ll need to make sure that your function results do not depend on anything that will be invalidated by any external parameters or that you account for it.

As an example, if you were to cache a query to an api that returns the most recent news reports, forgetting to feature an invalidation based on time passed will only get you outdated news after a while.

Implementing Memoization

As is tradition when explaining Memoization, we’ll be using the famed Fibonnaci sequence:

// Implement the fibonacci sequence using recursion
int fibonacci(int n) {
  if (n < 2) {
    return n;
  } else {
      // Recursively call the function until we get to the resolution
    return fibonacci(n - 1) + fibonacci(n - 2);
  }
}

The more astute viewers will realize that this function is both recursive and is extremely inefficient as it will often repeat work.

For example, calling fibonacci(5) ends up calling fibonacci(3) two times, fibonacci(2) three times and fibonacci(1) a total of 5 times. It is easy to realize that with a higher n, this redundant work will balloon out of control.

This fact makes the fibonacci sequence the perfect illustration for the incredible performance gains the memoization pattern and any sort of caching can do.

// Implement the fibonacci sequence using recursion and a cache
int memoFibonacci(int n, Map<int,int> memo){
  if (memo.containsKey(n)){
      // We've already computed this value, we'll just get it and return it
    return memo[n]!;
  }
  if (n < 2){
    return n;
  }
  else {
    // we need to resolve this value
    memo[n] = memoFibonacci(n-1,memo) + memoFibonacci(n-2, memo);
    return memo[n]!;
  }
}

void main() {
    Map<int,int> myCache = new Map<int, int>();
    memoFibonacci(43, myCache)
}

Using this pattern, the faster “slope” (the recursive calls to fibonacci(n-2)) will usually handle most of the computations and store them in a globally accessible Map. The following calls will then just need to get this precomputed value, avoiding redundant work.

Here’s the complete example using stopwatch so you can run it and compare.

Be careful about the input you use for the naive fibonacci implementation, my own computer can’t handle more than around 45 on dartpad. No such issue with the memoized call.

int fibonacci(int n) {
  if (n < 2) {
    return n;
  } else {
    return fibonacci(n - 1) + fibonacci(n - 2);
  }
}

int memoFibonacci(int n, Map<int,int> memo){
  if (memo.containsKey(n)){
    return memo[n]!;
  }
  if (n < 2){
    return n;
  }
  else {
    memo[n] = memoFibonacci(n-1,memo) + memoFibonacci(n-2, memo);
    return memo[n]!;
  }
}

void main() {
  Stopwatch watch = new Stopwatch();
  watch.start();
  print(fibonacci(43));
  watch.stop();
  print("fibonacci took ${watch.elapsed}");

  watch.reset();
  
  Map<int,int> myCache = new Map<int, int>();

  watch.start();
  print(memoFibonacci(200, myCache));
  watch.stop();
  print("memoFibonacci took ${watch.elapsed}");
}

So, I hope you’ve been convinced by the performance improvement brought to you by memoization but before I send you on your way there’s a few caveats we need to talk about (with some being specific to dart):

  • The fibonacci example is quite simple and only features one argument which enables us to use a simple Map<int,int>. Using multiple arguments means you’ll have to include another way to compare arguments before running. This might result in an overhead that’ll make your expected performance improvement vanish.

  • A working implementation using Generics and a variable number of arguments will most likely require some copy pasting (as varargs are not implemented in dart yet) and allowing the user to give the comparison function himself.

  • Make sure your memoized function is pure and free of side effects, only depending on the inputs you give it and no other independent state.

A cache money real life example

I’ve recently had the pleasure of doing some consulting work for an acquaintance’s startup offering financial data analysis. Finance is a bit of a weird place, data is extremely expensive and real-time data even more so, calculations can quickly become very complex and long-running in part due to the ever-increasing size of conflicting datasets that need resolution.

As the team did not feature any professional programmer, they ran head first into quite a few walls:

  • Running a complete analysis on every single user connection (querying expensive data every single time and breaching their monthly data limit in a matter of 2/3 days)

  • Calculation was long running, users were expected to wait around 40 seconds before gaining access to their dashboard

  • Server code as written in single-threaded python, so if someone queried data right before you did, that would be a 40 seconds bonus wait

  • An increase in Google Cloud Platform usage in order to meet the demand

To be frank, it was quite incredible to me to meet a product that was already both meeting scalability issues and bleeding cash with 30 nonpaying daily users.

Being self-funded, these issues were a real drain and were preventing progress on other sides of their business, namely customer acquisition, design (as they were using a bootstrap template whose responsiveness they broke) and increasing the scope of their still barebones product.

Implementing a cache let them solve a big chunk of their issues: bringing compute time down to nothing meant an infinitely improved user experience while caching the data for reuse brought down their data cost a lot. The cash they generate can now be used to improve their product instead of being sent down a black hole.

If you ever find yourself developping a product and in such a situation, please consider using a cache or a database to store reusable data or precomputation. This strategy can be a very effective and easily configurable stopgap measure.

One caveat for those of you using anything cloud functions / lambda: you can’t just cache in the function itself because of how short lived lambdas can be. Thankfully, AWS has a blog post covering the subject.