Skip to content

Serverless, DynamoDB, Streams, Filter Patterns and You

April 06, 2022Nathan Dolzonek4 min read

Picture of coffee filtering

This article will cover filter patterns in the serverless framework as they apply to DynamoDB streams, but the principles apply to filter patterns on other event types as well (kinesis, event bridge, etc...)


DynamoDB Streams provide near real time information on all data modifications that occur in your database at no extra cost (No paying for event messages or any in between computations). We can then use these streams to trigger Lambdas without having to setup any extra infrastructure! Read more about DynamoDB Streams here.

This sounds great but quickly the problem becomes that you have a Lambda triggered on every single data modification for your DynamoDB table wasting the saved costs by spinning up a Lambda whose only job is to recognize it has no job to do and then stop processing. The answer to this problem is to set up filter patterns. Filter patterns allow you to execute a listener to an event only under certain circumstances removing the data validation part of our Lambda and rely on our filter pattern to only invoke the Lambda when necessary. Plus stream filter patterns come with no extra costs making them the perfect solution.


Before we dive in

Filter patterns work on any event structure but our examples are specific to DynamoDB streams which have an event structure like the following:

{
  "eventName": "INSERT" | "MODIFY" | "REMOVE",
  "dynamodb": {
    "Keys": {...},
    "NewImage": {...},
    "OldImage": {...}
  }
}

The expamples below use the serverless-lift plugin to create our database instances. This is why the stream arn's look so nice they are a direct reference to a database created in the serverless file. Read more about serverless lift and DynamoDB here!


How to implement filter patterns in your project

Lets say you have a series of jobs making updates to a single record and you have a lambda that is going to send an email to your users when the jobs finish processing. To denote when a job is finished there is a jobState column on your record. You could have a serverless lambda function setup like this:

emailUsersWhenJobComplete:
  handler: src/emailUsersWhenJobComplete.handler
  events:
    - stream:
      type: dynamodb
      arn: ${construct:batchJobs.tableStreamArn}
      filterPatterns:
        - eventName: [INSERT, MODIFY]
          dynamodb:
            NewImage:
              jobState:
                S: [DONE]

Let's break the filter pattern down.

First the filter pattern says to only pay attention to INSERT or MODIFY events meaning any remove events are automatically ignored. The next few lines of the filter pattern drill down on the event structure to get to the attribute that we need. Then it says that we expect the value of jobState to be DONE. Meaning any inserts or updates to our table records that happen before the job is done are ignored by our pattern and our lambda doesn't even spin up.

This is good, but what about updates that happen to our record after it hits the done state. Those will trigger our lambda to run as well causing us to send emails we don't want to send. So we only want it to process when the jobState wasn't DONE before the modification. We can accomplish that with filter patterns as well. Our serverless lambda function will now look like this:

emailUsersWhenJobComplete:
  handler: src/emailUsersWhenJobComplete.handler
  events:
    - stream:
      type: dynamodb
      arn: ${construct:batchJobs.tableStreamArn}
      filterPatterns:
        - eventName: [INSERT]
          dynamodb:
            NewImage:
              jobState:
                S: [DONE]
        - eventName: [MODIFY]
          dynamodb:
            NewImage:
              jobState:
                S: [DONE]
            OldImage:
              jobState:
                S:
                  - anything-but: [DONE]

This updates the filter pattern we had before and splits out the insert and the modify into seperate patterns. AWS applies filter patterns in a way that ensures the lambda will be triggered if the stream event matches any of the provided patterns.

The MODIFY event has also been updated to reference OldImage. This is the state the database record was in before the modify event was emitted. By doing this, we have now specified that if an update happens and our record ends up with a jobState of DONE we still only want to trigger our lambda and send an email if the jobState before this update was not DONE, solving our earlier problem. Users will now only get notified when the job finishes and not if any updates are made after the job finishes.

What else can filter patterns do? Quite a lot. Lets say we have a shopping platform and we want to send our users a coupon if they make an order over $100.00. We could have a serverless lambda setup like this:

sendCoupon:
  handler: src/sendCoupon.handler
  events:
    - stream:
      type: dynamodb
      arn: ${construct:orders.tableStreamArn}
      filterPatterns:
        - eventName: [INSERT]
          dynamodb:
            NewImage:
              total_amount_cents:
                N:
                  - numeric: [ >, 10000]

Or if we need to run extra structural checks when a user adds a pipe with a radius greater than 5 cm, a length greater than 10 m, and any PVC based material type in a 3d modeling system:

structureCheck:
  handler: src/structureCheck.handler
    events:
      - stream:
        type: dynamodb
        arn: ${construct:structures.tableStreamArn}
        filterPatterns:
          - eventName: [INSERT]
            dynamodb:
              NewImage:
                type:
                  S: [piping]
                radius_cm:
                  N:
                    - numeric: [>, 5]
                length_m:
                  N:
                    - numeric: [>, 10]
                material:
                  S:
                    - prefix: [PVC]

As you can see pairing DynamoDB streams with serverless filter patterns lets you perform in-depth event logic without having to spin up any extra infrastructure or invoke any extra functions. Next time you need to publish an event after a database write in your serverless project why not try out a filter pattern. You just might save yourself some time and money.


Links

Images from unsplash

Information on DynamoDB stream event structures and testing events

Full list of filter pattern rules and raw json filter pattern examples

Information on the Serverless lift plugin

Nathan Dolzonek

Nathan Dolzonek

Web Developer at Theodo