# Performance on AWS Lambda

# Introduction

Kumologica Runtime has been architected around three pillars: developer productivity, security and performance.

The key benefits of Kumologica Runtime is the ability to run your API and integrations serverlessly, meaning that you don't have to worry about managing servers or runtimes, and your billing model is based on usage, and not "pre-paid" based on provisioned infrastructure, resulting in big cost savings.

# Benchmarking process

Given that Kumologica Runtime sits on top of NodeJS runtime, the ultimate goal is to work towards a zero-overhead over the NodeJS runtime provided by the cloud provider.

For our benchmark a typical HelloWorld flow is going to be deployed on AWS Lambda against various memory allocations: 128 Mb, 256 Mb, 512 Mb and 1024 Mb.

The API attached to the flow (lambda) will be implemented on AWS API Gateway with a lambda proxy integration.

The load test will be generated using Artillery with the following configuration:

config:
  target: <url of our api>
  phases:
    - duration: 60
      arrivalRate: 10
      rampTo: 50
scenarios:
  - flow:
      - get:
          url: '/perf'

This will emulate a single phase of load that will lasts 60 secs. The phase will start with 10 new virtual users and it will ramp up to 50 new users over the 60 secs.

Our benchmark will be run on two different AWS configurations: on-demand and provisioned concurrency

Note that Provisioned Concurrency (opens new window) to enable best performance.

# Provisioned Concurrency

All tests resulted in a 100% success rate.

# On-Demand

All tets resulted in a 100% success rate. While most of the requests are completed near the expected 1-second execution time, there is a long tail where p95 and p99 performance times are slower. This is caused by the function is scaling up, and new concurrent invocations are slower to start due to execution environment initialization (cold start).

In order to avoid such long-tail performing, a provisioned concurrency or higher memory is recommended.

# Recommendations

For most serverless latency-sensitive APIs, a minimum memory allocation of 512Mb is recommended with an additional provisioned concurrency for best performance.

For no latency-sensitive APIs - like cron jobs - a minimum memory allocation of 128Mb is more than sufficient in most cases.

← Extensibility Performance (Serverful) →