Enabling Cold Start optimization for Java Runtime using AWS Lambda SnapStart opt-in feature
Introduction:
As more organizations adopt serverless architectures, AWS has been coming up with various enhancements and new services to build more robust serverless solutions for enterprises. Companies are empowered to rapidly build and run cloud native solutions for various use cases across the different industries.
Among the multiple improvements and extended support for multiple languages runtimes and custom runtimes, cold start remains an important point of consideration. Cold start is the time taken to handle the first request and initialize the function handler. Especially for popular language runtimes such as JVM where framework like Spring boot is still taking more than 10 second to start before serving any request. At present, this restricts larger JVM based Framework community developers from leveraging the true benefit of AWS Lambda for application development. To address this particular concern, AWS introduces a new opt-in feature called Lambda SnapStart, which supports upto 10X savings in startup time for JVM related runtime, reducing cold start latencies from several seconds to 3-digit milliseconds.
What is AWS Lambda SnapStart feature?
AWS Lambda SnapStart is a new opt-in feature that reduces startup latency for functions running on Amazon Corretto Java 11 runtime.
By default, AWS Lambda creates a fresh execution environment when a function is invoked for the first time or is scaled up to handle additional traffic.
As depicted in the preceding diagram, lambda initialization process involves many phases from downloading function code, initializing the runtime and application. The initialization process can sometimes take several seconds to complete and adds a variable amount of latency when your function is invoked for the first time.
When SnapStart is enabled, function code is initialized once when a function version is published. Lambda then takes a snapshot of the memory and disk state of the initialized execution environment, persists the encrypted snapshot, and caches it for low-latency access. When the function is first invoked or subsequently scaled, Lambda resumes new execution environments from the persisted snapshot instead of initializing from scratch, improving startup latency.
Enable SnapStart on your Lambda function and Capturing load test result
Please follow the steps to enable the feature and capture the performance results:
- Enable SnapStart feature , go to Console →Lambda→Function-name
→Configurations → General configurations → Edit → SnapStart → PublishedVersions
2. Publish the function and wait for the creation to be successfully. You can confirm it’s enabled by accessing the corresponding version and checking the configuration:
Versions → Published Version Number → Configuration
3. Update API Gateway to use the published function version and redeploy the REST API
4. Performance testing
Traffic pattern and concurrency testing (Example: 1000 requests/sec for 10 seconds) need to be performed to capture the result for both Application with and without the SnapStart enabled for better performance comparison. Please use Apache AB or Artillery to run 1,000 requests with 100 threads in parallel.
a. For Apache AB:
ab -n 1000 -c 100 https://<URL>/
b. For Artillery
artillery run load-test.yml
load-test.yml
config:
target: "https://<URL>/"
phases:
- duration: 10
arrivalRate: 100
name: "Sustained load"
scenarios:
- name: "Search"
flow:
- get:
url: "/"
5. Query data from Amazon CloudWatch Insights
Please use CloudWatch Insights — Benchmark Query to check the result:
filter @type = "REPORT"
| parse @log /\d+:\/aws\/lambda\/(?<function>.*)/
| parse @message /Restore Duration: (?<restoreDuration>.*) ms/
| stats
count(*) as invocations,
pct(@duration+coalesce(@initDuration,0)+coalesce(restoreDuration,0), 50) as p50,
pct(@duration+coalesce(@initDuration,0)+coalesce(restoreDuration,0), 90) as p90,
pct(@duration+coalesce(@initDuration,0)+coalesce(restoreDuration,0), 99) as p99,
pct(@duration+coalesce(@initDuration,0)+coalesce(restoreDuration,0), 99.9) as p99.9
group by function, (ispresent(@initDuration) or ispresent(restoreDuration)) as
coldstart | sort by coldstart desc
6. Capturing the Results
Use below table for comparing the performance result for both with and without SnapStart feature.
Benefits of Lambda SnapStart opt-in feature
This feature significantly helps to reduce the outlier latencies caused by cold starts and improve overall performance of applications running on AWS Lambda.
- Adaption of serverless API at scale with sub milli second performance for Java based Enterprise business services will largely benefit by using this feature.
- You can use any JVM based framework (Spring/Micronaut/Quarkus) such as Spring core, Spring cloud function or Spring boot.
- Improve overall performance of function response time hence reduce the cost of running the application.
- Migrating the application to Microservice based architecture both for API and integration are going to be largely benefit as majority of the monolith application has been using Java based framework.
- As this feature used encrypted snapshot for cold start improvement there is no security issue and as per the Share responsibility model, AWS is responsible for the same.
Conclusion:
AWS Lambda SnapStart is a great feature for JVM based runtimes and helps to mitigate the overall Cold start latency. The service is free to use and provide obvious saving up to several seconds.
As per the testing results which we have conducted across different Java/Spring framework, and a custom framework, and different design patterns we have found varying results in overall savings upto 10X. There are many open-source Java frameworks and modules that follow different patterns and are used for different use cases such as API, Integration, batch, and orchestration which lead to different class loading pattern and architecture. We suggest one must be careful while choosing the Java Framework and verify the Cold start before advancing into the application development.
We believe that with SnapStart feature not only the adaption of serverless enterprise application development will increase but also it will open the door for newer JVM based framework to support Serverless based Enterprise application development.
This blog is done in collaboration with AWS Lambda product team. AWS team involved were Prateek Agrawal(Partner solution Architect) and Thiago Padua, Young Seok Jeong, Srinivas Jasti along with service team.
Additional References:
https://aws.amazon.com/lambda/
https://cloud.spring.io/spring-cloud-function/reference/html/spring-cloud-function.html