Dirt-cheap Serverless Flask hosting on AWS

Update 2023-01-30: Updated the code to work with CDK 2.0

Today I want to tell you how you can host a dynamic Flask application for cheap on AWS using Serverless technologies. If you are interested in hosting a low-traffic Flask app that can scale easily, and that can be deployed with a single command for almost free, you will find this blog helpful. If you are not interested in reading why I started on this journey, feel free to skip to the overview section.

Table of Contents:

  1. My Low-Cost Hosting Journey
    1. A Cheap VM (2015)
    2. ECS (early 2018)
    3. Adding CloudFormation to the mix (late 2018)
    4. Trying out Zappa (2020)
    5. AWS SAM (2020)
    6. Meet CDK (2021)
  2. Overview of the Holy Grail
    1. Lambda
    2. API Gateway
    3. apig-wsgi
    4. CloudFront
    5. Cache Control
    6. Session Handling
  3. Wrapping Up
  4. Cost (update)

My Low-Cost Hosting Journey

As a hobbyist programmer, one of the things I've spent a lot of time thinking about is how to host a dynamic HTTPS website (including this blog) as cheaply and easily as possible. The cheaply part refers to the literal dollar cost. I wanted to spend as little as possible and not be wasteful with what I am paying for. The easily part refers to the ease of development and deployment, like being able to stand up a new website with a single command. My programmer instinct told me to do as little manual work as possible.

You might ask, why not just host a static website? I just found being able to host dynamic service code very cool, and it requires less thinking, so that's what I am focusing on here.

A Cheap VM (2015)

The first obvious choice for me was to use a cheap VM. You can get a cheap instance that can host a Flask website for less than $10/month. This is probably the popular method today due to its low barrier to entry - This is how everyone do their web programming locally, after all. There is no shortage of documentation on how to set up NGINX with your Flask application. It involves clicking a few times to get a new VM, then sshing into the instance, and then installing your software.

However, I grew pretty unsatisfied with the setup over time:

  1. Setting up the instance was time consuming and tedious. I tried using solutions like Ansible to automate the setup within the instance but it took a long time to test and get it right. There were many manual steps. For example, the DNS entry for the website was outside the setup script. All these manual steps had to be documented, or else I would just forget about them and would have no idea how to bring the website up again.
  2. It also takes a lot of effort to set up an instance that is "prod"-ready. "Production" concerns include things like rotating logs so that it doesn't fill up your disk, updating software so you don't end up running a botnet. Reading the access logs taught me that the Internet is a fairly dangerous place - you get a ton of random break-in attempts (mainly targeting PHP message boards, but there are others too).
  3. Since the setup was complicated, testing my change in a prod-like setting was out of question. So I just tested in prod.
  4. Setting up HTTPS took way more effort than I imagined. Even after letsencrypt came out, it was quite a bit of hassle to make sure certificate renewal works correctly and that the cert is not lost over instance loss. I could have slapped an ELB in front to get the certificate from AWS, but that cost $15/month so I decided not to use that.
  5. It was wasteful. The resource utilization was very low (single digit % CPU utilization) most of the time, which meant most of the money I paid for the instance was basically thrown away. Even the smallest unit of instances proved to be too coarse for my purpose.
  6. At the same time, the top end of the scaling limit was quite low. At most, the VM was able to handle a few dozen requests simultaneously. On the other hand, I couldn't find a way to make it possible to scale up, without forking at least $20 a month.
  7. It was really easy to lose data. So instead, I used the free tier RDS instance for a year, but it started charging $10+/month after that (I eventually moved data to DynamoDB to save cost, at the expense of re-writing some code).

ECS (early 2018)

My next attempt was to use Elastic Container Service (ECS). For those who don't know, ECS is a container orchestration engine from AWS. This was before Kubernetes became dominant like today.

Dockerizing the application meant that I was at least able to launch the instance from the ground up easily, and that if the instance is down, ECS will start it back up. I still had to setup the whole NGINX + uWSGI + Flask combo since ECS doesn't help me with that. It solved some problems but it was not any cheaper or that much simpler. It was still up to me to make sure the instances are up to date.

Adding CloudFormation to the mix (late 2018)

By the end of 2018, I've caught up with the whole Infrastructure-as-Code (IaC) thing, so I decided to migrate my ECR setup to a CloudFormation template. In case you are not familiar with it, CloudFormation (CFN) is an Infrastructure-as-Code (IaC) solution to deploy infrastructure easily. Simply put, IaC allows you to deploy your entire infrastructure like code. IaC lets you manage your infrastructure like code so you can version control it, easily rollback, and deploy your infrastructure with a single command.

This setup worked, and I was even able to make a very basic form of zero-downtime deployment work, by deploying another stack and swapping the Elastic IP between two instances. That was done outside CFN but it worked well enough. Deploying a whole new server with just a command was a cool achievement so I was proud of that.

However, it did take many, many days to get the template right. The CloudFormation template had almost no type checking. It wasn't easy to find out which fields were mandatory or not, other than by consulting the scattered documentation (it has one web page per data type... really?). The whole "edit-compile-test" iteration time was long. It took minutes for CloudFormation to tell me something was wrong, and then it took many more minutes for it to get back to the state where I could try another change.

The final CFN template was definitely something that I did not want to touch ever again, once it was working. There was also no cost-savings still.

Trying out Zappa (2020)

AWS Lambda came out in 2014, and popularized so-called "serverless" computing, also often called function-as-a-service. I'd explain Lambda like this: Lambda lets you run a function, rather than the whole operating system. A JSON event goes in, and your code runs based on that. You can call it however often as you'd like because scaling is handled by Lambda. Lambda bills for the usage in millisecond precision. If you don't use it, you don't pay for it. If you use it for a second a month, you pay for the second, not anything more than that. It's hard for me to explain how revolutionary this is - every single highlighted issue is a hard problem.

A minor bonus for hobbyists like us is that Lambda's free tier lasts forever unlike EC2 and in my opinion, pretty generous. You can host a low-traffic website for close to free, forever.

When I first heard about Lambda, I thought it would be perfect for me but I was worried about a few things: Cold-start time sounded scary, and it wasn't very obvious to me how to migrate an existing app, and the local testing story was not there, so I didn't think to use it.

Eventually in 2020, I gave it another look when I started hearing more about the benefits of Lambda and how more mature the ecosystem around it is.

My first attempt was using Zappa. It was pleasantly simple to to use and it did convince me that Lambda was way to go. However, it became apparent to me soon that it wasn't for me. Zappa was quite opaque in its operation and it didn't look like there was any integration point or an escape hatch into the rest of the CloudFormation ecosystem.

For example, I wanted to attach a CDN in front such that I can cache contents any way I want. It was not possible to do this with Zappa. Even today, the main page suggests to use something else like S3 to serve contents in conjunction with Zappa for hosting static contents.

It seemed that I had a fundamental disagreement with the project's architecture and direction. I believed this unnecessarily complicated the local testing story. I didn't agree that Flask-generated text content are somehow any less cachable. And I still don't think it's any less "serverless" to serve binary with Flask when the CDN is in front.

In summary, Zappa convinced me to go serverless but ironically, I wasn't convinced Zappa was the way to go, so I kept searching.

AWS SAM (2020)

AWS Serverless Application Model (SAM) is a tool around CloudFormation to make it easier to develop serverless Applications.

SAM does including, but not limited to the following:

  1. Various CloudFormation transformations that make common serverless application definitions simpler.
  2. Other helpers to make deployment of Lambda bundles easier for common languages (e.g., Python).
  3. Harness to test and execute Lambda functions locally. It will essentially parse out the CloudFormation template to setup a local Lambda environment that's similar enough.

Since the infrastructure layer is thin, I was able to setup the infrastructure around my Lambda, exactly the way I wanted. The cold start time was not bad at all - it was at worst a second which was acceptable in my opinion (tons of websites perform much poorer). Since there was a CDN in front, the cold start delays were not perceptible most of the time.

I was very pleased with this setup. It was pretty close to the Holy Grail of easy & cheap hosting. Local testing story was acceptable. Excluding the cost of a Route 53 Hosted Zone ($0.50/month), I started paying way less than a dollar per month. A single command deployment was now possible, and there was no disruption to the service.

There were things that I was still unsatisfied with this setup. Mainly, working with CloudFormation was still a big pain. I started using CFN professionally and I still didn't like it. In addition, I didn't see SAM adopted widely, so it wasn't so easy to Google problems with using it. In other words, it was not something I'd recommend to a friend, unless they were willing to spend a lot of time going through the same pain I went through with multiple days of trial-and-error around the infrastructure.

Meet CDK (2021)

Finally, this year I gave CDK a try this year, and I was immediately sold on it. Cloud Development Kit (CDK) improves the CloudFormation experience significantly. It makes CloudFormation so much better that I would always recommend using CDK to define infrastructure, no matter how small your project is.

CDK is essentially a typed CloudFormation template generator. CDK supports writing the infrastructure definition in TypeScript (among many other languages, but please, just use TypeScript - it's not so hard to pick up). That means you get all the usual benefits of typed languages like earlier validation of errors, auto-complete and navigation support in IDEs like Visual Studio Code out of the box. It still generates (synthesis in the CDK parlance) a CloudFormation template at the end of the day so you will have to know a little bit about how CloudFormation works but that's not hard.

Migrating from the raw CloudFormation template was fairly simple because CDK can even import your CloudFormation template into a CDK app. After importing it, it was just a matter of moving one construct at a time to CDK. Unlike in CloudFormation template, referring to an existing resource in CDK was also fairly trivial. It took me less than a day to migrate the whole thing.

This was it, I finally had something good - something I can recommend to a friend. In fact, this blog you are reading is hosted using this exact setup.

SAM still had a place in the CDK world because it can be used to emulate Lambda locally based on the CDK-generated CFN template, if necessary. However, I rarely ended up using it once I got the infrastructure setup such that the local execution environment matches the remote environment.

Overview of the Holy Grail

For the rest of the blog, I want to explain how the Holy Grail is put together. I made a runnable starter kit available on GitHub so you can clone and host your own Serverless Flask on AWS easily. I'll include links to code in the post so you can refer back to the actual code.

We are going to use the all the components discussed previously: CDK, CloudFormation, Lambda, API Gateway, CloudFront CDN and S3. Here's a diagram of how they relate to each other.

CDK->CloudFormation deploys Lambda, API Gateway, CloudFront and S3

Let's start from Lambda, since that's where the code runs.

Lambda

Defining Lambda in CDK is pretty straightforward. The following sample shows how it can be done:

let webappLambda = new lambda.Function(this, "ServerlessFlaskLambda", {
  functionName: `serverless-flask-lambda-${stageName}`,
  code: lambda.Code.fromAsset(__dirname + "/../build-python/",),
  runtime: lambda.Runtime.PYTHON_3_9,
  handler: "serverless_flask.lambda.lambda_handler",
  role: lambdaRole,
  timeout: Duration.seconds(30),
  memorySize: 256,
  environment: {"JSON_CONFIG_OVERRIDE": JSON.stringify(lambdaEnv)},
  // default is infinite, and you probably don't want it
  logRetention: logs.RetentionDays.SIX_MONTHS,
});

(link to code in the starter kit)

By using lambda.Code.fromAsset, you can just dump your self-contained Python environment and let CDK upload it to S3 and link it to Lambda automagically.

There are two more main problems we need to tackle before we can actually host a Flask app. First, Lambda doesn't speak HTTP so something else needs to convert HTTP into a JSON event. Second, since Flask app doesn't speak the JSON object, somebody also needs to translate the JSON event into something Flask understands. Using API Gateway and apig-wsgi, I was able to solve both problems nicely.

API Gateway

API Gateway is a fairly complex product. I, myself, am not sure how to explain it. In any case, API Gateway is frequently used to give Lambda an HTTP interface, so the CDK module for API Gateway already provides a construct called LambdaRestApi. The following is all you need to define the API Gateway for the Lambda:

let restApi = new agw.LambdaRestApi(this, "FlaskLambdaRestApi", {
  restApiName: `serverless-flask-api-${stageName}`,
  handler: webappLambda, // this is the lambda object defined
  binaryMediaTypes: ["*/*"],
  deployOptions: {
  throttlingBurstLimit: MAX_RPS_BUCKET_SIZE,
  throttlingRateLimit: MAX_RPS
  }
});

(link to code in the starter kit)

The binaryMediaTypes is set to all types such that it simplifies handling of all content types. throttlingBurstLimit and throttlingRateLimit are one of the simplest ways I've seen to apply a token-bucket style throttling to your web app. It also serves as a control to protect yourself from an unwanted billing disasters.

apig-wsgi

The LambdaRestApi object from the previous section takes a HTTP request and hands it over to the Lambda. But Flask doesn't understand this particular format, which speaks Web Server Gateway Interface (WSGI) only. Fortunately, there is a Python library named apig-wsgi that can convert the API Gateway-format into WSGI and vice versa. The library is very simple to use, you simply need to wrap the Flask app with it. In the following code, create_app is the function that creates your Flask app.

from apig_wsgi import make_lambda_handler
from serverless_flask import create_app

inner_handler = make_lambda_handler(app, binary_support=True)

def lambda_handler(event, context):
  return inner_handler(event, context)

(link to code in the starter kit)

CloudFront

I suggest fronting the API with CloudFront, which is a Content Distribution Network (CDN) service by AWS. It has two main purposes. First, when you create an API Gateway-backed API, your application root is always prefixed by the stage name (e.g., /prod). CloudFront can re-write the URL to provide a clean URL (/my-url to /prod/my-url). Second, it can improve your application performance by being closer to your users and caching responses. The following code snippet assembles a simple CloudFront CDN. This is by far the longest CDK snippet:

let cdn = new cloudfront.Distribution(this, "CDN", {
  defaultBehavior: {
    functionAssociations: [{
      eventType: cloudfront.FunctionEventType.VIEWER_REQUEST,
      function: new cloudfront.Function(this, "RewriteCdnHost", {
        functionName: `${this.account}RewriteCdnHostFunction${stageName}`,
        // documentation: https://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/functions-event-structure.html#functions-event-structure-example
        code: cloudfront.FunctionCode.fromInline(`
        function handler(event) {
          var req = event.request;
          if (req.headers['host']) {
            req.headers['x-forwarded-host'] = {
            value: req.headers['host'].value
            };
          }
          return req;
        }
        `)
      })
    }],
    origin: new origins.HttpOrigin(restApiUrl, {
      originPath: "/prod",
      protocolPolicy: cloudfront.OriginProtocolPolicy.HTTPS_ONLY,
      connectionAttempts: 3,
      connectionTimeout: Duration.seconds(10),
      httpsPort: 443,
    }),
    smoothStreaming: false,
    viewerProtocolPolicy: cloudfront.ViewerProtocolPolicy.REDIRECT_TO_HTTPS,
    cachedMethods: cloudfront.CachedMethods.CACHE_GET_HEAD_OPTIONS,
    allowedMethods: cloudfront.AllowedMethods.ALLOW_ALL,
    compress: true,
    cachePolicy: new cloudfront.CachePolicy(this, 'DefaultCachePolicy', {
      // need to be overriden because the names are not automatically randomized across stages
      cachePolicyName: `CachePolicy-${stageName}`, 
      headerBehavior: cloudfront.OriginRequestHeaderBehavior.allowList("x-forwarded-host"),
      // allow Flask session variable
      cookieBehavior: cloudfront.CacheCookieBehavior.allowList("session"),
      queryStringBehavior: cloudfront.CacheQueryStringBehavior.all(),
      maxTtl: Duration.hours(1),
      defaultTtl: Duration.minutes(5),
      enableAcceptEncodingGzip: true,
      enableAcceptEncodingBrotli: true
    }),
  },
  priceClass: cloudfront.PriceClass.PRICE_CLASS_200,
  enabled: true,
  httpVersion: cloudfront.HttpVersion.HTTP2,
});
new CfnOutput(this, "CDNDomain", {
  value: cdn.distributionDomainName
});

(link to code in the starter kit)

Most of the configuration is self-explanatory, but there are a few things that need explanation: domain re-writing and cache control.

Domain re-writing

Domain re-writing is implemented so that Flask can know to which domain it's serving the content. This is important if your Flask app needs to know its own domain for things like sub-domain support and absolute URL generation. If you are hosting Flask in a more traditional architecture, this is not an issue but in this case, we are going through CloudFront and API Gateway so it's a bit more involved.

CloudFront is capable of passing the Host header but this is not possible when using API Gateway in the middle because API Gateway uses the Host header to distinguish its clients. (Googling this suggests this to be a common problem).

If you simply pass through the Host header, you will get a mysterious 403 error from API Gateway (most like this is because they use SNI to differentiate different originating domains).

Fortunately, we can use a super cool feature named CloudFront Functions to solve this problem. CloudFront Functions lets you give the CDN a JavaScript function which can modify the request and response objects at will, so long as they finish in a millisecond. In our setup, the function code will rename the original Host header into x-forwarded-host. We also need to allow the specific header to be forwarded.

Since the Flask application doesn't really know about the x-forwarded-host, we need to re-write the header once more to restore the Host header:

def lambda_handler(event, context):
  app.logger.debug(event)
  headers = event['headers']
  cf_host = headers.pop("X-Forwarded-Host", None)
  if cf_host:
    app.config["SERVER_NAME"] = cf_host
    # patch host header
    headers['Host'] = cf_host
    event['multiValueHeaders']['Host'] = [cf_host]
    app.logger.info(f"Host header is successfully patched to {cf_host}")

  return inner_handler(event, context)

(link to code in the starter kit)

Note that the HTTP header casing is inconsistent - CloudFront only accepts lower-case HTTP header names in the configuration but API Gateway turns them all into Camel-Kebab-Case headers.

After this, the Flask application will work pretty seamlessly with respect to the Host header.

Cache Control

The sample CDN configuration caches responses for 5 minutes by default, up to an hour. This is a sensible default for a mostly static website, but there are times when you don't want the response to be cacheable.

Since CloudFront CDN simply follows the HTTP cache directives, you can use the same mechanism to prevent caching of resources.

from flask import make_response
import time

@app.route("/example_json_api")
def example_json_api():
  resp = make_response({"body": "ok", "time": round(time.time())})
  resp.headers["Content-Type"] = "application/json"
  return resp

@app.route("/example_json_api_no_cache")
def example_json_api_no_cache():
  resp = make_response({"body": "ok", "time": round(time.time())})
  resp.headers["Content-Type"] = "application/json"
  resp.headers["Cache-Control"] = "no-store, max-age=0"
  return resp

You can observe that the first resource is cached for 5 minutes whereas the second resource is always fetched from the source by examining the time field.

The current configuration passes through a cookie named session because that's what's used by Flask to store session data. This effectively disables caching if you start using session (e.g., for logged in users). For a more robust control (such as always caching images regardless of cookies), you will want to create new CloudFront behaviours based on the URL.

Session Handling

This section is only relevant if you are planning to use the Flask session.

Session refers to the state that gets persisted across HTTP requests that the client cannot tamper with. For example, one of the ways to implement the "logged in" state is to use a session variable to indicate the current user name. A typical way this is implemented is by storing the session in a database.

It is possible to also implement the session without a database if you utilize cryptography (with a different set of trade-offs). This is the approach Flask takes by default (Flask quick start section on session). What is relevant in our context is that, you need to securely store the secret key backing the session encryption. If you were to re-generate the secret key every time, the session would not work.

In my setup, I decided to use S3 to store the key. You could use Secret Manager but it is totally not aligned with our goal of minimizing cost.

Here's how to define the S3 bucket in CDK:

let appStore = new s3.Bucket(this, "S3Storage", {
  blockPublicAccess: BlockPublicAccess.BLOCK_ALL,
  removalPolicy:RemovalPolicy.RETAIN,
  encryption: BucketEncryption.S3_MANAGED,
  bucketName: `${this.account}-serverlessflask-s3storage-${stageName}`
});

// grant permissions to the Lambda IAM Role
appStore.grantReadWrite(lambdaRole);

(link to code in the starter kit)

In the code, I opted to simply create a new secret key, if it does not exist - the code is not free of race-condition but it's good enough for our purposes.

Incident Response Plan - if the key ever gets compromised, you can just delete the S3 object and you will get a new key.

Wrapping Up

Feel free to try this yourself with the runnable example code in my serverless-flask-on-aws Github repo. I tried making the sample as realistic as possible - it has simple unit tests samples along with a very detailed documentations on how to run it.

If you found this helpful, please share with your friends using the permalink. Feel free to tweet at me or email me, if you have any comments.

Cost (Update 2021-12-30)

Only_As_I_Fall on Reddit asked how much this costs. This was my answer:

Since this is a mostly static website, I'll assume there aren't that many hits all the way to Lambda - which means the CDN is the dominating factor (Lambda+API Gateway would not cost much). As of now, loading the main page (with 5 latest article) costs about 120kiB per page load, but after compression it's 50KiB. Let's assume it's all not cached. So 1GiB gives me 20,000 hits. I opted for the "price class 200" which can be up to $0.120/GiB.

CloudFront now gives 1TiB for free, so it's free up to 20 million hits per month or 7.9 requests per second flat (as a comparison, reddit gets just 80x more than that). After that, it's about $6 per a million visits.

Catergorized under: programming

Published: 2021-12-26T19:54:46.143107
Last modified: 2023-02-12T02:30:40.531027
Permalink