Iterating on our bulk Visi exports feature

Published: Sep 30, 2022

Last updated: Sep 30, 2022

Working at a startup poses interesting challenges for those working in product development.

There are always jobs to be done, but entering the market quickly and bringing your product up to competitive feature parity and building out the "why we're different" mote requires ruthless prioritization as well as some creative, short feature timelines with a willingness to roll-forward or pivot at short notice.

For the engineers (myself included), that can tend towards what I will term "code-complete vision". Code-complete vision is the idea given a problem, an engineer proposes and builds a vision-perfect version of their solution. I sometimes think of it as the engineer being their own consumer.

This approach can have a time and place (and I can count the amount of times I've under-engineered a solution - including the problem that led to today's post), but the trade-off cost for doing so in a startup is this:

Time-to-market.
Feature validation.
Feedback cycles from the end-user.

Without overtly stating the obvious, there is always a fine balance between feature release and engineering. Underbake your cake, and you won't serve the customer what they want. Spend too long in the oven and the customers may already have eaten and are satisfied.

At Visibuild, we are working closely with our customers and remain transparent so that our finger is on the pulse and we can prioritize what needs to be done as effectively as possible. Once these decisions have been made, the internal team works together to put together the vision of the feature and work backwards on slicing up the end-goal into actionable pieces that help to bring value to the customer as soon as possible.

One feature that I recently built out was the ability for customers to bulk export PDFs of our customers "Visis" (a universal umbrella term that covers the project's inspections, issues, tasks and non-conformance reports).

Bulk PDF exports for Visis in the UI

This feature was heavily requested, and we wanted to slice up the end-result into iterations that enabled us to get the feature out to users quicker.

We sliced up the iterations for this feature into two pieces:

The first iteration would introduce the customer-facing UI on the web application and utilize the backend flow we already had for emailing out PDF exports for a single Visi as an email attachment.
The second iteration would focus on replacing the emailed zip files with the PDFs, storing those zip files remotely and replace the email attachment with a link to the downloads.

The problem with the first flow is the limitation of email attachments. There is an attachment size for the emails that we know we would eventually hit as project data gets heavier and exports for those Visis becomes more difficult. As users begin to use our feature more, the limit of hitting this will be imminent. This was a known assumption when defining our first iteration, so we capped the bulk exports to have a maximum of fifty Visis requested for export at one time. This cap is also not ideal for us nor our customers, and so rolling forward to using download attachments is ideal.

Note: it is still possible that our job may fail with an export cap of fifty, however we knew that this would help mitigate the risks.

Knowing the requirement to keep the second iteration high on the prioritization list, the breathing space given after the release of the first iteration has afforded us the chance to spike out the solution for the second iteration of replacing the zipped PDFs.

Spiking the solution

The current flow looks something like this:

Old flow

To clarify, the backend setup for the first iteration would generate the PDFs in-memory and eventually generate the ZIP folder with each PDF inside. The requirements for the second iteration release looked like this:

No longer send ZIP as an email attachments for bulk exports.
Store the generated ZIP folder remotely.
Manage the lifecycle of the asset (from moment of request to the removal of the folder).
Set up a way to manage a secure download link to the asset (if valid).

The new flow can be pictured like so:

New flow

This blog post goes over the process that I went through in order to build out a working spike that emulated our stack close enough to validate the concept.

Our current stack uses React.js on the frontend, Ruby on Rails for the backend and is hosted on Amazon Web Services. Because of this, the rest of the blog post have the following two goals:

Building out a Rails + React.js project that can demonstrate the ability to download remote ZIP folder assets securely.
Setting up an infrastructure folder to organize and deploy an infrastructure stack to AWS using the AWS CDK.

I deemed that going deep into the processes that we already had set up (emailing and generating the PDFs) was out-of-scope for this project.

Setting up the project

I already wrote a post on getting Rails 7 working with React.js and ESBuild, so I won't go into those details here.

The project starts by cloning the project and building from there.

$ git clone https://github.com/okeeffed/demo-rails-with-react-frontend demo-aws-sdk-s3-gem
$ cd demo-aws-sdk-s3-gem

At this stage, you could install the dependencies required with bundle and yarn, but the next step we want to do is getting the project ready for the AWS CDK.

$ mkdir infra
$ cd infra
$ npx cdk init app --language typescript

The steps above effectively just follow the documentation for getting started with the AWS CDK.

At this stage, we have the foundations setup for the frontend, backend and infrastructure.

The next logical step in this process is to first focus on the infrastructure for hosting our assets that we wish to download through a download link.

Building out the infrastructure for hosting our assets

For this particular spike, I am suggesting that we use AWS S3 as the destination for our assets. This blog post won't go into explaining S3 too deeply, but some of the features about S3 that I am interested in:

The lifecycle rules of S3 mean we can set a rule to expire assets for a bucket. This means that all uploaded assets are ephemeral from the moment they are stored. Removing the assets after a set amount of time is a benefit to us for storage costs and optimization, which is a huge win for those on a tight budget. It will mean that we need to track the expiry, but we will do that later on when we move onto the Rails application.

The second capability with pre-signed URLs means that we can use our access key (which we will specifically set for access to this S3 bucket only) to generate a URL that can be used by anyone who gets the URL to download the asset. We can use this on the front-end as our download link that is provided for the client to download their file directly from the frontend application (where a link is valid and ready-for-download).

To create out bucket, let's edit to default example stack in the file infra/lib/hello-cdk-stack.ts:

import * as cdk from "aws-cdk-lib";
import { Construct } from "constructs";
import * as s3 from "aws-cdk-lib/aws-s3";

export class InfraStack extends cdk.Stack {
  constructor(scope: Construct, id: string, props?: cdk.StackProps) {
    super(scope, id, props);

    new s3.Bucket(this, "DemoBlogRailsAssetsBucket", {
      bucketName: "demo-blog-rails-assets-bucket",
      removalPolicy: cdk.RemovalPolicy.DESTROY,
      encryption: s3.BucketEncryption.KMS,
      bucketKeyEnabled: true,
      lifecycleRules: [
        {
          enabled: true,
          expiration: cdk.Duration.days(1),
        },
      ],
    });
  }
}

This stack is straightforward, but the options we are passing cover the following:

Specifically setting the bucket name to be demo-blog-rails-assets-bucket.
Setting the removal policy of the bucket to DESTROY. This means that when we tear down the stack, it will also delete the bucket so that we do not receive any unexpected costs (for production applications, you will NOT want this).
I am setting S3 server-side bucket encryption for objects at rest. The options can be found on the docs. This is optional for the demo, but worth exploring if you plan to set this up for production.
I am setting a lifecycle rule to remove an upload asset one day after upload. This is contrives for demo purposes, but worth setting up sensible defaults for your production application.

Once this is done, you can deploy the infrastructure from the infra folder.

$ npx cdk synth
$ npx cdk bootstrap
$ npx cdk deploy

Please note that you must have your AWS credentials setup for the deployment to work. I won't cover this, but I personally use aws-vault to manage my accounts locally.

If everything deploys successfully, you can double check your bucket is there using the AWS CLI:

$ aws s3 ls | grep rails
2022-09-30 13:41:17 demo-blog-rails-assets-bucket

At this point, we can move back to the Rails application and set things up there.

Setting up for Ruby

For the demo, we will also add a few more gems that will aid in emulating the functionality that we want:

# Required for demo on AWS + downloading zip files
gem 'aws-sdk-s3', '~> 1', require: false
gem 'dotenv-rails', groups: %i[development test]
gem 'faker'
gem 'parallel', require: false
gem 'rubyzip', '~> 2.3'
gem 'sidekiq'

Once they are in your Gemfile, run bundle to install the gems.

What each Gem is used for:

aws-sdk-s3 is our SDK to upload files and generate pre-signed URLs.
dotenv-rails allows us to load environment variables from a local .env file.
faker will generate some fake data for us.
parallel is used for a helper script to upload zip files in parallel.
rubyzip helps us generate ZIP folders with the files we are creating.
sidekiq is our asynchronous background job gem. You can see a basic implementation on my blog post here.

Note: if you are using a different version of Ruby than specified in the .ruby-version and Gemfile, be sure to update those values.

Setting up the dummy assets

The first thing we will do is write a script to add some objects to our bucket. This is essentially a quick sense-check that things work as we expect at this current point.

In bin/sync-assets, add the following:

#!/usr/bin/env ruby
# frozen_string_literal: true

require 'dotenv/load'
require 'zip'
require 'faker'
require 'aws-sdk-s3'
require 'parallel'
require 'rack/mime'
require 'active_support/isolated_execution_state'
require 'active_support/core_ext/numeric/time'
require 'securerandom'

unless ENV['ASSETS_S3_BUCKET']
  puts 'ASSETS_S3_BUCKET environment variable is not set'
  exit 1
end

files = [
  { title: 'file1', body: Faker::Lorem.paragraph },
  { title: 'file2', body: Faker::Lorem.paragraph },
  { title: 'file3', body: Faker::Lorem.paragraph }
]

def generate_zip(file)
  temp_file = Tempfile.new("#{file[:title]}.zip")
  Zip::OutputStream.open(temp_file) do |zos|
    zos.put_next_entry("#{file[:title]}.txt")
    zos.puts(file[:body])
  end
  temp_file
end

zips = files.map do |file|
  generate_zip(file)
end

Parallel.each(zips, in_threads: 5) do |zip|
  object_key = SecureRandom.uuid
  Aws::S3::Object.new(ENV['ASSETS_S3_BUCKET'], "#{object_key}.zip").upload_file(
    zip,
    {
      content_type: Rack::Mime.mime_type(File.extname(zip.path))
    }
  )
end

This script will do the following:

Generate three ZIP folders with three text files within.
Upload them to S3.

Run bin/sync-assets in the command line to execute the Ruby script.

If there are no errors, you can validate that this was successful by heading to the S3 bucket in the AWS portal or checking the files using the AWS CLI:

$ aws s3api list-objects-v2 --bucket demo-blog-rails-assets-bucket

# You will get back something like this

{
    "Contents": [
        {
            "Key": "60df0050-abb3-4397-ae4b-ffef637ae682.zip",
            "LastModified": "2022-10-12T06:30:27+00:00",
            "ETag": "\"b87c63f5300caa176b930780679590ce\"",
            "Size": 186,
            "StorageClass": "STANDARD"
        },
        {
            "Key": "a51226cd-ad72-4975-a218-59d5250813ab.zip",
            "LastModified": "2022-10-12T06:30:27+00:00",
            "ETag": "\"91267f5c8ba69738489ccafabe4f4c2e\"",
            "Size": 182,
            "StorageClass": "STANDARD"
        },
        {
            "Key": "a63e4be5-93c2-44da-b198-769baae353af.zip",
            "LastModified": "2022-10-12T06:30:27+00:00",
            "ETag": "\"f0f70aec61ef1cce2cebb6cce5b4c5e9\"",
            "Size": 182,
            "StorageClass": "STANDARD"
        }
    ]
}

We have all three expected zip files uploaded as expected (albeit it may not seem obvious from the metadata, but we can infer that from the three objects returned).

At this point, we know we have files in the bucket and confirm that the infrastructure is working fine!

Creating a limited policy for our bucket

Now that we have set up our bucket and checked that it works with our script, it is time to create a programmatic AWS access key that we can use for writing to and reading from our new S3 bucket.

It is important here to follow the principle of least privilege.

Log into the AWS portal and head to IAM. Once there, the first thing we want to do is create a new policy.

Create new policy

We want to select S3 as the service.

Select service

We then want to add GetObject, PutObject access for our new bucket that we've created.

Adding the access we need

Add the specific bucket

Since we are also encrypting with KMS, we also need to add kms:Decrypt and kms:GenerateDataKey capabilities (as outlined on their support guide).

Add an additional policy and follow it like the above for S3, selecting those two actions. Leave it for all resources.

With the KMS permissions needed

Creating an AWS Access Key for our bucket

Next, in IAM, select users and create a new user. We want the user to have programmatic access.

Adding a new user

On the next page, add the new policy that we created to give the limited access to the specific S3 bucket that we want.

Add policy to the user

Create the user and take note of the keys that AWS gives you.

Once that is all done, take the access key ID and secret key and we add a new .env file to the root of our Rails application that we will read in development.

ASSETS_S3_BUCKET=demo-blog-rails-assets-bucket
AWS_ACCESS_KEY_ID=<new-access-id-here>
AWS_SECRET_ACCESS_KEY=<new-access-key-here>

Ensure that .env is ignored and not staged by Git.

We can now begin on making changes to the Rails application.

Setting up the additional files for our Rails application

Given that we've pulled the code from a previous blog post, there are some things worth noting:

The components controller file app/controllers/components_controller.rb was generated and components#index is the root route.
Our application entry point for the frontend is app/javascript/application.ts, which imports the base React.js application from app/javascript/components/application.tsx.

With that understanding out of way, we want to make some modifications to our Rails app by adding some new models, controllers and jobs. Run the following from the CLI in the root folder:

$ bin/rails g model ExportAsset ref_id:string content:string s3_url:string status:string expires_at:datetime
$ bin/rails db:migrate
$ bin/rails g controller api/v1/jobs create
$ bin/rails g controller api/v1/assets index
$ bin/rails g sidekiq:job upload_asset

The above does the following (in order):

Creates a new model ExportAsset. We will use this to track our asset export status, as well as whether or not it has expired.
We run the migration for ExportAsset.
We create a Api::V1::Jobs controller to schedule a new background job.
We create a Api::V1::Assets controller for use to query when we land on our downloads page.
We scaffold out a new job UploadAssetJob for generating a ZIP file, then uploading that file to S3.

The POST endpoint to schedule our upload job

Let's first update our job in the file app/controllers/api/v1/jobs_controller.rb:

require 'faker'
require 'securerandom'

class Api::V1::JobsController < ApplicationController
  def create
    @asset = ExportAsset.new(content: params[:content], status: 'pending', ref_id: SecureRandom.uuid)

    if @asset.save
      UploadAssetJob.perform_async(@asset.id)
      render json: { message: 'Accepted', id: @asset.ref_id }, status: :accepted
    else
      render json: { errors: @asset.errors.full_messages }, status: :unprocessable_entity
    end
  end
end

This API endpoint will be used to create a new ExportAsset entity and set the status to pending. If the asset saves successfully, it will return a "202 Accepted" status. If it does not, the error will be returned.

Creating our job to generate a zip and upload the asset

In app/sidekiq/upload_asset_job.rb, add the following:

require 'zip'
require 'aws-sdk-s3'

unless ENV['ASSETS_S3_BUCKET']
  puts 'ASSETS_S3_BUCKET environment variable is not set'
  exit 1
end

class UploadAssetJob
  include Sidekiq::Job

  def perform(asset_id)
    asset = ExportAsset.find(asset_id)

    return unless asset

    begin
      sleep 10 # simulate waiting to process time
      asset.update(status: 'processing')
      zip = generate_zip(asset.content)
      upload_to_s3(asset, zip)
      sleep 10 # simulate processing time
      s3_url = generate_presigned_url(asset)
      asset.update(status: 'completed', expires_at: Time.now + 1.minutes, s3_url:)
    rescue StandardError
      asset.update(status: 'failed')
    end
  end

  private

  def generate_zip(content)
    string_io = Zip::OutputStream.write_buffer do |zos|
      zos.put_next_entry('content.txt')
      zos.puts(content)
    end
    string_io.string
  end

  def upload_to_s3(asset, zip)
    Aws::S3::Object.new(ENV['ASSETS_S3_BUCKET'], "#{asset.ref_id}.zip").put(
      {
        body: zip,
        content_type: 'application/zip'
      }
    )
  end

  def generate_presigned_url(asset)
    s3 = Aws::S3::Resource.new
    object_key = "#{asset.ref_id}.zip"
    s3.bucket(ENV['ASSETS_S3_BUCKET']).object(object_key).presigned_url(:get, expires_in: 60) # 5 minutes
  end
end

The background job has been written to go do the following:

Simulate processing time with sleep.
Update the status to be processing.
Add the uploaded to content to a zipped text file.
Upload that file to S3.
Generate a pre-signed URL.
Set that pre-signed URL and expires_at field in the asset.

Some things to note about this code written:

sleep is specifically for emulation in development for this spike. Using that in production is an obvious foot gun.
This job workflow will be iterated upon and changes will be made prior to the final implementation. Some of the error handling and functionality here is contrived. Use this to get a feel for things and then use best practices.

Now that our code is written to be able to manage our lifecycle emulation and upload assets to our S3 bucket, let's create an API endpoint to get the status of the job.

The GET endpoint for asset state

Finally, we need a way for our download page to setup the download link or provide information if the status of the job is not completed.

In app/controllers/api/v1/assets_controller.rb, update the code to be the following:

require 'aws-sdk-s3'

class Api::V1::AssetsController < ApplicationController
  def show
    asset = ExportAsset.find_by(ref_id: params[:id])

    return render json: { message: 'Not found' }, status: :not_found unless asset
    return render json: { status: asset.status } unless asset.status == 'completed'
    return render json: { status: 'expired' } unless asset.expires_at > Time.now

    render json: { status: asset.status, url: asset.s3_url }
  end
end

Now that we have our business logic in the backend setup, we can move to the frontend to tie it all together.

Setting up the frontend

Our production application at Visibuild is not using the ESBuild setup that I have cloned from my other blog post, and so I ended up using React Router to manage both the home and download page. This code is contrived to test out the business logic. Please take what I am doing with React Router here with a grain of salt.

First, install React Router with the Node package manager of your choosing:

$ yarn add react-router-dom

Next, we can override the code from the cloned repository in app/javascript/components/application.tsx:

import React from "react";
import ReactDOM from "react-dom";
import { createBrowserRouter, RouterProvider } from "react-router-dom";
import { DownloadPage } from "./DownloadPage";
import { HomePage } from "./HomePage";

const router = createBrowserRouter([
  {
    path: "/",
    element: <HomePage />,
  },
  {
    path: "/download",
    element: <DownloadPage />,
  },
]);

document.addEventListener("DOMContentLoaded", () => {
  const rootEl = document.getElementById("root");
  ReactDOM.render(<RouterProvider router={router} />, rootEl);
});

The above code will help use load the correct page component for the / and /download routes.

Again, I must emphasize that this is contrived work. I am unclear on best practices for React Router when using Hotwire (and we do not use Hotwire at Visibuild).

We now need to add our missing page files for the home page and download page.

Adding in our home page

In app/javascript/components/HomePage.tsx, add the following:

import * as React from "react";
import axios from "axios";
import { Link } from "react-router-dom";

export function HomePage() {
  const [status, setStatus] = React.useState<string | null>(null);
  const [id, setId] = React.useState<string | null>(null);

  const handleSubmit = React.useCallback(
    (e: React.FormEvent) => {
      e.preventDefault();
      const data = new FormData(e.target);
      axios
        .post("/api/v1/jobs", {
          content: data.get("content"),
        })
        .then(({ data }) => {
          setStatus("success");
          setId(data.id);
        })
        .catch(() => setStatus("error"));
    },
    [setStatus]
  );

  return (
    <div>
      <h2>Welcome to home</h2>
      <h3>Create a job to upload to S3</h3>
      <form onSubmit={handleSubmit}>
        <input type="text" name="content" placeholder="Content" />
        <input type="submit" value="Submit" />
        {status === "success" && (
          <>
            <p>Job created successfully</p>
            <Link to={`/download?ref_id=${id}`}>Go to download page</Link>
          </>
        )}
        {status === "error" && <p>Job creation failed</p>}
      </form>
    </div>
  );
}

This basic page will present an input to start a new job to zip up the text value and upload it to S3.

If successful, it will render a download link. If there is an error, it will just let us know.

Next on the jobs-to-do list is to create our downloads page that we link to after the successful scheduling of a job.

Creating the downloads page

In app/javascript/components/DownloadPage.tsx, add the following:

import * as React from "react";
import axios from "axios";
import { useSearchParams } from "react-router-dom";

export function DownloadPage() {
  const [searchParams] = useSearchParams();
  const [status, setStatus] = React.useState<string | null>(null);
  const [url, setUrl] = React.useState<string | null>(null);

  React.useEffect(async () => {
    axios
      .get(`/api/v1/assets/${searchParams.get("ref_id")}`)
      .then(({ data }) => {
        setUrl(data.url);
        setStatus(data.status);
      })
      .catch((err) => {
        if (err.response.status === 404) {
          return setStatus("not_found");
        }
        setStatus("failed");
      });
  }, [setUrl, setStatus]);

  return (
    <div>
      <p>File download</p>
      {status === "not_found" && <p>File not found</p>}
      {status === "pending" && <p>Job is pending</p>}
      {status === "processing" && <p>Job is processing. Check again soon.</p>}
      {status === "failed" && <p>Job failed</p>}
      {status === "completed" && url && (
        <a href={url} download>
          Download
        </a>
      )}
      {status === "expired" && (
        <p>The URL for the export has expired. Please re-order.</p>
      )}
    </div>
  );
}

When you head to /download and provide a query param for ref_id, it will make a request to search for that asset, and if the asset is ready with a status of completed, it will provide a download link.

The value for that download link will be our pre-signed S3 URL to download the zip file asset!

The above code could do with a refactor, but I am leaving it in (as I would normally leave a refactor for an actual implementation).

We are almost ready to run the application, but just need to do some clean-up for the Rails router and CSRF token check.

Setting up the Rails router

At this point, we can head to our router and make some changes so that our api routes are setup, that we connect our URL for the download path and that we are using the components#index controller from this particular Git clone to be our home page.

Rails.application.routes.draw do
  get 'jobs/create'
  root 'components#index'
  get 'download', to: 'components#index'

  # Define your application routes per the DSL in https://guides.rubyonrails.org/routing.html
  namespace :api do
    namespace :v1 do
      resources :jobs, only: [:create]
      resources :assets, only: [:show]
    end
  end
end

Visiting /download will redirect it to the components#index route, which in turn will let React Router do its thing. Again, I am unsure on this particular implementation detail for Hotwire best practices with React Router, but I figure that it is out-of-scope for the spike and we can leave it be.

Finally, let's sort out the CSRF token check in development.

Disabling CSRF in development

We need to update the application config so that CSRF forgery is set to false for development. We can do that in config/application.rb:

require_relative 'boot'

require 'rails/all'

# Require the gems listed in Gemfile, including any gems
# you've limited to :test, :development, or :production.
Bundler.require(*Rails.groups)

# Load dotenv only in development or test environment
Dotenv::Railtie.load if %w[development test].include? ENV['RAILS_ENV']

module DemoRailsWithReactFrontend
  class Application < Rails::Application
    # Initialize configuration defaults for originally generated Rails version.
    config.load_defaults 7.0

    # Enable us to send requests without auth token
    config.action_controller.default_protect_from_forgery = false if ENV['RAILS_ENV'] == 'development'
  end
end

This is not something you want as false in production environments. I haven't set up this project to manage this correctly on development, and will ignore it for the sake of the spike. There are plenty of resources on setting up your React.js frontend to pass the CSRF token.

Running the application

At this point, we can run bin/dev to boot up the Rails app and head to port 3000.

bin/dev is setup to run Procfile.dev for us, which will startup ESBuild, the Rails server and our Sidekiq server.

Note: Sidekiq requires Redis to be configured.

Once the server is running, we can head to http://localhost:3000 and see our application running.

Running application

Enter in a value, select submit and when successful, we will get our accepted response from the backend.

Content accepted

When accepted, the link that displays will link through to our download page. Select it now and you will end up on a page with a pending response.

Job pending

If you remember, our job was designed to emulate processing time with sleep, so after ~10 seconds, if you reload the page you will see the job in the processing state.

Job process

After the job has begun processing, another sleep will take us into the completed stage.

Job completed

The completed stage has a download link to our new asset. If you click the link, the download will begin for the zip file.

Once downloaded, if you open the zip file, you will get our content.txt file that was generated with the input text we sent.

Downloaded

We can also confirm that the asset was uploaded to S3 with the same AWS CLI call we made earlier:

$ aws s3api list-objects-v2 --bucket demo-blog-rails-assets-bucket
# Notice that we now have four items! The new one is the ZIP file.
{
    "Contents": [
        {
            "Key": "370d66b7-28c8-464e-8940-e53fd03aaef3.zip",
            "LastModified": "2022-10-12T09:10:23+00:00",
            "ETag": "\"f0f726f898163d0f934046033389a6c6\"",
            "Size": 139,
            "StorageClass": "STANDARD"
        },
        {
            "Key": "60df0050-abb3-4397-ae4b-ffef637ae682.zip",
            "LastModified": "2022-10-12T06:30:27+00:00",
            "ETag": "\"b87c63f5300caa176b930780679590ce\"",
            "Size": 186,
            "StorageClass": "STANDARD"
        },
        {
            "Key": "a51226cd-ad72-4975-a218-59d5250813ab.zip",
            "LastModified": "2022-10-12T06:30:27+00:00",
            "ETag": "\"91267f5c8ba69738489ccafabe4f4c2e\"",
            "Size": 182,
            "StorageClass": "STANDARD"
        },
        {
            "Key": "a63e4be5-93c2-44da-b198-769baae353af.zip",
            "LastModified": "2022-10-12T06:30:27+00:00",
            "ETag": "\"f0f70aec61ef1cce2cebb6cce5b4c5e9\"",
            "Size": 182,
            "StorageClass": "STANDARD"
        }
    ]
}

Notice that now we have four items in the Contents array, whereas we had three before. We can infer that our zip file is the new entry.

If you wait the allocated amount of time that we set in the database for the expiry, you can reload and get an expired state on the downloads page.

Expired

If the bucket URL expired, but the user was still on a page with a download link, then clicking the download link will display the standard AWS object expired response.

AWS expired link

There is also a not found state in my code example, although you will likely want to consider a redirection to a 404 page if an asset does not exist instead of showing the message like I have here.

Not found

The S3 bucket after expiry time has elapsed

The last important piece of the puzzle is the validation that the objects are automatically cleared when the lifecycle policy time we set for the bucket elapses.

Below is an image that I took for the bucket one day after the initial work was completed. You can see that the bucket is now empty, woohoo!

Lifecycle expired

Cleanup

Now that we are done, we can tear down the AWS S3 bucket that we created to ensure we aren't paying money for unused spike assets.

$ npx cdk destroy

This will remove the bucket since we set the bucket policy to DESTROY at the beginning of this post, as well as any other configuration for that CloudFormation stack.

The bucket will need to empty because it can be destroyed. Either write a script to clean the bucket up, use the UI or wait out the one day for the assets to be automatically removed by the lifecycle rule.

Wrap up

The focus today was to follow along with a raw process that I will go through when spiking out new features and/or iterations.

We managed to successfully set up a project that could demonstrate a rudimentary version of setting download links that have lifecycle status associated. We also created an S3 bucket that will remove files after an elapsed period of time, which is great for us when it comes to cost optimization and keeping the number of assets manageable.

Spikes like this one give us the opportunity to explore technology and see what a first-version looks like. From here, I will review the models, lifecycle flow, talk with the product team and polish off any security and execution concerns. Once the product team here at Visibuild is aligned, we will move into action with a more polished version of this workflow.

Resources and further reading

Dennis O'Keeffe

@dennisokeeffe92

Melbourne, Australia

Hi, I am a professional Software Engineer. Formerly of Culture Amp, UsabilityHub, Present Company and NightGuru.

I am currently working on Visibuild.