Building a Dynamic Image Gallery
Posting images on social media has always been a challenge for me. I have lacked the discipline to organize my photos into collections, leaving my photography scattered in a disorganized gallery.
In an effort to solve this, I tried building a tiny Instagram clone a few years ago. The site was a grid of square images, and my goal was to arrange them in an aesthetically pleasing way. Unfortunately, the project stalled shortly after; perhaps I got distracted and moved on. This time, however, it is much clearer to me what this could become.
The Application
The customer-facing application is a web application built with Next.js. It displays a collection of high-quality photographs, sorted algorithmically for an aesthetically pleasing layout. The template used was provided by Vercel and can be found here.
The Backend
I built a custom backend to ensure my photographs were delivered with the desired quality. This required several services and media processing pipelines.
Generating Assets From Images
Every photograph begins its journey when exported from Lightroom. Once exported, the photograph can be dropped in a dedicated S3 bucket and the corresponding Asset is generated by a Lambda function. An Asset is an entity which represents an image, containing several pieces of information about the image as well as variants of the original image.
Variants are resized versions of the image, to better support various scenarios such as only requesting a “big enough” image for a given screen size. These variants are created with the sharp module which, so far, has not caused any regression in quality. Some other pieces of information about the image include its EXIF data, aspect ratio, dimensions, and dominant colours.
Storing Assets
After the Asset is generated, it needs to be persisted so it can be consumed by other services. Every Asset requires two types of data: the metadata and the media for its variants.
To keep things within AWS, the metadata is persisted to a DynamoDB table, while the media is put in S3. DynamoDB can be used as a key-value store, and for this case, each asset is given an ID which is used as the key. Each Asset's metadata contains URLs which point to the media stored in S3. Variant URLs point to the media as a client would via a CDN. For example, an Asset variant would have a URL similar to: https://cdn.example.com/:id/:variant
.
Retrieving Assets from Storage
The data was prepped and stored in DynamoDB, so I needed to provide a way for clients to query it. I had to be careful of the costs associated with AWS services, as leaving open access to my DynamoDB table could have disastrous results.
To address this, I used API Gateway to create the Registry. This Registry queries the DynamoDB table and returns results to authenticated consumers. This authentication, handled by Auth0, ensures that only permitted external services can trigger reads and consume resources.
The application server uses the Asset metadata to determine the presentation order. This operation is deterministic, and the results can be configured to alter client-side presentation. This approach has some exciting advantages, such as the ability to control the images shown and their order at various levels of the stack.
Algorithmically Sorting Assets
There is potential to improve the presentation order. Currently, the dominant colour of the Asset is the most important factor. This usually leads to a good experience, but it could be improved.
Any property on the Asset could be used to affect sorting. Aspect ratio, dominant colours, and palette are all possibilities. Machine learning could be used to include image content in the sort criteria.
A pipe dream is to have the sorting algorithm become powerful enough to determine which images work best for a specific criteria. Then, the application server could query the Registry for Assets that match. This would be a drastic change, meaning I could upload all my photographs and have the algorithm pick the best ones.
Optimizations
Upstash is used to cache responses from the Registry in order to reduce the number of calls handled by the AWS gateway within a given timeframe. This window can be configured based on load. Additionally, the cache is used to save application servers the cost of sorting, as this work only needs to be done once for a given set of Assets.
References
- Securing AWS HTTP APIs with JWT Authorizers
- Getting Started | Upstash: Documentation
- sharp - High performance Node.js image processing
- Using AWS Lambda with Amazon API Gateway
- Tutorial: Build a CRUD API with Lambda and DynamoDB
- Image Gallery Starter - Vercel
- JSON Web Tokens
- Responsive images - Learn web development | MDN