Full Stack Ecoinformatics

Laboratory Information Management System

Private R Shiny AWS Lambda DynamoDB Docker

A cloud-native laboratory information management system (LIMS) for tracking environmental DNA samples from client intake through DNA extraction, assay processing, and reporting. Built as a modular R Shiny application backed by a custom API deployed to AWS Lambda with DynamoDB.

Role
Lead Developer & System Architect
Context
Developed while serving as Software Engineer at Jonah Ventures
Status
Production Deployment

Laboratory Information Management for Commercial eDNA Operations

The eDNA LIMS is a full-stack laboratory information management system built for Jonah Ventures, a commercial environmental DNA laboratory. The system manages the complete sample lifecycle — from client intake and kit fulfillment through DNA extraction, sequencing and qPCR assays, and results delivery — across two coordinated codebases: an interactive R Shiny application and a bespoke API deployed as an R package on AWS Lambda.

I served as the lead developer and system architect for both components.


The Problem

Commercial eDNA laboratories process hundreds to thousands of samples concurrently, each progressing through a multi-stage workflow: client onboarding, kit and vial orders, sample receipt, batching, DNA extraction, assay setup, sequencing or qPCR runs, and results delivery. Each stage involves distinct personnel, equipment, and data requirements.

Before the LIMS, sample tracking relied on a combination of spreadsheets and manual record-keeping. This approach introduced several operational risks:

  • No centralized view of sample status across workflow stages
  • Manual data entry at each handoff point, increasing error rates
  • Difficulty coordinating work across lab technicians, project managers, and clients
  • No structured mechanism for managing extraction plates, assay runs, or task assignment
  • Limited ability to search, audit, or report on historical sample data

The laboratory needed a purpose-built system that could track samples through every stage while remaining responsive to the fast-paced, hands-on realities of bench work.


System Architecture

The system is split into two repositories, each deployed independently:

Shiny Application (UI)

The user interface is a modular Shiny application built with the golem framework, comprising over 70 modules organized around core laboratory workflows. The application uses an event-driven architecture via the gargoyle package, decoupling module communication from Shiny’s reactive graph to maintain clarity as the system grows.

Key interface areas include:

  • Dashboard — Real-time overview of receiving, order, extraction, and task status
  • Receiving — Sample intake via manual entry, plate maps, or bulk sample sheet upload with validation
  • Orders — Kit and vial request management with ShipStation fulfillment integration
  • Lab Workflows — DNA extraction plate management, assay configuration (NGS and qPCR), task assignment and run tracking
  • Search — Cross-entity lookup by client, sample, batch, order, or project with fuzzy matching
  • Client Management — Account administration, project organization, personnel, and communications

All API calls are executed asynchronously using the promises and mirai packages, keeping the interface responsive during database operations.

Lambda API (Backend)

The API is an R package deployed to AWS Lambda as a Docker container image via ECR. It exposes over 175 handler functions through a single Lambda URL, with each request specifying the target function and its parameters. Authentication uses AWS SigV4 request signing tied to IAM roles.

The primary data store is a DynamoDB table organized using single-table design. Eleven record types — including projects, samples, batches, extracts, tasks, runs, orders, and sample sheets — share a single table with composite keys and five global secondary indexes supporting access patterns ranging from client-scoped queries to status-based filtering with temporal sharding.

Additional AWS integrations include S3 for sample sheet storage and email archival, SES for client communications, Cognito for user authentication, and Secrets Manager for third-party API credentials.


Technical Design

Several architectural decisions reflect the operational constraints of a working laboratory:

Event-driven over reactive. The gargoyle event system avoids deeply nested reactive dependencies that become difficult to reason about as module count grows. State changes propagate through explicit event triggers rather than implicit reactive invalidation.

Async-first API layer. Every database operation runs asynchronously, preventing long-running queries from freezing the interface. This is particularly important during batch operations that touch hundreds of records.

Single-table DynamoDB. The single-table design consolidates all record types into one table with carefully designed key structures, enabling efficient cross-entity queries without joins. Status-based indexes use temporal sharding (e.g., TASK#COMPLETE#2024Q1) to distribute read load across partitions.

Two-stage Docker builds. Both the Shiny application and the Lambda API use a two-stage Docker build process: a base image with pinned R dependencies (managed by renv), and a thin application layer installed on top. This separates dependency management from deployment, reducing build times and ensuring reproducibility.


Outcome

The eDNA LIMS replaced a manual, spreadsheet-driven workflow with a structured system that tracks samples from intake through delivery. Lab technicians, project managers, and administrators now operate from a shared interface with real-time visibility into sample status, extraction progress, and task assignments.

The project demonstrates how R — often associated with analysis scripts — can serve as the foundation for production-grade operational software when paired with appropriate architectural patterns and cloud infrastructure.

Have a Similar Challenge?

If you're developing or modernizing software and data systems for ecological or scientific work, let's connect. We'll begin with a focused conversation about your goals, technical constraints, and how to build infrastructure that supports long-term impact.

Discuss a Similar Project