Nextflow in the Cloud

External Link

Repository

Role:

  • AWS Solutions Architect
    • Designed the entire architecture
    • Mentored colud inexpereinced engineering team on AWS resources and interactions
  • SCRUM Team Product Owner
    • Prioritized the backlog
    • Feature planning and primary contributor to backlog
    • Iteration execution
  • Engineer
    • Participated in sprints
      • Nextflow pipeline development
      • AWS deployment

Phase 1 (complete):

  • Queues submissions received from an API
  • On a schedule starts compute workers (on-demand) run Nextflow pipelines with submitted parameters
  • Run pipeline processes on AWS Batch (spot-instances) under Nextflow orchestration
  • Store status updates in a non-SQL database (deposited through API and Lambda Function)
  • Store run output metadata in a non-SQL database
  • Store output data to a S3 bucket
  • Store process temporary files to a S3 bucket with lifecycle set up
  • Build a front-end website (serverlessly hosted) which can query and display status and metadata tables

Phase 2 (in progress: ON HOLD):

  • Build a front-end website which can offer a user-interface for submissions
  • Develop JSON-schema for templating submissions
  • Customize compute worker to handle multiple pipeline types
  • Handle file uploads for pipeline input parameters
  • Create CloudFormation template for deployment

Phase 3 (in planning):

  • Build Alexa skills for:
    • Querying statuses
    • Start batch run
    • Querying metadata outputs