ViraTrend
Analyses COVID‑19 spread dynamics in Australia using the SEIRS-V model.
ViraTrend is a predictive system that analyzes COVID‑19 variant dynamics in Australia. It uses the SEIRS-V model to assess spread, mutation rates, and the impact of emerging variants. The insights support data‑driven public health strategies and timely interventions, helping teams anticipate trends and respond with confidence.
Contributions
Role: Backend Team Lead
I built the core infrastructure that powers the product — from data mining and pipelines to databases, internal data flows, and RESTful APIs. I ensure reliable ingestion, maintain scalable and secure data stores, and design services that deliver clean, timely data to the frontend and modeling layers. This foundation keeps the platform performant, maintainable, and ready to evolve with new requirements.
Project Teams
This project unites specialized teams with clear scopes and strong separation of concerns. By aligning frontend, backend, and data modeling responsibilities, we enable deep expertise and faster iteration. Together, the teams deliver a cohesive, data-driven experience from ingestion to insight.
Teams
Frontend Team: Built the user-facing app, including UI/UX, dashboards, and the user guide.
Backend Team: Handled GISAID data extraction and integrated both data and model into the frontend.
Data Modelling Team: Developed epidemiological models for future projections (Predictive Dashboard) and variant interaction analysis (Comparative Analysis).
System Architecture
Invisible to users, the backend is the glue that connects every component. All parts communicate through a unified backend API, ensuring consistent data flow, minimal integration friction, and the freedom for each component to evolve independently without breaking the system.
Overview

Three services: Frontend, Backend, and Model with clear, modular roles.
Frontend: Display and export information; built with React and Material‑UI
Backend: Manage data flow, storage, and inter‑service communication; built with Koa, SQLite, and RPA
Model: Host the SEIRS-V model implementation for forecasts and comparisons; built with Flask
Each runs in its own Docker container on an inter‑networked setup.
Backend mediates all communication, routing Frontend requests to the database and the Model.
User Interface (UI/UX)
This section highlights how the frontend leverages data from the backend to deliver clear, user‑friendly visuals and insights. It also explains how the backend infrastructure seamlessly connects every component so the application consistently delivers value.
Predictive Dashboard
Displays the predicted spread of a selected COVID‑19 variant in a time‑series graph, with custom data filters and model parameters. Powered by the mono-variant SEIRS-V model.

Comparative Analysis
Compares the spread of two selected variants in a time‑series graph, with custom data filters and model parameters. Powered by the bi-variant SEIRS-V model.

Backend Infrastructure
This section explains how the backend manages end‑to‑end data flow, robust storage, and inter‑service communication. It outlines the data pipelines and databases that ensure reliability and scale, and the service interfaces that coordinate components securely and efficiently.
Backend API
Related Project(s): SwaggerQL Plus
REST endpoints for database and model access, built on SwaggerQL, powered by Koa (Node.js) with OpenAPI and Swagger UI.

Data Pipeline (RPA)
Related Project(s): GISAID EpiCoV Downloader, GISAID EpiCoV Updater
Software bots for curation of data and updates, built on Laiye Automation Platform.


Design Justifications
This section presents each major backend design decision using the STAR framework to clearly explain the context, rationale, implementation, and outcomes behind the choices made.
✨ Dataset Crawling
Related Project(s): GISAID EpiCoV Downloader
Situation: We needed COVID‑19 variant data from GISAID for detailed analysis, but it lacked a real‑time API, limited downloads to 10,000 entries, was slow, and had unpredictable delays.
Task: Acquire, store, and process GISAID data efficiently despite these constraints.
Action:
Localised the data in our own database to avoid repeat GISAID queries.
Built an RPA bot to automate browser‑based downloads from GISAID.
Ran integrity checks to remove duplicate, missing, and unwanted records.
Cleaned the data by dropping irrelevant fields to optimise for analysis.
See "GISAID EpiCoV Downloader" for details.
Result: Delivered a fast, reliable, and clean localised dataset, removing GISAID bottlenecks and enabling timely analysis and modelling.
✨ Annotation Analysis
Related Project(s): PANGO Lineage ➜ WHO Label
Situation: GISAID lacked variant labels, which are essential for variant‑specific analyses. Without them, we couldn’t directly study the spread or impact of specific COVID‑19 variants.
Task: Create a reliable way to map Pango lineages to WHO labels to enable variant‑level analysis.
Action:
Built an annotation table linking Pango lineages to WHO labels.
Started with CoVariants’ base definitions (e.g., BA.* → Omicron).
Expanded coverage by:
Unaliasing known lineages to their canonical forms (e.g., BA.* → B.1.1.529.* ⇒ B.1.1.529.* → Omicron).
Adding aliases and derivatives until no further matches remained (e.g., BD.* → B.1.1.529.1.17.2.* ⇒ BD.* → Omicron).
Used Cov‑Lineages to validate unaliasing and alias discovery.
Applied regex matching to assign WHO labels across records.
See "PANGO Lineage ➜ WHO Label" for details.
Result: Labelled ~86% of entries, meeting modelling needs and enabling accurate variant‑specific analysis.
✨ Real-time Data Ingestion
Related Project(s): GISAID EpiCoV Updater
Situation: The app needed real‑time data to stay current with COVID‑19 variant information, but GISAID’s lack of real‑time access made this challenging.
Task: Implement automated downloads and imports to keep the database up to date.
Action:
Used RPA for real‑time ingestion.
Configured a bot to download the latest GISAID data and load it into the localised database.
Ensured a seamless, consistent pipeline with minimal manual effort.
See "GISAID EpiCoV Updater" for details.
Result: The app ingests data in real time, remaining up‑to‑date and reliable for variant‑specific analysis and modelling.
Database Schema
Situation: Normalising crawled data to 3NF reduced redundancy but hurt performance due to many joins and correlated sub‑queries, slowing responses.
Task: Design a schema that balanced performance with redundancy for fast data retrieval.
Action:
Switched from full 3NF to a large summary table.
Precomputed joins and stored results in the summary table.
Optimised for direct reads to avoid runtime computation.
Result:
Query performance improved markedly with lower overhead.
Accepted some redundancy to meet the app’s performance needs.
Schema now supports efficient, real‑time, large‑scale analysis.
DBMS Selection
Situation: The backend needed a reliable, vertically scalable, and query‑efficient way to store GISAID’s tabular data.
Task: Choose a database structure and engine that deliver robustness, scalability, and speed for a smooth user experience.
Action:
Compared relational vs non‑relational options.
Selected a relational database for robustness, ACID compliance, and fit with tabular data.
Benchmarked SQLite, MySQL, and PostgreSQL.
Chose SQLite for primary storage due to speed and zero network latency; ruled out MySQL/PostgreSQL because added latency hurt performance.
Result: Implemented SQLite, delivering fast, reliable, and efficient queries. The lack of network latency improved user experience and met the app’s performance and scalability needs.
Component Integration
Situation: We needed seamless integration between frontend, backend, and modelling components, while keeping teams modular and collaborative.
Task: Enable the frontend to access both the database and the model, and let backend and modelling teams work independently.
Action:
Built a Backend RESTful API with SwaggerQL to send SQL to the SQLite database with minimal setup.
Exposed a Model RESTful API using Flask to host the modelling team’s standalone Python algorithms.
Wired SwaggerQL to interface with Flask so the frontend could reach both data and model via a single web interface.
Maintained clear separation of responsibilities across frontend, backend, and modelling.
Result: Delivered cohesive integration with smooth data flow across layers. SwaggerQL and Flask simplified development, reduced interdependencies, and let each team focus on their domain within a modular, unified system.
Data Flow Abstraction
Situation: We needed to abstract data retrieval and processing so the frontend received structured, predictable data for easy UI integration.
Task: Design a simple data‑flow abstraction that delivers consistent, easily consumable data between frontend and backend.
Action:
Frontend sent HTTP requests to dedicated backend API endpoints for database/model access.
Backend processed requests and returned outputs in a consistent JSON structure.
Standardised JSON made consumption straightforward for the frontend.
Hid database queries and model logic behind the API, so the frontend dealt only with endpoints.
Result: The frontend received predictable, well‑structured JSON, enabling seamless UI integration. The abstraction reduced frontend complexity and improved communication efficiency and user experience.
Collaborations
This section outlines how the backend team collaborates across the project to drive success. Through clear interfaces, shared standards, and frequent touchpoints, the backend enables each team to move independently while keeping the whole system aligned, reliable, and fast.
Backend-Model
Wrapped standalone algorithms in a web framework and exposed them over HTTP, simplifying cross-component communication.
Iteratively aligned model inputs/outputs and encoded the formats in REST APIs, enabling frontend access to model data.
Selected fields to store in database to ensure the model has sufficient attributes for analysis.
Last updated











