Marfle Australia Region: Local Infrastructure for Australian Customers

Juha Männikkö

•May 26, 2026•5 mins

Background

Marfle has deployed a dedicated regional infrastructure cluster in Australia to serve Australian customers. The primary driver for this work was a contractual requirement from the Port Authority of New South Wales that all vessel data be stored locally within Australian jurisdiction. Beyond compliance, the deployment provides lower latency for Australian-based users and lays the foundation for serving additional customers in the region.

Infrastructure

The Australian cluster runs on Google Cloud Platform in the australia-southeast1 (Sydney) region. A dedicated Google Kubernetes Engine (GKE) cluster hosts application workloads, and a regional PostgreSQL database instance stores vessel operational data. All data at rest, backups, and disk snapshots are stored within the Australian region.

The following services are deployed in the Australian cluster:

Web backend — serves the Marfle UI and API for Australian customers.
Consumer — ingests vessel telemetry from Kafka and writes to the Australian database
Whipper — runs background processing, aggregation, and scheduled tasks against the regional shard.
Supporting services — Redis, MQTT endpoint, and metric collection

Database Architecture

To support multiple regions without duplicating the entire database, the platform was restructured into a sharded architecture with a central metadata database and regional data shards.

The central database holds user accounts, vessel configurations, customer definitions, motor models, sensor configurations, warning thresholds, and other reference data — roughly 44 tables shared across all regions. This database is the single source of truth for administrative and configuration data.

Regional shard databases hold high-volume operational data: vessel telemetry (vessel_data_maintrips, logbook entries, fuel tank readings, motion profiles, engine economy metrics, battery data, and similar time-series and event tables — roughly 57 tables per shard. Each shard stores only the data belonging to customers assigned to that region.

Customer-to-shard assignment is tracked in the customers table via a datashard column. When a customer is assigned to the Australian shard, all their vessel data is written to and read from the Australian database exclusively.

Data Replication

Central metadata is replicated to each regional shard using PostgreSQL logical replication. A publication on the central database streams changes to subscriptions on each shard. This means that each shard has a local read replica of the reference data it needs (vessel definitions, user accounts, motor configurations) without requiring cross-region queries at request time.

For write operations that modify central data (e.g., updating vessel configuration), the application writes to the central database, then waits for replication to propagate the change to the relevant shard before confirming the operation with the user. This is handled by a waitShardSync mechanism that polls the shard's replication lag against the central WAL position.

Application-Level Data Routing

All application services were modified to be shard-aware:

Request handling: When a web request arrives, the application determines the customer's shard from the authentication context and routes all database queries to the correct regional database for the duration of that request.
Data ingestion: The consumer service looks up the datalogger's customer and their assigned shard, then writes incoming vessel telemetry directly to the correct regional database.
Background processing: Scheduled tasks and aggregation jobs iterate over all configured shards, processing each region's data independently.
GraphQL API and subscriptions: The external API respects shard boundaries, so third-party integrations for Australian customers query the Australian database.

This routing is implemented using an AsyncLocalStorage-based context that propagates the shard selection through the call stack without requiring explicit parameters at every level.

Database Optimisation

As part of preparing the database for sharding, several structural changes were made:

Sequence removal from shard tables: Auto-increment id columns were removed from shard-side tables (vessel_traction, motor_economy_hourly, vessel_warnings, trips, event tables, and others) and replaced with UUIDs where necessary. Sequences are incompatible with logical replication across independent shards — each shard must be able to generate identifiers independently.
Replication safety triggers: Protective triggers were added on the shard side to prevent accidental updates or deletes on primary key columns that would break logical replication.
Migration script separation: The migration framework was adapted to run migrations against both the central database and each shard as appropriate, depending on which tables a migration touches.

Deployment

The deployment pipeline was extended to support multi-cluster operations. A single deployment run can target the EU cluster, the Australian cluster, or both. Each cluster deployment includes:

Pre-deployment disk snapshots (stored in the respective region)
Database migrations on the central and each shard
Container image rollout to the regional GKE cluster
Configuration updates via Helm values specific to each region

The CI/CD workflow authenticates against both GKE clusters and applies Kubernetes manifests through a unified Helm chart with per-region value overrides.

End-to-End Testing

The test infrastructure was updated to validate sharded operation. The end-to-end test suite provisions a local central database and at least two shard databases, verifies that shard isolation is enforced (shard databases cannot write to central tables), and tests data routing for consumers, web requests, and background tasks across shards.

Benefits for Australian Customers

Data sovereignty: All vessel operational data for Australian customers is stored in the Sydney region, satisfying local data residency requirements.
Reduced latency: Australian users interact with web and API services running in the same region as their data, reducing round-trip times for page loads and data queries.
Regulatory compliance: The architecture supports the contractual requirements defined in the Cloud Services Agreement with the Port Authority of New South Wales.
Independent scaling: The Australian shard can be scaled independently as customer count and data volume grow in the region, without affecting European operations.
Operational isolation: Database maintenance, backups, and recovery operations in one region do not impact the other.

Current Status

The infrastructure, database sharding, application routing, and deployment automation have been implemented and are operational in the development environment. The architecture is designed to extend to production deployment and to accommodate additional regional shards in the future.

About the author