← All work
Case study · Architect · 2022–present

Encount Data Lake

Automated survey processing for Norwegian transport

510K+survey responses processed
6survey programs processed
3incompatible upstream APIs unified
01 — The challenge

Six survey programs. Three incompatible APIs. One unified output.

Opinion AS needed automated, weighted survey datasets for the Norwegian transport authorities it serves. Each authority runs its own survey program through a different vendor — and results had to come together in one consistent, usable format for reporting and analysis.

The challenge was unifying data that came from fundamentally incompatible systems, applying statistically rigorous weighting, and delivering outputs that matched the metadata fidelity of professionally produced SPSS files.

Three incompatible upstream APIs

Decipher, WALR, and QuenchTec each with different data models, authentication schemes, and pagination patterns.

Complete SPSS metadata fidelity

Variable labels, value labels, and measurement levels preserved through every stage of the pipeline.

Time-windowed statistical weighting

Weights must account for departure times, contract periods, and seasonal patterns simultaneously.

Multi-operator contract mapping

Each authority has different contract structures that shape how data is attributed across operators.

02 — The solution

A pipeline that carries metadata through every transformation.

A processing pipeline that ingests from 3 upstream APIs, transforms data through 4–6 pipeline stages, and applies RIM weighting with time-windowed, contract-based, and departure-limited dimensions. The pipeline runs automatically, producing clean SPSS-compatible datasets for each survey program.

A key internal component is the DataFrameWithMeta library — a custom data structure that carries SPSS metadata (variable labels, value labels, measurement levels) alongside the raw data through every stage of the pipeline. This ensures that the output datasets have the same professional quality as hand-produced SPSS files.

The abstraction layer was the key design decision. By building DataFrameWithMeta early, metadata fidelity became a property of the data itself rather than something to retrofit at export time. Adding new survey programs is configuration, not code.

03 — Tech stack
PythonFastAPIPostgreSQLPandas
04 — Client

Built for Opinion AS — one of Norway's leading market research agencies — the pipeline processes the on-board and travel-behaviour surveys Opinion conducts for Norwegian transport authorities.

Let's build something ambitious.

Have a complex problem? Let's talk.

martin@encount.co