Spreadometer Pipeline Technical Documentation¶
The Spreadometer Pipeline is a data processing pipeline that ingests raw natural gas futures trades and orders, processing them into time series of native and synthetic spreads. You should familiarize yourself with the methodology chapter before doing any significant work on the Spreadometer.
The pipeline is developed in Synapse and follow the medallion architecture.
Pipeline specification:
Integrate/Pipelines/spreadometer/spreadometer_update_spreads
Code location:
Develop/Notebooks/silver/spreadometer/Develop/Notebooks/gold/spreadometer/
Data location:
- Data lakes:
ngttsynapsestorage(prod),ngttsynapsestoragedev(dev) - At:
silver/spreadometer,gold/spreadometer
The following diagram illustrates the pipeline architecture. Please refer to the diagram to understand data flow and dependencies between the different components.

Schedule¶
The pipeline executes a full load every Monday at 00:00 London time, fully overwriting tables with the latest data to incorporate historical corrections. From Tuesday to Friday, it runs daily incremental updates at 00:00 London time, ensuring the most recent seven days of data are frequently refreshed and kept up-to-date throughout the week.
Data Sources and other dependencies¶
The Spreadometer uses the standard dataset for Trayport trades and Trayport orders provided by Alpha Squad through Atlas.
The Mapping Table¶
The mapping table /trayport_instrument_mapping_with_units.csv in the lake ngttsynapsestorage contain the mapping for all the instruments relevant to the Spreadometer. The mapping table also include the price and volume units for each of these instrument. The mapping table was handed over to the Whale Squad from the Traders.
Modules¶
Synapse does not support .py files, and .ipynb notebooks cannot be directly imported as modules. Executing another notebook (.ipynb) using %run risks overwriting variables in the current notebook, particularly if common variable names are used.
To enable safe modularization and reusable code, the following convention is adopted:
- Create a notebook (
.ipynb) insilver/spreadometer/modules. - Within this notebook, define a class matching the notebook's filename, containing reusable logic in the form of static methods.
Minimal Example:
In silver/spreadometer/modules/my_module.ipynb, define:
class MyModule:
GLOBAL_VAR = "Some global information"
@staticmethod
def my_method():
print(GLOBAL_VAR)
Usage:
To reuse this module in another notebook, follow these steps:
Cell 1: (Must contain only the %run command)
%run silver/spreadometer/modules/my_module
Cell 2: (Invoke the static method)
MyModule.my_method()
out[1]: 'Some global information'
Lake manager Module¶
Module to read/write from/to <silver/gold>/spreadometer/ in ngttsynapsestorage or ngttsynapsestoragedev.
Contract Module¶
Contain logic to extract the tenor of a contract from the SequenceItemName column in the Trayport datasets.
Conversion Module¶
Contain logic to convert prices to euro using FX curves and logic to convert all volume units to MWh/h. The conversion logic is based on the mapping table.
Ask bid spread Module¶
Contain logic to correctly join quotes to calculate the ask and bid price for a spread.
Consider a spread contract A / contract B = price of contract A - price of contract B;
The best available price for buying this spread is buying A at the best asking price and selling B at the best bid price. Conversely, the best available price for selling this spread is selling A at the best bid price and buying B at the best asking price. Consequently, quotes from A should be matched with quotes from the opposite side of B.
Bronze Layer¶
The Spreadometer does not include a bronze layer as the raw data is already sourced by the Alpha Squad and made available in Atlas.
Silver Layer¶
The notebooks in the silver layer prepares the raw Trayport trades and orders to be processed into times series of spreads and outright prices. Additionally, it prepares a table of Equinor trades with a visible counterparty.
The schema is provided only for the trayport_trades and trayport_orders tables to keep the documentation easier to maintain. Schemas for the other tables are available in their respective notebooks.
Notebook: silver_prep_trayport_trades¶
Prepares the raw Trayport trades to be processed into time series of spreads and outright prices.
Input: Trayport trades from Atlas
Output: trayport_trades, Schema:
| schema | type | description |
|---|---|---|
| trade_id | string | The trade_id linked to the trade |
| aggressor | string | The company that initiated the trade. |
| initiator | string | The company that responded to the trade. |
| datetime | timestamp | The datetime the trade was executed in the market. Not to be confused with the trade creation time. |
| price | double | The traded price. |
| price_unit | string | The unit of the traded price. |
| volume | double | The traded volume. |
| volume_unit | string | The unit of the traded volume. |
| instrument_group | string | The market in which the contract is traded. E.g., TTF, NBP, .. |
| contract_group | string | The tenor/delivery within the year. E.g., DA, Jan, .., Dec, Q1, .., Q2, Sum, Win, CAL. |
| contract_year | integer | The tenor/delivery year. |
| contract_start | date | The start date of the tenor/delivery. |
| contract_mid | date | The mid date of the tenor/delivery. |
| quote_group | string | trade, ask or bid .. |
| update_timestamp | timestamp | The timestamp in which the row was added/updated. |
Notebook: silver_prep_trayport_orders¶
Prepares the raw Trayport orders to be processed into time series of ask-bid quotes for spreads and outright prices. Note that the Trayport orders dataset contain the top 3 quotes per minute for each instrument. The notebook will only select the best available quote for each instrument per minute.
Input: Trayport orders from Atlas
Output table: trayport_trades, Schema:
| schema | type | description |
|---|---|---|
| datetime | timestamp | The minute the quotes was recorded in the market. |
| price | double | The traded price. |
| price_unit | string | The unit of the traded price. |
| volume | double | The traded volume. |
| volume_unit | string | The unit of the traded volume. |
| instrument_group | string | The market in which the quote was recorded. E.g., TTF, NBP, .. |
| contract_group | string | The tenor/delivery within the year. E.g., DA, Jan, .., Dec, Q1, .., Q2, Sum, Win, CAL. |
| contract_year | integer | The tenor/delivery year. |
| contract_start | date | The start date of the tenor/delivery. |
| contract_mid | date | The mid date of the tenor/delivery. |
| quote_group | string | trade, ask or bid .. |
| update_timestamp | timestamp | The timestamp in which the row was added/updated. |
Notebook: silver_outright_prices¶
Creates a 1-second aggregated time series of outright prices by calculating the volume-weighted average price (VWAP) and the total volume for each contract at 1-second intervals.
Input: trayport_trades, trayport_orders
Output: outright_prices
Notebook: silver_join_native_spreads¶
Computes all the native time spreads and the native location spreads.
The native location spreads are computed by joining each non-TTF outright time series with the corresponding TTF outright time series with equal tenor/delivery. The native spread is then calculated by simply computing the difference between the two time series.
The native time spreads are computed by joining each contract of the same market — with another contract of the same market — with an earlier tenor/delivery. Note that only the native time spreads are only computed for the TTF due to liquidity reasons.
Input: outright_prices
Output: native_spreads
Notebook: silver_prep_equinor_trades¶
Creates a table with all the Equinor trades with a visible counterparty and the corresponding spread.
Input: trayport_trades, native_spreads
Output: equinor_trades
Gold Layer¶
The gold layer aggregates the data into hourly and daily time series before combining them into synthetic spreads.
Notebook: gold_aggregated_outright_prices¶
Aggregates outright prices into hourly and daily time series by computing the volume-weighted average price (VWAP) and total traded volume for each contract within each interval. Additionally, it calculates the TTF 16:30 End-of-Day (EOD) assessment price — a simplified benchmark based on the mid-price (average of best bid and ask) observed at exactly 16:30 London time. While the official ICIS assessment includes fallback rules when a mid-price is unavailable, this notebook assumes sufficient liquidity in TTF at that time and relies solely on the observed mid-price, primarily to support cancellation of TTF in daily synthetic time spread calculations.Input:
outright_prices
Output: hourly_outright_prices, daily_outright_prices, ttf_1630_assessment_price
Notebook: gold_aggregated_native_spreads¶
Aggregates the native spreads into hourly and daily time series by computing the volume-weighted average spread and corresponding total volume for each interval. The weighting is based on the volume of the illiquid leg of the spread (i.e., the non-TTF leg on the left side).
Input: native_spreads
Output: hourly_native_spreads, daily_native_spreads
Notebook: gold_aggregated_equinor_trades¶
Aggregates Equinor trades into hourly and daily time series by computing the volume-weighted average spread and corresponding total volume for each interval, grouped by counterparty. The weighting is based on the volume of the illiquid leg of the spread (i.e., the non-TTF leg on the left side). Only trades with a non-null spread price are included in the weighted average calculation.
Input: equinor_trades
Output: hourly_equinor_trades, daily_equinor_trades
Notebook: gold_join_synthetic_spreads¶
Computes the synthetic spreads in accordance with the method outlined in the methodology chapter.
Calculates synthetic location spreads by joining hourly or daily native spreads with the same delivery period but different illiquid legs (i.e., different non-TTF legs on the left side). The synthetic location spread is computed as the difference between the volume-weighted average spreads of the matched native spreads.
Calculates synthetic time-location spreads by joining hourly or daily native location spreads with the same timestamp and illiquid leg (i.e., the same non-TTF leg on the left side), where the right-hand spread has an earlier delivery period. The synthetic time-location spread is computed as the difference between the volume-weighted average spreads of the two native spreads.
Calculates synthetic time spreads by adding the hourly TTF outright price or daily TTF assessment price to the corresponding native location spread, then recomputing the time-location spread. This effectively cancels out the influence of the liquid leg in the native spread (i.e., the TTF leg on the right side) and isolates the effect of the illiquid leg (i.e., the non-TTF leg on the left side).
Input: hourly_native_spreads, daily_native_spreads, hourly_outright_prices, ttf_1630_assessment_price
Output: hourly_synthetic_spreads, daily_synthetic_spreads