Core ideas from Basic Statistics and Essential Statistical Tools — mean, spread, correlation, and regression — in a clear, playable way for trading and risk.
1 Basic statistics
To describe a set of numbers (e.g. daily prices or returns), we use a few key measures:
Mean (average): sum of values ÷ number of values. Sensitive to outliers.
Median: the middle value when sorted. More robust to extreme values.
Standard deviation (σ): measures spread around the mean. Low σ = values cluster near the mean; high σ = more variability.
Skewness: measures asymmetry. Positive skew = long right tail (e.g. lottery payoffs); negative skew = long left tail (e.g. many return distributions).
Kurtosis: measures tail heaviness relative to the normal distribution. High kurtosis = fatter tails, so extreme events (crashes, spikes) occur more often than a normal curve suggests — critical for risk.
Percentiles / quantiles: the value below which a given % of data falls. The 95th percentile means 95% of values are below it. VaR is often the 95th or 99th percentile of the loss distribution.
Play with the data below — change values or add/remove points and watch the stats and bar chart update.
Data playground (e.g. daily returns or prices)
Mean—
Median—
Std dev—
2 Distributions
A distribution describes how values are spread. The normal (Gaussian) distribution is common in finance: many returns and price changes cluster around the mean with symmetric tails.
About 68% of values fall within 1 standard deviation of the mean, 95% within 2σ, and 99.7% within 3σ. This is the basis for many risk and option models.
−3σ−2σ−1σμ+1σ+2σ+3σ
3 Correlation
Correlation measures how two variables move together, from −1 (perfect opposite) to +1 (perfect same direction). Zero means no linear relationship.
In trading: gas vs power prices, spot vs forward, or two commodities may be correlated. Correlation helps with hedging and diversification.
⚠️ Correlation does not imply causation. Two variables can be strongly correlated without one causing the other. A classic spurious example: ice cream sales and drownings are positively correlated (both rise in summer), but ice cream does not cause drownings — a third factor (warm weather) drives both. In trading, two prices may move together because of a common driver (e.g. oil) rather than one causing the other. Always ask: is there a genuine causal link, or just shared influences?
Choose a preset to see correlation and scatter plot
Correlation (r)
0.00
4 Regression
Regression fits a line (or curve) to data. Simple linear regression finds the line that minimizes the vertical distance from points to the line (least squares).
Formula: y = a + b·x. Here b is the slope (sensitivity of y to x) and a is the intercept. Used for forecasting, hedge ratios, and explaining one variable by another.
Key assumptions:
(1) Linearity — the relationship between x and y is linear;
(2) Independence — residuals are not correlated (e.g. no autocorrelation in time series);
(3) Homoscedasticity — constant variance of residuals (no fan-shaped pattern).
Failing these can lead to biased coefficients, wrong standard errors, and misleading hedge ratios or forecasts.
Regression demo — line of best fit (same data as correlation)
Slope b—
Intercept a—
In practice — hedge ratios: Regress spot price on forward price to estimate the optimal hedge ratio (the slope b tells you how many forward contracts to hold per unit of spot exposure). Similarly, regress one asset’s returns on another to measure beta or exposure. Example: regressing power spot returns on gas spot returns yields the sensitivity of power to gas, useful for spark spread hedging.
Why this matters in trading & ETRM
Risk: Standard deviation of returns is volatility; it feeds into VaR and option pricing.
Hedging: Correlation and regression help choose how much to hedge and with which instrument.
Forecasting: Historical mean and variance inform simple forecasts; regression can model relationships (e.g. spark spread vs gas).
Reporting: Basic stats (mean, median, std) are the building blocks of P&L and risk reports.