Mean reversion z-score

A 20-day rolling z-score signal generator — the textbook mean-reversion idiom. When the price drifts more than 1.25 σ away from its rolling mean, mark it as oversold (signal = +1) or overbought (signal = -1). Otherwise flat (0).

ma_20[t]   = mean of close[t-19 … t]
std_20[t]  = sample std of close[t-19 … t]   (ddof = 1)
zscore[t]  = (close[t] − ma_20[t]) / std_20[t]
signal[t]  = +1   if zscore < −1.25
             −1   if zscore > +1.25
              0   otherwise

Query

WITH
    '2024-01-01' AS start_date,
    '2024-12-31' AS end_date,
    20 AS window_size,
    1.25 AS n_std,
    candles AS (
        SELECT toDate(start) AS day, toFloat64(close) AS close
        FROM api.ohlcv(candle_duration_in_minutes = 1440)
        WHERE exchange = 'binance'
          AND market = 'BTC-USDT'
          AND start BETWEEN start_date AND end_date
    ),
    z AS (
        SELECT day, close,
               avg(close) OVER w AS ma_20,
               stddevSamp(close) OVER w AS std_20,
               (close - avg(close) OVER w) / nullIf(stddevSamp(close) OVER w, 0) AS zscore,
               count(*) OVER w AS valid_rows
        FROM candles
        WINDOW w AS (ORDER BY day ROWS BETWEEN 19 PRECEDING AND CURRENT ROW)
    )
SELECT day, close, ma_20, std_20, zscore,
       if(zscore < -n_std, 1, if(zscore > n_std, -1, 0)) AS signal
FROM z
WHERE valid_rows = window_size
ORDER BY day

Two SQL details worth flagging:

  • stddevSamp, not stddevPop. Pandas .rolling(N).std() defaults to ddof = 1 (Bessel's correction). Mixing pop / sample here changes the z-score by a factor of √(N / (N − 1)) ≈ 1.026 for N = 20 — small numerically but enough to flip a signal that's hovering near ±1.25. Use stddevSamp to match pandas exactly.

  • count(*) OVER w = window_size filters out the first 19 rows before the rolling window has filled. Same effect as dropna(subset=['zscore']) in pandas.

Output (2024)

347 rows × 6 columns. The first few signal-firing days:

day
close
ma_20
std_20
zscore
signal

2024-01-22

39568.02

43335.61

1974.41

−1.91

+1

2024-01-23

39897.60

43188.22

2117.76

−1.55

+1

2024-01-24

40084.88

42984.91

2213.47

−1.31

+1

2024-02-07

44349.60

42064.01

1310.55

+1.74

−1

2024-02-08

45288.65

42245.49

1490.48

+2.04

−1

2024-02-09

47132.77

42517.32

1839.83

+2.51

−1

Signal counts for the year: 53 long, 99 short, 195 flat. Short signals dominate because BTC trended upward through 2024 — the rolling mean lagged the price most of the time, pushing the z-score above the upper threshold whenever the rally re-accelerated.

Extending it

  • Different window or threshold — change window_size and n_std. Tighter window (e.g. 10) reacts faster but flips more often; wider window (60) is smoother but slower.

  • Backtest the strategy — feed signal into a cumulative-returns CTE: signal[t-1] · pct_change(close)[t] then a running product via exp(sum(log(1 + ...))) (see Maximum drawdown & Calmar for the cumprod trick).

Last updated