> For the complete documentation index, see [llms.txt](https://docs.koinju.io/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://docs.koinju.io/sql-cookbook/historical-volatility.md).

# Historical volatility

This page shows how to compute the rolling 30-day **realised volatility** of a crypto pair entirely server-side, using two of the most common estimators:

* **Close-to-close** — the textbook approach: standard deviation of daily log returns, annualised by `√365`.
* **Parkinson** — uses the daily range (high/low) instead of just closes. Lower variance estimator because it sees more of the intraday move.

The full pipeline runs in a single SQL query against `api.ohlcv(...)`. We walk through it CTE-by-CTE then provide the full code at the end. The example is pinned to `2024-01-01` → `2024-12-31` to match the parity-tested baseline; for live data swap the two date literals for `now() - interval 13 month` and `toStartOfDay(now())`.

## Select the data

We pull daily candles for `BTC-USDT` on `binance` for the pinned window, with named bindings for the date range, window length, and annualisation factor.

```sql
WITH
    '2024-01-01' AS start_date,
    '2024-12-31' AS end_date,
    30 AS window_days,
    365 AS sessions_per_year,
    candles AS (
        SELECT toDate(start) AS day, open, high, low, close
        FROM api.ohlcv(candle_duration_in_minutes = 1440)
        WHERE exchange = 'binance'
          AND market = 'BTC-USDT'
          AND start >= start_date
          AND start <= end_date
    )
```

Functions used: [`toDate`](https://clickhouse.com/docs/sql-reference/functions/type-conversion-functions#todate), [`ln`](https://clickhouse.com/docs/sql-reference/functions/math-functions#ln), [`nullIf`](https://clickhouse.com/docs/sql-reference/functions/conditional-functions#nullif), [`lagInFrame`](https://clickhouse.com/docs/sql-reference/window-functions#laginframe), [`pow`](https://clickhouse.com/docs/sql-reference/functions/math-functions#pow), [`stddevSamp`](https://clickhouse.com/docs/sql-reference/aggregate-functions/reference/stddevsamp), [`sqrt`](https://clickhouse.com/docs/sql-reference/functions/math-functions#sqrt)

`candles` contains:

| day        | open     | high     | low      | close    |
| ---------- | -------- | -------- | -------- | -------- |
| 2024-01-01 | 42283.58 | 44184.10 | 42180.77 | 44179.55 |
| 2024-01-02 | 44179.55 | 45879.63 | 44148.34 | 44946.91 |
| 2024-01-03 | 44946.91 | 45500.00 | 40750.00 | 42845.23 |
| 2024-01-04 | 42845.23 | 44729.58 | 42613.77 | 44151.10 |
| ...        | ...      | ...      | ...      | ...      |

## Compute log returns and intraday range

Two per-row quantities feed the two estimators:

* `log_return = ln(close / close[t-1])` — close-to-close log return.
* `hl_log_sq = ln(high / low)²` — squared log range, the input to Parkinson.

We use `nullIf(..., 0)` on the lagged close so the very first row (no predecessor) yields `NULL` rather than tripping a division-by-zero on `Decimal(76, 20)`.

```sql
    returns AS (
        SELECT day, high, low, close,
               ln(close / nullIf(lagInFrame(close, 1) OVER (ORDER BY day), 0)) AS log_return,
               pow(ln(high / low), 2) AS hl_log_sq
        FROM candles
    )
```

`returns` contains:

| day        | close    | log\_return | hl\_log\_sq |
| ---------- | -------- | ----------- | ----------- |
| 2024-01-01 | 44179.55 | NULL        | 0.002153    |
| 2024-01-02 | 44946.91 | 0.017220    | 0.001480    |
| 2024-01-03 | 42845.23 | -0.047888   | 0.012156    |
| 2024-01-04 | 44151.10 | 0.030024    | 0.002348    |
| ...        | ...      | ...         | ...         |

## Roll the 30-day window

ClickHouse window functions compute both estimators in one pass:

* Close-to-close: `stddevSamp(log_return)` over the 30-row window, annualised by `√365` and reported in percent.
* Parkinson: `√( Σ ln(H/L)² / (4·N·ln 2) )` over the same window, also annualised.

We carry `count(log_return) OVER w` so we can drop early rows where the window isn't yet full (the first 30 days, before we have 30 valid log returns).

```sql
    vols AS (
        SELECT day,
               stddevSamp(log_return) OVER w * sqrt(sessions_per_year) * 100 AS vol_30d_cc_pct,
               sqrt(sum(hl_log_sq) OVER w / (4 * window_days * ln(2))) * sqrt(sessions_per_year) * 100 AS vol_30d_park_pct,
               count(log_return) OVER w AS valid_returns
        FROM returns
        WINDOW w AS (ORDER BY day ROWS BETWEEN 29 PRECEDING AND CURRENT ROW)
    )
SELECT day, vol_30d_cc_pct, vol_30d_park_pct
FROM vols
WHERE valid_returns = window_days
ORDER BY day
```

Output:

| day        | vol\_30d\_cc\_pct | vol\_30d\_park\_pct |
| ---------- | ----------------- | ------------------- |
| 2024-01-31 | 53.90             | 58.88               |
| 2024-02-01 | 53.70             | 58.73               |
| 2024-02-02 | 51.02             | 54.19               |
| ...        | ...               | ...                 |
| 2024-12-29 | 44.39             | 56.48               |
| 2024-12-30 | 44.37             | 56.94               |
| 2024-12-31 | 44.38             | 57.53               |

## Notes on the estimators

* `stddevSamp` (Bessel-corrected, `ddof=1`) matches the convention used by most published HV charts — including Deribit's reference implementation and `pandas.rolling.std()` with default arguments. Use `stddevPop` if you prefer the population estimator.
* `√365` annualisation is appropriate for crypto (24/7 trading). For traditional markets, swap in `√252`.
* The Parkinson constant `1 / (4·N·ln 2)` ≈ `0.3607 / N` — switch the window length only by changing the `window_days` binding.

## Full example

```sql
WITH
    '2024-01-01' AS start_date,
    '2024-12-31' AS end_date,
    30 AS window_days,
    365 AS sessions_per_year,
    candles AS (
        SELECT toDate(start) AS day, open, high, low, close
        FROM api.ohlcv(candle_duration_in_minutes = 1440)
        WHERE exchange = 'binance'
          AND market = 'BTC-USDT'
          AND start >= start_date
          AND start <= end_date
    ),
    returns AS (
        SELECT day, high, low, close,
               ln(close / nullIf(lagInFrame(close, 1) OVER (ORDER BY day), 0)) AS log_return,
               pow(ln(high / low), 2) AS hl_log_sq
        FROM candles
    ),
    vols AS (
        SELECT day,
               stddevSamp(log_return) OVER w * sqrt(sessions_per_year) * 100 AS vol_30d_cc_pct,
               sqrt(sum(hl_log_sq) OVER w / (4 * window_days * ln(2))) * sqrt(sessions_per_year) * 100 AS vol_30d_park_pct,
               count(log_return) OVER w AS valid_returns
        FROM returns
        WINDOW w AS (ORDER BY day ROWS BETWEEN 29 PRECEDING AND CURRENT ROW)
    )
SELECT day, vol_30d_cc_pct, vol_30d_park_pct
FROM vols
WHERE valid_returns = window_days
ORDER BY day
```

The whole pipeline runs in well under a second of compute time; round-trip latency dominates.

| window length | execution time |
| ------------- | -------------- |
| 1 year        | \~ 5–8 s       |
| 5 years       | \~ 7–10 s      |

The dataset is downloadable as-is by running the `candles` sub-query if you prefer to compute the volatilities client-side (see the companion [Python reference](https://gitlab.com/koinju/connector/exporter/-/blob/master/api/python_vs_sql/historical_volatility/original.py) for a reproducible parity baseline).

## See also

* [Backtesting a simple SMA strategy](/sql-cookbook/backtesting-simple-sma-strategy.md) — uses the same `api.ohlcv(...)` view and `lagInFrame` window pattern.
* Source article that inspired this example: [Historical volatility calculations (Deribit Insights)](https://insights.deribit.com/dev-hub/historical-volatility-calculations-python-code/).
