Backtesting simple SMA strategy

This doc will explain how to run a backtest for a simple SMA (simple moving average) crossing strategy between 50 and 200 period SMA's directly inside the Clickhouse instance. For simplicity, we assume that there is no transaction cost, slippage or financing cost thus the returns obtained are unrealistic.

We will present it step by step, then we will provide the full code at the end.

Select the data

We are selecting the 1 minutes candles for BTC-USD from coinbase for the last 12 months.

with now() - interval 12 month as start_date, toStartOfMinute(now()) as end_date, 10000 as initial_cash,
    -- Fetching BTC data for the specified period
    btc_data as (SELECT timestamp,
                        argMaxMerge(close) as close
                 FROM public_data.candle_1m
                 WHERE market = 'BTC-USD'
                   AND exchange = 'coinbase'
                   and timestamp > start_date
                   and timestamp < end_date
                 GROUP BY timestamp)

btc_data contains:

timestamp
close

2025-03-07T08:50Z

88409.84

2024-07-07T09:24Z

57631.88

2024-10-29T23:04Z

72649.13

...

...

Compute the SMA

The 50 and 200 period SMA is computed using a window function.

Clickhouse offers a wide range of statistical functions documented here.

--- previous code snippet ---
    sma AS (SELECT timestamp,
                   close,
                   avg(close) OVER w50  AS sma50,
                   avg(close) OVER w200 AS sma200
            FROM btc_data
            -- using -1 to avoid the current row in the moving average calculation and thus avoid lookahead bias
            WINDOW w50 AS (ORDER BY timestamp ROWS BETWEEN 50 PRECEDING AND 1 PRECEDING ),
                   w200 AS (ORDER BY timestamp ROWS BETWEEN 200 PRECEDING AND 1 PRECEDING )),

sma contains:

timestamp
close
sma50
sma200

2024-06-11T15:23Z

66908.00

66918.2

66918.2

2024-06-11T15:24Z

66870.67

66913.1

66913.1

2024-06-11T15:25Z

66835.35

66898.95666666667

66898.95666666667

2024-06-11T15:26Z

66757.10

66883.055

66883.055

2024-06-11T15:27Z

66779.25

66857.864

66857.864

2024-06-11T15:28Z

66795.82

66844.76166666666

66844.76166666666

2024-06-11T15:29Z

66817.05

66837.76999999999

66837.76999999999

2024-06-11T15:30Z

66776.63

66835.18000000001

66835.18000000001

2024-06-11T15:31Z

66780.80

66828.67444444445

66828.67444444445

...

...

...

...

Generate the signals and extract the trades

We detect the crossing of the two SMA's and generate the signals accordingly.

    --- previous code snippet ---
    signals AS (SELECT timestamp,
                          close,
                          sma50,
                          sma200,
                          CASE
                              WHEN sma50 > sma200 AND lagInFrame(sma50) OVER w < lagInFrame(sma200) OVER w THEN 'long' -- golden cross
                              WHEN sma50 < sma200 AND lagInFrame(sma50) OVER w > lagInFrame(sma200) OVER w THEN 'short' -- death cross
                              ELSE NULL
                              END AS signal
                   FROM sma
                   WINDOW w AS (ORDER BY timestamp)),
    trades AS (SELECT timestamp,
                      close                                  AS price,
                      signal,
                      row_number() OVER (ORDER BY timestamp) AS trade_id
               FROM signals
               WHERE signal is not null)

trades contains:

timestamp
price
signal
trade_id

2024-06-11T17:29Z

66258.98

short

1

2024-06-11T18:03Z

66784.28

long

2

2024-06-12T00:10Z

67354.27

short

3

...

...

...

...

Compute strategy performance

We regroup the trades by their signal and ids to present the data with entry and exit prices for each trade for the PnL calculation.

    --- previous code snippet ---
    returns AS (SELECT entry.trade_id,
                           entry.timestamp          AS entry_date,
                           exit.timestamp           AS exit_date,
                           entry.price              AS entry_price,
                           exit.price               AS exit_price,
                           entry.signal             as signal,
                           if(signal = 'long', exit_price / entry_price, entry_price / exit_price) - 1  as trade_return
                    FROM trades entry
                             LEFT JOIN trades exit ON entry.trade_id + 1 = exit.trade_id
                    where exit_price != 0 -- ignoring the last unclosed trade
                    order by trade_id)

returns contains:

trade_id
entry_date
exit_date
entry_price
exit_price
signal
trade_return

1

2024-06-11T17:22Z

2024-06-11T18:11Z

66289.25

66695.99

short

-0.00609841761101379559

2

2024-06-11T18:11Z

2024-06-12T00:10Z

66695.99

67354.27

long

0.00986985874263205329

3

2024-06-12T00:10Z

2024-06-12T02:32Z

67354.27

67455.74

short

-0.00150424559866958691

4

2024-06-12T02:32Z

2024-06-12T05:33Z

67455.74

67362.65

long

-0.00138001599270870055

5

2024-06-12T05:33Z

2024-06-12T07:57Z

67362.65

67455.94

short

-0.00138297679937452507

6

2024-06-12T07:57Z

2024-06-12T16:33Z

67455.94

69424.35

long

0.02918067704638020017

7

2024-06-12T16:33Z

2024-06-12T21:51Z

69424.35

68534.74

short

0.01298042423448312490

8

2024-06-12T21:51Z

2024-06-12T23:30Z

68534.74

68173.60

long

-0.00526944437229936234

...

...

...

...

...

...

...

Sharpe Ratio

We compute the Sharpe ratio of the strategy using the returns from the trades.

    --- previous code snippet ---
    SELECT  avg(trade_return) / stddevPop(trade_return)   AS sharpe_ratio from returns

outputs 0.022227898711626174

The above results can be also grouped by day and the Sharpe ratio can then be annualised to get a better understanding of the strategy performance over time.

    --- previous code snippet ---
    daily_retuns AS (
    SELECT  sum( trade_return) AS daily_return
    from returns
    group by toStartOfDay(exit_date)
    )
    SELECT  sqrt(365) *  avg(daily_return) / stddevPop(daily_return)   AS sharpe_ratio from daily_retuns;

outputs 1.3117539455990057

Full example

with now() - interval 12 month as start_date, toStartOfMinute(now()) as end_date,
    -- Fetching BTC data for the specified period
    btc_data as (SELECT timestamp,
                        argMaxMerge(close) as close
                 FROM public_data.candle_1m
                 WHERE market = 'BTC-USD'
                   AND exchange = 'coinbase'
                   and timestamp > start_date
                   and timestamp < end_date
                 GROUP BY timestamp),
    -- Calculating 50 and 200 period simple moving averages
    sma AS (SELECT timestamp,
                   close,
                   avg(close) OVER w50  AS sma50,
                   avg(close) OVER w200 AS sma200
            FROM btc_data
            -- using -1 to avoid the current row in the moving average calculation and thus avoid lookahead bias
            WINDOW w50 AS (ORDER BY timestamp ROWS BETWEEN 50 PRECEDING AND 1 PRECEDING ),
                   w200 AS (ORDER BY timestamp ROWS BETWEEN 200 PRECEDING AND 1 PRECEDING )),
    signals AS (SELECT timestamp,
                       close,
                       sma50,
                       sma200,
                       CASE
                           WHEN sma50 > sma200 AND lagInFrame(sma50) OVER w < lagInFrame(sma200) OVER w
                               THEN 'long' -- golden cross
                           WHEN sma50 < sma200 AND lagInFrame(sma50) OVER w > lagInFrame(sma200) OVER w
                               THEN 'short' -- death cross
                           END AS signal
                FROM sma
                WINDOW w AS (ORDER BY timestamp)),
    trades AS (SELECT timestamp,
                      close                                  AS price,
                      signal,
                      row_number() OVER (ORDER BY timestamp) AS trade_id
               FROM signals
               WHERE signal is not null),
    returns AS (SELECT entry.trade_id,
                       entry.timestamp                                                             AS entry_date,
                       exit.timestamp                                                              AS exit_date,
                       entry.price                                                                 AS entry_price,
                       exit.price                                                                  AS exit_price,
                       entry.signal                                                                as signal,
                       if(signal = 'long', exit_price / entry_price, entry_price / exit_price) - 1 as trade_return
                FROM trades entry
                         LEFT JOIN trades exit ON entry.trade_id + 1 = exit.trade_id
                where exit_price != 0 -- ignoring the last unclosed trade
                order by trade_id),
    daily_retuns AS (SELECT sum(trade_return) AS daily_return
                     from returns
                     group by toStartOfDay(exit_date))
SELECT sqrt(365) * avg(daily_return) / stddevPop(daily_return) AS sharpe_ratio
from daily_retuns;

The execution speed of this backtest is rapid, even for large datasets.

number of candles
Execution time

525,525 ( 1 Year)

1 s 678 ms

2,628,415 ( 5 Years )

7 s 315 ms

The dataset can always be downloaded from the table by running the btc_data sub-query.

Was this helpful?