Backtesting simple SMA strategy
This doc will explain how to run a backtest for a simple SMA (simple moving average) crossing strategy between 50 and 200 period SMA's directly inside the Clickhouse instance. For simplicity, we assume that there is no transaction cost, slippage or financing cost thus the returns obtained are unrealistic.
We will present it step by step, then we will provide the full code at the end.
Select the data
We are selecting the 1 minutes candles for BTC-USD from coinbase for the last 12 months.
with now() - interval 12 month as start_date, toStartOfMinute(now()) as end_date, 10000 as initial_cash,
-- Fetching BTC data for the specified period
btc_data as (SELECT timestamp,
argMaxMerge(close) as close
FROM public_data.candle_1m
WHERE market = 'BTC-USD'
AND exchange = 'coinbase'
and timestamp > start_date
and timestamp < end_date
GROUP BY timestamp)btc_data contains:
2025-03-07T08:50Z
88409.84
2024-07-07T09:24Z
57631.88
2024-10-29T23:04Z
72649.13
...
...
Compute the SMA
The 50 and 200 period SMA is computed using a window function.
Clickhouse offers a wide range of statistical functions documented here.
--- previous code snippet ---
sma AS (SELECT timestamp,
close,
avg(close) OVER w50 AS sma50,
avg(close) OVER w200 AS sma200
FROM btc_data
-- using -1 to avoid the current row in the moving average calculation and thus avoid lookahead bias
WINDOW w50 AS (ORDER BY timestamp ROWS BETWEEN 50 PRECEDING AND 1 PRECEDING ),
w200 AS (ORDER BY timestamp ROWS BETWEEN 200 PRECEDING AND 1 PRECEDING )),sma contains:
2024-06-11T15:23Z
66908.00
66918.2
66918.2
2024-06-11T15:24Z
66870.67
66913.1
66913.1
2024-06-11T15:25Z
66835.35
66898.95666666667
66898.95666666667
2024-06-11T15:26Z
66757.10
66883.055
66883.055
2024-06-11T15:27Z
66779.25
66857.864
66857.864
2024-06-11T15:28Z
66795.82
66844.76166666666
66844.76166666666
2024-06-11T15:29Z
66817.05
66837.76999999999
66837.76999999999
2024-06-11T15:30Z
66776.63
66835.18000000001
66835.18000000001
2024-06-11T15:31Z
66780.80
66828.67444444445
66828.67444444445
...
...
...
...
Generate the signals and extract the trades
We detect the crossing of the two SMA's and generate the signals accordingly.
--- previous code snippet ---
signals AS (SELECT timestamp,
close,
sma50,
sma200,
CASE
WHEN sma50 > sma200 AND lagInFrame(sma50) OVER w < lagInFrame(sma200) OVER w THEN 'long' -- golden cross
WHEN sma50 < sma200 AND lagInFrame(sma50) OVER w > lagInFrame(sma200) OVER w THEN 'short' -- death cross
ELSE NULL
END AS signal
FROM sma
WINDOW w AS (ORDER BY timestamp)),
trades AS (SELECT timestamp,
close AS price,
signal,
row_number() OVER (ORDER BY timestamp) AS trade_id
FROM signals
WHERE signal is not null)trades contains:
2024-06-11T17:29Z
66258.98
short
1
2024-06-11T18:03Z
66784.28
long
2
2024-06-12T00:10Z
67354.27
short
3
...
...
...
...
Compute strategy performance
We regroup the trades by their signal and ids to present the data with entry and exit prices for each trade for the PnL calculation.
--- previous code snippet ---
returns AS (SELECT entry.trade_id,
entry.timestamp AS entry_date,
exit.timestamp AS exit_date,
entry.price AS entry_price,
exit.price AS exit_price,
entry.signal as signal,
if(signal = 'long', exit_price / entry_price, entry_price / exit_price) - 1 as trade_return
FROM trades entry
LEFT JOIN trades exit ON entry.trade_id + 1 = exit.trade_id
where exit_price != 0 -- ignoring the last unclosed trade
order by trade_id)returns contains:
1
2024-06-11T17:22Z
2024-06-11T18:11Z
66289.25
66695.99
short
-0.00609841761101379559
2
2024-06-11T18:11Z
2024-06-12T00:10Z
66695.99
67354.27
long
0.00986985874263205329
3
2024-06-12T00:10Z
2024-06-12T02:32Z
67354.27
67455.74
short
-0.00150424559866958691
4
2024-06-12T02:32Z
2024-06-12T05:33Z
67455.74
67362.65
long
-0.00138001599270870055
5
2024-06-12T05:33Z
2024-06-12T07:57Z
67362.65
67455.94
short
-0.00138297679937452507
6
2024-06-12T07:57Z
2024-06-12T16:33Z
67455.94
69424.35
long
0.02918067704638020017
7
2024-06-12T16:33Z
2024-06-12T21:51Z
69424.35
68534.74
short
0.01298042423448312490
8
2024-06-12T21:51Z
2024-06-12T23:30Z
68534.74
68173.60
long
-0.00526944437229936234
...
...
...
...
...
...
...
Sharpe Ratio
We compute the Sharpe ratio of the strategy using the returns from the trades.
--- previous code snippet ---
SELECT avg(trade_return) / stddevPop(trade_return) AS sharpe_ratio from returnsoutputs 0.022227898711626174
The above results can be also grouped by day and the Sharpe ratio can then be annualised to get a better understanding of the strategy performance over time.
--- previous code snippet ---
daily_retuns AS (
SELECT sum( trade_return) AS daily_return
from returns
group by toStartOfDay(exit_date)
)
SELECT sqrt(365) * avg(daily_return) / stddevPop(daily_return) AS sharpe_ratio from daily_retuns;outputs 1.3117539455990057
Full example
with now() - interval 12 month as start_date, toStartOfMinute(now()) as end_date,
-- Fetching BTC data for the specified period
btc_data as (SELECT timestamp,
argMaxMerge(close) as close
FROM public_data.candle_1m
WHERE market = 'BTC-USD'
AND exchange = 'coinbase'
and timestamp > start_date
and timestamp < end_date
GROUP BY timestamp),
-- Calculating 50 and 200 period simple moving averages
sma AS (SELECT timestamp,
close,
avg(close) OVER w50 AS sma50,
avg(close) OVER w200 AS sma200
FROM btc_data
-- using -1 to avoid the current row in the moving average calculation and thus avoid lookahead bias
WINDOW w50 AS (ORDER BY timestamp ROWS BETWEEN 50 PRECEDING AND 1 PRECEDING ),
w200 AS (ORDER BY timestamp ROWS BETWEEN 200 PRECEDING AND 1 PRECEDING )),
signals AS (SELECT timestamp,
close,
sma50,
sma200,
CASE
WHEN sma50 > sma200 AND lagInFrame(sma50) OVER w < lagInFrame(sma200) OVER w
THEN 'long' -- golden cross
WHEN sma50 < sma200 AND lagInFrame(sma50) OVER w > lagInFrame(sma200) OVER w
THEN 'short' -- death cross
END AS signal
FROM sma
WINDOW w AS (ORDER BY timestamp)),
trades AS (SELECT timestamp,
close AS price,
signal,
row_number() OVER (ORDER BY timestamp) AS trade_id
FROM signals
WHERE signal is not null),
returns AS (SELECT entry.trade_id,
entry.timestamp AS entry_date,
exit.timestamp AS exit_date,
entry.price AS entry_price,
exit.price AS exit_price,
entry.signal as signal,
if(signal = 'long', exit_price / entry_price, entry_price / exit_price) - 1 as trade_return
FROM trades entry
LEFT JOIN trades exit ON entry.trade_id + 1 = exit.trade_id
where exit_price != 0 -- ignoring the last unclosed trade
order by trade_id),
daily_retuns AS (SELECT sum(trade_return) AS daily_return
from returns
group by toStartOfDay(exit_date))
SELECT sqrt(365) * avg(daily_return) / stddevPop(daily_return) AS sharpe_ratio
from daily_retuns;The execution speed of this backtest is rapid, even for large datasets.
525,525 ( 1 Year)
1 s 678 ms
2,628,415 ( 5 Years )
7 s 315 ms
The dataset can always be downloaded from the table by running the btc_data sub-query.
Was this helpful?