SEASONALITY

Does "Sell in May" Still Work? Calendar Anomaly Backtest in Python

June 28, 2026

What's the question?

"Sell in May and go away" is one of the oldest market adages. The claim is that stocks perform better during the November-through-April "winter" months than during the May-through-October "summer" months. The hypothesis has a formal name in academic literature — the Halloween indicator — and a long history of empirical support in studies going back decades.

The question is whether this seasonal pattern persists in recent data across different market sectors. If the anomaly once existed but has been arbitraged away, a strategy built on it would be unprofitable. If it varies by sector, it might still be useful in specific contexts. A proper test requires not just computing average returns but also assessing statistical significance — whether any observed difference is distinguishable from random variation.

The approach

Pull 10 years of daily return data for SPY (broad market) and 7 sector ETFs: XLK (technology), XLF (financials), XLE (energy), XLV (healthcare), XLY (consumer discretionary), XLP (consumer staples), and XLI (industrials)
For each complete seasonal cycle, compute cumulative returns for "winter" (November through April) and "summer" (May through October)
Calculate the average return for each window, the win rate (percentage of years where winter beat summer), and the return difference in percentage points
Run a paired t-test for each ETF to determine whether the winter-summer difference is statistically significant at the 5% level

A paired t-test is appropriate here because each observation pairs a winter and summer return from the same seasonal cycle, controlling for year-to-year market conditions.

Code

import xfinlink as xfl
import pandas as pd
import numpy as np
from scipy import stats

xfl.set_api_key("YOUR_API_KEY")  # free at https://xfinlink.com/signup

tickers = ["SPY", "XLK", "XLF", "XLE", "XLV", "XLY", "XLP", "XLI"]
df = xfl.prices(tickers, period="10y", fields=["close", "return_daily"])

df["date"] = pd.to_datetime(df["date"])
df = df.sort_values(["ticker", "date"])

results = []
for ticker in tickers:
    t = df[df["ticker"] == ticker].set_index("date").sort_index()
    years = sorted(t.index.year.unique())
    winter_rets, summer_rets = [], []

    for year in years:
        winter = t.loc[f"{year-1}-11-01":f"{year}-04-30", "return_daily"].dropna()
        summer = t.loc[f"{year}-05-01":f"{year}-10-31", "return_daily"].dropna()
        if len(winter) > 50 and len(summer) > 50:
            winter_rets.append((1 + winter).prod() - 1)
            summer_rets.append((1 + summer).prod() - 1)

    w, s = np.array(winter_rets), np.array(summer_rets)
    t_stat, p_val = stats.ttest_rel(w, s)

    print(f"{ticker}: Nov-Apr {np.mean(w)*100:+.1f}%  May-Oct {np.mean(s)*100:+.1f}%  "
          f"diff {(np.mean(w)-np.mean(s))*100:+.1f}pp  "
          f"win {np.mean(w>s)*100:.0f}%  p={p_val:.4f}")

Full script with formatting and visualisation: sell-in-may-calendar-anomaly-backtest-python.py

Output

Grouped bar chart comparing average Nov-Apr and May-Oct returns for SPY and 7 sector ETFs over 2016-2025

Ticker  Avg Nov-Apr (%)  Avg May-Oct (%)  Diff (pp)  Win Rate (%)  t-stat  p-value  n_years
   SPY             7.78             8.16      -0.38          66.7   -0.09   0.9331        9
   XLK            10.32            13.95      -3.62          44.4   -0.59   0.5712        9
   XLF             9.20             6.00       3.20          66.7    0.53   0.6134        9
   XLE            10.07             1.34       8.73          55.6    1.19   0.2670        9
   XLV             5.88             4.75       1.13          55.6    0.50   0.6321        9
   XLY             7.61             7.67      -0.06          55.6   -0.01   0.9907        9
   XLP             5.86             1.50       4.36          66.7    1.15   0.2826        9
   XLI             8.04             6.61       1.43          66.7    0.21   0.8374        9

Statistically significant (p < 0.05): 0 of 8 ETFs
Average winter-summer difference across all ETFs: 1.85 pp

What this tells us

Not a single ETF shows a statistically significant difference between winter and summer returns. The lowest p-value is 0.2670 (XLE), far above the 0.05 threshold. Across 8 ETFs and 9 seasonal cycles, the data cannot reject the null hypothesis that winter and summer returns are equal.

The direction of the effect varies by sector. For SPY itself, summer actually outperformed winter by 0.38 percentage points on average — the opposite of what "sell in May" predicts. XLK (technology) shows the largest reversal: summer returned 13.95% on average versus 10.32% in winter, driven by strong tech rallies in the 2020s. Technology's growth trajectory in recent years has been agnostic to calendar seasons.

Defensive and cyclical sectors tell a different story in direction, if not in significance. XLE (energy) and XLP (consumer staples) both show winter premiums of 8.73 and 4.36 percentage points respectively. Energy's summer weakness reflects the period's overlap with demand uncertainty and geopolitical risk windows. Consumer staples, as a low-beta sector, tend to attract capital during risk-off winter positioning. However, with p-values of 0.27 and 0.28, these differences could easily be noise.

The win rates are instructive. SPY's winter won 66.7% of years despite averaging lower returns — meaning the years when summer won, it won by large margins. This pattern is consistent with positively skewed summer returns driven by a few exceptional years (such as the post-COVID recovery in summer 2020).

So what?

The "sell in May" anomaly does not hold in the recent decade for the broad U.S. market or any of the major sector ETFs at a statistically significant level. Allocating to cash from May through October would have meant missing some of the strongest return months of the decade, particularly in technology.

This does not mean seasonality is irrelevant. The sector-level variation suggests that if a seasonal strategy has any merit, it would need to be sector-specific rather than applied to the broad market. A rotation strategy that overweights energy and staples in winter while tilting toward technology and discretionary in summer aligns with the directional patterns in the data — but the high p-values indicate that the edge, if it exists, is small relative to return variability.

For portfolio construction, the practical takeaway is straightforward: do not leave the market in May. If seasonal signals are incorporated at all, they should be one input among many in a multi-factor model, not a standalone timing rule. Nine years of data and zero significant results suggest that any historical anomaly has either dissipated or is too weak to be reliably harvested after transaction costs.

Built with xfinlink — free financial data API for Python. pip install xfinlink

Built with xfinlink — free financial data API for Python. pip install xfinlink

← All articles

Behind the numbers.