Options Backtesting Execution Realism

Read this page with Quotes, Trades, Option Quote and Trade Conditions, Options Slippage Modeling, and Quote-Aware Options Backtests.

Quick definition: execution realism is the part of an options backtest that decides whether the selected contract could actually be entered and exited at the modeled prices.

Execution realism is the difference between a signal backtest and a tradable options backtest. Options do not trade continuously across every strike. Last price can be stale, midpoint can be optimistic, and a few cents of spread can dominate a short-dated trade.

Why this matters

The strategy can be directionally right and still fail as an options trade. A call can move in the expected direction while the spread, quote age, DTE, or size makes the simulated entry unrealistic. A fill model that ignores those details is more than optimistic; it changes what the backtest claims to measure.

A useful execution model records both the price and the reason that price was allowed. If the simulator buys near the ask, exits near the bid, uses midpoint with a haircut, or rejects the trade entirely, the trade log should say so.

Bars, quotes, and trades

Use each data object for the job it answers:

Data	Best use	Do not use it for
Underlying bars	Signal features and underlying stop/target logic.	Option fill prices.
Option bars	Option price path and fallback valuation.	Proof that a specific bid/ask fill existed.
Option quotes	Executable bid/ask context and spread quality.	Long-term signal features without freshness checks.
Option trades	Activity evidence and last-sale context.	Universal fill proxy.

A serious framework often uses bars for path and quotes for fills. For a long option, entry near the ask and exit near the bid is more conservative than assuming midpoint both ways. More advanced models can allow midpoint or price improvement only when quote freshness, spread width, and trade evidence support it.

Fill policy

Make the fill model explicit:

bash

def long_option_entry_fill(bid: float, ask: float, mode: str) -> float:
    if ask <= 0 or bid < 0 or ask < bid:
        raise ValueError("invalid_quote")
    if mode == "marketable_limit":
        return ask
    if mode == "mid_with_haircut":
        return (bid + ask) / 2 + 0.25 * (ask - bid)
    raise ValueError(f"unsupported_fill_mode: {mode}")

The important part is not this exact formula. It is that the simulator records what side of the market was used and rejects quotes that do not pass the policy.

Fill model table

Fill model	Entry assumption	Exit assumption	Best use
Midpoint reference	Midpoint only as a diagnostic value.	Midpoint only as a diagnostic value.	Comparing theoretical edge before execution costs.
Midpoint with haircut	Midpoint plus part of spread.	Midpoint minus part of spread.	Conservative research when quotes are fresh and tight.
Marketable long option	Buy at or near ask.	Sell at or near bid.	Stress-testing whether a long-premium idea survives crossing spreads.
Quote reject	No fill when quote is stale, crossed, missing, or too wide.	No exit fill until observable.	Keeping weak liquidity from becoming fake PnL.

Use one model across a comparison. Do not let the winning profile use midpoint while a later profile is judged with marketable assumptions.

Stops and exits

Stops and targets must be observable. If a premium stop uses option quotes, the framework should check quotes after entry and apply the stop only when a quote pair proves the stop level was reachable. If an underlying stop uses underlying bars, the option exit still needs a corresponding option quote or option bar at the exit time.

Common exit reasons:

profit target
premium stop
underlying stop
time stop
end-of-day exit
max-hold exit
invalid or missing quote
forced rejection because entry would occur after exit

Every exit should preserve the timestamp and price source.

Spread and slippage

Track cost in dollars alongside percentages. A 0.10 option spread is 10 dollars per contract before commissions. For short-dated options, that may be larger than the expected edge.

Useful execution metrics include entry spread percentage, exit spread percentage, spread as a share of premium, expected move to spread ratio, entry quote age, exit quote age, rejected trade count by reason, and fill source. These metrics should sit beside PnL, Sharpe, drawdown, and win rate so a reader can see whether returns came from tradable markets or optimistic assumptions.

Rejecting trades is success

A realistic simulator should reject trades when the market is not good enough. That can reduce trade count and make performance less exciting, but it improves the scientific value of the result.

Good rejection reasons include:

no quote near entry
quote crossed or invalid
spread too wide
ask below minimum premium
entry after effective exit
stop or target cannot be priced
quantity below one after risk caps

Specific rejects make no-go research useful. If a strategy fails because the signal is wrong, the next experiment is different from a strategy that fails because the option market is too wide.

Execution rules depend on the market-data surface described in Quotes, Trades, and Aggregates Semantics, Option Quote and Trade Conditions, and Market Hours, Timestamps, and Time Zones. A fill model should name bid, ask, midpoint, NBBO, spread width, quote age, exchange condition, and time-in-force assumptions explicitly.

Fill realism implementation notes

Execution realism starts with side-specific markets. A buy candidate needs the ask, ask size, quote timestamp, quote condition, and spread policy. A sell or exit candidate needs the bid, bid size, no-bid handling, quote age, and any stop or target ordering rule. The midpoint can be useful for marking, but the artifact needs the bid and ask that made the midpoint believable.

Pair this page with Option Quote and Trade Conditions, Quotes, Trades, and the Options Slippage Calculator. When a strategy survives only at midpoint fills, label it as research-only until it passes side-specific fills, spread caps, stale-quote rejects, and no-bid exit handling. That label prevents a pretty backtest from becoming an execution claim.

Backtesting Framework

Docs