Overfitting Reconsidered: An Epistemic Framework for Strategy Evaluation

Introduction

In quantitative finance, overfitting is usually defined in statistical terms. A model is said to overfit when it captures noise in historical data rather than persistent structure. This definition assumes that the past contains reliable information about the future and that the main risk lies in extracting too much of it. This essay adopts a stricter and more rigorous definition. Overfitting is treated as an epistemic failure rather than a statistical one. A model is overfit whenever it relies on information that is not known ex-ante or not known with certainty. This shift in definition changes which models are admissible and reverses many common intuitions about realism and sophistication.

Ex-Ante Knowledge and Certainty

The foundation of this framework is a simple axiom. Any quantity used in a model must be known before the future unfolds and must be known with certainty. Ex-ante knowledge refers to information that is available before the realization of future prices. Certainty means that the information does not rely on inference, estimation, belief, or historical extrapolation. If either condition fails, the information is inadmissible.

This axiom applies broadly. It applies not only to prices but also to distributions, parameters, structural features, and probabilities. A tail exponent, a regime transition matrix, or a jump intensity violates the axiom in the same way as a future return does. None of these quantities can be known before the fact, and none can be known with certainty. Encoding them into a model therefore constitutes overfitting, even if no explicit calibration to data is performed.

Overfitting as Epistemic Contamination

Under this definition, overfitting is not about using too much data. It is about using unjustified information. A model can be fully non-empirical and still be overfit if it encodes structural claims about the future that cannot be justified ex-ante. Conversely, a model can be extremely simple and still be admissible if it avoids such claims.

This reframing clarifies a common confusion. Many models are described as realistic because they reproduce historical features such as fat tails, volatility clustering, or regime shifts. These features are observed ex-post, but they are not certifiable ex-ante. Treating them as primitives in a forward-looking model amounts to assuming that the future will resemble the past in specific ways. This assumption is neither provable nor falsifiable before the fact. Under the present framework, it is therefore a form of overfitting.

The Problem of Probability and Belief

Probability deserves special attention. Assigning a probability to an event is often presented as a way to model uncertainty. In reality, it shifts the problem one level back. One must now justify the probability itself. Without ex-ante certainty, probability assignments become statements of belief rather than knowledge.

This leads to a version of Juvenal’s question: who will guard the probabilities themselves. If a model states that a crash has a probability of one percent, the model asserts knowledge about an event that cannot be measured ex-ante or with certainty. The number may be small, but the epistemic claim is large. Under the axioms stated above, such probability assignments are inadmissible. Probability may be used as a mathematical weighting device, but not as a claim about frequencies or beliefs regarding specific outcomes.

Wins and the Illusion of Validation

Another foundational claim follows directly from epistemic uncertainty. Any observed success is indistinguishable from an engineered success. A strategy that performs well on a realized path may do so because of genuine structure, alignment with historical idiosyncrasies, or pure chance. No amount of performance can separate these explanations without additional assumptions about the future.

This implies that wins cannot validate a model or a strategy. They can describe what happened, but they cannot certify robustness or truth. Performance therefore has no epistemic authority. It cannot justify additional structure, parameters, or beliefs. Models that rely on backtested success to legitimize their assumptions are engaged in circular reasoning.

Negative Principles and Extremal Constraints

Once unjustified knowledge is excluded, modeling becomes a negative exercise. One cannot prefer a specific point in time, a specific distributional shape, or a specific structural feature. One cannot encode arbitrary beliefs about events or regimes. The only admissible assumptions are those that express ignorance through symmetry.

From this perspective, several extremal principles emerge naturally. The model must exhibit maximum symmetry. It must not privilege any time, scale, or state beyond what is imposed by definition. It must commit only to scale, not to shape or frequency. It must be invariant to representation, meaning that it should not depend on an arbitrary choice such as a fixed time step. Finally, it must be closed under refinement and aggregation, so that no hidden structure appears when the model is examined more closely.

These extremals are not aesthetic choices. They are the direct consequences of refusing to encode uncertifiable information. Violating any one of them introduces a belief that cannot be defended ex-ante.

The Narrow Admissible Set

Under these constraints, the space of admissible models collapses. Most conventional models fail immediately. Regime models assume the existence and persistence of regimes. Jump models assume the frequency and magnitude of discontinuities. Heavy-tailed models assume a specific tail shape or exponent. All of these assumptions go beyond what can be known ex-ante with certainty.

What remains is a narrow set of null models. In discrete time, this includes independent Gaussian increments with a specified scale and drift. In continuous time, this leads to geometric Brownian motion as the resolution-free closure of the same idea. These models do not claim that markets are Gaussian in reality. They claim only that uncertainty has a typical scale and that no further structure is known.

This narrowness is often criticized as unrealistic. Under the present framework, the opposite is true. These models are realistic in the epistemic sense because they avoid pretending to know what cannot be known. Their simplicity is not a lack of sophistication. It is a refusal to overfit.

Why “More Realistic” Models Fail

Models described as more realistic typically embed past information deeply into their structure. They reproduce historical features by construction. When used for strategy testing, they answer the question of what would happen if the future resembled the past in specific ways. This may be useful for storytelling or risk reporting, but it is not epistemically clean.

If the past is trusted, it should be used directly. Historical testing already answers what happened. Transforming historical data into synthetic data that preserves selected features adds no new information. It either preserves what is already known or injects assumptions about which features matter. In both cases, it increases the risk of overfitting under the strict definition.

By contrast, null models that erase past structure except for scale serve a different purpose. They test whether a strategy survives ignorance. They do not attempt to predict or reproduce history. They act as filters rather than forecasts.

Conclusion

A stricter definition of overfitting leads to a radically different modeling landscape. Overfitting is not the misuse of data. It is the use of uncertifiable information. Under this definition, many models that are considered realistic are overfit by construction. They encode beliefs about the future that cannot be justified ex-ante or with certainty.

The admissible alternatives are few and austere. They rely on symmetry, scale, and minimal structure. They do not claim to describe reality. They claim only to avoid epistemic fraud. This narrow set does not win by being more accurate. It wins by refusing to lie.

Appendix

At the deepest level, the decisive fault of most alternative models is not that they are complex, historical, or statistical. The fault is that they attempt to be predictive. Prediction requires claims about the future. Claims about the future require information. When that information is neither known ex-ante nor known with certainty, prediction becomes belief disguised as modeling.

Conventional models do not merely provide a mathematical environment in which strategies are evaluated. They assert that certain features of the future will occur with specific structure. A heavy-tailed model asserts that extreme events will occur with a particular frequency profile. A regime model asserts that regimes exist, recur, and transition in a structured way. A jump model asserts that discontinuities occur with a definable intensity. These are not neutral abstractions. They are predictive statements, even when framed probabilistically.

This predictive intent is the cardinal epistemic error. It violates the core constraint that the future does not grant ex-ante access to its structure. The models are not wrong because the future will fail to exhibit these features. They are wrong because the modeler has no epistemic standing to assert them in advance. The act of prediction itself is the overfit.

By contrast, the narrow admissible class of models does not attempt to predict. It does not assert that the future will behave in a particular way. It asserts only that uncertainty exists, that it has a scale, and that no further structure is known. Any resemblance between paths generated under such models and realized market behavior is incidental. It is not the objective of the model, and it does not validate it.

This distinction is critical. A model that aims to predict must be judged by its correspondence to outcomes. A model that aims to avoid unjustified belief must be judged by the discipline of its assumptions. These are incompatible goals. When critics argue that scale-only or symmetry-based models are unrealistic, they are implicitly demanding predictive content. Under the present framework, that demand is illegitimate.

The admissible models do “some stuff” that may resemble real behavior. Paths fluctuate, trends emerge over long horizons, noise accumulates, and variability matters. These features are not claimed as truths about markets. They arise as consequences of symmetry and scale, not as hypotheses about reality. The resemblance is a byproduct, not a justification.

In this sense, the narrow set of models is not weak because it says little. It is strong because it refuses to say more than can be known. Predictive ambition is replaced with epistemic restraint. The goal is not to forecast the future, but to test strategies against ignorance. Anything that survives this test has not been proven correct. It has merely not been proven dishonest.