### Thoughts on Quantitative Trading, Again

Food for thought. First part is a lead in to a discussion of the surprisingly complicated question of what "good" analysis really is. In my opinion at least, to attempt to bridge the divide between true quantitative finance with qualitative, it helps to have a robust definition for what good analysis is, regardless of its origin. I attempt one at the end of this long winded piece.

Parametric complexity refers to the number of factors that are analyzed in the investment process. Various price processes are examples of factors, but so are more qualitative things like the weather, the number of buildings a company has, the quality of a CEO’s education or the tenor of his voice.

Depth of understanding refers to how deeply one comprehends a given factor in question. So rather than evaluating a thesis, model or strategy based on how many parameters were analyzed, this evaluates just how well one understands the parameters in question.

I’ll use option valuation as an example to better explain what I mean by these definitions. The Black Scholes equation is what many use to value an option, evaluating the option’s price with 5 parameters—the current price of the underlying, the current volatility of the underlying, the current risk free rate, the time to maturity and the strike price. So even though a stock’s price is governed by an infinite number of factors (fear, greed, sentiment, etc.), no arbitrage restrictions have reduced this plethora of factors down to just 5. This model is pretty sparse in terms of parametric complexity—I believe the reason it does so well has to do with the depth of understanding of the parameters used. Only by fully understanding just how the option’s value is derived from the cost of its replicating portfolio can you model its price in such a succinct way.

Depth of understanding is also responsible for the more advanced incarnations of the model. People realized volatility wasn’t constant so they found ways to model volatility.

GARCH can usually give a much better estimate of one-period-ahead volatility with only one additional parameter, the volatility of the prior day.

Volatility functions try to give better estimates with a couple more factors, like money-ness and time to maturity. People realized interest rates weren’t constant so they were also able to adjust for stochastic interest rates. These are all adjustments which add few new parameters and yet in some cases can greatly improve the estimate of the parameter in question. This is possible through a depth of understanding of the variables in question. No one can say what makes a successful quant (I definitely can’t), but at least from what has happened in the past, it seems like some of the biggest breakthroughs have come not through constructing an incredibly parametrically complex model, or vastly increasing the complexity of an existing model, but through reaching new depths of understanding of whatever process or processes you are looking at with a manageable parametric complexity. What would a deep value investor look at when deciding on whether to buy an option on a stock? I can guarantee you they will allocate their time in a completely different fashion, probably focusing more on what is driving the stock process itself. They will probably take in way more factors than would ever be healthy for a quant model—the difference in the number of factors is probably orders of magnitude in size. Even the IVF, which is supposed to factor in the seemingly obvious skew factor, empirically does little better on out of sample data than constant volatility! The depth of a manager’s understanding of each of the factors that go into that manager’s model is probably one of the big differentiators of quality among investment managers. Parametric complexity can only go so far before one runs into some serious problems. I’m not entirely sure that deep value investing has such a limit on complexity. The commoditization of many quant strategies may correlate with how parametrically complex the process is.

Ways to Evaluate Analysis:

Temporal Relevance: if one were to decompose the information given in an analysis into its component factors, what is the time horizon for the various factors? Is that time horizon so short that it isn't grasping all relevant information pertaining to that factor, making the analysis not robust? Is the time horizon so long that you are incorporating irrelevant information in your analysis? Keep in mind that the time horizon can be a group of disjoint sets, and is in no way limited to only one continuous time period.

Idiosyncrasy : If one were to decompose an analysis into its component factors, how idiosyncratic are the various factors? Can we get outright datasets for the less idiosyncratic factors to improve the robustness of the analysis? Might it be possible to create 'fuzzy datasets' of our own, using pre-specified classification rules and a large news database, for the more idiosyncratic variables? Could we make our analysis more "rich" by incorporating more idiosyncratic elements instead of strictly hugging the data? Finally, could we be placing an irrational amount of weight on idiosyncratic elements to make up for a genuine lack of numerical data?

Dimensionality: If one were to decompose an analysis into its component factors, just how many factors are there, and are we weighing our factors rationally? How sensitive is our valuation to each of our assumptions? Did we make sure to construct a full dataset for each factor? If not we again run into robustness issues. In general, the more factors you introduce at the same time, the greater the risk that you are witnessing a random permutation which just so happened to fit a relevant back-test.

Orthogonality: In our analysis, are we using the same one general line of reasoning the whole way through, or are we calling on a large number of distinct and unrelated factors (or sources of information for those factors)? The more unrelated our factors and their sources are, the better, because that means our conclusion is less sensitive to any one factor or source and our inferential ability could be higher.

Popularity: If you've found a factor which is important that other people either don't know about, or are weighing improperly, you have much to gain. However if everyone knows that your factor is important, then that factor ceases to be useful as an input for inferential purposes—the update happens too quickly. Once that happens you would then need to find factors which have inferential power over the original factor which other people either don't know about or are weighing improperly. So an important question to ask yourself becomes—if one were to decompose an analysis into its component factors, how popular are those factors? For the more well known factors, are their values themselves predicted using other factors which are perhaps less well known? For the less well known factors, are we sure they aren't less well known for a reason?

A Factors-Based Approach:

In the same way that two orthogonal unit vectors can form the basis for 2-Space, perhaps a substantial number of factors of various forms can form a basis for our prediction space. A factors-based approach makes some theoretical sense to me because the factors you create can be used over and over again for different predictions. Factors can also be used to predict other factors, should your original factors become very popular. I suppose the key driver of this approach is the belief that everything is interconnected. You will also probably end up as a market historian as well as a mathematician—I think the skill-sets of both complement one another. Btw, I would be interested to see that proof on high dimensionality without overfitting. Unless the explanatory variables are completely unrelated to one another on at least a couple levels, it doesn't make intuitive sense to me why I should be able to avoid overfitting.

Parametric complexity refers to the number of factors that are analyzed in the investment process. Various price processes are examples of factors, but so are more qualitative things like the weather, the number of buildings a company has, the quality of a CEO’s education or the tenor of his voice.

Depth of understanding refers to how deeply one comprehends a given factor in question. So rather than evaluating a thesis, model or strategy based on how many parameters were analyzed, this evaluates just how well one understands the parameters in question.

I’ll use option valuation as an example to better explain what I mean by these definitions. The Black Scholes equation is what many use to value an option, evaluating the option’s price with 5 parameters—the current price of the underlying, the current volatility of the underlying, the current risk free rate, the time to maturity and the strike price. So even though a stock’s price is governed by an infinite number of factors (fear, greed, sentiment, etc.), no arbitrage restrictions have reduced this plethora of factors down to just 5. This model is pretty sparse in terms of parametric complexity—I believe the reason it does so well has to do with the depth of understanding of the parameters used. Only by fully understanding just how the option’s value is derived from the cost of its replicating portfolio can you model its price in such a succinct way.

Depth of understanding is also responsible for the more advanced incarnations of the model. People realized volatility wasn’t constant so they found ways to model volatility.

GARCH can usually give a much better estimate of one-period-ahead volatility with only one additional parameter, the volatility of the prior day.

Volatility functions try to give better estimates with a couple more factors, like money-ness and time to maturity. People realized interest rates weren’t constant so they were also able to adjust for stochastic interest rates. These are all adjustments which add few new parameters and yet in some cases can greatly improve the estimate of the parameter in question. This is possible through a depth of understanding of the variables in question. No one can say what makes a successful quant (I definitely can’t), but at least from what has happened in the past, it seems like some of the biggest breakthroughs have come not through constructing an incredibly parametrically complex model, or vastly increasing the complexity of an existing model, but through reaching new depths of understanding of whatever process or processes you are looking at with a manageable parametric complexity. What would a deep value investor look at when deciding on whether to buy an option on a stock? I can guarantee you they will allocate their time in a completely different fashion, probably focusing more on what is driving the stock process itself. They will probably take in way more factors than would ever be healthy for a quant model—the difference in the number of factors is probably orders of magnitude in size. Even the IVF, which is supposed to factor in the seemingly obvious skew factor, empirically does little better on out of sample data than constant volatility! The depth of a manager’s understanding of each of the factors that go into that manager’s model is probably one of the big differentiators of quality among investment managers. Parametric complexity can only go so far before one runs into some serious problems. I’m not entirely sure that deep value investing has such a limit on complexity. The commoditization of many quant strategies may correlate with how parametrically complex the process is.

Ways to Evaluate Analysis:

Temporal Relevance: if one were to decompose the information given in an analysis into its component factors, what is the time horizon for the various factors? Is that time horizon so short that it isn't grasping all relevant information pertaining to that factor, making the analysis not robust? Is the time horizon so long that you are incorporating irrelevant information in your analysis? Keep in mind that the time horizon can be a group of disjoint sets, and is in no way limited to only one continuous time period.

Idiosyncrasy : If one were to decompose an analysis into its component factors, how idiosyncratic are the various factors? Can we get outright datasets for the less idiosyncratic factors to improve the robustness of the analysis? Might it be possible to create 'fuzzy datasets' of our own, using pre-specified classification rules and a large news database, for the more idiosyncratic variables? Could we make our analysis more "rich" by incorporating more idiosyncratic elements instead of strictly hugging the data? Finally, could we be placing an irrational amount of weight on idiosyncratic elements to make up for a genuine lack of numerical data?

Dimensionality: If one were to decompose an analysis into its component factors, just how many factors are there, and are we weighing our factors rationally? How sensitive is our valuation to each of our assumptions? Did we make sure to construct a full dataset for each factor? If not we again run into robustness issues. In general, the more factors you introduce at the same time, the greater the risk that you are witnessing a random permutation which just so happened to fit a relevant back-test.

Orthogonality: In our analysis, are we using the same one general line of reasoning the whole way through, or are we calling on a large number of distinct and unrelated factors (or sources of information for those factors)? The more unrelated our factors and their sources are, the better, because that means our conclusion is less sensitive to any one factor or source and our inferential ability could be higher.

Popularity: If you've found a factor which is important that other people either don't know about, or are weighing improperly, you have much to gain. However if everyone knows that your factor is important, then that factor ceases to be useful as an input for inferential purposes—the update happens too quickly. Once that happens you would then need to find factors which have inferential power over the original factor which other people either don't know about or are weighing improperly. So an important question to ask yourself becomes—if one were to decompose an analysis into its component factors, how popular are those factors? For the more well known factors, are their values themselves predicted using other factors which are perhaps less well known? For the less well known factors, are we sure they aren't less well known for a reason?

A Factors-Based Approach:

In the same way that two orthogonal unit vectors can form the basis for 2-Space, perhaps a substantial number of factors of various forms can form a basis for our prediction space. A factors-based approach makes some theoretical sense to me because the factors you create can be used over and over again for different predictions. Factors can also be used to predict other factors, should your original factors become very popular. I suppose the key driver of this approach is the belief that everything is interconnected. You will also probably end up as a market historian as well as a mathematician—I think the skill-sets of both complement one another. Btw, I would be interested to see that proof on high dimensionality without overfitting. Unless the explanatory variables are completely unrelated to one another on at least a couple levels, it doesn't make intuitive sense to me why I should be able to avoid overfitting.

## 0 Comments:

Post a Comment

<< Home