The Manual Work in Mechanical Trading: Evaluating a Trading System

This article was published in 2006 in SFO Magazine (now defunct).


Not everything in mechanical system trading is automatic. Before you begin trading a system, you must evaluate it carefully.


On May 6, 2106 Rebecca Adams woke up at 6:30 a.m. and started preparing for her day. She showered, dressed, gathered her beach equipment and then went downstairs for breakfast. At 7:45 a.m. she stepped into her office. Displayed on the industry standard, OLED, wall-sized monitor were the results of her overnight trading. Her profits from trading the Shanghai 1000 e-mini futures contracts were offset by her losses from the Bombay 50 options. But overall she had booked a nice profit while she was sleeping. The computer informed her that there had been a communications glitch but that the backup lines had kicked in so her positions were never in jeopardy. Had there been a problem that couldn’t be rectified, the computer would have woken her up with a phone call or sounded a shrill alarm guaranteed to wake the dead. She smiled with satisfaction, turned around and went to spend her day on the beach where she would kick her friends’ butts in beach volleyball. The computer would continue trading while she was gone…

Many beginning system traders have been seduced by this fantasy, and as technology races ahead, portions of this fantasy edge tantalizingly closer to reality. However, a self-correcting, self-adjusting, automated-trading cash machine that will fulfill traders’ fantasies remains as far off in the future as the stories of Isaac Asimov. For now, system traders will have to develop (or purchase) one or more trading systems and create an infrastructure and routine for implementing the trading signals. So given the time-consuming tasks system traders must undertake, this article will focus on developing and evaluating a trading system – either one that is being written by the end user or one that is being considered for purchase or lease.

Reality Check

Every day thousands of extremely intelligent folks (think Ph.D. types) try to find new ways to wring money out of market inefficiencies but have yet to develop a system that consistently makes money over a long period of time. Experienced system traders know this. Instead, they will trade a basket of systems on a basket of commodities on multiple time frames. Over the long run this is the only way to consistently make money with mechanical systems—at least until we can program a machine to do the bulk of the work for us.

Of course many readers may disagree, but that depends on how a system is defined. After all, many traders have a single “system” that they use day in and day out, and they make a good living trading it. However, these “systems” allow some wiggle room for the trader. For the purposes of this article the term “trading system” or “system” will mean a mechanical trading system that has strict entry and exit rules with no room for discretion.

It is extremely hard, if not almost impossible, to consistently make money over a long period trading a single mechanical system. A single robust system generally has only a small edge, which will dull over time, and returns only a small profit on a risk-adjusted basis. Systems must be combined into a portfolio in order to maximize returns and reduce risk.

Take a Good, Long Look at a System

There are hundreds of statistical measures that any trading system can generate. Regardless of what numerical attributes of a trading system are used for evaluation, all of those factors are based on historical data. Without the luxury of peeking into the future, the best you can do is monitor the system and compare the performance to historical norms. When the system starts to deviate — and it will — the trader will have to evaluate the reasons for the deviation and determine if the underlying assumptions on which the system is based still hold water.

Over time I have arrived at my own favorite measures for evaluating systems. Out of the hundreds (maybe thousands) of measures, I’ve narrowed mine down to 16, listed here. Although this list may not necessarily be the best set of factors to use, they work for me and can be a good starting point for newer traders.

1. Underlying premise: The first question that should be asked when looking at a system is this: What’s the underlying assumption or driver of the system? Then a trader should ask if that underlying assumption makes sense. Every system has an underlying driver that should make intuitive sense to a trader. In my experience most robust trading systems are usually based on very simple underlying drivers.

2. Profitability: Many of you reading this will say, “Duh! Of course I’ll look at profitability.” But what I mean by this is to look at whether or not the system is profitable on a single contract. Many systems use money management that varies the number of contracts on each trade. But remove that and the profitability on a single contract is not so enticing – in some cases the system may even be a losing system. Varying the number of contracts is used to control risk and leverage the system; it should not be used to turn a losing system into a winning one.

3. Profit factor: This is just another way of looking at profitability. It is calculated by dividing the gross wins by the gross losses. A profit factor of more than 1.0 is a winning system. Short-term trading systems tend to have profit factors less than 2.0 with a large number of trades.

4. Drawdown: What is the maximum drawdown as a percentage of the profits made? A drawdown of $100,000 in order to make $100,000 over five years is not attractive. Where did the max drawdown occur? Was it at the start, end or middle of the equity curve? If it’s at the end (i.e. recently), it’s a warning sign that the system is starting to fail — unless it’s due to unusual circumstances like 9/11.

5. Percent winners and average profit per trade: What is the percentage of winning trades? A high percentage of winning trades will be accompanied by a lower profit per trade and a higher average loss per trade. If it’s not, you are, more than likely, looking at a system that will not hold up over time. Does the underlying premise correspond with these numbers? For example, breakout trend following systems tend to have a lower percentage of winning trades, but the average winning trade will be larger than the average losing trade.

6. Testing period(s): What was the length of the testing period? What types of markets did that period cover? For systems based on daily charts, there should be, ideally, ten years of data. The market data should show multiple uptrends, multiple downtrends and multiple sideways price action, so you can see how the system handled each type of action. Did the system catch the majority of the action that it was designed to catch? For example, if the system is a trend following system, it should have caught most of the trends.

Next, look at how the system was developed and tested. Ideally, the developer would have taken the data and broken it into three sets. The middle 60 percent would be the development set (the set on which the initial testing was conducted). The first 20 percent and last 20 percent would be the out-of-sample set. Out-of-sample data is data on which the system is tested that was not used in the initial system development. At best, testing on this data set is done once and only once – it either works or doesn’t. Some development shops will only allow their developers access to the development set. They aren’t even allowed to see the out-of-sample data. Instead, the system is handed off to others for the out-of-sample data testing — this goes a long way towards preventing curve-fitting.

7. Sample size: The biggest and most difficult factor that system developers have to guard against is accidental curve-fitting of data. A large sample size goes a long way towards mitigating that risk. So the larger the sample size, the better. For short-term swing and day trading systems I want to see 200+ trades. For long-term trend following systems I like to see 50+ trades. The sample size can be on one market or across multiple markets.

8. Consecutive losers: I’m simply looking at one thing here – can I psychologically handle the number of consecutive losers that the system has had in the past? If the system had ten consecutive losers, can I handle that? Will I stop trading it in frustration after losing five times?

9. Equity curve: You do not want to see a 45-degree straight line on the equity curve. If there is one, then chances are it is curve-fitted and will not hold up over time. Of course there should be a steady trend up, but a robust system will be punctuated by drawdowns. Another item to look at here is the length of time it took to recover from the maximum drawdown.

10. Year-by-year analysis: Look at some performance measures by year. For example, I want to see a consistent number of trades on a yearly basis and a consistent winning rate year over year. It’s important to note here that in some markets a system may produce less absolute profit today than it did ten years ago. A big reason for this can be a reduction in volatility. Look at the underlying premise of the system to determine if reduced volatility will result in reduced profits. If so, then you do not necessarily want to discard a system since profitability can be restored via money management rules that factor in volatility.

In addition, determine the return you would expect next year on a percentage basis. If the system spends a significant amount of time exposed to the market, then you would want this to be much higher than the so-called risk-free return you could earn with T-bills.

11. Reaction to unusual events: How did the system respond to unusual events like the market crash of 1987, 9/11 or the Asian currency crisis in 1997? If the system made money, how much was it relative to the overall profitability of the system? What does the system performance look like when those trades were taken out? If the system was unprofitable during those times, how well did the system control risk?

12 Friction costs: In trading, as in physics, friction slows things down. Friction refers to slippage, commission and the costs of missed trades, all of which must be considered when looking at a trading system. You should factor in an appropriate slippage cost for each trade – as little as one tick in highly liquid markets to multiple ticks and points in less liquid markets. There have been many systems that went from nicely profitable to break-even or losing simply because of slippage and nothing else. Also, make sure that you analyze the system with reasonable costs in mind. Commission costs ten years ago were very different from costs today. And finally consider how well the system would perform if you missed ten or 20 percent of the trades? It would not be unusual to miss 20 percent of a system’s trades if you were trading the system manually, or if limit orders aren’t filled or are only partially filled.

13 Multiple time frames: What time frames does the system consider when trading? Systems that use a longer time frame to determine the dominant trend and shorter time frames for entries tend to be more robust than systems that use a single time frame.

14. Multiple types of data: What types of data is the system using? Systems that use some fundamental data combined with technical data are more comprehensive.

15 Multiple markets: How well does the system work on multiple markets? For example, a system developed on the S&Ps should at least be profitable on other broad market measures such as the Russell and Dow. How well does the system work on non-correlated markets? Systems that are at least profitable on non-correlated markets may be more reliable.

16. Monte Carlo simulation: A Monte Carlo simulation is a relatively simple concept. It takes the trades that a system produces and randomly mixes them up to produce new sequences of trades. This ensures that the system can be tradable even if the trades occurred in a different sequence. Most of the simulations should be profitable, and most of the drawdowns and consecutive winners/losers should be within your tolerance level.

Mix It Up

The last thing on most beginning system traders’ minds is diversification — evaluating a system is complicated enough — yet this has the most potential to keep a system trader out of trouble. Diversification can help to mitigate the two largest stumbling blocks to trading systems: adequate risk control and adequate return on equity.

By diversifying, the distribution of drawdown periods between systems can often offset each other. Diversification can be trading multiple systems or multiple system types (short-term, long-term, trend following, counter-trend, etc.), trading systems on multiple time frames and trading systems in multiple, non-correlated markets.

Ideally, a system trader’s portfolio will comprise all three forms of diversification, but even incorporating one can add a much-needed measure of diversity to a portfolio. An entire book can be written about how to ensure that your portfolio diversification choices mitigate risk instead of adding risk, but the point to be made here in this short article is that some diversification is usually better than no diversification.

Make the Most of Mechanical System Trading

A disciplined, systematic evaluation process and a properly diversified portfolio of systems are two of the foundations of successful mechanical system trading. Between your own ideas for trading systems and the hundreds of additional systems that are available commercially or in the public domain, you can and will spend a lot of time evaluating systems. Thus, in order to make the most of your time, having a fixed process to examine each system is essential.

It also takes a portfolio of systems to be successful in the long run. Markets change and individual systems fail. That’s a fact of system trading life. As with any type of trader, system traders have to adapt to these changing conditions. A well-diversified portfolio of systems ensures that a sudden change does not put you out of business before you’ve had a chance to adapt.

Incorporating diversification and a systematic evaluation of trading systems will boost your chances of success as a mechanical system trader — and may make you almost as efficient as that power computer of the future.

Posted in