Open Access

Machine Learning Methods in Algorithmic Trading Strategy Optimization – Design and Time Efficiency


Cite

Fig. 1

The equity lines of the strategies selected by all the methods for SPXDAX - in-sampleSPX - S&P500 Index, DAX - Deutscher Aktienindex, ES, EHC, GD, DEM - equity line of the median strategy resulted from respectively Exhaustive Search, Extended Hill Climbing, Grid Method and Differential Evolution Method. Prices of both assets have been normalized in order to have initial value equal to 1000. The equity line has been calculated for the strategy working on daily frequency, investing 20% of capital in position on each asset with rebalancing every 5 trading days. Trading from the beginning of 1998 to the end of 2013 has been simulated, with the assumption of fee equal to 0.25% of the position value.
The equity lines of the strategies selected by all the methods for SPXDAX - in-sampleSPX - S&P500 Index, DAX - Deutscher Aktienindex, ES, EHC, GD, DEM - equity line of the median strategy resulted from respectively Exhaustive Search, Extended Hill Climbing, Grid Method and Differential Evolution Method. Prices of both assets have been normalized in order to have initial value equal to 1000. The equity line has been calculated for the strategy working on daily frequency, investing 20% of capital in position on each asset with rebalancing every 5 trading days. Trading from the beginning of 1998 to the end of 2013 has been simulated, with the assumption of fee equal to 0.25% of the position value.

Fig. 2

The histograms of the reached optimization criterion and the execution time of EHC for SPXDAX – in-sampleOC - optimization criterion calculated as 100 * (%ARC * %ARC) / (%ASD * %MDD). The optimization criterion have been calculated from the sample of 1000 independent algorithm executions. The strategies have been working on the daily frequency, investing 20% of capital in position on each asset with rebalancing every 5 trading days. Trading from the beginning of 1998 to the end of 2013 has been simulated, with assumption of fee equal to 0.25% of the position value.
The histograms of the reached optimization criterion and the execution time of EHC for SPXDAX – in-sampleOC - optimization criterion calculated as 100 * (%ARC * %ARC) / (%ASD * %MDD). The optimization criterion have been calculated from the sample of 1000 independent algorithm executions. The strategies have been working on the daily frequency, investing 20% of capital in position on each asset with rebalancing every 5 trading days. Trading from the beginning of 1998 to the end of 2013 has been simulated, with assumption of fee equal to 0.25% of the position value.

Fig. 3

The histograms of the reached optimization criterion and the execution time of DEM for SPXDAX - in-sampleOC - optimization criterion calculated as 100 * (%ARC * %ARC) / (%ASD * %MDD). The equity lines have been calculated for the strategy working on daily frequency, investing 20% of capital in position on each asset with rebalancing every 5 trading days. Trading from the beginning of 1998 to the end of 2013 has been simulated, with the assumption of fee equal to 0.25% of the position value.
The histograms of the reached optimization criterion and the execution time of DEM for SPXDAX - in-sampleOC - optimization criterion calculated as 100 * (%ARC * %ARC) / (%ASD * %MDD). The equity lines have been calculated for the strategy working on daily frequency, investing 20% of capital in position on each asset with rebalancing every 5 trading days. Trading from the beginning of 1998 to the end of 2013 has been simulated, with the assumption of fee equal to 0.25% of the position value.

Fig. 4

The equity lines of the strategies selected by the all the methods for SPXDAX – out-of-sampleES, EHC, GD, DEM - equity line of the median strategy resulted from respectively Exhaustive Search, Extended Hill Climbing, Grid Method and Differential Evolution Method. Prices of both assets have been normalized in order to have initial value equal to 1000. The equity line has been calculated for the strategy working on daily frequency, investing 20% of capital in position on each asset with rebalancing every 5 trading days. Trading from the beginning of 2014 to the end of 2017 has been simulated, with the assumption of fee equal to 0.25% of the position value.
The equity lines of the strategies selected by the all the methods for SPXDAX – out-of-sampleES, EHC, GD, DEM - equity line of the median strategy resulted from respectively Exhaustive Search, Extended Hill Climbing, Grid Method and Differential Evolution Method. Prices of both assets have been normalized in order to have initial value equal to 1000. The equity line has been calculated for the strategy working on daily frequency, investing 20% of capital in position on each asset with rebalancing every 5 trading days. Trading from the beginning of 2014 to the end of 2017 has been simulated, with the assumption of fee equal to 0.25% of the position value.

Fig. 5

The equity line of the strategy selected by all the methods for AAPLMSFT – in-sampleAAPL - Apple Inc. stock, MSFT - Microsoft Corp. stock, ES, EHC, GD, DEM - equity line of the median strategy resulted from respectively Exhaustive Search, Extended Hill Climbing, Grid Method and Differential Evolution Method. Prices of both assets have been normalized in order to have initial value equal to 1000. The equity line has been calculated for the strategy working on daily frequency, investing 20% of capital in position on each asset with rebalancing every 5 trading days. Trading from the beginning of 1998 to the end of 2013 has been simulated, with the assumption of fee equal to 0.25% of the position value.
The equity line of the strategy selected by all the methods for AAPLMSFT – in-sampleAAPL - Apple Inc. stock, MSFT - Microsoft Corp. stock, ES, EHC, GD, DEM - equity line of the median strategy resulted from respectively Exhaustive Search, Extended Hill Climbing, Grid Method and Differential Evolution Method. Prices of both assets have been normalized in order to have initial value equal to 1000. The equity line has been calculated for the strategy working on daily frequency, investing 20% of capital in position on each asset with rebalancing every 5 trading days. Trading from the beginning of 1998 to the end of 2013 has been simulated, with the assumption of fee equal to 0.25% of the position value.

Fig. 6

The histograms of the reached optimization criterion and the execution time of EHC for AAPLMSFT – in-sampleOC - optimization criterion calculated as 100 * (%ARC * %ARC) / (%ASD * %MDD). The optimization criterion has been calculated from the sample of 1000 independent algorithm executions. The strategies have been working on the daily frequency, investing 20% of capital in position on each asset with rebalancing every 5 trading days. Trading from the beginning of 1998 to the end of 2013 has been simulated, with the assumption of fee equal to 0.25% of the position value.
The histograms of the reached optimization criterion and the execution time of EHC for AAPLMSFT – in-sampleOC - optimization criterion calculated as 100 * (%ARC * %ARC) / (%ASD * %MDD). The optimization criterion has been calculated from the sample of 1000 independent algorithm executions. The strategies have been working on the daily frequency, investing 20% of capital in position on each asset with rebalancing every 5 trading days. Trading from the beginning of 1998 to the end of 2013 has been simulated, with the assumption of fee equal to 0.25% of the position value.

Fig. 7

The histograms of the reached optimization criterion and the execution time of DEM for AAPLMSFT – in-sample OC - optimization criterion calculated as 100 * (%ARC * %ARC) / (%ASD * %MDD). The optimization criterion have been calculated from the sample of 1000 independent algorithm executions. The strategies have been working on the daily frequency, investing 20% of capital in position on each asset with rebalancing every 5 trading days. Trading from the beginning of 1998 to the end of 2013 has been simulated, with the assumption of fee equal to 0.25% of the position value.
The histograms of the reached optimization criterion and the execution time of DEM for AAPLMSFT – in-sample OC - optimization criterion calculated as 100 * (%ARC * %ARC) / (%ASD * %MDD). The optimization criterion have been calculated from the sample of 1000 independent algorithm executions. The strategies have been working on the daily frequency, investing 20% of capital in position on each asset with rebalancing every 5 trading days. Trading from the beginning of 1998 to the end of 2013 has been simulated, with the assumption of fee equal to 0.25% of the position value.

Fig. 8

The equity lines of the strategies selected by all the methods for AAPLMSFT – out-of-sampleAAPL - Apple Inc. stock, MSFT - Microsoft Corp. stock, ES, EHC, GD, DEM - equity line of the median strategy resulted from respectively Exhaustive Search, Extended Hill Climbing, Grid Method and Differential Evolution Method. Prices of both assets have been normalized in order to have initial value equal to 1000. The equity line has been calculated for the strategy working on daily frequency, investing 20% of capital in position on each asset with rebalancing every 5 trading days. Trading from the beginning of 2014 to the end of 2017 has been simulated, with the assumption of fee equal to 0.25% of the position value.
The equity lines of the strategies selected by all the methods for AAPLMSFT – out-of-sampleAAPL - Apple Inc. stock, MSFT - Microsoft Corp. stock, ES, EHC, GD, DEM - equity line of the median strategy resulted from respectively Exhaustive Search, Extended Hill Climbing, Grid Method and Differential Evolution Method. Prices of both assets have been normalized in order to have initial value equal to 1000. The equity line has been calculated for the strategy working on daily frequency, investing 20% of capital in position on each asset with rebalancing every 5 trading days. Trading from the beginning of 2014 to the end of 2017 has been simulated, with the assumption of fee equal to 0.25% of the position value.

Fig. 9

The equity lines of the strategy selected by all the methods for HGFCBF – in-sampleHGF - High Grade Copper Futures, CBF - Crude Oil Brent Futures, ES, EHC, GD, DEM - equity line of the median strategy resulted from respectively Exhaustive Search, Extended Hill Climbing, Grid Method and Differential Evolution Method. Prices of the both assets have been normalized in order to have initial value equal to 1000. The equity line has been calculated for the strategy working on daily frequency, investing 20% of capital in position on each asset with rebalancing every 5 trading days. Trading from the beginning of 1998 to the end of 2013 has been simulated, with the assumption of fee equal to 0.25% of the position value.
The equity lines of the strategy selected by all the methods for HGFCBF – in-sampleHGF - High Grade Copper Futures, CBF - Crude Oil Brent Futures, ES, EHC, GD, DEM - equity line of the median strategy resulted from respectively Exhaustive Search, Extended Hill Climbing, Grid Method and Differential Evolution Method. Prices of the both assets have been normalized in order to have initial value equal to 1000. The equity line has been calculated for the strategy working on daily frequency, investing 20% of capital in position on each asset with rebalancing every 5 trading days. Trading from the beginning of 1998 to the end of 2013 has been simulated, with the assumption of fee equal to 0.25% of the position value.

Fig. 10

The histograms of the reached optimization criterion and the execution time of EHC for HGFCBF – in-sampleOC - optimization criterion calculated as 100 * (%ARC * %ARC) / (%ASD * %MDD). The optimization criterion has been calculated from the sample of 1000 independent algorithm executions. The strategies have been working on the daily frequency, investing 20% of capital in position on each asset with rebalancing every 5 trading days. Trading from the beginning of 1998 to the end of 2013 has been simulated, with the assumption of fee equal to 0.25% of the position value.
The histograms of the reached optimization criterion and the execution time of EHC for HGFCBF – in-sampleOC - optimization criterion calculated as 100 * (%ARC * %ARC) / (%ASD * %MDD). The optimization criterion has been calculated from the sample of 1000 independent algorithm executions. The strategies have been working on the daily frequency, investing 20% of capital in position on each asset with rebalancing every 5 trading days. Trading from the beginning of 1998 to the end of 2013 has been simulated, with the assumption of fee equal to 0.25% of the position value.

Fig. 11

The histograms of the reached optimization criterion and the execution time of DEM for HGFCBF – in-sampleOC - optimization criterion calculated as 100 * (%ARC * %ARC) / (%ASD * %MDD). The optimization criterion has been calculated from the sample of 1000 independent algorithm executions. The strategies have been working on the daily frequency, investing 20% of capital in position on each asset with rebalancing every 5 trading days. Trading from the beginning of 1998 to the end of 2013 has been simulated, with the assumption of fee equal to 0.25% of the position value.
The histograms of the reached optimization criterion and the execution time of DEM for HGFCBF – in-sampleOC - optimization criterion calculated as 100 * (%ARC * %ARC) / (%ASD * %MDD). The optimization criterion has been calculated from the sample of 1000 independent algorithm executions. The strategies have been working on the daily frequency, investing 20% of capital in position on each asset with rebalancing every 5 trading days. Trading from the beginning of 1998 to the end of 2013 has been simulated, with the assumption of fee equal to 0.25% of the position value.

Fig. 12

The equity lines of the strategies selected by the different methods for HGFCBF – out-of-sampleSPX - S&P500 Index, DAX - Deutscher Aktienindex, ES, EHC, GD, DEM - equity line of the median strategy resulted from respectively Exhaustive Search, Extended Hill Climbing, Grid Method and Differential Evolution Method. Prices of the both assets have been normalized in order to have initial value equal to 1000. The equity line has been calculated for the strategy working on daily frequency, investing 20% of capital in position on each asset with rebalancing every 5 trading days. Trading from the beginning of 2014 to the end of 2017 has been simulated, with the assumption of fee equal to 0.25% of the position value.
The equity lines of the strategies selected by the different methods for HGFCBF – out-of-sampleSPX - S&P500 Index, DAX - Deutscher Aktienindex, ES, EHC, GD, DEM - equity line of the median strategy resulted from respectively Exhaustive Search, Extended Hill Climbing, Grid Method and Differential Evolution Method. Prices of the both assets have been normalized in order to have initial value equal to 1000. The equity line has been calculated for the strategy working on daily frequency, investing 20% of capital in position on each asset with rebalancing every 5 trading days. Trading from the beginning of 2014 to the end of 2017 has been simulated, with the assumption of fee equal to 0.25% of the position value.

Fig. 13

The boxplot of the optimization criterion of strategies selected by the machine learning methods, as a percentage of the global maxima found by the Exhaustive SearchThe samples were denoted by the algorithm acronym and the number of trading case, so 1, 2 and 3 stands for respectively SPXDAX, AAPLMSFT and HGFCBF. The box plots present the empirical distribution quartiles and highlight the outliers. Half of the observations are inside the corresponding box, when the line inside marks the median. The observation was considered as an outlier and marked by a circle if the distance from both first and third quartile (from the nearest side of the box) was higher than 1.5 interquartile range. The range of observations, without outliers was marked by the whiskers. That type of box plot was often called the Turkey Box Plot. It was worth to notice that the box plots of the Grid Method results were just a line because the results of that method were deterministic.
The boxplot of the optimization criterion of strategies selected by the machine learning methods, as a percentage of the global maxima found by the Exhaustive SearchThe samples were denoted by the algorithm acronym and the number of trading case, so 1, 2 and 3 stands for respectively SPXDAX, AAPLMSFT and HGFCBF. The box plots present the empirical distribution quartiles and highlight the outliers. Half of the observations are inside the corresponding box, when the line inside marks the median. The observation was considered as an outlier and marked by a circle if the distance from both first and third quartile (from the nearest side of the box) was higher than 1.5 interquartile range. The range of observations, without outliers was marked by the whiskers. That type of box plot was often called the Turkey Box Plot. It was worth to notice that the box plots of the Grid Method results were just a line because the results of that method were deterministic.

Fig. 14

The boxplot of machine learnings methods’ computation time empirical distributionThe samples were denoted by the algorithm acronym and the number of trading case, so 1, 2 and 3 stands for respectively SPXDAX, AAPLMSFT and HGFCBF. The box plots presents the empirical distribution quartiles and highlight the outliers. A half of the observations are inside the corresponding box, when the line inside marks the median. The observation was considered as an outlier and marked by a circle if the distance from both first and third quartile (from the nearest side of the box) was higher than 1.5 interquartile range. The range of observations, without the outliers was marked by the whiskers. That type of box plot was often called the Turkey Box Plot. It was worth to notice that the box plots of Grid Method results were just a line because the results of that method were deterministic.
The boxplot of machine learnings methods’ computation time empirical distributionThe samples were denoted by the algorithm acronym and the number of trading case, so 1, 2 and 3 stands for respectively SPXDAX, AAPLMSFT and HGFCBF. The box plots presents the empirical distribution quartiles and highlight the outliers. A half of the observations are inside the corresponding box, when the line inside marks the median. The observation was considered as an outlier and marked by a circle if the distance from both first and third quartile (from the nearest side of the box) was higher than 1.5 interquartile range. The range of observations, without the outliers was marked by the whiskers. That type of box plot was often called the Turkey Box Plot. It was worth to notice that the box plots of Grid Method results were just a line because the results of that method were deterministic.

The summary of the reached optimization criterion and the execution time of methods for SPXDAX – in-sample

ESEHCGMDEM
OCTime [sec]OCTime [sec]OCTime [sec]OCTime [sec]
Minimum77.7935562.1765.5811.8777.16128.0471.9313.11
1st Quantile77.7935562.1774.3913.9377.16128.0477.1624.84
Median77.7935562.1777.1630.9777.16128.0477.1631.15
Mean77.7935562.1775.9443.177.16128.0477.3442.73
2nd Quantile77.7935562.1777.1665.3277.16128.0477.7961.5
Max77.7935562.1777.79569.3977.16128.0477.79141.08
Standard deviation0.000.002.6148.770.000.000.3624.06

The median strategies parameters and statistics resulted from all the methods for SPXDAX

In-sampleOut-of-sample
ESEHCGMDEMESEHCGMDEM
k160.00100.00100.00100.0060.00100.00100.00100.00
k245.0035.0035.0035.0045.0035.0035.0035.00
k1.265.0045.0045.0045.0065.0045.0045.0045.00
k2.275.0085.0085.0085.0075.0085.0085.0085.00
%ARC4.273.923.923.92-0.03-0.62-0.62-0.62
%ASD5.174.634.634.634.023.743.743.74
IR0.830.850.850.85-0.01-0.17-0.17-0.17
%MDD4.534.304.304.307.206.346.346.34
OC77.7977.1677.1677.160.00-1.62-1.62-1.62

The descriptive statistics of the considered assets

In-sampleOut-of-sample
SPXDAXAAPLMSFTHGFCBFSPXDAXAAPLMSFTHGFCBF
%ARC3.924.7935.226.329.4112.149.678.0722.6825.70-0.62-11.01
%ASD20.3924.9746.6933.0628.4834.5411.9418.3722.2721.4319.2533.07
IR0.190.190.750.190.330.350.810.441.021.2-0.03-0.33
%MDD56.7872.6843.8071.6568.3773.4814.1629.2730.4518.0542.4775.83

Mean and median optimization criterion reached by the different methods, referred to the ES method in percent – in-sample

ESGridEHC medianDEM medianEHC meanDEM mean
SPXDAX10099.1999.1999.1997.6299.42
AAPLMSFT100100.00100.00100.0099.5599.92
HGFCBF10088.9488.9488.9489.6590.74

The median strategy parameters and statistics resulted from all the methods for AAPLMSFT

In-sampleOut-of-sample
ESEHCGMDEMESEHCGMDEM
k150.0050.0050.0050.0050.0050.0050.0050.00
k260.0060.0060.0060.0060.0060.0060.0060.00
k1.275.0075.0075.0075.0075.0075.0075.0075.00
k2.240.0040.0040.0040.0040.0040.0040.0040.00
%ARC17.7917.7917.7917.791.371.371.371.37
%ASD11.1311.1311.1311.135.685.685.685.68
IR1.601.601.601.600.240.240.240.24
%MDD7.717.717.717.719.549.549.549.54
OC368.83368.83368.83368.833.493.493.493.49

The median strategies parameters and statistics resulted from all the methods for HGFCBF

In-sampleOut-of-sample
k1ES 60.00EHC 60.00GM 60.00DEM 60.00ES 60.00EHC 60.00GM 60.00DEM 60.00
k275.0075.0075.0075.0075.0075.0075.0075.00
k1.250.0030.0030.0030.0050.0030.0030.0030.00
k2.225.0095.0095.0095.0025.0095.0095.0095.00
%ARC8.189.539.539.53-1.596.606.607.38
%ASD8.169.839.839.837.098.178.178.07
IR1.000.970.970.97-0.220.810.810.91
%MDD7.519.529.529.5215.8612.1612.1612.16
OC109.1197.0497.0497.04-2.2443.8143.8155.49

The summary of the reached optimization criterion and the execution time of the methods for HGFCBF – in-sample

ESEHCGMDEM
OCTime [sec]OCTime [sec]OCTime [sec]OCTime [sec]
Minimum109.1142193.5777.2912.2997.04113.7597.049,76
1st Quantile109.1142193.5793.6214.7397.04113.7597.0419.92
Median109.1142193.5797.0433.0997.04113.7597.0423.08
Mean109.1142193.5797.8242.6997.04113.7599.0127.34
2nd Quantile109.1142193.57109.1136.8597.04113.7597.0429.15
Max109.1142193.57109.11622.1797.04113.75109.11110.93
Standard deviation0.000.008.5951.280.000.004.4612.5

The summary of the reached optimization criterion and the execution time of methods for AAPLMSFT – in-sample

ESEHCGMDEM
OCTime [sec]OCTime [sec]OCTime [sec]OCTime [sec]
Minimum368.8332821.18301.3411.96368.83150.66274.9711.82
1st Quantile368.8332821.18368.8314368.83150.66368.8319.67
Median368.8332821.18368.8318.18368.83150.66368.8322.19
Mean368.8332821.18367.1627.4368.83150.66368.5522.71
2nd Quantile368.8332821.18368.8332.97368.83150.66368.8325.27
Max368.8332821.18368.83174.06368.83150.66368.8345.8
Standard deviation0.000.005.5118.710.000.005.144.52

Mean and median computation time of the methods, referred to the ES method in percent

ESGridEHC medianDEM medianEHC meanDEM mean
SPXDAX1000.350.080.090.120.12
AAPLMSFT1000.460.060.070.080.07
HGFCBF1000.270.080.050.100.06