Preface 1Introduction 31 Market Data 91.1 Tick and bar data 91.2 Corporate actions and adjustment factor 101.3 Linear vs log returns 112 Forecasting 132.1 Data for forecasts 142.1.1 Point-in-time and lookahead 152.1.2 Security master and survival bias 162.1.3 Fundamental and accounting data 162.1.4 Analyst estimates 172.1.5 Supply chain and competition 182.1.6 M&A and risk arbitrage 182.1.7 Event-based predictors 182.1.8 Holdings and flows 192.1.9 News and social media 202.1.10 Macroeconomic data 212.1.11 Alternative data 212.1.12 Alpha capture 212.2 Technical forecasts 222.2.1 Mean reversion 222.2.2 Momentum 242.2.3 Trading volume 242.2.4 Statistical predictors 252.2.5 Data from other asset classes 252.3 Basic concepts of statistical learning 272.3.1 Mutual information and Shannon entropy 282.3.2 Likelihood and Bayesian inference 322.3.3 Mean square error and correlation 332.3.4 Bias-variance tradeoff 352.3.5 PAC learnability, VC dimension, and generalization error bounds 362.4 Machine learning 402.4.1 Types of machine learning 412.4.2 Overfitting 432.4.3 Ordinary and generalized least squares 442.4.4 Deep learning 462.4.5 Types of neural networks 482.4.6 Nonparametric methods 512.4.7 Cross-validation 542.4.8 Curse of dimensionality, eigenvalue cleaning, and shrinkage 562.4.9 Smoothing and regularization 612.4.9.1 Smoothing spline 622.4.9.2 Total variation denoising 622.4.9.3 Nadaraya-Watson kernel smoother 632.4.9.4 Local linear regression 642.4.9.5 Gaussian process 642.4.9.6 Ridge and kernel ridge regression 672.4.9.7 Bandwidth and hypertuning of kernel smoothers 682.4.9.8 Lasso regression 692.4.10 Generalization puzzle of deep and overparameterized learning 692.4.11 Online machine learning 742.4.12 Boosting 752.4.13 Randomized learning 792.4.14 Latent structure 802.4.15 No free lunch and AutoML 812.4.16 Computer power and machine learning 832.5 Dynamical modeling 872.6 Alternative reality 892.7 Timeliness-significance tradeoff 902.8 Grouping 912.9 Conditioning 922.10 Pairwise predictors 922.11 Forecast for securities from their linear combinations 932.12 Forecast research vs simulation 953 Forecast Combining 973.1 Correlation and diversification 983.2 Portfolio combining 993.3 Mean-variance combination of forecasts 1023.4 Combining features vs combining forecasts 1033.5 Dimensionality reduction 1043.5.1 PCA, PCR, CCA, ICA, LCA, and PLS 1053.5.2 Clustering 1073.5.3 Hierarchical combining 1083.6 Synthetic security view 1083.7 Collaborative filtering 1093.8 Alpha pool management 1103.8.1 Forecast development guidelines 1113.8.1.1 Point-in-time data 1113.8.1.2 Horizon and scaling 1113.8.1.3 Type of target return 1123.8.1.4 Performance metrics 1123.8.1.5 Measure of forecast uncertainty 1123.8.1.6 Correlation with existing forecasts 1123.8.1.7 Raw feature library 1133.8.1.8 Overfit handling 1133.8.2 Pnl attribution 1143.8.2.1 Marginal attribution 1143.8.2.2 Regression-based attribution 1144 Risk 1174.1 Value at risk and expected shortfall 1174.2 Factor models 1194.3 Types of risk factors 1204.4 Return and risk decomposition 1214.5 Weighted PCA 1224.6 PCA transformation 1234.7 Crowding and liquidation 1244.8 Liquidity risk and short squeeze 1264.9 Forecast uncertainty and alpha risk 1275 Trading Costs 1295.1 Slippage 1305.2 Impact 1305.2.1 Empirical observations 1325.2.2 Linear impact model 1335.2.3 Impact arbitrage 1355.3 Cost of carry 1356 Portfolio Construction 1376.1 Hedged allocation 1376.2 Single-period vs multi-period mean-variance utility 1396.3 Single-name multi-period optimization 1406.3.1 Optimization with fast impact decay 1416.3.2 Optimization with exponentially decaying impact 1426.3.3 Optimization conditional on a future position 1436.3.4 Position value and utility leak 1456.3.5 Optimization with slippage 1466.4 Multi-period portfolio optimization 1486.4.1 Unconstrained portfolio optimization with linear impact costs 1496.4.2 Iterative handling of factor risk 1506.4.3 Optimizing future EMA positions 1516.4.4 Portfolio optimization using utility leak rate 1516.4.5 Notes on portfolio optimization with slippage 1526.5 Portfolio capacity 1526.6 Portfolio optimization with forecast revision 1536.7 Portfolio optimization with forecast uncertainty 1566.8 Kelly criterion and optimal leverage 1576.9 Intraday optimization and execution 1606.9.1 Trade curve 1606.9.2 Forecast-timed execution 1616.9.3 Algorithmic trading and HFT 1626.9.4 HFT controversy 1667 Simulation 1697.1 Simulation vs production 1707.2 Simulation and overfitting 1717.3 Research and simulation efficiency 1727.4 Paper trading 1737.5 Bugs 173Afterword: Economic and Social Aspects of Quant Trading 179Appendix 183A1 Secmaster mappings 183A2 Woodbury matrix identities 184A3 Toeplitz matrix 185Index 187Questions index 195Quotes index 197Stories index 199
MICHAEL ISICHENKO, PhD, is a theoretical physicist and a quantitative portfolio manager who worked at Kurchatov Institute, University of Texas, University of California, SAC Capital Advisors, Société Générale, and Jefferies. He received his doctorate in physics and mathematics from the Moscow Institute of Physics and Technology and is an expert in plasma physics, nonlinear dynamics, and statistical and chaos theory.