Fusion Strategies for Multi-Class Stock Movement Prediction: Balancing Temporal, Spatial, and Tabular Models
Abstract
Accurate short-horizon stock-movement forecasting remains a central problem in computational finance, where even small directional errors can accumulate into significant trading risk. The most challenging regime is the neutral state— intervals with minor price changes that are easily masked by noise. To address this challenge, we compare three complementary learning paradigms and their combinations across multiple lookback horizons for three representative equities (AAPL, GOOG, TSLA). We evaluate Long Short-Term Memory (LSTM) networks for temporal dynamics, Convolutional Neural Networks (CNNs) on polar-transformed price images for spatial pattern extraction, and XGBoost on tabular technical indicators for structured feature learning.
Empirical results (Appendices A–C) reveal distinct horizon-dependent behaviors: CNNs excel at ultra-short windows (W = 1–3) with perfect accuracy and neutral-F1 ≈ 1.00 but deteriorate rapidly as horizons lengthen; LSTMs gain overall accuracy with longer windows (W = 30–60o ) but lose sensitivity to neutral segments; and XGBoost remains the most stable single model, maintaining accuracy ≈ 0.89–0.93, low loss ≈ 0.4–0.6, and neutral-F1 ≈ 0.89–0.96 across assets.
Building on these complementary patterns, we propose fusion frameworks that integrate CNN and XGBoost outputs through weighted voting, cascaded thresholds, and probability-smoothed blending. The best configuration—probability-smoothed fusion—achieves roughly a 3–4 percentage-point improvement in neutral-F1 over the strongest standalone model while preserving comparable accuracy and calibration loss. The LSTM is retained solely as a benchmark to illustrate sequencemodel trade-offs and is not included in the fusion.
Together, the results demonstrate that combining spatial and tabular perspectives yields more balanced recognition of neutral states without sacrificing directional accuracy. Accuracy measures overall correctness, loss captures probabilistic calibration, and F1 quantifies class-wise precision–recall balance. Viewed jointly, these metrics show that CNN–XGBoost fusion produces smoother and more interpretable predictions across assets and horizons. Such stability can reduce overtrading during ambiguous market phases, improving risk-adjusted decision-making in algorithmic trading strategies.
Keywords
Download Options
Introduction
Predicting short-horizon stock direction remains a central challenge in financial modeling and algorithmic trading. Near-term price movements are affected by volatility, microstructure effects, and external shocks, producing highly non-stationary data that obscure clear predictive patterns [7]. Among the three outcome classes—up, down, and neutral—the neutral state is the most difficult to detect because its price changes are small and easily masked by random market noise. Misclassifying neutral periods as directional often results in unnecessary trades, increasing turnover and reducing risk-adjusted returns.
Conclusion
This study investigated short-horizon stock movement prediction with a specific focus on accurately identifying the neutral class across multiple forecasting intervals. Three complementary modeling approaches were analyzed: Convolutional Neural Networks (CNNs) using polar-transformed price images, Long Short-Term Memory (LSTM) networks for sequence learning, and XGBoost trained on structured financial indicators. Results across AAPL, GOOG, and TSLA (Appendices A–F) show that each method exhibits distinct behavior depending on the prediction window, motivating the proposed hybrid design.
The CNN demonstrated exceptional performance at ultra-short windows (W = 1–3), achieving perfect accuracy and NeutralF1 scores near 1.00. However, its effectiveness declined sharply as the temporal window expanded due to increasing spatial complexity in the polar representation. The LSTM exhibited the opposite trend: accuracy improved with longer horizons (W = 30–60), yet sensitivity to neutral patterns weakened, leading to directional overbias. XGBoost delivered the most stable and well-balanced performance across all settings, maintaining accuracy between approximately 0.89 and 0.93, loss values near 0.4–0.6, and consistently high Neutral-F1 scores between 0.89 and 0.96. This made XGBoost the most reliable standalone approach.
To leverage complementary strengths, three fusion strategies combining CNN and XGBoost outputs were evaluated: weighted soft voting, cascaded thresholds, and probability-smoothed blending. Among these, probability-smoothed fusion yielded the most robust results, improving Neutral-F1 by approximately 3–4 percentage points relative to the strongest single model while maintaining comparable accuracy and calibration. The LSTM remained part of the study only as a conceptual benchmark to highlight contrast with non-sequential architectures and to help interpret horizon-dependent trade-offs.
Across all experiments, accuracy, loss, and F1 score provided complementary evaluation perspectives. Accuracy measured overall correctness, loss captured probability calibration and confidence alignment, and F1 quantified the precision–recall balance essential for neutral prediction. Together, these metrics show that the CNN–XGBoost fusion approach produces more stable and interpretable forecasts across assets and horizons, particularly for applications where avoiding false directional signals is important.
From an applied standpoint, improved neutral detection can reduce unnecessary trades, thereby lowering transaction costs and improving risk-adjusted returns in algorithmic trading systems. The stability of XGBoost and the incremental yet consistent improvement from fusion methods provide practical insight: combining spatial and tabular representations mitigates overconfidence during volatile regimes, while sequence models remain relevant for broader directional forecasting tasks.
Several limitations warrant further study. The experiments were conducted on three individual equities; extending evaluation to diversified assets, market indices, and higher-frequency intraday data would strengthen generalization. Macroeconomic, sentiment, and options-derived variables were not incorporated and may provide additional predictive value. Finally, fusion weights were static; adaptive, regime-aware ensembles may better capture dynamic market behavior.
Future work may explore:
1) Incorporating external signals such as sentiment, macroeconomic indicators, or options-derived features into the polarCNN architecture;
2) Testing alternative spatial encodings to preserve structure at longer windows;
3) Developing adaptive or self-tuning fusion systems responsive to volatility shifts; and
4) Extending the framework to portfolio-level or multi-asset forecasting.
Advancing these directions may establish the CNN–XGBoost hybrid as a generalizable and interpretable method for robust financial time-series prediction across diverse market environments.