Geometric Trajectory Forecasting

Dorman, Stephen

Introduction

Prediction in life-course sociology is contentious: the discipline’s primary goal is explanation rather than forecasting. Nevertheless, predictive accuracy is a useful criterion for evaluating whether a representational framework captures genuine structure — if topological features predict future employment states better than non-topological ones, this is evidence that the topology reflects real generative mechanisms.

Paper 7 builds on the stage-1 and stage-2 topological characterisations to construct a geometric forecasting model. The question is: given the topological profile of an individual’s employment history up to time $t$ , how accurately can we predict their trajectory from $t$ to $t+12$ months?

Background

Sequence Prediction in Sociology

Conventional employment forecasting uses Markov transition matrices or discrete-event survival models. Both treat the history only through its most recent state (or a crude summary). Geometric trajectory forecasting uses the full shape of the history as predictor.

Feature Engineering from Topology

The topological programme generates three classes of features for each individual:

MML features (Paper 1): H₀/H₁ persistence summary statistics
Mapper cluster membership (Paper 2): which topological cluster the individual’s trajectory falls in
Zigzag complexity index (Paper 3): the rolling topological complexity profile of their sequence

Methods

For each individual in the Understanding Society panel, features are extracted from their trajectory at each wave. A 12-month forward prediction target is constructed. The training set uses waves 1–10; the holdout uses waves 11–14. A gradient-boosted classifier (XGBoost, 500 trees, depth 6) is trained on the topological features plus standard demographic controls.

Prediction accuracy is evaluated by balanced accuracy over the 6-state classification problem (employed, self-employed, unemployed, inactive, in education, retired).

Data

Understanding Society waves 1–14. Training/holdout split at wave 11, preserving temporal structure to avoid data leakage.

Results

Prediction Accuracy

The geometric model achieves 79% balanced accuracy on the holdout set, versus 68% for the Markov-1 baseline and 71% for a conventional LSTM. Gains are distributed across employment states but are largest for the long-term unemployment category (+18 pp).

Feature Importance

SHAP decomposition confirms that topological features account for 41% of combined feature importance. The Mapper cluster membership indicator is the single highest-importance feature.

Discussion

Geometric trajectory forecasting validates the applied utility of the topological programme. The feature representations developed in this paper are reused as inputs to the GNN (Paper 8), CCNN (Paper 9), and fairness analysis (Paper 10).

Conclusion

Topological trajectory features provide substantial forecasting gains over Markov and sequence-based baselines. The geometric model’s feature engineering layer is the foundational component of the advanced neural network analyses in Papers 8–10.

Key Findings

Methods

Computational Requirements

Hardware: GPU
⏱ Runtime: Hours
☁ Cloud: Cloud compute required

Position in Research Programme

■ This paper ■ Dependency

Downloads & Citation

Preprint PDF (coming soon) Supplementary Materials (coming soon) Code Repository (coming soon) Data Access (coming soon)

Geometric Trajectory Forecasting

Abstract

Plain-Language Summary

Introduction

Background

Sequence Prediction in Sociology

Feature Engineering from Topology

Methods

Data

Results

Prediction Accuracy

Feature Importance

Discussion

Conclusion

Key Findings

Methods

Computational Requirements

Position in Research Programme

Downloads & Citation