Skip to content

Embedding

Time-delay embedding is the foundation of EDM. It transforms a scalar time series into a multidimensional representation that reconstructs the dynamics of the underlying system.

Given a time series x(t), the embedding vector at time t is:

[x(t), x(t - tau), x(t - 2*tau), ..., x(t - (E-1)*tau)]

where:

  • E (embedding dimension) — the number of lagged coordinates
  • tau (time delay) — the spacing between lags
from edmkit.embedding import lagged_embed
embedded = lagged_embed(x, tau=2, e=3)
# Each row is [x(t), x(t-2), x(t-4)]

The scan function performs a grid search over candidate values of E and tau, evaluating prediction skill using cross-validation:

from edmkit.embedding import scan, select
scores = scan(
x, None,
E=list(range(1, 11)),
tau=[1, 2, 3, 5],
)
best_E, best_tau, best_score = select(scores, E=list(range(1, 11)), tau=[1, 2, 3, 5])

scan returns a 3D array of shape (len(E), len(tau), K) where K is the maximum number of cross-validation folds. select picks the combination that maximizes mean - SE (standard error), which favors stable performance across folds.

scan accepts pluggable components:

  • split: Cross-validation strategy (default: sliding_folds)
  • predict: Prediction function (default: simplex_projection)
  • metric: Evaluation metric (default: mean_rho, i.e., Pearson correlation)
  • n_ahead: Prediction horizon (default: 1 step)