Singular Spectrum Analysis (SSA)#

Import required libraries#

We begin this tutorial by importing the required libraries:

import circstudio as cs
import plotly.graph_objs as go
import numpy as np
import os
import plotly.io as pio
pio.templates.default = "plotly_white"
pio.renderers.default = "notebook"

Open a sample file:

# Create file path for sample files within the circStudio package
fpath = os.path.join(os.path.dirname(cs.__file__), 'data')

raw = cs.io.read_awd(
    os.path.join(fpath, 'example_01.AWD'), 
    start_time='1918-01-24 08:00:00',
    period='9 days'
) 

Theoretical background#

Singular Spectrum Analysis (SSA) is a data-driven method for decomposing a time series into interpretable components such as trends, oscillations (e.g., circadian, ultradian rhythms), and noise. Conceptually, SSA plays a role similar to principal component analysis (PCA), but it is specifically designed for time-series data.

Whereas cosinor analysis assumes a predefined sinusoidal structure, SSA does not assume any particular waveform. Instead, it allows rhythmic patterns to emerge directly from the data.

An SSA analysis can be understood in three main steps.

Embedding: constructing the Trajectory Matrix.#

The first step is to transform the one-dimensional time series into a matrix.

We choose a window length L, and then create overlapping segments (sliding windows) of the signal. Each column of the resulting matrix contains one window of length L, shifted by one time step relative to the previous column.

For example:

  • Column 1: y1, y2, …, yL
  • Column 2: y2, y3, …, yL+1
  • Column 3: y3, y4, …, yL+2

This matrix is called the trajectory matrix.

Intuitively, this step reorganizes the signal so that temporal structure (e.g., repeating oscillations) becomes visible as patterns across rows and columns.

In circStudio, you can create a trajectory matrix with by creating an SSA instance:

# Create a SSA instance with L=24h (typicall for human activities)
ssa = cs.SSA(raw.activity, window_length='24h')

Decomposition: Singular Value Decomposition (SVD)#

Once the trajectory is constructed, we apply Singular Value Decomposition (SVD).

For readers less familiar with linear algebra, SVD can be understood as a method that:

  • Finds the dominant patterns in the matrix.

  • Orders them by importance.

  • Separates signal structure from noise.

Mathematically, SVD decomposes the trajectory matrix X into: $$X = U \Sigma V^\tau$$

Conceptually, this decomposition converts the trajectory matrix X in three other matrices:

  • U: caracteristic time patterns.

  • Σ: importance (strength) of each pattern.

  • Vτ: how each pattern evolves across windows.

In the context of analyzing signals mainly with a circadian structure:

  • The largest components often correspond to the circadian rhythm.

  • Secondary components may reflect ultradian rhythms, trends or artifacts.

  • Small components typically capture noise.

In circStudio, the decomposition step is performed using the fit() method:

# Perform Singular Value Decomposition (SVD)
ssa.fit()

Reconstruction: Diagonal Averaging#

After decomposition, we can reconstruct selected components back into a time series.

However, recall that the signal was embedded into overlapping windows. Because of this, each time point appears multiple times in the trajectory matrix.

Specifically:

  • A given time point appears once in each window that contains it.

  • In the matrix, identical time points lie along diagonals (from top-right to bottom-left).

To convert the matrix representation back into a one-dimensional signal, we perform diagonal averaging. Essentially, for each diagonal of the reconstructed matrix, we average the values along that diagonal and assign the average to the corresponding time point. This step ensures that overlapping estimates or the same time point are combined into a single value.

In intuitive terms, embedding spreads the signal across overlapping windows (in order to measure which components explain most of the variance in the signal); diagonal averaging stitches those overlapping pieces back together.

Reconstruction step in circStudio#

After performing decomposition, one can run a quick sanity test by summing the partial variances, which should add up to one:

Partial variances#

# Calculate the sum of partial variances, which should add up to 1
ssa.variance_explained.sum()
np.float64(1.0)

Scree diagram#

Similar to PCA, SSA allows you to visualize the relative contribution of each component using a scree plot. This plot shows how much variance each SSA component explains, helping identify the dominant structures in the signal (e.g., circadian, ultradian, or trend components).

We begin by setting up the plotting layout and then visualize the explained variance:

# Create layout for Plotly figure
layout = go.Layout(
    height=600,
    width=800,
    title="Scree Plot of SSA Component Variance",
    xaxis=dict(
        title="Component Index (k)",
        type='linear',
        showgrid=True,
        gridwidth=1,
        gridcolor='lightgray',
        title_font = {"size": 18}),
    yaxis=dict(title=r'Explained variance (λₖ / Σλ)',
               type='linear',
               showgrid=True,
               gridwidth=1,
               gridcolor='lightgrey'),
    showlegend=False
)
# Create Scree plot
go.Figure(
    data=[go.Scatter(
        x=np.arange(0,len(ssa.variance_explained)+1),
        y=ssa.variance_explained)],
    layout=layout)

The scree plot indicates that a large proportion of the variance is captured by the first two components. The first component explains approximately 30% of the total variance (the trend). The second component accounts for an additional ~7%, after which there is a pronounced drop in explained variance (the “elbow” of the Scree plot).

In circadian data, the first two components often represent a dominant oscillatory mode (typically the circadian rhythm), while subsequent components may reflect secondary dynamics (e.g., ultradian processes) or noise. The long, low-amplitude tail suggests progressively smaller contributions from stochastic variability.

Overall, the spectrum supports the presence of a structured rhythmic signal rather than purely random fluctuations. We can now reconstruct the signal using the dominant components.

Reconstructing one or several components as a time series#

Using the method reconstruct_component(), you can reconstruct one or several grouped SSA components as a time series, using diagonal averaging.

The first component (k=0) corresponds to the trend:

# Construct time series using the first component (k=0)
trend = ssa.reconstruct_component(0)

You can also plot the subsequent components:

# Construct time series using second and third components (k=1,2)
components_12 = ssa.reconstruct_component([1,2])

# Construct time series using the fourth and fifth components (k=3,3)
components_34 = ssa.reconstruct_component([3,4])

Finally, you can visualize what each reconstructed component looks like:

# Define layout of the plot for the time series generated by the components
layout = go.Layout(
    height=600,
    width=1000,
    title="Reconstructed time series",
    xaxis=dict(title='Date Time'),
    yaxis=dict(title='Activity'),
    shapes=[],
    showlegend=True
)
# Construct time series generated by components
go.Figure(data=[
    go.Scatter(
        x=raw.activity.index,
        y=raw.activity,
        name='Activity'
    ),
    go.Scatter(
        x=raw.activity.index,
        y=trend,
        name='Trend'
    ),
    go.Scatter(
        x=raw.activity.index,
        y=trend+components_12,
        name='Circadian component'
    ),
    go.Scatter(
        x=raw.activity.index,
        y=trend+components_34,
        name='Ultradian component'
    )
], layout=layout)

Reconstructing actigraphy signal by summing several SSA components#

It is also possible to reconstruct the actigraphy signal by summing a subset of SSA components (e.g., circadian and ultradian components). The reconstruct_signal() method reconstructs multiple SSA components (each as a time series via diagonal averaging) and then adds them together to produce a combined reconstructed signal:

reconstructed_signal = ssa.reconstruct_signal([0,1,2,3,4,5,6])
# Plot reconstructed actigraphy signal based on the first seven components
go.Figure(data=[
    go.Scatter(
        x=raw.activity.index,
        y=raw.activity,
        name='Activity'
    ),
    go.Scatter(
        x=raw.activity.index,
        y=reconstructed_signal, 
        name='Reconstructed signal'
    )
], layout=go.Layout(
    height=600,
    width=1000,
    showlegend=True)
         )

Next steps#

For a more detailed mathematical treatment of SSA, we encourage readers to consult the original pyActigraphy tutorial here.

In the next tutorial, we will introduce (Multifractal) Detrended fluctuation analysis (MF-DFA), a complementary approach for characterizing scale-invariant and long-range correlation properties in actigraphy time series.