Opening and processing actigraphy data#
Welcome to circStudio! This introduction provides an overview of how to load actigraphy data using the Raw class and demonstrates how adaptor subclasses can be used to convert common actigraphy file formats into Raw instances.
We begin the tutorial by importing circStudio, Numpy, Pandas and os:
import circstudio as cs
import numpy as np
import pandas as pd
import os
import plotly.io as pio
pio.renderers.default = "notebook"
Reading actigraphy files using the Raw class directly#
The preferred method for loading actigraphy data in circStudio is to create a new instance of the Raw class. This requires the actigraphy data to be preformatted as a table containing an activity and light series.
pyActigraphy, circStudio decouples the computation of actigraphy metrics from the Raw class, which is used exclusively for preprocessing the actigraphy signal, allowing users greater flexibility. For instance, users may define custom preprocessing pipelines or apply circStudio metrics to other types of time series data, such as skin temperature.
Unlike pyActigraphy, circStudio decouples the computation of actigraphy metrics from the Raw class, which is used exclusively for preprocessing the actigraphy signal, allowing users greater flexibility. For instance, users may define custom preprocessing pipelines or apply circStudio metrics to other types of time series data, such as skin temperature.
To illustrate the creation of a Raw object, we first create a new pd.DataFrame filled with random data sampled from a normal distribution. First, we initialize a default random number generator (RNG):
rng = np.random.default_rng()
Next, we generate synthetic actigraphy data by creating random time series for activity and light. These values are indexed using a DatetimeIndex to simulate a seven-day recording period with a sampling frequency of sixty seconds. This step is intended for demonstration purposes and can be skipped if the user already has actigraphy data stored in a pandas.DataFrame:
# Define the number of samples for seven days of data at a 60s interval
n = 1440*7 # 1440 minutes/day x 7 days
# Generate a random time series to simulate activity counts
activity = rng.normal(loc=10,scale=1, size=n)
# Generate a random time series to simulate light exposure (in lux)
light = rng.normal(loc=100, scale=10, size=n)
# Create a datetime index starting on January 1st, 2025, with 60s intervals
index = pd.date_range(start='01-01-2025', freq='60s', periods=n)
# Store the synthetic data into a pd.DataFrame
data = pd.DataFrame(index=index,
data={
'Activity': activity,
'Light': light
})
# Display the resulting DataFrame
data
| Activity | Light | |
|---|---|---|
| 2025-01-01 00:00:00 | 9.972924 | 101.290872 |
| 2025-01-01 00:01:00 | 11.201004 | 106.171440 |
| 2025-01-01 00:02:00 | 9.341955 | 90.995576 |
| 2025-01-01 00:03:00 | 9.973207 | 87.974919 |
| 2025-01-01 00:04:00 | 9.714965 | 98.316874 |
| ... | ... | ... |
| 2025-01-07 23:55:00 | 11.685443 | 98.549054 |
| 2025-01-07 23:56:00 | 11.018233 | 103.698525 |
| 2025-01-07 23:57:00 | 10.162152 | 93.505352 |
| 2025-01-07 23:58:00 | 9.488018 | 106.932148 |
| 2025-01-07 23:59:00 | 11.765320 | 97.512196 |
10080 rows × 2 columns
Finally, we import the Rawclass from circStudio.io and create a new Raw instance using the synthetic data. We specify the dataframe (df), activity (activity) and light (light) time serie, start time (start_time), total duration (period), and sampling frequency (frequency):
# Create a new Raw instance
raw = cs.io.Raw(
df=data, # pd.DataFrame
activity=data['Activity'], # Activity time series
light=data['Light'], # Light time series
start_time=data.index[0], # Start time
period=(data.index[-1]-data.index[0]), # Total duration
frequency=data.index.freq # Sampling frequency
)
Reading actigraphy files using adaptor subclasses#
To simplify the data import process, circStudio includes several adaptor subclasses for commonly used actigraphy file formats. These adaptors enable users to easily convert supported file types into Raw instances.
Raw class is designed to be format-agnostic and flexible. Users can implement custom functions to convert data from other actigraphy file formats for which circStudio does not natively have an adaptor, as long as the resulting data is compatible with the structure expected by the Raw class. As observed in the previous section, a Raw object requires the user to provide a DataFrame containing all the data (df), activity (activity) and light (light) time series, start time (start_time), total duration (period`), and sampling frequency (frequency).
In the following example, we open a .txt file generated by an ActTrust (Condor Instruments) actigraphy device. To access example files included with circStudio, we construct a file path using os.path:
fpath = os.path.join(os.path.dirname(cs.__file__))
This retrieves the directory where circStudio is installed by referencing its __file__ attribute, which stores the path from which circStudio was imported. os.path.dirname extracts the directory name from that path.
Next, we create a new Raw instance using the auxiliary function read_atr, which is located in circStudio.io:
raw = cs.io.read_atr(os.path.join(fpath, 'data', 'test_sample_atr.txt'))
#ActLogModel=2.0.0, which should should be excluded when importing data. The skip_rows optional parameter allows users to skip lines during import:
circStudio.io.read_atr(os.path.join(fpath, 'data', 'test_sample_atr.txt'), skip_rows=1)
Each adaptor subclass can be accessed using a helper function with the format read_XXX, where XXX indicates the file format. Besides read_atr, the available adaptor functions are:
# Actiwatch
raw = cs.io.read_awd(os.path.join(fpath, 'data', 'example_01.AWD'))
# Actigraph
raw = cs.io.read_agd(os.path.join(fpath, 'data', 'test_sample.agd'))
# Daqtometer
raw = cs.io.read_dqt(os.path.join(fpath, 'data', 'test_sample_dqt.csv'))
# Multi-Ethnic Study of Atherosclerosis (MESA)
raw = cs.io.read_mesa(os.path.join(fpath, 'data', 'test_sample_mesa.csv'))
# Respironics
raw = cs.io.read_rpx(os.path.join(fpath, 'data', 'test_sample_rpx_eng.csv'))
# Tempatilumi
raw = cs.io.read_tal(os.path.join(fpath, 'data', 'test_sample_tal.txt'))
Retrieving information from Raw instances#
After converting an original actigraphy file into a Raw object, users may access the underlying data and respective metadata. Below are some examples; for a complete description of available attributes and methods, please refer to the API.
circStudio is not imported.
Activity (
pd.Series):
raw.activity
DATE/TIME
1918-01-01 09:00:00 1851
1918-01-01 09:01:00 2683
1918-01-01 09:02:00 5260
1918-01-01 09:03:00 8
1918-01-01 09:04:00 4
...
1918-01-05 08:55:00 0
1918-01-05 08:56:00 1271
1918-01-05 08:57:00 543
1918-01-05 08:58:00 837
1918-01-05 08:59:00 4817
Freq: min, Name: PIM, Length: 5760, dtype: int64
Interactive activity plot:
raw.plot(mode='activity', log=False)
Light (
pd.Series):
raw.light
DATE/TIME
1918-01-01 09:00:00 384.44
1918-01-01 09:01:00 324.20
1918-01-01 09:02:00 320.28
1918-01-01 09:03:00 312.00
1918-01-01 09:04:00 309.69
...
1918-01-05 08:55:00 0.01
1918-01-05 08:56:00 0.01
1918-01-05 08:57:00 0.02
1918-01-05 08:58:00 0.02
1918-01-05 08:59:00 0.03
Freq: min, Name: LIGHT, Length: 5760, dtype: float64
# Create a new figure and axis
raw.plot(mode='light', log=True)
Plot of the temperature signal (extracted from
raw.df):
raw.plot(ts='TEMPERATURE', log='True')
First five rows of the
pd.DataFrame:
raw.df.head()
| MS | EVENT | TEMPERATURE | EXT TEMPERATURE | ORIENTATION | PIM | PIMn | TAT | TATn | ZCM | ZCMn | LIGHT | AMB LIGHT | RED LIGHT | GREEN LIGHT | BLUE LIGHT | IR LIGHT | UVA LIGHT | UVB LIGHT | STATE | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| DATE/TIME | ||||||||||||||||||||
| 1918-01-01 09:00:00 | 0 | 0 | 29.38 | 28.19 | 0 | 1851 | 30.850000 | 109 | 1.816670 | 41 | 0.683333 | 384.44 | 155.78 | 51.36 | 73.89 | 37.99 | 13.36 | 0.0 | 0.0 | 0 |
| 1918-01-01 09:01:00 | 0 | 0 | 29.47 | 28.25 | 0 | 2683 | 44.716700 | 134 | 2.233330 | 64 | 1.066670 | 324.20 | 131.37 | 43.02 | 62.64 | 31.88 | 11.46 | 0.0 | 0.0 | 0 |
| 1918-01-01 09:02:00 | 0 | 0 | 29.56 | 28.38 | 0 | 5260 | 87.666700 | 199 | 3.316670 | 171 | 2.850000 | 320.28 | 129.78 | 41.71 | 62.46 | 31.69 | 11.69 | 0.0 | 0.0 | 0 |
| 1918-01-01 09:03:00 | 0 | 0 | 29.75 | 28.56 | 0 | 8 | 0.133333 | 1 | 0.016667 | 0 | 0.000000 | 312.00 | 126.43 | 41.20 | 60.67 | 30.30 | 11.16 | 0.0 | 0.0 | 0 |
| 1918-01-01 09:04:00 | 0 | 0 | 29.98 | 28.75 | 0 | 4 | 0.066667 | 0 | 0.000000 | 0 | 0.000000 | 309.69 | 125.49 | 40.63 | 60.21 | 30.14 | 11.19 | 0.0 | 0.0 | 0 |
Acquisition frequency:
raw.frequency
Timedelta('0 days 00:01:00')
Duration of the data acquisition period:
raw.duration()
Timedelta('4 days 00:00:00')
Next Steps#
In the next section, we will explore data masking and resampling using methods provided by the Raw class.