Loading and Exploring CBCIC dataset using bciflow
The bciflow library provides convenient tools for working with EEG datasets for Brain-Computer Interface (BCI) research. In this tutorial, we will focus on loading and exploring the CBCIC dataset using bciflow.
Objectives of this Tutorial
Learn how to load EEG data from CBCIC dataset using bciflow
Understand the structure of the dataset
Print and interpret key dataset components such as EEG signals, labels, and metadata
1. Installation
pip install bciflow
Note
Ensure you are using Python 3.7 or higher.
2. Loading the Dataset
from bciflow.datasets.CBCIC import cbcic
dataset = cbcic(subject=1, path='data/cbcic/')
Note
This command loads the dataset for subject 1 and stores it in a dictionary called dataset.
Ensure the dataset is available at data/cbcic/ or adjust the path accordingly.
3. Exploring the Dataset Contents
Let’s explore what’s inside this dataset. We will print different keys of the dictionary to understand the data structure.
3.1 EEG Signals: dataset[“X”]
print(dataset["X"])
This prints the EEG signals organized as a 4D array:
trials: how many repetitions (epochs) of the task were recorded
frequency_bands: for each trial, the signals are filtered in different frequency bands (if applicable)
channels: each electrode in the EEG cap used
time_samples: the EEG signal over time (in samples)
Example shape: (120, 1, 12, 4096) → 120 trials, 1 frequency band, 12 electrodes, 4096 time samples.
If the frequency is 512Hz, it means that there are 4096 samples in 8 seconds
3.2 Labels per Trial: dataset[“y”]
print(dataset["y"])
[0, 0, 0, ..., 1, 1, 1]3.3 Class Meaning: dataset[“y_dict”]
print(dataset["y_dict"])
{'left-hand': 0, 'right-hand': 1}3.4 Events: dataset[“events”]
print(dataset["events"])
{'get_start': [0, 3],
'beep_sound': [2],
'cue': [3, 8],
'task_exec': [3, 8]}
This tells us when each event happened (in seconds) during data collection. Useful to segment the signals around specific events
3.5 Channel Names: dataset[“ch_names”]
print(dataset["ch_names"])
['F3', 'FC3', 'C3', 'CP3', 'P3', 'FCz', 'CPz', 'P4', 'FC4', 'C4', 'CP4', 'P4']3.6 Sampling Frequency: dataset[“sfreq”]
print(dataset["sfreq"])
Returns the sampling frequency in Hz (e.g., 512.0). This tells us how many samples per second were recorded.
3.7 Start Time: dataset[“tmin”]
print(dataset["tmin"])
0.0).4. Dataset Structure Summary
Key |
Description |
Example |
|---|---|---|
|
EEG data (trials × bands × channels × time) |
shape (120, 1, 12, 4096) |
|
Labels for each trial |
[0, 0, 0, …] |
|
Class mapping |
{‘left-hand’: 0, ‘right-hand’: 1} |
|
Event timestamps |
{‘get_start’: […]} |
|
Channel names |
[‘F3’, ‘FC3’, ‘C3’, …] |
|
Sampling frequency (Hz) |
512.0 |
|
Start time (seconds) |
0.0 |
5. Complete Example Code
from bciflow.datasets.CBCIC import cbcic
dataset = cbcic(subject=1, path='data/cbcic/')
print("EEG signals shape:", dataset["X"].shape)
print("Labels:", dataset["y"])
print("Class dictionary:", dataset["y_dict"])
print("Events:", dataset["events"])
print("Channel names:", dataset["ch_names"])
print("Sampling frequency (Hz):", dataset["sfreq"])
print("Start time (s):", dataset["tmin"])