Introduction to Data Processing using bciflow

The bciflow library is designed for developing Brain-Computer Interface (BCI) systems in Python. It provides modular tools for data loading, preprocessing, feature extraction, feature selection, and classification of EEG signals.

In this tutorial, you’ll learn how to use bciflow to build a complete EEG analysis pipeline, applying a well-known BCI algorithm named FBCSP that uses techniques such as filterbank, CSP, logpower, MIBIF, and LDA.

Objectives of this Tutorial

  • Introduce the main functionalities of bciflow

  • Demonstrate how to load the CBCIC dataset

  • Apply correctly the pre-processing and post-processing parts of the pipeline

  • Ensure the correct execution of the created pipeline

  • Visualize the accuracy of the results

Prerequisites

  • Basic Python knowledge

  • Familiarity with EEG and BCI concepts is helpful, but not required

1. Installation

Install bciflow using pip:

pip install bciflow

Note

Ensure you are using Python 3.7 or higher.

2. Loading Data

We are using the CBCIC dataset (Clinical Brain-Computer Interface Challenge). Then load the data:

from bciflow.datasets.CBCIC import cbcic

dataset = cbcic(subject=1, path='data/cbcic/')

Note

Ensure the dataset is available at data/cbcic/ or adjust the path accordingly.

3. Preprocessing: Applying a Filterbank

To replicate the FBCSP algorithm, first start processing the data by using a filterbank to apply multiple bandpass filters and capture patterns in different frequency bands:

from bciflow.modules.tf.filterbank import filterbank

pre_folding = {'tf': (filterbank, {'kind_bp': 'chebyshevII'})}

4. Building the Post-processing Pipeline

After that, we can go to the next stage by adding, in order, the stages of the algorithm:

  1. sf: Common Spatial Patterns (CSP) - maximizes discriminative variance

  2. fe: logpower - extracts logarithmic power of filtered signals

  3. fs: MIBIF - selects 8 best features based on mutual information

  4. clf: LDA classifier - classifies data

from bciflow.modules.sf.csp import csp
from bciflow.modules.fe.logpower import logpower
from bciflow.modules.fs.mibif import MIBIF
from sklearn.discriminant_analysis import LinearDiscriminantAnalysis as lda

sf = csp()
fe = logpower
fs = MIBIF(8, clf=lda())
clf = lda()

pos_folding = {
    'sf': {sf, ()},
    'fe': {fe, ()},
    'fs': {fs, ()},
    'clf': {clf, ()}
}

5. Running the Pipeline

Now we just need to run the pipeline with k-fold cross-validation. We define the window of study starting 0.5 seconds after the cue:

from bciflow.modules.core.kfold import kfold

results = kfold(
    target=dataset,
    start_window=dataset['events']['cue'][0] + 0.5,
    pre_folding=pre_folding,
    pos_folding=pos_folding
)

6. Displaying Raw Results

Display a table of the results:

print(results)

7. Analyzing Performance Metrics

To better visualize the processed data, we can calculate the accuracy:

import pandas as pd
from bciflow.modules.analysis.metric_functions import accuracy

df = pd.DataFrame(results)
acc = accuracy(df)

print(f"Accuracy: {acc:.4f}")

8. Complete Pipeline Code

Here is the entire pipeline code:

from bciflow.datasets.CBCIC import cbcic
from bciflow.modules.core.kfold import kfold
from bciflow.modules.tf.filterbank import filterbank
from bciflow.modules.sf.csp import csp
from bciflow.modules.fe.logpower import logpower
from bciflow.modules.fs.mibif import MIBIF
from sklearn.discriminant_analysis import LinearDiscriminantAnalysis as lda
import pandas as pd
from bciflow.modules.analysis.metric_functions import accuracy

dataset = cbcic(subject=1, path='data/cbcic/')

pre_folding = {'tf': {filterbank, {'kind_bp': 'chebyshevII'}}}

sf = csp()
fe = logpower
fs = MIBIF(8, clf=lda())
clf = lda()

pos_folding = {
    'sf': {sf, ()},
    'fe': {fe, ()},
    'fs': {fs, ()},
    'clf': {clf, ()}
}

results = kfold(
    target=dataset,
    start_window=dataset['events']['cue'][0] + 0.5,
    pre_folding=pre_folding,
    pos_folding=pos_folding
)

df = pd.DataFrame(results)
acc = accuracy(df)
print(f"Accuracy: {acc:.4f}")

Note

The pipeline structure makes the analysis reproducible, standardized, and automated. Feel free to experiment by changing parameters or modules to explore new approaches.