spikingjelly.datasets.base module#

class spikingjelly.datasets.base.NeuromorphicDatasetFolder(root, train=None, data_type='event', frames_number=None, split_by=None, duration=None, custom_integrate_function=None, custom_integrated_frames_dir_name=None, transform=None, target_transform=None)[源代码]#

基类：DatasetFolder

API Language - 中文 | English

中文

SpikingJelly 神经形态数据集的基类。用户可以通过继承此类并实现所有抽象方法来定义新的数据集。用户可以参考 DVS128Gesture。

用户可以通过设置参数来控制数据格式：

如果 data_type == 'event'：数据集中的每个样本是一个字典，其键为 ['t', 'x', 'y', 'p']，值为 numpy.ndarray。
如果 data_type == 'frame' 且 frames_number 不为 None：事件将积分到固定帧数的帧中。 split_by 定义如何分割事件。详见 cal_fixed_frames_number_segment_index。
如果 data_type == 'frame' 且 duration 不为 None：事件将积分到每帧固定时间时长的帧中。结果序列的长度彼此不同。
如果 data_type == 'frame' 且 custom_integrate_function 不为 None：事件将通过用户定义的函数进行积分，并保存到 root 目录下的 custom_integrated_frames_dir_name 目录中。详见 Neuromorphic Datasets Processing。

数据集准备过程包括以下步骤：

参数检查。这由 NeuromorphicDatasetConfig 完成。
准备 原始数据集。
1. 数据集文件下载到 root/download （如果支持）并验证。
2. 下载的文件提取到 root/extract
3. 提取的数据转换为统一的原始事件格式（例如 .npz ）并保存到 raw_root。
将原始数据集转换为 处理后的数据集。

根据与 data_type 和相关参数对应的最终数据集格式，将原始事件数据转换为处理后的数据集。此过程由 NeuromorphicDatasetBuilder 完成。处理后的数据集保存到自动生成的目录 processed_root。
加载处理后的数据集。通过继承 DatasetFolder 并使用其 __getitem__() 。

参数:

root (Union[str, Path]) -- 数据集的根路径
train (Optional[bool]) -- 是否使用训练集。对于提供训练/测试划分的数据集，设置为 True 或 False，例如 DVS128 Gesture。如果数据集不提供训练/测试划分，例如 CIFAR10-DVS，请设置为 None 并使用 split_to_train_test_set 函数来获取训练/测试集
data_type (str) -- "event" 或 "frame"
frames_number (Optional[int]) -- 积分帧的数量
split_by (Optional[str]) -- "time" 或 "number"
duration (Optional[int]) -- 每帧的时间时长，其单位与特定数据集的时间单位相同
custom_integrate_function (Optional[Callable]) -- 一个用户定义的函数，其输入为 events, H, W。events 是一个键为 ['t', 'x', 'y', 'p']、值为 numpy.ndarray 的字典。 H 是数据的高度，W 是数据的宽度。例如，对于 DVS128 Gesture 数据集，H=128 和 W=128。应返回积分后的帧序列（np.ndarray）。
custom_integrated_frames_dir_name (Optional[str]) -- 用于保存通过 custom_integrate_function 积分帧的目录名称。如果 None，则设置为 custom_integrate_function.__name__
transform (Optional[Callable]) -- 一个函数/转换器，接收样本并返回转换后的版本。例如图像的 transforms.RandomCrop。
target_transform (Optional[Callable]) -- 一个函数/转换器，接收目标并对其进行转换。

English

The base class for SpikingJelly's neuromorphic datasets. Users can define a new dataset by inheriting this class and implementing all abstract methods. Users can refer to DVS128Gesture.

Users can control data formats by setting arguments:

If data_type == 'event': each sample is a dict whose keys are ['t', 'x', 'y', 'p'] and values are numpy.ndarray.
If data_type == 'frame' and frames_number is not None: events are integrated to a fixed number of frames. split_by defines how to split the events. See cal_fixed_frames_number_segment_index for more details.
If data_type == 'frame' and duration is not None: events are integrated with a fixed duration for each frame. The resulting sequences can have different lengths.
If data_type == 'frame' and custom_integrate_function is not None: events are integrated by the user-defined function and saved to the custom_integrated_frames_dir_name directory in root. See Neuromorphic Datasets Processing for more details.

Dataset preparation process consists of the following steps:

Arguments check. This is done by NeuromorphicDatasetConfig.
Prepare the raw dataset.
1. Dataset files are downloaded to root/download (if supported) and verified.
2. Downloaded files are extracted to root/extract
3. Extracted data are converted into a unified raw event format (e.g., .npz) and saved to raw_root.
Convert the raw dataset to the processed dataset.

The raw event data are converted into the final dataset format according to data_type and related parameters. This process is done by NeuromorphicDatasetBuilder. Processed dataset is saved to a auto-generated directory processed_root.
Load the processed dataset. This is done by inheriting DatasetFolder and using its __getitem__().

参数:

root (Union[str, Path]) -- root path of the dataset
train (Optional[bool]) -- whether use the train set. Set to True or False for those datasets provide train/test division, e.g., DVS128 Gesture. If the dataset does not provide train/test division, e.g., CIFAR10-DVS, please set to None and use split_to_train_test_set function to get train/test set
data_type (str) -- "event" or "frame"
frames_number (Optional[int]) -- the number of integrated frames
split_by (Optional[str]) -- "time" or "number"
duration (Optional[int]) -- the time duration of each frame, whose unit is the same as the time unit of the specific dataset
custom_integrate_function (Optional[Callable]) -- a user-defined function whose inputs are events, H, W. events is a dict whose keys are ['t', 'x', 'y', 'p'] and values are numpy.ndarray. H is the height of the data and W is the weight of the data. For example, H=128 and W=128 for the DVS128 Gesture dataset. The integrated frame sequence (np.ndarray) should be returned.
custom_integrated_frames_dir_name (Optional[str]) -- The name of directory for saving the frames integrating by custom_integrate_function. If None, it will be set to custom_integrate_function.__name__
transform (Optional[Callable]) -- a function/transform that takes in a sample and returns a transformed version. E.g, transforms.RandomCrop for images.
target_transform (Optional[Callable]) -- a function/transform that takes in the target and transforms it.

property raw_root: Path#

API Language - 中文 | English

中文

原始数据集的根目录。

原始数据集 作为原始数据集的中间和统一表示。处理后的数据集是基于原始数据集生成的。

返回:: 默认为 root/events_np
返回类型:: Path

English

Root directory of the raw dataset.

Raw dataset serves as an intermediate and unified representation of the original dataset. Processed dataset is generated based on the raw dataset.

返回:: default to root/events_np
返回类型:: Path

prepare_raw_dataset()[源代码]#

API Language - 中文 | English

中文

准备 原始数据集。

此方法确保原始数据集存在于 raw_root 下。如果不存在，则按顺序执行以下步骤：

将数据集文件下载到 root/download （如果支持）或验证现有下载。
通过调用 extract_downloaded_files() 将下载的文件提取到 root/extract 中。
通过调用 create_raw_from_extracted() 将提取的数据转换为原始数据集，并将原始数据集保存到 raw_root。

English

Prepare the raw dataset.

This method ensures that the raw dataset exists under raw_root. If not, it performs the following steps sequentially:

Download dataset files to root/download (if supported) or verify existing downloads.
Extract downloaded files into root/extract by calling extract_downloaded_files().
Convert extracted data into raw dataset by calling create_raw_from_extracted(), and save the raw dataset to raw_root.

get_dataset_builder()[源代码]#

API Language - 中文 | English

中文

根据配置创建数据集构建器。

构建器定义了**如何将原始数据集转换为最终处理后的数据集**。根据 data_type 和相关参数选择特定的构建器。

返回:: 数据集构建器实例。
返回类型:: NeuromorphicDatasetBuilder

English

Create a dataset builder according to the configuration.

The builder defines how raw dataset are converted into the final processed dataset. The specific builder is selected based on data_type and related parameters.

返回:: A dataset builder instance.
返回类型:: NeuromorphicDatasetBuilder

get_root_when_train_is_none(_root)[源代码]#

API Language - 中文 | English

中文

当 train 为 None 时确定处理后的数据集的目录。

此方法用于不提供预定义的训练/测试划分的数据集。子类可以覆盖此方法以实现自定义目录布局。

参数:: _root (Path) -- 处理后的数据集的根目录。
返回:: 由 DatasetFolder 使用的处理后的数据集的目录。
返回类型:: Path

English

Determine the directory of the processed dataset when train is None.

This method is used for datasets that do not provide a predefined train/test split. Subclasses may override this method to implement custom directory layouts.

参数:: _root (Path) -- root directory of the processed dataset.
返回:: directory of the processed dataset used by DatasetFolder.
返回类型:: Path

classmethod get_extensions()[源代码]#

API Language - 中文 | English

中文

返回处理后的数据集样本的有效文件扩展名。

这些扩展名将传递给 DatasetFolder 以识别有效的数据文件。

返回:: 支持的文件扩展名元组, 当前为 ('.npy', '.npz')。
返回类型:: Tuple[str]

English

Return valid file extensions for processed dataset samples.

These extensions are passed to DatasetFolder to identify valid data files.

返回:: tuple of supported file extensions, currently ('.npy', '.npz').
返回类型:: Tuple[str]

abstractmethod classmethod get_H_W()[源代码]#

API Language - 中文 | English

中文

返回:: 一个元组 (H, W), 其中 H 是数据的高度, W 是数据的宽度。例如, 对于 DVS128 Gesture 数据集, 此函数返回 (128, 128)。
返回类型:: Tuple[int]

English

返回:: a tuple (H, W), where H is the height of the data and W is the width of the data. For example, this function returns (128, 128) for the DVS128 Gesture dataset.
返回类型:: Tuple[int]

abstractmethod classmethod resource_url_md5()[源代码]#

API Language - 中文 | English

中文

返回:: 一个列表 url, 其中 url[i] 是一个元组, 包含第 i 个数据文件的文件名、下载链接和 MD5。
返回类型:: list

English

返回:: a list url where url[i] is a tuple containing the i-th file's name, download link, and MD5 checksum.
返回类型:: list

abstractmethod classmethod downloadable()[源代码]#

API Language - 中文 | English

中文

返回:: 数据集是否可以通过 Python 代码直接下载。若返回 False, 则需要用户手动下载。
返回类型:: bool

English

返回:: whether the dataset can be downloaded directly by Python code. If False, users need to download it manually.
返回类型:: bool

abstractmethod classmethod extract_downloaded_files(download_root, extract_root)[源代码]#

API Language - 中文 | English

中文

定义如何解压已下载的数据文件。

参数:

download_root (Path) -- 保存已下载数据文件的根目录。
extract_root (Path) -- 保存解压后文件的根目录。

English

Define how downloaded dataset files are extracted.

参数:

download_root (Path) -- root directory that stores downloaded dataset files.
extract_root (Path) -- root directory that stores files extracted from the downloaded archives.

abstractmethod classmethod create_raw_from_extracted(extract_root, raw_root)[源代码]#

API Language - 中文 | English

中文

定义如何将 extract_root 中的解压数据转换为原始数据集格式, 并保存到 raw_root。

参数:

extract_root (Path) -- 保存解压后文件的根目录。
raw_root (Path) -- 保存转换后原始数据集文件的根目录。

English

Define how to convert the extracted dataset in extract_root to the raw dataset format and save the converted files to raw_root.

参数:

extract_root (Path) -- root directory where extracted files are saved.
raw_root (Path) -- root directory where converted raw dataset files are saved.

class spikingjelly.datasets.base.NeuromorphicDatasetBuilder(cfg, raw_root)[源代码]#

基类：ABC

API Language - 中文 | English

中文

神经形态数据集构建器的抽象基类。

数据集构建器定义了原始事件数据如何转换为可以被 DatasetFolder 加载的处理后的数据集。每个构建器封装了一种具体的预处理策略（例如：事件数据、固定帧数积分、固定时长积分）。

构建器负责：

确定处理后的数据集的保存目录。
如果处理后的文件不存在，则创建它们。
为 torchvision.datasets.DatasetFolder 提供加载器函数。

子类应实现抽象方法 build_impl()、get_loader() 和属性 processed_root。

参数:

cfg (NeuromorphicDatasetConfig) -- 数据集配置
raw_root (Path) -- 原始数据集的根目录。构建器将读取该目录中的数据。

English

Abstract base class for neuromorphic dataset builders.

A dataset builder defines how raw event data are converted into a processed dataset that can be loaded by DatasetFolder. Each builder encapsulates one concrete preprocessing strategy (e.g., event data, fixed-frame integration, fixed-duration integration).

The builder is responsible for:

Determining the directory where the processed dataset is saved.
Creating processed files if they do not already exist.
Providing a loader function for torchvision.datasets.DatasetFolder.

Subclasses should implement the abstract methods build_impl(), get_loader() and property processed_root.

参数:

cfg (NeuromorphicDatasetConfig) -- dataset configuration.
raw_root (Path) -- root directory of the raw dataset. The builder will read data from this directory.

abstract property processed_root: Path#

API Language - 中文 | English

中文

处理后的数据集的根目录。

该目录存储由构建器定义的预处理步骤的输出。

English

Root directory of the processed dataset.

This directory stores the output of the preprocessing step defined by the builder. :return: 处理后的数据集的根目录 :rtype: Path

build()[源代码]#

API Language - 中文 | English

中文

必要时构建处理后的数据集。

如果处理后的数据集目录已存在，该方法将跳过预处理。否则，它将调用 build_impl() 来生成处理后的文件。

返回:: 一个元组 (processed_root, loader)。processed_root 由属性 processed_root 定义，loader 是一个加载单个样本的函数。
返回类型:: Tuple[Path, Callable]

English

Build the processed dataset if necessary.

If the processed dataset directory already exists, this method skips preprocessing. Otherwise, it invokes build_impl() to generate processed files.

返回:: a tuple (processed_root, loader). processed_root is defined by property processed_root . loader is a function that loads individual samples.
返回类型:: Tuple[Path, Callable]

abstractmethod build_impl()[源代码]#

API Language - 中文 | English

中文

实现数据集特定的预处理逻辑。

此方法定义了原始数据如何转换为处理后的数据集文件，并保存到 processed_root 下。

子类必须实现此方法。

English

Implement dataset-specific preprocessing logic.

This method defines how raw data are transformed into processed dataset files and saved under processed_root.

Subclasses must implement this method.

返回类型:: None

abstractmethod get_loader()[源代码]#

API Language - 中文 | English

中文

为处理后的数据集文件返回一个加载器函数。

返回的可调用对象应加载单个处理后的文件并返回对应的样本。它将被传递给 DatasetFolder。

返回:: 加载处理后的数据集文件的函数
返回类型:: Callable

English

Return a loader function for processed dataset files.

The returned callable should load a single processed file and return the corresponding sample. It will be passed to DatasetFolder .

返回:: a loader function that returns a single sample from a processed file
返回类型:: Callable

class spikingjelly.datasets.base.EventBuilder(cfg, raw_root)[源代码]#

基类：NeuromorphicDatasetBuilder

API Language - 中文 | English

中文

原始事件数据的数据集构建器。

此构建器不执行任何预处理，直接使用原始数据集作为处理后的数据集。每个样本通过 np.load 直接加载为原始事件文件（例如 .npz），无需帧积分。

通常，当 data_type == "event" 时使用此构建器。

English

Dataset builder for raw event data.

This builder performs no preprocessing and directly uses the raw dataset as the processed dataset. Each sample is loaded directly by np.load as a raw event file (e.g., .npz) without frame integration.

Typically, this builder is used when data_type == "event".

参数:

cfg (NeuromorphicDatasetConfig) -- 数据集配置
raw_root (Path) -- 原始数据的根目录
cfg -- Dataset configuration
raw_root -- Root directory of the raw data

build_impl()[源代码]#

返回类型:: None

build()[源代码]#

API Language - 中文 | English

中文

直接使用原始数据集目录作为处理后的数据集目录，不做额外处理。

返回:: 元组 (processed_root, loader), 其中 processed_root 为原始数据集目录, loader 为 np.load。
返回类型:: Tuple[Path, Callable]

English

Use the raw dataset directory as the processed dataset directory directly without any additional preprocessing.

返回:: a tuple (processed_root, loader), where processed_root is the raw dataset directory and the loader is np.load.
返回类型:: Tuple[Path, Callable]

property processed_root: Path#

get_loader()[源代码]#

返回类型:: Callable

class spikingjelly.datasets.base.FrameFixedNumberBuilder(cfg, raw_root, H, W)[源代码]#

基类：NeuromorphicDatasetBuilder

API Language - 中文 | English

中文

固定帧数积分的数据集构建器。

此构建器将原始事件数据转换为每个样本固定数量的帧。根据指定的策略（按时间或按事件计数）将事件分割并积分到帧中。

当 data_type == "frame" 且 frames_number 被指定时使用此构建器。

参数:

H (int) -- 输出帧的高度。
W (int) -- 输出帧的宽度。
cfg (NeuromorphicDatasetConfig)
raw_root (Path)

其他参数与 NeuromorphicDatasetBuilder 中的相同。

English

Dataset builder for fixed-frame-number integration.

This builder converts raw event data into a fixed number of frames per sample. Events are split according to the specified strategy (by time or by event count) and integrated into frames.

It is used when data_type == "frame" and frames_number is specified.

参数:

H (int) -- height of the output frames.
W (int) -- width of the output frames.
cfg (NeuromorphicDatasetConfig)
raw_root (Path)

Other arguments are the same as those in NeuromorphicDatasetBuilder.

build_impl()[源代码]#

返回类型:: None

property processed_root: Path#

get_loader()[源代码]#

返回类型:: Callable

class spikingjelly.datasets.base.FrameFixedDurationBuilder(cfg, raw_root, H, W)[源代码]#

基类：NeuromorphicDatasetBuilder

API Language - 中文 | English

中文

固定时长积分的数据集构建器。

此构建器将原始事件数据转换为帧序列，其中每帧对应固定的时间时长。不同样本的长度可能不同。

当 data_type == "frame" 且 duration 被指定时使用此构建器。

参数:

H (int) -- 输出帧的高度。
W (int) -- 输出帧的宽度。
cfg (NeuromorphicDatasetConfig)
raw_root (Path)

其他参数与 NeuromorphicDatasetBuilder 中的相同。

English

Dataset builder for fixed-duration integration.

This builder converts raw event data into frame sequences where each frame corresponds to a fixed time duration. Different samples may have different lengths.

It is used when data_type == "frame" and duration is specified.

参数:

H (int) -- height of the output frames.
W (int) -- width of the output frames.
cfg (NeuromorphicDatasetConfig)
raw_root (Path)

Other arguments are the same as those in NeuromorphicDatasetBuilder.

build_impl()[源代码]#

返回类型:: None

property processed_root: Path#

get_loader()[源代码]#

返回类型:: Callable

class spikingjelly.datasets.base.FrameCustomIntegrateBuilder(cfg, raw_root, H, W)[源代码]#

基类：NeuromorphicDatasetBuilder

API Language - 中文 | English

中文

自定义事件到帧积分的数据集构建器。

此构建器应用用户定义的积分函数将原始事件数据转换为帧序列。生成的帧保存在用户指定的目录下。请参阅 Neuromorphic Datasets Processing 了解如何定义自定义积分函数。

当 data_type == "frame" 且 custom_integrate_function 被指定时使用此构建器。

参数:

H (int) -- 输出帧的高度。
W (int) -- 输出帧的宽度。
cfg (NeuromorphicDatasetConfig)
raw_root (Path)

其他参数与 NeuromorphicDatasetBuilder 中的相同。

English

Dataset builder for custom event-to-frame integration.

This builder applies a user-defined integration function to convert raw event data into frame sequences. The resulting frames are saved on disk under a user-specified directory. Refer to Neuromorphic Datasets Processing for the way to define a custom integration function.

It is used when data_type == "frame" and custom_integrate_function is specified.

参数:

H (int) -- height of the output frames.
W (int) -- width of the output frames.
cfg (NeuromorphicDatasetConfig)
raw_root (Path)

Other arguments are the same as those in NeuromorphicDatasetBuilder.

build_impl()[源代码]#

返回类型:: None

property processed_root: Path#

get_loader()[源代码]#

返回类型:: Callable

class spikingjelly.datasets.base.NeuromorphicDatasetConfig(root, train, data_type='event', frames_number=None, split_by=None, duration=None, custom_integrate_function=None, custom_integrated_frames_dir_name=None, transform=None, target_transform=None)[源代码]#

基类：object

API Language - 中文 | English

中文

神经形态数据集的配置容器。

该数据类封装了所有用户指定的选项，用于控制神经形态数据集的准备、处理和加载方式。它是**不可变的**，并且在**初始化时进行验证**。

English

Configuration container for neuromorphic datasets.

This dataclass encapsulates all user-specified options that control how a neuromorphic dataset is prepared, processed, and loaded. It is immutable, and is validated upon initialization.

参数:

root (Path)
train (bool | None)
data_type (str)
frames_number (int | None)
split_by (str | None)
duration (int | None)
custom_integrate_function (Callable | None)
custom_integrated_frames_dir_name (str | None)
transform (Callable | None)
target_transform (Callable | None)

root: Path#

train: bool | None#

data_type: str = 'event'#

frames_number: int | None = None#

split_by: str | None = None#

duration: int | None = None#

custom_integrate_function: Callable | None = None#

custom_integrated_frames_dir_name: str | None = None#

transform: Callable | None = None#

target_transform: Callable | None = None#