Attention Layers#

SpikingJelly 的 注意力层 提供了用于深度脉冲神经网络的注意力机制实现,包括用于卷积 SNN 的注意力层和用于脉冲 Transformers 的注意力层。 有关脉冲 Transformer 的更多信息,见 脉冲Transformer构建、训练和改进


SpikingJelly's attention layers provide attention mechanisms for deep Spiking Neural Networks, including attention layers for convolutional SNNs and Spiking Transformers. For more information about Spiking Transformers, see Spiking Transformer Construction, Training, and Improvements .

class spikingjelly.activation_based.layer.attention.TemporalWiseAttention(T: int, reduction: int = 16, dimension: int = 4)[源代码]#

基类:Module, MultiStepModule

API Language: 中文 | English


  • 中文

Temporal-Wise Attention Spiking Neural Networks for Event Streams Classification 中提出 的TemporalWiseAttention层。TemporalWiseAttention层必须放在二维卷积层之后脉冲神经元之前,例如:

Conv2d -> TemporalWiseAttention -> LIF

输入的尺寸是 [T, N, C, H, W] 或者 [T, N, L] ,经过TemporalWiseAttention层,输出为 [T, N, C, H, W] 或者 [T, N, L]

reduction 是压缩比,相当于论文中的 \(r\)

参数:
  • T (int) -- 输入数据的时间步长

  • reduction (int) -- 压缩比

  • dimension (int) -- 输入数据的维度。当输入数据为[T, N, C, H, W]时, dimension = 4;输入数据维度为[T, N, L]时,dimension = 2。


  • English

The TemporalWiseAttention layer is proposed in Temporal-Wise Attention Spiking Neural Networks for Event Streams Classification.

It should be placed after the convolution layer and before the spiking neurons, e.g.,

Conv2d -> TemporalWiseAttention -> LIF

The dimension of the input is [T, N, C, H, W] or [T, N, L] , after the TemporalWiseAttention layer, the output dimension is [T, N, C, H, W] or [T, N, L] .

reduction is the reduction ratio,which is \(r\) in the paper.

参数:
  • T (int) -- timewindows of input

  • reduction (int) -- reduction ratio

  • dimension (int) -- Dimensions of input. If the input dimension is [T, N, C, H, W], dimension = 4; when the input dimension is [T, N, L], dimension = 2.

返回:

None

返回类型:

None

forward(x_seq: Tensor)[源代码]#
class spikingjelly.activation_based.layer.attention.MultiDimensionalAttention(T: int, C: int, reduction_t: int = 16, reduction_c: int = 16, kernel_size=3)[源代码]#

基类:Module, MultiStepModule

API Language: 中文 | English


  • 中文

Attention Spiking Neural Networks 中提出 的MA-SNN模型以及MultiStepMultiDimensionalAttention层。

您可以从以下链接中找到MA-SNN的示例项目: - MA-SNN/MA-SNN - ridgerchu/SNN_Attention_VGG

输入的尺寸是 [T, N, C, H, W] ,经过MultiStepMultiDimensionalAttention层,输出为 [T, N, C, H, W]

参数:
  • T (int) -- 输入数据的时间步长

  • C (int) -- 输入数据的通道数

  • reduction_t (int) -- 时间压缩比

  • reduction_c (int) -- 通道压缩比

  • kernel_size (int) -- 空间注意力机制的卷积核大小


  • English

The MA-SNN model and MultiStepMultiDimensionalAttention layer are proposed in Attention Spiking Neural Networks.

You can find the example projects of MA-SNN in the following links: - MA-SNN/MA-SNN - ridgerchu/SNN_Attention_VGG

The dimension of the input is [T, N, C, H, W] , after the MultiStepMultiDimensionalAttention layer, the output dimension is [T, N, C, H, W] .

参数:
  • T (int) -- timewindows of input

  • C (int) -- channel number of input

  • reduction_t (int) -- temporal reduction ratio

  • reduction_c (int) -- channel reduction ratio

  • kernel_size (int) -- convolution kernel size of SpatialAttention

返回:

None

返回类型:

None

forward(x: Tensor)[源代码]#
class spikingjelly.activation_based.layer.attention.SpikingSelfAttention(dim, num_heads=8, backend: str = 'torch')[源代码]#

基类:Module, MultiStepModule

API Language: 中文 | English


  • 中文

Spikformer: When Spiking Neural Network Meets Transformer 中提出的 Spiking Self Attention 层。本模块在 Spikformer源代码 的基础上做了改进,显著提高了运行效率。关于 Spikformer 和本模块实现方式的更多信息, 参见教程 脉冲Transformer构建、训练和改进

本模块的输入是尺寸为 [T, N, C, L] 的脉冲张量,其中 T 是时间步数, N 是 batch size ,C 是 channel 数量,L 是 token 数量 (对于视觉任务, L=H*W )。 输出是尺寸为 [T, N, C, L] 的脉冲张量。

参数:
  • dim (int) -- channel 数量

  • num_heads (int) -- 多头自注意力的头数量,默认为 8

  • backend (str) -- 本模块内部神经元使用的后端,默认为 torch


  • English

Spiking Self-Attention layer proposed in Spikformer: When Spiking Neural Network Meets Transformer. This module is implemented based on Spikformer source code with several improvements that significantly enhance efficiency. For more details about Spikformer and the implementation of this module, please refer to the tutorial Spiking Transformer Construction, Training, and Improvements.

The input to this module is a spike tensor of shape [T, N, C, L], where T denotes the number of time steps, N is the batch size, C is the number of channels, and L is the number of tokens (for vision tasks, L = H * W). The output is a spiking tensor with the same shape [T, N, C, L].

参数:
  • dim (int) -- number of channels

  • num_heads (int) -- number of heads in multi-head self-attention. Default: 8

  • backend (str) -- backend used by the internal neurons of this module. Default: torch

返回:

None

返回类型:

None

property backend#

一旦设置,本模块中所有神经元的后端都会被同样地设置。

Once set, the backend of all the neurons in this module will also be changed.

forward(x_seq: Tensor)[源代码]#
参数:

x_seq (torch.Tensor) -- shape=[T, N, C, L]

返回:

shape=[T, N, C, L]

返回类型:

torch.Tensor

class spikingjelly.activation_based.layer.attention.QKAttention(dim: int, num_heads: int = 8, qka_type: str = 'token', backend: str = 'torch')[源代码]#

基类:Module, MultiStepModule

API Language: 中文 | English


  • 中文

QKFormer: Hierarchical Spiking Transformer using Q-K Attention 中提出的 Q-K Attention 层。本模块在 QKFormer源代码 的基础上做了改进,显著提高了运行效率;改进思路与 Spikformer 类似,见教程 脉冲Transformer构建、训练和改进

本模块的输入是尺寸为 [T, N, C, L] 的脉冲张量,其中 T 是时间步数, N 是 batch size ,C 是 channel 数量,L 是 token 数量 (对于视觉任务, L=H*W )。 输出是尺寸为 [T, N, C, L] 的脉冲张量。

参数:
  • dim (int) -- channel 数量

  • num_heads (int) -- 多头自注意力的头数量,默认为 8

  • qka_type (str) -- QKAttention的类型,可选值为 tokenchannel。默认为 token,生成逐token的掩码

  • backend (str) -- 本模块内部神经元使用的后端,默认为 torch


  • English

Q-K Attention layer proposed in QKFormer: Hierarchical Spiking Transformer using Q-K Attention. This module is implemented based on the QKFormer source code, with several improvements that significantly enhance efficiency. The improvement strategy is similar to that used in Spikformer; see the tutorial Spiking Transformer Construction, Training, and Improvements for details.

The input to this module is a spike tensor of shape [T, N, C, L], where T denotes the number of time steps, N is the batch size, C is the number of channels, and L is the number of tokens (for vision tasks, L = H * W). The output is a spiking tensor with the same shape [T, N, C, L].

参数:
  • dim (int) -- number of channels.

  • num_heads (int) -- number of heads in multi-head self-attention. Default: 8.

  • qka_type (str) -- type of QKAttention. Available options are token and channel. The default is token, which generates a token-wise mask.

  • backend (str) -- backend used by the internal neurons of this module. Default: torch.

返回:

None

返回类型:

None

property backend#

一旦设置,本模块中所有神经元的后端都会被同样地设置。

Once set, the backend of all the neurons in this module will also be changed.

property qka_type#

只读。构造时设置,随后不可修改。

Read-only. Set when constructing, and cannot be modified afterwards.

forward(x_seq)[源代码]#
参数:

x_seq (torch.Tensor) -- shape=[T, N, C, L]

返回:

shape=[T, N, C, L]

返回类型:

torch.Tensor

class spikingjelly.activation_based.layer.attention.TokenQKAttention(dim: int, num_heads: int = 8, backend: str = 'torch')[源代码]#

基类:QKAttention

QKAttention(..., qka_type="token") . See QKAttention .

参数:
  • dim (int) -- 输入维度

  • num_heads (int) -- 注意力头数

  • backend (str) -- 后端

  • dim -- Input dimension

  • num_heads -- Number of attention heads

  • backend -- Backend

返回:

None

返回类型:

None

class spikingjelly.activation_based.layer.attention.ChannelQKAttention(dim: int, num_heads: int = 8, backend: str = 'torch')[源代码]#

基类:QKAttention

QKAttention(..., qka_type="channel") . See QKAttention .

参数:
  • dim (int) -- 输入维度

  • num_heads (int) -- 注意力头数

  • backend (str) -- 后端

  • dim -- Input dimension

  • num_heads -- Number of attention heads

  • backend -- Backend

返回:

None

返回类型:

None