spikingjelly.activation_based.ann2snn package#

English

Step executed after FX tracing and device transfer. The default implementation returns fx_model unchanged. Subclasses can fuse Conv-BN modules or perform other post-tracing preprocessing here; training/eval mode that affects FX tracing should be set in before_trace().

参数:

converter (FXConverter) -- Converter that executes this recipe.
fx_model (GraphModule) -- GraphModule after tracing and device transfer.

返回:

GraphModule used by later steps.

返回类型:

before_trace(converter, ann)[源代码]#

API Language - 中文 | English

中文

FX tracing 之前运行的步骤。默认直接返回 ann。子类可在此设置训练/推理模式，或执行必须发生在 tracing 前的模型准备。

参数:

converter (FXConverter) -- 执行当前 recipe 的转换器。
ann (Module) -- 待 trace 的原始 ANN。

返回:

后续 tracing 使用的 ANN。

返回类型:

English

Step executed before FX tracing. The default implementation returns ann unchanged. Subclasses can set training/eval mode or perform model preparation that must happen before tracing.

参数:

converter (FXConverter) -- Converter that executes this recipe.
ann (Module) -- Original ANN to be traced.

返回:

ANN used by FX tracing.

返回类型:

calibrate(converter, fx_model)[源代码]#

API Language - 中文 | English

中文

运行校准数据的步骤。默认不运行 dataloader 并直接返回 fx_model。需要校准的子类应自行决定是否使用 torch.no_grad()、如何解析 batch，以及如何更新已插入的 observer / hook。

参数:

converter (FXConverter) -- 执行当前 recipe 的转换器。
fx_model (GraphModule) -- 当前 GraphModule。

返回:

后续步骤使用的 GraphModule。

返回类型:

English

Run calibration data. The default implementation does not iterate over the dataloader and returns fx_model unchanged. Subclasses that need calibration should decide whether to use torch.no_grad(), how to parse batches, and how to update inserted observers or hooks.

参数:

converter (FXConverter) -- Converter that executes this recipe.
fx_model (GraphModule) -- Current GraphModule.

返回:

GraphModule used by later steps.

返回类型:

finalize(converter, fx_model)[源代码]#

API Language - 中文 | English

中文

转换结束前的收尾步骤。默认直接返回 fx_model。子类可在此做最终 graph lint、清理临时模块、恢复状态，或包装最终返回的 torch.nn.Module。

参数:

converter (FXConverter) -- 执行当前 recipe 的转换器。
fx_model (GraphModule) -- 当前 GraphModule。

返回:

最终转换结果。

返回类型:

English

Final step before returning the converted model. The default implementation returns fx_model unchanged. Subclasses can perform final graph linting, clean temporary modules, restore state, or wrap the final returned torch.nn.Module.

参数:

converter (FXConverter) -- Converter that executes this recipe.
fx_model (GraphModule) -- Current GraphModule.

返回:

Final converted model.

返回类型:

insert_observers(converter, fx_model)[源代码]#

API Language - 中文 | English

中文

插入校准 observer / hook 的步骤。默认不插入任何模块并直接返回 fx_model。需要校准数据的 recipe 可在此修改 FX 图。

参数:

converter (FXConverter) -- 执行当前 recipe 的转换器。
fx_model (GraphModule) -- 当前 GraphModule。

返回:

后续步骤使用的 GraphModule。

返回类型:

English

Insert calibration observers or hooks. The default implementation inserts nothing and returns fx_model unchanged. Recipes that need calibration data can mutate the FX graph here.

参数:

converter (FXConverter) -- Converter that executes this recipe.
fx_model (GraphModule) -- Current GraphModule.

返回:

GraphModule used by later steps.

返回类型:

replace(converter, fx_model)[源代码]#

API Language - 中文 | English

中文

执行核心替换的步骤，例如将 activation 替换为 spiking neuron，或将 ANN module 替换为 TD operator。默认直接返回 fx_model。

参数:

converter (FXConverter) -- 执行当前 recipe 的转换器。
fx_model (GraphModule) -- 当前 GraphModule。

返回:

替换后的 GraphModule。

返回类型:

English

Perform the core replacement step, such as replacing activations with spiking neurons or replacing ANN modules with TD operators. The default implementation returns fx_model unchanged.

参数:

converter (FXConverter) -- Converter that executes this recipe.
fx_model (GraphModule) -- Current GraphModule.

返回:

Replaced GraphModule.

返回类型:

validate(converter)[源代码]#

API Language - 中文 | English

中文

校验当前 recipe 的前置条件。默认实现不做任何检查。该方法由 FXConverter / Converter 在每次转换开始时调用一次，子类不应在这里执行图转换。

参数:: converter (FXConverter) -- 执行当前 recipe 的转换器。
返回类型:: None

English

Validate this recipe's prerequisites. The default implementation checks nothing. FXConverter / Converter calls this method once at the beginning of each conversion; subclasses should not perform graph conversion here.

参数:: converter (FXConverter) -- Converter that executes this recipe.
返回类型:: None

class spikingjelly.activation_based.ann2snn.recipes.ModuleConversionRecipe[源代码]#

基类：object

API Language - 中文 | English

中文

直接 nn.Module tree 转换 recipe 基类。该路径不执行 FX tracing，只由 ModuleConverter 调用 validate() 和 convert_module()。适用于 SpikeZIP 这类需要按 module tree 替换子模块、但不改写 FX graph 的转换。该基类没有 before_trace、after_trace、insert_observers、calibrate、 replace 或 finalize 生命周期。

English

Base class for direct nn.Module tree conversion recipes. This path does not run FX tracing. ModuleConverter only calls validate() and convert_module(). It is intended for conversions such as SpikeZIP that replace submodules in a module tree without rewriting an FX graph. This base class has no before_trace, after_trace, insert_observers, calibrate, replace or finalize lifecycle.

convert_module(converter, ann)[源代码]#

API Language - 中文 | English

中文

执行直接 module-tree 转换。默认直接返回 ann。实现必须返回 torch.nn.Module 实例。

参数:

converter (ModuleConverter) -- 执行当前 recipe 的 module converter。
ann (Module) -- 待转换的原始 ANN 或 QANN。

返回:

转换后的模型。

返回类型:

English

Execute direct module-tree conversion. The default implementation returns ann unchanged. Implementations must return a torch.nn.Module instance.

参数:

converter (ModuleConverter) -- Module converter that executes this recipe.
ann (Module) -- Original ANN or QANN to convert.

返回:

Converted model.

返回类型:

validate(converter)[源代码]#

API Language - 中文 | English

中文

校验 module-tree recipe 的前置条件。默认实现不做任何检查。

参数:: converter (ModuleConverter) -- 执行当前 recipe 的 module converter。
返回类型:: None

English

Validate prerequisites for a module-tree recipe. The default implementation checks nothing.

参数:: converter (ModuleConverter) -- Module converter that executes this recipe.
返回类型:: None

class spikingjelly.activation_based.ann2snn.recipes.LocalThresholdBalancingRecipe(dataloader, time_steps=64, mode='99.9%', channel_dim=1, threshold_candidates=(0.5, 0.75, 1.0, 1.25, 1.5), fuse_flag=True, eps=1e-06)[源代码]#

基类：FXConversionRecipe

API Language - 中文 | English

中文

构造 training-free local-threshold-balancing ANN2SNN 转换 recipe。该 recipe 只使用校准数据在 SNN 侧为 ReLU 输出选择 channel-wise 阈值，不训练或修改输入 ANN 参数。

参考文献：Bu T, Li M, Yu Z. Inference-Scale Complexity in ANN-SNN Conversion for High-Performance and Low-Power Applications. arXiv:2409.03368, 2024. Accepted by CVPR 2025.

参数:

dataloader (Iterable) -- 校准数据加载器。
time_steps (int) -- SNN 仿真步数，保留用于 recipe 配置记录。
mode (str or float) -- 保留用于兼容 RateCodingRecipe 风格的配置校验。
channel_dim (int) -- ReLU 输出的通道维。
threshold_candidates (Tuple[float, ...]) -- 保留用于兼容早期实验 API；当前实现遵循原始 LTB 的逐 batch 阈值平衡更新。
fuse_flag (bool) -- 是否执行 Conv-BN 融合。
eps (float) -- 数值下界。

English

Construct a training-free local-threshold-balancing ANN2SNN conversion recipe. It uses calibration data only to choose channel-wise thresholds on the SNN side for ReLU outputs, without training or mutating the input ANN parameters.

Reference: Bu T, Li M, Yu Z. Inference-Scale Complexity in ANN-SNN Conversion for High-Performance and Low-Power Applications. arXiv:2409.03368, 2024. Accepted by CVPR 2025.

参数:

dataloader (Iterable) -- Calibration dataloader.
time_steps (int) -- SNN simulation steps, retained as recipe configuration metadata.
mode (str or float) -- Retained for configuration validation in the style of RateCodingRecipe.
channel_dim (int) -- Channel dimension of ReLU outputs.
threshold_candidates (Tuple[float, ...]) -- Retained for compatibility with earlier experimental APIs. The implementation follows the original LTB per-batch threshold-balancing update.
fuse_flag (bool) -- Whether to fuse Conv-BN modules.
eps (float) -- Numeric lower bound.

after_trace(converter, fx_model)[源代码]#

参数:

converter (Converter)
fx_model (fx.GraphModule)

返回类型:

fx.GraphModule

before_trace(converter, ann)[源代码]#

参数:

converter (Converter)
ann (nn.Module)

返回类型:

nn.Module

calibrate(converter, fx_model)[源代码]#

参数:

converter (Converter)
fx_model (fx.GraphModule)

返回类型:

fx.GraphModule

insert_observers(converter, fx_model)[源代码]#

参数:

converter (Converter)
fx_model (fx.GraphModule)

返回类型:

fx.GraphModule

replace(converter, fx_model)[源代码]#

参数:

converter (Converter)
fx_model (fx.GraphModule)

返回类型:

fx.GraphModule

validate(converter)[源代码]#

参数:: converter (Converter)
返回类型:: None

class spikingjelly.activation_based.ann2snn.recipes.RateCodingRecipe(dataloader, mode='Max', momentum=0.1, fuse_flag=True, channel_wise=False, channel_dim=1, pre_spike_maxpool=False, half_threshold=False, eps=1e-06, rules=None, neuron_factory=None, threshold_optimizer=None)[源代码]#

基类：FXConversionRecipe

API Language - 中文 | English

中文

构造传统 rate-coding ReLU→IFNode 转换 recipe。该 recipe 拥有 rate-coding 算法参数，并执行 Conv-BN 融合、VoltageHook 校准和 neuron replacement。

参数:

dataloader (Iterable) -- 校准数据加载器。每个 batch 可为单输入 tensor、 (input, target) 风格的 tuple/list，或包含 "input" / "image" / "images" 等输入键的 dict。默认 recipe 只支持单输入校准；多输入模型应通过自定义 recipe 扩展。
mode (str or float) -- VoltageHook 统计模式，支持 "Max"、百分位字符串和 0 < mode <= 1 的浮点缩放。
momentum (float) -- VoltageHook 动量。
fuse_flag (bool) -- 是否执行 Conv-BN 融合。
channel_wise (bool) -- 是否按通道统计 robust 激活尺度并使用 channel-wise threshold。默认 False 以保持原有 layer-wise 行为。
channel_dim (int) -- channel_wise=True 时的通道维。
pre_spike_maxpool (bool) -- channel_wise=True 时，若匹配到 ReLU -> MaxPool2d，是否把 MaxPool2d 放到脉冲神经元之前。
half_threshold (bool) -- channel_wise=True 时，是否使用半阈值膜电位初始化。
eps (float) -- channel_wise=True 时的阈值数值下界。
rules (Optional[List[ActivationRule]]) -- 激活转换规则。默认 [ReLURule()]。
neuron_factory (Optional[NeuronFactory]) -- 脉冲神经元工厂。
threshold_optimizer (Optional[ThresholdOptimizer]) -- 阈值优化器。

English

Construct a traditional rate-coding ReLU-to-IFNode conversion recipe. This recipe owns rate-coding algorithm parameters and performs Conv-BN fusion, VoltageHook calibration and neuron replacement.

参数:

dataloader (Iterable) -- Calibration dataloader. Each batch can be a single-input tensor, a (input, target)-style tuple/list, or a dict with input-like keys such as "input", "image", or "images". The default recipe only supports single-input calibration; multi-input models should extend a custom recipe.
mode (str or float) -- VoltageHook statistics mode. Supports "Max", percentile strings, and float scaling with 0 < mode <= 1.
momentum (float) -- VoltageHook momentum.
fuse_flag (bool) -- Whether to fuse Conv-BN modules.
channel_wise (bool) -- If True, collect robust activation scales per channel and use channel-wise thresholds. Defaults to False to preserve the original layer-wise behaviour.
channel_dim (int) -- Channel dimension used when channel_wise=True.
pre_spike_maxpool (bool) -- When channel_wise=True and a ReLU -> MaxPool2d pattern is matched, place MaxPool2d before the spiking neuron.
half_threshold (bool) -- When channel_wise=True, initialize membrane potential at half threshold.
eps (float) -- Numeric lower bound for thresholds when channel_wise=True.
rules (Optional[List[ActivationRule]]) -- Activation conversion rules. Defaults to [ReLURule()].
neuron_factory (Optional[NeuronFactory]) -- Spiking-neuron factory.
threshold_optimizer (Optional[ThresholdOptimizer]) -- Threshold optimizer.

after_trace(converter, fx_model)[源代码]#

参数:

converter (Converter)
fx_model (fx.GraphModule)

返回类型:

fx.GraphModule

before_trace(converter, ann)[源代码]#

参数:

converter (Converter)
ann (nn.Module)

返回类型:

nn.Module

calibrate(converter, fx_model)[源代码]#

参数:

converter (Converter)
fx_model (fx.GraphModule)

返回类型:

fx.GraphModule

insert_observers(converter, fx_model)[源代码]#

参数:

converter (Converter)
fx_model (fx.GraphModule)

返回类型:

fx.GraphModule

replace(converter, fx_model)[源代码]#

API Language - 中文 | English

中文

将已校准的 activation-hook 节点对替换为 rate-coding SNN 子图，并将结果移动到当前转换 device。

参数:

converter (Converter) -- 执行当前 recipe 的转换器。
fx_model (GraphModule) -- 已插入并校准 VoltageHook 的 GraphModule。

返回:

替换后的 GraphModule。

返回类型:

English

Replace calibrated activation-hook node pairs with rate-coding SNN subgraphs, and move the result to the current conversion device.

参数:

converter (Converter) -- Converter that executes this recipe.
fx_model (GraphModule) -- GraphModule with inserted and calibrated VoltageHook modules.

返回:

Replaced GraphModule.

返回类型:

validate(converter)[源代码]#

参数:: converter (Converter)
返回类型:: None

class spikingjelly.activation_based.ann2snn.recipes.SpikeZIPTFQANNRecipe(time_steps=200, model_family='roberta', strict=True)[源代码]#

基类：ModuleConversionRecipe

API Language - 中文 | English

中文

SpikeZIP-TF QANN-to-SNN recipe。该 recipe 不执行 ANN 量化或后训练，只把已经兼容 SpikeZIP 的 QANN 原位替换为透明 SNN module。当前版本支持 "roberta" 与 "vit" 两类已量化模型。RoBERTa-style self-attention module 需要暴露 query、key、value linear layers， num_attention_heads、attention_head_size、all_head_size、 dropout，以及带有 s、sym、pos_max、neg_min、 level 属性的 query_quan、key_quan、value_quan、 attn_quan、after_attn_quan quantizers。ViT-style attention module 需要暴露 qkv、proj、quan_q、quan_k、 quan_v、attn_quan、after_attn_quan、quan_proj、 num_heads、head_dim、scale、attn_drop 与 proj_drop。RoBERTa attention mask 仅在第一个时间步加入，随后由 temporal-difference softmax 的累计状态传播该静态 mask 影响。若 quantizer 未显式暴露 level，则按 pos_max - neg_min + 1 推断。

参数:

time_steps (int) -- 记录在转换后模型上的时间步元数据。用户仍需显式构造单步循环或多步输入序列；该参数不在 recipe 内部编码输入或控制 step_mode。建议不小于 QANN 的量化 level，否则部分 bias 或残余电荷可能无法完全释放。
model_family (str) -- 模型族。支持 "roberta" 或 "vit"。
strict (bool) -- 必须为 True。保留该参数用于未来显式放宽支持边界。

English

SpikeZIP-TF QANN-to-SNN recipe. This recipe does not quantize or post-train an ANN; it only converts an already SpikeZIP-compatible QANN into transparent SNN modules. The current version supports "roberta" and "vit" quantized models. RoBERTa-style self-attention modules must expose query, key and value linear layers, num_attention_heads, attention_head_size, all_head_size and dropout, plus query_quan, key_quan, value_quan, attn_quan and after_attn_quan quantizers with s, sym, pos_max, neg_min and level attributes. ViT-style attention modules must expose qkv, proj, quan_q, quan_k, quan_v, attn_quan, after_attn_quan, quan_proj, num_heads, head_dim, scale, attn_drop and proj_drop. RoBERTa attention masks are added only at the first timestep; the temporal-difference softmax state carries the static mask effect afterwards. If a quantizer does not expose level explicitly, the recipe infers it as pos_max - neg_min + 1.

参数:

time_steps (int) -- Timestep metadata stored on the converted model. Users still explicitly construct single-step loops or multi-step input sequences; the recipe does not encode inputs or control step_mode internally. It should be no smaller than the QANN quantization level; otherwise some bias terms or residual membrane charge may not be fully emitted.
model_family (str) -- Model family. Supported values are "roberta" and "vit".
strict (bool) -- Must be True. The parameter is reserved for future explicit boundary relaxation.

convert_module(converter, ann)[源代码]#

参数:

converter (ModuleConverter)
ann (nn.Module)

返回类型:

nn.Module

validate(converter)[源代码]#

参数:: converter (ModuleConverter)
返回类型:: None

class spikingjelly.activation_based.ann2snn.recipes.STATransformerRecipe(dataloader=None, time_steps=32, mode='equivalent', threshold_mode='mse', threshold_scale=1.0, spike_linear=None, spike_conv2d=None, spike_classifier=False, momentum=0.1, num_calibration_batches=None, show_progress=False, eps=1e-06)[源代码]#

基类：FXConversionRecipe

API Language - 中文 | English

中文

实现基于 Spatio-Temporal Approximation (STA) [1] 思路的 training-free Transformer 转换 recipe。该 recipe 将 Transformer 中的 Linear、Conv2d、LayerNorm、GELU、MultiheadAttention 等算子替换为支持 STA 差分时间传播的 step-mode 模块。转换过程中会以 eval 模式 trace 和校准原 ANN；Converter 会在转换结束后恢复原 ANN 的 training 标志。

mode="equivalent" 是默认的在线累计-差分基线：Linear、Conv2d、 LayerNorm、GELU、MultiheadAttention 和 FX tensor 常量都按时间步保持与原 ANN 的累计输出等价。该模式用于建立 Transformer 图形态和模型级接受基线，不宣称 fully spike-driven。

mode="spiking_encoder" 会在 LayerNorm、GELU 和 MultiheadAttention 输出侧插入校准后的有状态 spike encoder，同时保持主干 affine 在线等价。mode="spiking_affine" 会为选中的 Linear/Conv2d 统计阈值并替换为有状态 bipolar IF / burst affine 模块； LayerNorm、GELU 和 MultiheadAttention 仍使用在线累计-差分浮点模块。当前 step-mode 对齐后端暂不支持 mode="spiking_affine"、 spike_linear=True 或 spike_conv2d=True；这些配置会明确报错。 time_steps 参与阈值搜索和图中常量的多步展开，因而是转换 recipe 的一部分，而不仅是外部评估循环参数。

转换产物是普通 nn.Module / fx.GraphModule。用户通过 spikingjelly.activation_based.functional.set_step_mode() 递归设置内部 step-mode 模块。step_mode="s" 时，用户自己按时间步调用转换后的模型；step_mode="m" 时，模型接收第 0 维为时间维的序列 tensor 并返回输出序列。最终累计 readout 由用户显式执行，例如对时间维求和。

参数:

dataloader (Optional[Iterable]) -- 校准数据加载器。每个 batch 可为单输入 tensor、 (input, target) 风格的 tuple/list，或传递给模型的 kwargs dict。mode="equivalent" 默认不执行校准，可传 None；显式启用 spike_linear 或 spike_conv2d 时也需要提供。
time_steps (int) -- STA 内部推理时间步数，也用于阈值搜索。
mode (str) -- 转换模式，支持 "equivalent"、 "spiking_encoder" 和 "spiking_affine"。
threshold_mode (str) -- 阈值统计模式，支持 "mse" 和 "max"。
threshold_scale (float) -- 校准阈值的正数缩放因子。
spike_linear (Optional[bool]) -- 是否替换 Linear 为 spiking affine。若为 None，在 mode="spiking_affine" 时启用。
spike_conv2d (Optional[bool]) -- 是否替换 Conv2d 为 spiking affine。若为 None，默认不启用。
spike_classifier (bool) -- spike_linear=True 时是否也转换分类头。
momentum (float) -- 阈值 observer 的动量。
num_calibration_batches (Optional[int]) -- 最多使用的校准 batch 数；None 表示使用整个 dataloader。
show_progress (bool) -- 是否显示校准进度条。
eps (float) -- 阈值数值下界。

抛出:

ValueError -- 当校验发现不支持的转换模式、阈值模式、非正 time step、非正缩放因子、非法动量、非法校准 batch 上限、非法模式组合，或布尔选项类型错误时抛出。

English

Implement a training-free Transformer conversion recipe based on Spatio-Temporal Approximation (STA) [1]. The recipe replaces Transformer Linear, Conv2d, LayerNorm, GELU, MultiheadAttention and related operators with step-mode modules that support STA differential temporal propagation. Conversion traces and calibrates the original ANN in eval mode; Converter restores the original ANN training flags after conversion finishes.

mode="equivalent" is the default online cumulative-difference baseline: Linear, Conv2d, LayerNorm, GELU, MultiheadAttention and FX tensor constants preserve the original ANN cumulative output across time. This mode establishes the Transformer graph shape and model-level acceptance baseline; it does not claim to be fully spike-driven.

mode="spiking_encoder" inserts calibrated stateful spike encoders after LayerNorm, GELU and MultiheadAttention outputs while keeping main affine projections online-equivalent. mode="spiking_affine" calibrates thresholds for selected Linear/Conv2d modules and replaces them with stateful bipolar IF / burst affine modules; LayerNorm, GELU and MultiheadAttention remain online cumulative-difference floating-point modules. The current step-mode-aligned backend does not support mode="spiking_affine", spike_linear=True or spike_conv2d=True; these configurations raise a clear error. time_steps is used by threshold search and multi-step expansion of graph constants, so it belongs to the conversion recipe rather than only to an external evaluation loop.

The converted model is a plain nn.Module / fx.GraphModule. Users call spikingjelly.activation_based.functional.set_step_mode() to recursively configure internal step-mode modules. With step_mode="s", users call the converted model once per timestep. With step_mode="m", the model consumes sequence tensors whose first dimension is time and returns output sequences. Final accumulated readout is explicit, e.g. summing the time dimension.

参数:

dataloader (Optional[Iterable]) -- Calibration dataloader. Each batch can be a single-input tensor, a (input, target)-style tuple/list, or a kwargs dict passed to the model. mode="equivalent" skips calibration by default and can use None; explicitly enabling spike_linear or spike_conv2d still requires a dataloader.
time_steps (int) -- Number of STA internal inference timesteps. It is also used by threshold search.
mode (str) -- Conversion mode. Supported values are "equivalent", "spiking_encoder" and "spiking_affine".
threshold_mode (str) -- Threshold statistics mode. Supported values are "mse" and "max".
threshold_scale (float) -- Positive scale factor applied to calibrated thresholds.
spike_linear (Optional[bool]) -- Whether to replace Linear with spiking affine modules. If None, enable it for mode="spiking_affine".
spike_conv2d (Optional[bool]) -- Whether to replace Conv2d with spiking affine modules. If None, disable it by default.
spike_classifier (bool) -- Whether to convert classifier heads when spike_linear=True.
momentum (float) -- Momentum used by threshold observers.
num_calibration_batches (Optional[int]) -- Maximum number of calibration batches; None uses the full dataloader.
show_progress (bool) -- Whether to show a calibration progress bar.
eps (float) -- Numeric lower bound for thresholds.

抛出:

ValueError -- If validation finds an unsupported mode, threshold mode, non-positive timestep count, non-positive scale, invalid momentum, invalid calibration batch limit, unsupported mode combination, or invalid type for a boolean option.

before_trace(converter, ann)[源代码]#

参数:

converter (Converter)
ann (nn.Module)

返回类型:

nn.Module

calibrate(converter, fx_model)[源代码]#

参数:

converter (Converter)
fx_model (fx.GraphModule)

返回类型:

fx.GraphModule

finalize(converter, fx_model)[源代码]#

参数:

converter (Converter)
fx_model (fx.GraphModule)

返回类型:

nn.Module

insert_observers(converter, fx_model)[源代码]#

参数:

converter (Converter)
fx_model (fx.GraphModule)

返回类型:

fx.GraphModule

replace(converter, fx_model)[源代码]#

参数:

converter (Converter)
fx_model (fx.GraphModule)

返回类型:

fx.GraphModule

validate(converter)[源代码]#

参数:: converter (Converter)
返回类型:: None

class spikingjelly.activation_based.ann2snn.recipes.TransformerTDEquivalentRecipe(time_steps=None)[源代码]#

基类：FXConversionRecipe

API Language - 中文 | English

中文

Transformer TD-equivalent operator 替换 recipe。该 recipe 不插入 observer，不运行 dataloader 校准，也不强制切换模型 train/eval 状态；它仅将当前支持的 ANN core modules 和窄 attention 子集替换为 TD 等价算子。

English

Transformer TD-equivalent operator replacement recipe. This recipe does not insert observers, does not run dataloader calibration, and does not force train/eval mode changes. It only replaces the currently supported ANN core modules and narrow attention subset with TD-equivalent operators.

参数:: time_steps (int | None)

finalize(converter, fx_model)[源代码]#

参数:

converter (Converter)
fx_model (fx.GraphModule)

返回类型:

nn.Module

replace(converter, fx_model)[源代码]#

API Language - 中文 | English

中文

将当前支持的 Transformer core modules、SDPA 调用和窄 MultiheadAttention 调用替换为 TD-equivalent 算子。该步骤不插入 observer，也不运行 rate-coding 校准。

参数:

converter (Converter) -- 执行当前 recipe 的转换器。
fx_model (GraphModule) -- 已 trace 的 GraphModule。

返回:

替换后的 GraphModule。

返回类型:

抛出:

ValueError -- 当 attention 调用或配置不在当前支持范围内时抛出。

English

Replace currently supported Transformer core modules, SDPA calls and narrow MultiheadAttention calls with TD-equivalent operators. This step does not insert observers or run rate-coding calibration.

参数:

converter (Converter) -- Converter that executes this recipe.
fx_model (GraphModule) -- Traced GraphModule.

返回:

Replaced GraphModule.

返回类型:

抛出:

ValueError -- If an attention call or configuration is outside the currently supported subset.

validate(converter)[源代码]#

参数:: converter (Converter)
返回类型:: None

Rate-Coding Rules, Factories, and Thresholds#

These modules support the ReLU-to-spiking-neuron path used by rate-coding recipes, primarily RateCodingRecipe and LocalThresholdBalancingRecipe. ActivationRule / ReLURule match FX activation nodes, insert calibration hooks, and replace calibrated activations with spiking-neuron subgraphs. HookFactory, NeuronFactory, and ThresholdOptimizer are the matching calibration and construction utilities.

Transformer TD-equivalent, STA Transformer, and SpikeZIP conversions do not use this graph-rule interface. They implement their own recipe-specific operator or module replacement logic.

class spikingjelly.activation_based.ann2snn.rules.ActivationRule(*args, **kwargs)[源代码]#

基类：Protocol

API Language - 中文 | English

中文

激活函数转换规则协议。实现该协议即可接入新的 ANN→SNN 转换算法。规则需要负责：

通过 match() 判断是否处理某个 fx.Node；
通过 insert_hooks() 在节点后插入校准 hook；
通过 find_replacements() 找到 (activation_node, hook_node) 对；
通过 replace_with_neurons() 将激活节点与 hook 替换为脉冲神经元结构。

English

Protocol for activation-to-neuron conversion rules. Implement this protocol to plug a new ANN→SNN algorithm into the converter. A rule must:

decide whether it handles a given fx.Node via match();
insert a calibration hook after the node via insert_hooks();
enumerate (activation_node, hook_node) pairs to replace via find_replacements();
replace the activation + hook pair with spiking neurons via replace_with_neurons().

match(node, modules)[源代码]#

API Language - 中文 | English

中文

判断该规则是否处理给定节点。

参数:

node (fx.Node) -- 待检查的 fx.Node。
modules (Dict[str, nn.Module]) -- fx.GraphModule.named_modules() 得到的模块名字典。

返回:

若该规则负责此节点则返回 True。

返回类型:

bool

English

Return True if this rule handles the given graph node.

参数:

node (fx.Node) -- The fx.Node to check.
modules (Dict[str, nn.Module]) -- Module-name dictionary obtained from fx.GraphModule.named_modules().

返回:

True if this rule handles the node.

返回类型:

bool

insert_hooks(fx_model, node, hook_factory, hook_counts_per_prefix)[源代码]#

API Language - 中文 | English

中文

在 node 之后插入一个由 hook_factory 创建的校准 hook，并将新节点加入 fx_model。hook_counts_per_prefix 用于在多 hook 场景下生成唯一的目标名称。

参数:

fx_model (fx.GraphModule) -- 待修改的 GraphModule。
node (fx.Node) -- 触发 hook 插入的 fx.Node。
hook_factory (HookFactory) -- 校准 hook 工厂。
hook_counts_per_prefix (Dict[str, int]) -- 用于生成唯一 hook 目标名的前缀计数器。

返回:

新插入的 hook 节点。

返回类型:

fx.Node

English

Insert a calibration hook created by hook_factory after node and register the new node inside fx_model. hook_counts_per_prefix is used to generate unique hook target names when multiple hooks are inserted.

参数:

fx_model (fx.GraphModule) -- The GraphModule to modify.
node (fx.Node) -- The fx.Node after which the hook is inserted.
hook_factory (HookFactory) -- Hook factory used to build the calibration hook.
hook_counts_per_prefix (Dict[str, int]) -- Per-prefix counters used to build unique hook target names.

返回:

The newly inserted hook node.

返回类型:

fx.Node

find_replacements(fx_model, modules)[源代码]#

API Language - 中文 | English

中文

遍历 fx_model，产出需要被替换的 (activation_node, hook_node) 对。对于非标准图结构的规则，应重写该方法实现自定义遍历。

参数:

fx_model (fx.GraphModule) -- 已插入校准 hook 的 GraphModule。
modules (Dict[str, nn.Module]) -- fx.GraphModule.named_modules() 得到的模块名字典。

返回:

形如 (activation_node, hook_node) 的迭代器。

返回类型:

Iterator[Tuple[fx.Node, fx.Node]]

English

Iterate over fx_model and yield (activation_node, hook_node) pairs to replace. Rules with non-standard graph patterns should override this method with their own traversal.

参数:

fx_model (fx.GraphModule) -- GraphModule with calibration hooks already inserted.
modules (Dict[str, nn.Module]) -- Module-name dictionary obtained from fx.GraphModule.named_modules().

返回:

Iterator of (activation_node, hook_node) pairs.

返回类型:

Iterator[Tuple[fx.Node, fx.Node]]

replace_with_neurons(fx_model, activation_node, hook_node, neuron_factory, threshold_optimizer)[源代码]#

API Language - 中文 | English

中文

将 activation_node 与 hook_node 替换为对应的脉冲神经元结构。threshold 由 threshold_optimizer 基于 hook 校准数据计算得到；神经元由 neuron_factory 构造。

参数:

fx_model (fx.GraphModule) -- 待修改的 GraphModule。
activation_node (fx.Node) -- 激活节点。
hook_node (fx.Node) -- 校准 hook 节点。
neuron_factory (NeuronFactory) -- 脉冲神经元工厂。
threshold_optimizer (ThresholdOptimizer) -- 阈值优化器。

返回类型:

None

English

Replace the activation + hook pair with the corresponding spiking neuron structure. The threshold is computed by threshold_optimizer from the calibration hook; the neuron is built by neuron_factory.

参数:

fx_model (fx.GraphModule) -- The GraphModule to modify.
activation_node (fx.Node) -- The activation node.
hook_node (fx.Node) -- The calibration hook node.
neuron_factory (NeuronFactory) -- Spiking-neuron factory.
threshold_optimizer (ThresholdOptimizer) -- Threshold optimizer.

返回类型:

None

class spikingjelly.activation_based.ann2snn.rules.ReLURule[源代码]#

基类：object

API Language - 中文 | English

中文

nn.ReLU 转换规则。复现 SpikingJelly 原有行为：将每个 nn.ReLU 替换为 VoltageScaler(1/s) -> IFNode -> VoltageScaler(s)，其中 s 由 ThresholdOptimizer 基于 VoltageHook 的校准结果计算。

English

Conversion rule for nn.ReLU modules. Reproduces the original SpikingJelly behaviour: each nn.ReLU is replaced by VoltageScaler(1/s) -> IFNode -> VoltageScaler(s), where s is computed by ThresholdOptimizer from the VoltageHook calibration data.

match(node, modules)[源代码]#

参数:

node (Node)
modules (Dict[str, Module])

返回类型:

bool

insert_hooks(fx_model, node, hook_factory, hook_counts_per_prefix)[源代码]#

参数:

fx_model (GraphModule)
node (Node)
hook_factory (HookFactory)
hook_counts_per_prefix (Dict[str, int])

返回类型:

Node

find_replacements(fx_model, modules)[源代码]#

参数:

fx_model (GraphModule)
modules (Dict[str, Module])

返回类型:

Iterator[Tuple[Node, Node]]

replace_with_neurons(fx_model, activation_node, hook_node, neuron_factory, threshold_optimizer)[源代码]#

参数:

fx_model (GraphModule)
activation_node (Node)
hook_node (Node)
neuron_factory (NeuronFactory)
threshold_optimizer (ThresholdOptimizer)

返回类型:

None

class spikingjelly.activation_based.ann2snn.factories.NeuronFactory(neuron_type=<class 'spikingjelly.activation_based.neuron.integrate_and_fire.IFNode'>, v_threshold=1.0, v_reset=None, **kwargs)[源代码]#

基类：object

API Language - 中文 | English

中文

用于创建替换激活函数的脉冲神经元模块。默认创建 spikingjelly.activation_based.neuron.IFNode，并使用 v_threshold=1.0 与 v_reset=None 保持原有 ANN2SNN 行为。默认转换会通过 VoltageScaler 处理激活尺度，因此默认工厂不会把 scale 直接写入神经元阈值；自定义工厂可读取 scale 派生阈值或其他参数。

参数:

neuron_type (Type[nn.Module]) -- 神经元类，必须接受 v_threshold 与 v_reset 关键字参数。默认为 spikingjelly.activation_based.neuron.IFNode。
v_threshold (float) -- 神经元发放阈值，传递给神经元构造函数。
v_reset (Optional[float]) -- 膜电位复位值。None 表示软复位（减法复位），默认为 None。
kwargs -- 透传给神经元构造函数的其他关键字参数。

English

Factory that creates spiking-neuron modules used to replace ANN activation functions. By default it instantiates spikingjelly.activation_based.neuron.IFNode with v_threshold=1.0 and v_reset=None to preserve the original ANN2SNN behaviour. The default conversion handles the activation scale with VoltageScaler, so the default factory does not copy scale into the neuron threshold. Custom factories may derive thresholds or other neuron parameters from scale.

参数:

neuron_type (Type[nn.Module]) -- Neuron class to instantiate. Must accept v_threshold and v_reset keyword arguments. Defaults to spikingjelly.activation_based.neuron.IFNode.
v_threshold (float) -- Firing threshold passed to the neuron constructor.
v_reset (Optional[float]) -- Membrane reset value. None means soft reset (subtractive reset). Defaults to None.
kwargs -- Additional keyword arguments forwarded to the neuron constructor.

static neuron_kwargs_reserved_keys()[源代码]#

返回类型:: set[str]

create(scale)[源代码]#

API Language - 中文 | English

中文

根据工厂配置创建一个脉冲神经元模块实例。scale 为当前层校准得到的激活尺度，默认实现不直接使用该值，但子类可据此派生阈值或其他参数。

参数:: scale (float) -- 当前层的校准尺度。
返回:: 配置完成的脉冲神经元模块。
返回类型:: nn.Module

English

Instantiate a spiking-neuron module with the configured parameters. scale is the calibrated activation scale of the current layer; the default implementation does not use it directly, but subclasses can derive thresholds or other neuron parameters from it.

参数:: scale (float) -- Calibration scale for the layer.
返回:: A spiking-neuron module.
返回类型:: nn.Module

class spikingjelly.activation_based.ann2snn.factories.HookFactory(mode='Max', momentum=0.1)[源代码]#

基类：object

API Language - 中文 | English

中文

用于创建校准阶段使用的 VoltageHook 实例。每个匹配到的激活节点会获得独立的 hook 实例。

参数:

mode (str, float) -- 校准模式，传递给 VoltageHook。"Max" 记录激活最大值； "99.9%" 记录 99.9 分位点；(0, 1] 区间的 float 表示 max * mode。
momentum (float) -- VoltageHook 的 EMA 动量。

English

Factory that creates VoltageHook instances used during calibration. Each matched activation node receives an independent hook instance.

参数:

mode (str, float) -- Calibration mode forwarded to VoltageHook. "Max" records the maximum activation; "99.9%" records the 99.9-th percentile; a float in (0, 1] records max * mode.
momentum (float) -- EMA momentum for VoltageHook.

create()[源代码]#

API Language - 中文 | English

中文

创建一个新的 VoltageHook 实例。

返回:: 配置完成的 VoltageHook。
返回类型:: VoltageHook

English

Create a new VoltageHook instance.

返回:: A configured VoltageHook.
返回类型:: VoltageHook

class spikingjelly.activation_based.ann2snn.threshold.ThresholdOptimizer(strategy='fixed')[源代码]#

基类：object

API Language - 中文 | English

中文

阈值优化器。根据 VoltageHook 在校准阶段记录的 scale 计算当前层的神经元阈值。当前内置策略：

"fixed": 阈值等于校准 scale （默认，等价于 SpikingJelly 原有行为）。

其他策略需通过子类化并重写 compute_threshold() 实现；基类可接受任意策略名，但只有 "fixed" 在基类中真正生效。

参数:: strategy (str) -- 阈值计算策略名称。

English

Threshold optimizer. Computes the neuron threshold for a layer from the scale recorded by VoltageHook during calibration. Built-in strategy:

"fixed": threshold equals the calibrated scale (default, matches the original SpikingJelly behaviour).

Additional strategies should be implemented by subclassing and overriding compute_threshold(). The base class accepts any strategy name but only implements "fixed" itself.

参数:: strategy (str) -- Name of the threshold computation strategy.

compute_threshold(hook)[源代码]#

API Language - 中文 | English

中文

返回当前层对应的脉冲神经元阈值。当前仅在 strategy="fixed" 时直接返回 hook 中记录的 scale；其他策略由子类实现。

参数:

hook (VoltageHook) -- 已完成校准的 VoltageHook，其 scale 属性保存激活范围统计量。

返回:

神经元阈值。

返回类型:

float

抛出:

NotImplementedError -- 当 strategy 不是已实现策略时抛出。
AttributeError -- 当 hook 不包含 scale 属性时抛出。

English

Return the spiking-neuron threshold for the layer represented by hook. With strategy="fixed" this returns the scale stored in the hook; other strategies should be implemented by subclasses.

参数:

hook (VoltageHook) -- A calibrated VoltageHook whose scale attribute holds the activation range statistic.

返回:

The neuron threshold.

返回类型:

float

抛出:

NotImplementedError -- If strategy is not implemented.
AttributeError -- If hook does not expose a scale attribute.

Stateful Operators and Runtime Modules#

Temporal-difference (TD) operators in spikingjelly.activation_based.ann2snn.operators follow stateful SpikingJelly step-mode semantics:

ann_forward(...) runs the ordinary stateless ANN/PyTorch operation and does not read or write module memory.
step_mode="s" / single_step_forward(...) consumes one differential timestep, updates cumulative memory, and returns one differential output.
step_mode="m" / multi_step_forward(...) consumes a complete sequence whose first dimension is time, uses vectorized cumulative-sum / temporal-difference execution where implemented, and leaves the final memory.
Call reset() before starting an independent sequence.

This means single_step_forward is not the ordinary ANN forward path for TD operators. Use ann_forward when comparing a TD module with the source PyTorch module at one non-temporal input.

class spikingjelly.activation_based.ann2snn.operators.TDModule(step_mode='m')[源代码]#

基类：MemoryModule

API Language

中文

Temporal-difference / sequence-preserving 算子的基类。该类继承 spikingjelly.activation_based.base.MemoryModule，使用 step_mode、memory 和 reset 语义实现 TD 状态传播。

step_mode="s" 时，输入被解释为当前差分时间步，模块更新内部累积 memory 并返回当前差分输出；step_mode="m" 时，输入第 0 维被解释为时间维，模块返回完整差分序列并保留最终 memory。普通 ANN/PyTorch 数值路径由 ann_forward() 提供，且不读写 memory。子类必须实现 ann_forward() 和 multi_step_forward()。子类 __init__ 应调用 super().__init__(step_mode) 初始化步进模式。

参数:: step_mode (str) -- 步进模式，"s" 或 "m"。默认 "m" 保持既有 TD operator 行为。
抛出:: ValueError -- 当 step_mode 非法时，由 StepModule 的 setter 抛出；若子类绕过 setter 写入非法模式，forward 也会抛出。

English

Base class for temporal-difference / sequence-preserving operators. It inherits spikingjelly.activation_based.base.MemoryModule and uses step_mode, memory, and reset semantics for TD state propagation.

With step_mode="s", inputs are interpreted as the current differential time step; the module updates its cumulative memory and returns the current differential output. With step_mode="m", dimension 0 is interpreted as the time dimension; the module returns a full differential sequence and keeps the final memory. The ordinary ANN/PyTorch numeric path is exposed by ann_forward() and does not read or write memory. Subclasses must implement ann_forward() and multi_step_forward(). Subclass __init__ methods should call super().__init__(step_mode) to initialize the step mode.

参数:: step_mode (str) -- Step mode, "s" or "m". The default "m" preserves existing TD operator behavior.
抛出:: ValueError -- Raised by StepModule's setter when step_mode is invalid; forward also raises if a subclass bypasses the setter and writes an invalid mode.

ann_forward(*args, **kwargs)[源代码]#

multi_step_forward(*args, **kwargs)[源代码]#

single_step_forward(*args, **kwargs)[源代码]#

class spikingjelly.activation_based.ann2snn.operators.TDSoftmax(dim=-1, step_mode='m')[源代码]#

基类：TDModule

API Language

中文

Temporal-difference (TD) Softmax 算子。step_mode="m" 时输入必须是完整时间序列，时间维固定为第 0 维，形状为 [T, ...]；模块先对输入在时间维做累积，再沿 dim 计算 torch.softmax，最后返回累积输出在时间维上的差分。 step_mode="s" 时输入被解释为当前差分时间步，模块更新内部累积 memory 并返回当前差分输出；普通 torch.softmax 路径由 ann_forward() 提供。

返回值是浮点差分值，可能包含负值；它不是二值脉冲，也不表示 fully spike-driven Softmax。输出 dtype 与输入 dtype 相同；推荐使用 float32、float16 或 float64 输入。该算子完全由 PyTorch 可微算子组成，对 autograd 透明。

该算子的机制来源于 SpikeZIP-TF: Conversion is All You Need for Transformer-based SNN 中对 Transformer 非线性算子的累积-差分等价转换思路。本文档中的 TD Softmax 只实现张量级算子：在多步模式下它仍调用 torch.softmax，需要完整时间序列输入，不是逐时间步在线算子，也不是面向神经形态硬件的 fully spike-driven Softmax。

op = TDSoftmax(dim=-1)
x_seq = torch.randn(4, 2, 3)
y_seq = op(x_seq)

参数:

dim (int) -- Softmax 归一化维度。step_mode="m" 时不能为第 0 维，因为第 0 维保留为时间维；step_mode="s" 时作用在当前差分时间步的对应维度。
step_mode (str) -- 步进模式，"s" 或 "m"。默认 "m"。

English

Temporal-difference (TD) Softmax operator. With step_mode="m", the input must be a complete time sequence whose time dimension is fixed at dimension 0, with shape [T, ...]. This module first accumulates the input over time, applies torch.softmax along dim to each cumulative input, and returns the temporal difference of the cumulative outputs. With step_mode="s", the input is interpreted as the current differential time step; the module updates its cumulative memory and returns the current differential output. The ordinary torch.softmax path is exposed by ann_forward().

The output contains floating-point differential values and may contain negative values. It is not a binary spike tensor and does not represent a fully spike-driven Softmax. The output dtype matches the input dtype; float32, float16 and float64 inputs are recommended. The operator is composed entirely of differentiable PyTorch operations and is transparent to autograd.

The mechanism follows the cumulative-difference equivalence idea for Transformer nonlinear operators in SpikeZIP-TF: Conversion is All You Need for Transformer-based SNN. This implementation provides only a tensor-level operator: in multi-step mode it still calls torch.softmax, requires a complete time sequence, is not a step-wise online operator, and is not a fully spike-driven Softmax for neuromorphic hardware.

op = TDSoftmax(dim=-1)
x_seq = torch.randn(4, 2, 3)
y_seq = op(x_seq)

参数:

dim (int) -- Softmax normalization dimension. With step_mode="m", it must not be dimension 0, which is reserved as the time dimension. With step_mode="s", it applies to the corresponding dimension of the current differential time step.
step_mode (str) -- Step mode, "s" or "m". The default is "m".

ann_forward(x)[源代码]#

参数:: x (Tensor)
返回类型:: Tensor

multi_step_forward(x_seq)[源代码]#

API Language

中文

对完整时间序列执行 TD Softmax。计算过程为：

\[X_{cum}[t] = \sum_{i=0}^{t} X[i]\]

\[Y_{cum}[t] = \operatorname{Softmax}(X_{cum}[t])\]

\[Y[0] = Y_{cum}[0], \quad Y[t] = Y_{cum}[t] - Y_{cum}[t-1]\]

因此 Y.cumsum(dim=0) 与对 X.cumsum(dim=0) 逐时间步执行 ANN Softmax 的结果一致。输出是浮点差分值，可能为负，不是二值脉冲。当 T = 1 时，Y[0] 直接等于 torch.softmax(X[0], dim=dim)。输出 dtype 与输入 dtype 相同，且该算子对 autograd 透明。

参数:: x_seq (Tensor) -- 输入时间序列，形状为 [T, ...]，且 T > 0。
返回:: TD Softmax 差分序列，形状与 x_seq 相同。
返回类型:: Tensor
抛出:: ValueError -- 若 x_seq 少于 2 维、时间维为空，或 dim 指向时间维。

English

Apply TD Softmax to a complete time sequence:

\[X_{cum}[t] = \sum_{i=0}^{t} X[i]\]

\[Y_{cum}[t] = \operatorname{Softmax}(X_{cum}[t])\]

\[Y[0] = Y_{cum}[0], \quad Y[t] = Y_{cum}[t] - Y_{cum}[t-1]\]

Thus, Y.cumsum(dim=0) matches ANN Softmax applied to X.cumsum(dim=0) at each time step. The output contains floating-point differential values, may be negative, and is not a binary spike tensor. When T = 1, Y[0] is exactly torch.softmax(X[0], dim=dim). The output dtype matches the input dtype, and the operator is transparent to autograd.

参数:: x_seq (Tensor) -- Input time sequence with shape [T, ...] and T > 0.
返回:: TD Softmax differential sequence with the same shape as x_seq.
返回类型:: Tensor
抛出:: ValueError -- If x_seq has fewer than 2 dimensions, the time dimension is empty, or dim refers to the time dimension.

extra_repr()[源代码]#

返回类型:: str

class spikingjelly.activation_based.ann2snn.operators.TDLayerNorm(normalized_shape, eps=1e-05, elementwise_affine=True, bias=True, device=None, dtype=None, step_mode='m')[源代码]#

基类：TDModule

API Language

中文

Temporal-difference (TD) LayerNorm 算子。step_mode="m" 时输入必须是完整时间序列，时间维固定为第 0 维，形状为 [T, ...]；模块先对输入在时间维做累积，再对每个累积输入执行 torch.nn.functional.layer_norm()，最后返回累积输出在时间维上的差分。step_mode="s" 时输入被解释为当前差分时间步，模块更新内部累积 memory 并返回当前差分输出；普通 LayerNorm 路径由 ann_forward() 提供。

返回值是浮点差分值，可能包含负值；它不是二值脉冲，也不表示 fully spike-driven LayerNorm。输出 dtype 与输入 dtype 相同；推荐使用 float32、float16 或 float64 输入。该算子完全由 PyTorch 可微算子组成，对 autograd 透明。该算子是 stateful TD MemoryModule；重复处理独立序列前应调用 reset。

该算子的机制来源于 SpikeZIP-TF: Conversion is All You Need for Transformer-based SNN 中对 Transformer 非线性算子的累积-差分等价转换思路。本文档中的 TD LayerNorm 只实现张量级算子：在多步模式下它仍调用 torch.nn.functional.layer_norm()，需要完整时间序列输入，不是逐时间步在线算子，也不是面向神经形态硬件的 fully spike-driven LayerNorm。

op = TDLayerNorm(normalized_shape=3)
x_seq = torch.randn(4, 2, 3)
y_seq = op(x_seq)

参数:

normalized_shape (int or list[int] or Size) -- 输入尾部需要归一化的形状，与 torch.nn.LayerNorm 的 normalized_shape 语义一致。
eps (float) -- 加到方差上的数值稳定项。
elementwise_affine (bool) -- 若为 True，使用可学习的逐元素仿射参数。
bias (bool) -- 若 elementwise_affine 和 bias 均为 True，使用可学习 bias 参数。若 elementwise_affine 为 False，则忽略 bias。
device (device or str or None) -- 参数初始化设备。
dtype (dtype or None) -- 参数初始化 dtype。
step_mode (str) -- 步进模式，"s" 或 "m"。默认 "m"。

English

Temporal-difference (TD) LayerNorm operator. With step_mode="m", the input must be a complete time sequence whose time dimension is fixed at dimension 0, with shape [T, ...]. This module first accumulates the input over time, applies torch.nn.functional.layer_norm() to each cumulative input, and returns the temporal difference of the cumulative outputs. With step_mode="s", the input is interpreted as the current differential time step; the module updates its cumulative memory and returns the current differential output. The ordinary LayerNorm path is exposed by ann_forward().

The output contains floating-point differential values and may contain negative values. It is not a binary spike tensor and does not represent a fully spike-driven LayerNorm. The output dtype matches the input dtype; float32, float16 and float64 inputs are recommended. The operator is composed entirely of differentiable PyTorch operations and is transparent to autograd. The operator is a stateful TD MemoryModule; call reset before processing an independent sequence.

The mechanism follows the cumulative-difference equivalence idea for Transformer nonlinear operators in SpikeZIP-TF: Conversion is All You Need for Transformer-based SNN. This implementation provides only a tensor-level operator: in multi-step mode it still calls torch.nn.functional.layer_norm(), requires a complete time sequence, is not a step-wise online operator, and is not a fully spike-driven LayerNorm for neuromorphic hardware.

op = TDLayerNorm(normalized_shape=3)
x_seq = torch.randn(4, 2, 3)
y_seq = op(x_seq)

参数:

normalized_shape (int or list[int] or Size) -- Input trailing shape to normalize, with the same semantics as normalized_shape in torch.nn.LayerNorm.
eps (float) -- Value added to the variance for numerical stability.
elementwise_affine (bool) -- If True, use learnable per-element affine parameters.
bias (bool) -- If both elementwise_affine and bias are True, use a learnable bias parameter. If elementwise_affine is False, bias is ignored.
device (device or str or None) -- Device used to initialize parameters.
dtype (dtype or None) -- Dtype used to initialize parameters.
step_mode (str) -- Step mode, "s" or "m". The default is "m".

reset_parameters()[源代码]#

返回类型:: None

ann_forward(x)[源代码]#

参数:: x (Tensor)
返回类型:: Tensor

multi_step_forward(x_seq)[源代码]#

API Language

中文

对完整时间序列执行 TD LayerNorm。计算过程为：

\[X_{cum}[t] = \sum_{i=0}^{t} X[i]\]

\[Y_{cum}[t] = \operatorname{LayerNorm}(X_{cum}[t])\]

\[Y[0] = Y_{cum}[0], \quad Y[t] = Y_{cum}[t] - Y_{cum}[t-1]\]

因此 Y.cumsum(dim=0) 与对 X.cumsum(dim=0) 逐时间步执行 ANN LayerNorm 的结果一致。输出是浮点差分值，可能为负，不是二值脉冲。当 T = 1 时，Y[0] 直接等于对 X[0] 执行 LayerNorm 的结果。输出 dtype 与输入 dtype 相同，且该算子对 autograd 透明。

参数:: x_seq (Tensor) -- 输入时间序列，形状为 [T, ...]，且 T > 0，尾部形状必须匹配 normalized_shape。
返回:: TD LayerNorm 差分序列，形状与 x_seq 相同。
返回类型:: Tensor
抛出:: ValueError -- 若 x_seq 少于 2 维、时间维为空或尾部形状不匹配。

English

Apply TD LayerNorm to a complete time sequence:

\[X_{cum}[t] = \sum_{i=0}^{t} X[i]\]

\[Y_{cum}[t] = \operatorname{LayerNorm}(X_{cum}[t])\]

\[Y[0] = Y_{cum}[0], \quad Y[t] = Y_{cum}[t] - Y_{cum}[t-1]\]

Thus, Y.cumsum(dim=0) matches ANN LayerNorm applied to X.cumsum(dim=0) at each time step. The output contains floating-point differential values, may be negative, and is not a binary spike tensor. When T = 1, Y[0] is exactly LayerNorm applied to X[0]. The output dtype matches the input dtype, and the operator is transparent to autograd.

参数:: x_seq (Tensor) -- Input time sequence with shape [T, ...] and T > 0. The trailing shape must match normalized_shape.
返回:: TD LayerNorm differential sequence with the same shape as x_seq.
返回类型:: Tensor
抛出:: ValueError -- If x_seq has fewer than 2 dimensions, the time dimension is empty, or the trailing shape does not match.

extra_repr()[源代码]#

返回类型:: str

class spikingjelly.activation_based.ann2snn.operators.TDGELU(approximate='none', step_mode='m')[源代码]#

基类：TDModule

API Language

中文

Temporal-difference (TD) GELU（Gaussian Error Linear Unit）算子。 step_mode="m" 时输入必须是完整时间序列，时间维固定为第 0 维，形状为 [T, ...]；模块先对输入在时间维做累积，再对每个累积输入执行 torch.nn.functional.gelu()，最后返回累积输出在时间维上的差分。step_mode="s" 时输入被解释为当前差分时间步，模块更新内部累积 memory 并返回当前差分输出；普通 GELU 路径由 ann_forward() 提供。

返回值是浮点差分值，可能包含负值；它不是二值脉冲，也不表示 fully spike-driven GELU。输出 dtype 与输入 dtype 相同；推荐使用 float32、float16、bfloat16 或 float64 输入。该算子完全由 PyTorch 可微算子组成，对 autograd 透明。该算子是 stateful TD MemoryModule；重复处理独立序列前应调用 reset。该算子仅依赖 torch.nn.functional.gelu()，支持 CPU 与 CUDA，后端与 torch 一致，无 CuPy / Triton 专用路径。

该算子的机制来源于 SpikeZIP-TF: Conversion is All You Need for Transformer-based SNN 中对 Transformer 非线性算子的累积-差分等价转换思路。本文档中的 TD GELU 只实现张量级算子：在多步模式下它仍调用 torch.nn.functional.gelu()，需要完整时间序列输入，不是逐时间步在线算子，也不是面向神经形态硬件的 fully spike-driven GELU。

op = TDGELU(approximate="none")
x_seq = torch.randn(4, 2, 3)
y_seq = op(x_seq)

参数:

approximate (Literal["none", "tanh"]) -- GELU 近似模式，与 torch.nn.GELU 的 approximate 语义一致。
step_mode (str) -- 步进模式，"s" 或 "m"。默认 "m"。

抛出:

ValueError -- 若 approximate 不是 "none" 或 "tanh"。

English

Temporal-difference (TD) GELU (Gaussian Error Linear Unit) operator. With step_mode="m", the input must be a complete time sequence whose time dimension is fixed at dimension 0, with shape [T, ...]. This module first accumulates the input over time, applies torch.nn.functional.gelu() to each cumulative input, and returns the temporal difference of the cumulative outputs. With step_mode="s", the input is interpreted as the current differential time step; the module updates its cumulative memory and returns the current differential output. The ordinary GELU path is exposed by ann_forward().

The output contains floating-point differential values and may contain negative values. It is not a binary spike tensor and does not represent a fully spike-driven GELU. The output dtype matches the input dtype; float32, float16, bfloat16 and float64 inputs are recommended. The operator is composed entirely of differentiable PyTorch operations and is transparent to autograd. The operator is a stateful TD MemoryModule; call reset before processing an independent sequence. It only depends on torch.nn.functional.gelu(), supports CPU and CUDA, follows the torch backend behavior, and has no CuPy / Triton specific path.

The mechanism follows the cumulative-difference equivalence idea for Transformer nonlinear operators in SpikeZIP-TF: Conversion is All You Need for Transformer-based SNN. This implementation provides only a tensor-level operator: in multi-step mode it still calls torch.nn.functional.gelu(), requires a complete time sequence, is not a step-wise online operator, and is not a fully spike-driven GELU for neuromorphic hardware.

op = TDGELU(approximate="none")
x_seq = torch.randn(4, 2, 3)
y_seq = op(x_seq)

参数:

approximate (Literal["none", "tanh"]) -- GELU approximation mode, with the same semantics as approximate in torch.nn.GELU.
step_mode (str) -- Step mode, "s" or "m". The default is "m".

抛出:

ValueError -- If approximate is not "none" or "tanh".

ann_forward(x)[源代码]#

参数:: x (Tensor)
返回类型:: Tensor

multi_step_forward(x_seq)[源代码]#

API Language

中文

对完整时间序列执行 TD GELU。计算过程为：

\[X_{cum}[t] = \sum_{i=0}^{t} X[i]\]

\[Y_{cum}[t] = \operatorname{GELU}(X_{cum}[t])\]

\[Y[0] = Y_{cum}[0], \quad Y[t] = Y_{cum}[t] - Y_{cum}[t-1]\]

因此 Y.cumsum(dim=0) 与对 X.cumsum(dim=0) 逐时间步执行 ANN GELU 的结果一致。输出是浮点差分值，可能为负，不是二值脉冲。当 T = 1 时，Y[0] 直接等于对 X[0] 执行 GELU 的结果。输出 dtype 与输入 dtype 相同，且该算子对 autograd 透明。

参数:: x_seq (Tensor) -- 输入时间序列，形状为 [T, ...]，且 T > 0。
返回:: TD GELU 差分序列，形状与 x_seq 相同。
返回类型:: Tensor
抛出:: ValueError -- 若 x_seq 少于 2 维或时间维为空。

English

Apply TD GELU to a complete time sequence:

\[X_{cum}[t] = \sum_{i=0}^{t} X[i]\]

\[Y_{cum}[t] = \operatorname{GELU}(X_{cum}[t])\]

\[Y[0] = Y_{cum}[0], \quad Y[t] = Y_{cum}[t] - Y_{cum}[t-1]\]

Thus, Y.cumsum(dim=0) matches ANN GELU applied to X.cumsum(dim=0) at each time step. The output contains floating-point differential values, may be negative, and is not a binary spike tensor. When T = 1, Y[0] is exactly GELU applied to X[0]. The output dtype matches the input dtype, and the operator is transparent to autograd.

参数:: x_seq (Tensor) -- Input time sequence with shape [T, ...] and T > 0.
返回:: TD GELU differential sequence with the same shape as x_seq.
返回类型:: Tensor
抛出:: ValueError -- If x_seq has fewer than 2 dimensions or the time dimension is empty.

extra_repr()[源代码]#

返回类型:: str

class spikingjelly.activation_based.ann2snn.operators.TDLinear(in_features, out_features, bias=True, device=None, dtype=None, step_mode='m')[源代码]#

基类：TDModule

API Language: 中文 | English

中文

Temporal-difference (TD) Linear 算子。step_mode="m" 时输入必须是完整时间序列，时间维固定为第 0 维，形状为 [T, ..., in_features]；模块返回 sequence-preserving affine 差分序列，使 Y.cumsum(dim=0) 等于对 X.cumsum(dim=0) 逐时间步执行 torch.nn.functional.linear()。step_mode="s" 时输入被解释为当前差分时间步，模块更新内部累积 memory 并返回当前差分输出；普通 Linear 路径由 ann_forward() 提供。

返回值是浮点差分值，可能包含负值；它不是二值脉冲，也不表示 fully spike-driven Linear。输出 dtype 与 PyTorch Linear 一致；推荐使用 float32、float16、bfloat16 或 float64 输入。该算子完全由 PyTorch 可微算子组成，对 autograd 透明。该算子是 stateful TD MemoryModule；重复处理独立序列前应调用 reset。该算子仅依赖 PyTorch Linear，支持 CPU 与 CUDA，后端与 torch 一致，无 CuPy / Triton 专用路径。

该算子用于处理带 bias 的 affine projection。普通 torch.nn.Linear 直接作用在 TD 差分序列上会在时间累积后得到 T * bias；TD Linear 使累计输出保持 W @ x_cum + bias。当 bias=False 时，该算子退化为普通逐时间步 Linear；当 bias=True 时，bias 只在第 0 个时间步进入差分序列。

op = TDLinear(3, 5)
x_seq = torch.randn(4, 2, 3)
y_seq = op(x_seq)

参数:

in_features (int) -- 输入特征数。
out_features (int) -- 输出特征数。
bias (bool) -- 若为 True，使用可学习 bias 参数。
device (device or str or None) -- 参数初始化设备。
dtype (dtype or None) -- 参数初始化 dtype。
step_mode (str) -- 步进模式，"s" 或 "m"。默认 "m"。

English

Temporal-difference (TD) Linear operator. With step_mode="m", the input must be a complete time sequence whose time dimension is fixed at dimension 0, with shape [T, ..., in_features]. This module returns a sequence-preserving affine differential sequence such that Y.cumsum(dim=0) matches torch.nn.functional.linear() applied to X.cumsum(dim=0) at every time step. With step_mode="s", the input is interpreted as the current differential time step; the module updates its cumulative memory and returns the current differential output. The ordinary Linear path is exposed by ann_forward().

The output contains floating-point differential values and may contain negative values. It is not a binary spike tensor and does not represent a fully spike-driven Linear. The output dtype follows PyTorch Linear; float32, float16, bfloat16 and float64 inputs are recommended. The operator is composed entirely of differentiable PyTorch operations and is transparent to autograd. The operator is a stateful TD MemoryModule; call reset before processing an independent sequence. It only depends on PyTorch Linear, supports CPU and CUDA, follows the torch backend behavior, and has no CuPy / Triton specific path.

This operator handles affine projections with bias. Applying ordinary torch.nn.Linear directly to a TD differential sequence would accumulate the bias as T * bias. TD Linear applies Linear to the cumulative input and then differences the cumulative output, preserving W @ x_cum + bias. When bias=False, this operator degenerates to ordinary per-time-step Linear. When bias=True, the bias appears only at the first time step of the differential sequence.

op = TDLinear(3, 5)
x_seq = torch.randn(4, 2, 3)
y_seq = op(x_seq)

参数:

in_features (int) -- Number of input features.
out_features (int) -- Number of output features.
bias (bool) -- If True, use a learnable bias parameter.
device (device or str or None) -- Device used to initialize parameters.
dtype (dtype or None) -- Dtype used to initialize parameters.
step_mode (str) -- Step mode, "s" or "m". The default is "m".

reset_parameters()[源代码]#

返回类型:: None

ann_forward(x)[源代码]#

参数:: x (Tensor)
返回类型:: Tensor

multi_step_forward(x_seq)[源代码]#

API Language: 中文 | English

中文

对完整时间序列执行 TD Linear。计算过程为：

\[X_{cum}[t] = \sum_{i=0}^{t} X[i]\]

\[Y_{cum}[t] = X_{cum}[t] W^T + b\]

\[Y[0] = Y_{cum}[0], \quad Y[t] = Y_{cum}[t] - Y_{cum}[t-1]\]

因此 Y.cumsum(dim=0) 与对 X.cumsum(dim=0) 逐时间步执行 ANN Linear 的结果一致。若 bias 为 None，该计算等价于直接对 X 逐时间步执行 Linear；若存在 bias，bias 只出现在 Y[0] 中，避免累计后得到 T * bias。输出是浮点差分值，可能为负，不是二值脉冲。当 T = 1 时，Y[0] 直接等于对 X[0] 执行 Linear 的结果。输出 dtype 与 PyTorch Linear 一致，且该算子对 autograd 透明。

参数:: x_seq (Tensor) -- 输入时间序列，形状为 [T, ..., in_features]，且 T > 0。
返回:: TD Linear 差分序列，形状为 [T, ..., out_features]。
返回类型:: Tensor
抛出:: ValueError -- 若 x_seq 少于 2 维或时间维为空。

English

Apply TD Linear to a complete time sequence:

\[X_{cum}[t] = \sum_{i=0}^{t} X[i]\]

\[Y_{cum}[t] = X_{cum}[t] W^T + b\]

\[Y[0] = Y_{cum}[0], \quad Y[t] = Y_{cum}[t] - Y_{cum}[t-1]\]

Thus, Y.cumsum(dim=0) matches ANN Linear applied to X.cumsum(dim=0) at each time step. If bias is None, this is equivalent to applying Linear to X at each time step directly. If a bias exists, the bias appears only in Y[0], avoiding T * bias after accumulation. The output contains floating-point differential values, may be negative, and is not a binary spike tensor. When T = 1, Y[0] is exactly Linear applied to X[0]. The output dtype follows PyTorch Linear, and the operator is transparent to autograd.

参数:: x_seq (Tensor) -- Input time sequence with shape [T, ..., in_features] and T > 0.
返回:: TD Linear differential sequence with shape [T, ..., out_features].
返回类型:: Tensor
抛出:: ValueError -- If x_seq has fewer than 2 dimensions or the time dimension is empty.

extra_repr()[源代码]#

返回类型:: str

class spikingjelly.activation_based.ann2snn.operators.TDConv2d(in_channels, out_channels, kernel_size, stride=1, padding=0, dilation=1, groups=1, bias=True, padding_mode='zeros', device=None, dtype=None, step_mode='m')[源代码]#

基类：TDModule

API Language: 中文 | English

中文

Temporal-difference (TD) Conv2d 算子。step_mode="m" 时输入必须是完整时间序列，形状为 [T, N, C, H, W]；返回的浮点差分序列满足 Y.cumsum(dim=0) 等于对 X.cumsum(dim=0) 逐时间步执行 torch.nn.functional.conv2d()。当存在 bias 时，bias 只出现在第 0 个差分时间步，避免累计后得到 T * bias。step_mode="s" 时输入被解释为当前差分时间步，模块更新内部累积 memory 并返回当前差分输出；普通 Conv2d 路径由 ann_forward() 提供。

输出是浮点差分值，可能包含负值；它不是二值脉冲，也不表示 fully spike-driven Conv2d。构造参数对齐 torch.nn.Conv2d 的 2D convolution 参数，支持 padding="same" 和 padding="valid"。

参数:

in_channels (int) -- 输入通道数。
out_channels (int) -- 输出通道数。
kernel_size (int or Tuple[int, int]) -- 卷积核大小。
stride (int or Tuple[int, int]) -- 卷积步幅。
padding (str | int | Tuple[int, int]) -- padding 参数，可为整数、tuple、"same" 或 "valid"。
dilation (int | Tuple[int, int]) -- dilation 参数。
groups (int) -- 分组卷积组数。
bias (bool) -- 是否使用 learnable bias。
padding_mode (str) -- padding 模式。
device (device | str | None) -- 参数初始化设备。
dtype (dtype | None) -- 参数初始化 dtype。
step_mode (str) -- 步进模式，"s" 或 "m"。默认 "m"。

English

Temporal-difference (TD) Conv2d operator. With step_mode="m", input must be a complete time sequence with shape [T, N, C, H, W]. The returned floating differential sequence satisfies Y.cumsum(dim=0) matching torch.nn.functional.conv2d() applied to X.cumsum(dim=0) at each timestep. When bias is present, it appears only in Y[0] to avoid accumulating T * bias. With step_mode="s", the input is interpreted as the current differential time step; the module updates its cumulative memory and returns the current differential output. The ordinary Conv2d path is exposed by ann_forward().

The output may contain negative floating-point differential values. It is not a binary spike tensor and does not represent a fully spike-driven Conv2d. Constructor arguments mirror the supported 2D convolution arguments of torch.nn.Conv2d, including padding="same" and padding="valid".

参数:

in_channels (int) -- Number of input channels.
out_channels (int) -- Number of output channels.
kernel_size (int or Tuple[int, int]) -- Convolution kernel size.
stride (int or Tuple[int, int]) -- Convolution stride.
padding (str | int | Tuple[int, int]) -- Padding argument, which can be an int, tuple, "same" or "valid".
dilation (int | Tuple[int, int]) -- Convolution dilation.
groups (int) -- Number of convolution groups.
bias (bool) -- If True, use a learnable bias parameter.
padding_mode (str) -- Padding mode.
device (device | str | None) -- Device used to initialize parameters.
dtype (dtype | None) -- Dtype used to initialize parameters.
step_mode (str) -- Step mode, "s" or "m". The default is "m".

reset_parameters()[源代码]#

返回类型:: None

ann_forward(x)[源代码]#

参数:: x (Tensor)
返回类型:: Tensor

multi_step_forward(x_seq)[源代码]#

参数:: x_seq (Tensor)
返回类型:: Tensor

extra_repr()[源代码]#

返回类型:: str

class spikingjelly.activation_based.ann2snn.operators.SNNMatrixOperator(step_mode='m')[源代码]#

基类：TDModule

API Language

中文

Sequence-preserving SNN 矩阵乘法算子。step_mode="m" 时输入必须是两条完整时间序列，时间维固定为第 0 维，形状分别为 [T, ..., M, N] 和 [T, ..., N, P]；模块先分别对两条输入在时间维做累积，再执行 torch.matmul()，最后返回累积输出在时间维上的差分。step_mode="s" 时输入被解释为当前差分时间步，模块更新内部累积 memory 并返回当前差分输出；普通 matmul 路径由 ann_forward() 提供。

该算子满足 Y.cumsum(dim=0) == torch.matmul(A.cumsum(dim=0), B.cumsum(dim=0))。因而它保留 cross-time terms，例如 A[0] @ B[1] 与 A[1] @ B[0]；这不同于逐时间步执行 A[t] @ B[t]。该算子是 LAS SNNMatrixOperator prefix recurrence 的 sequence-preserving 张量级形式，但不会在内部自动 sum(0)。

输出是浮点差分值，可能包含负值；它不是二值脉冲，也不表示 fully spike-driven matrix multiplication。dtype、device 与 broadcast 语义遵循 torch.matmul()。该算子是 stateful TD MemoryModule；重复处理独立序列前应调用 reset。

参数:: step_mode (str) -- 步进模式，"s" 或 "m"。默认 "m"。

English

Sequence-preserving SNN matrix multiplication operator. With step_mode="m", the inputs must be two complete time sequences whose time dimension is fixed at dimension 0, with shapes [T, ..., M, N] and [T, ..., N, P]. This module accumulates both inputs over time, applies torch.matmul(), and returns the temporal difference of the cumulative outputs. With step_mode="s", the inputs are interpreted as the current differential time step; the module updates its cumulative memory and returns the current differential output. The ordinary matmul path is exposed by ann_forward().

The operator satisfies Y.cumsum(dim=0) == torch.matmul(A.cumsum(dim=0), B.cumsum(dim=0)). Therefore it preserves cross-time terms such as A[0] @ B[1] and A[1] @ B[0]; it is not equivalent to applying A[t] @ B[t] at each time step independently. It is the sequence-preserving tensor form of the LAS SNNMatrixOperator prefix recurrence, but it does not implicitly call sum(0).

The output contains floating-point differential values and may contain negative values. It is not a binary spike tensor and does not represent fully spike-driven matrix multiplication. Dtype, device and broadcasting semantics follow torch.matmul(). The operator is a stateful TD MemoryModule; call reset before processing an independent sequence.

参数:: step_mode (str) -- Step mode, "s" or "m". The default is "m".

ann_forward(a, b)[源代码]#

参数:

a (Tensor)
b (Tensor)

返回类型:

multi_step_forward(a_seq, b_seq)[源代码]#

API Language

中文

对两条完整时间序列执行 sequence-preserving SNN 矩阵乘法：

\[A_{cum}[t] = \sum_{i=0}^{t} A[i]\]

\[B_{cum}[t] = \sum_{i=0}^{t} B[i]\]

\[Y_{cum}[t] = A_{cum}[t] B_{cum}[t]\]

\[Y[0] = Y_{cum}[0], \quad Y[t] = Y_{cum}[t] - Y_{cum}[t-1]\]

参数:

a_seq (Tensor) -- 左输入序列，形状为 [T, ..., M, N]，且 T > 0。
b_seq (Tensor) -- 右输入序列，形状为 [T, ..., N, P]，且 T > 0。

返回:

差分输出序列，形状为 [T, ..., M, P]。

返回类型:

抛出:

ValueError -- 若输入少于 3 维、时间维为空或时间长度不一致。

English

Apply sequence-preserving SNN matrix multiplication to two complete time sequences:

\[A_{cum}[t] = \sum_{i=0}^{t} A[i]\]

\[B_{cum}[t] = \sum_{i=0}^{t} B[i]\]

\[Y_{cum}[t] = A_{cum}[t] B_{cum}[t]\]

\[Y[0] = Y_{cum}[0], \quad Y[t] = Y_{cum}[t] - Y_{cum}[t-1]\]

参数:

a_seq (Tensor) -- Left input sequence with shape [T, ..., M, N] and T > 0.
b_seq (Tensor) -- Right input sequence with shape [T, ..., N, P] and T > 0.

返回:

Differential output sequence with shape [T, ..., M, P].

返回类型:

抛出:

ValueError -- If an input has fewer than 3 dimensions, the time dimension is empty, or the time lengths differ.

class spikingjelly.activation_based.ann2snn.operators.SNNElementWiseProduct(step_mode='m')[源代码]#

基类：TDModule

API Language

中文

Sequence-preserving SNN 逐元素乘法算子。step_mode="m" 时输入必须是两条完整时间序列，时间维固定为第 0 维，形状为 [T, ...]，非时间维遵循 PyTorch broadcast 规则；模块先分别对两条输入在时间维做累积，再执行逐元素乘法，最后返回累积输出在时间维上的差分。 step_mode="s" 时输入被解释为当前差分时间步，模块更新内部累积 memory 并返回当前差分输出；普通逐元素乘法路径由 ann_forward() 提供。

该算子满足 Y.cumsum(dim=0) == A.cumsum(dim=0) * B.cumsum(dim=0)。它是 LAS SNNMACOperator 核心乘法-累积语义的 sequence-preserving 张量级形式，但不会在内部自动 sum(0)；需要单步聚合时由调用方显式完成。

输出是浮点差分值，可能包含负值；它不是二值脉冲，也不表示 fully spike-driven multiply-accumulate。dtype、device 与 broadcast 语义遵循 PyTorch 逐元素乘法。该算子是 stateful TD MemoryModule；重复处理独立序列前应调用 reset。

参数:: step_mode (str) -- 步进模式，"s" 或 "m"。默认 "m"。

English

Sequence-preserving SNN element-wise product operator. With step_mode="m", the inputs must be two complete time sequences whose time dimension is fixed at dimension 0, with shape [T, ...]. Non-time dimensions follow PyTorch broadcasting rules. This module accumulates both inputs over time, applies element-wise multiplication, and returns the temporal difference of the cumulative outputs. With step_mode="s", the inputs are interpreted as the current differential time step; the module updates its cumulative memory and returns the current differential output. The ordinary element-wise multiplication path is exposed by ann_forward().

The operator satisfies Y.cumsum(dim=0) == A.cumsum(dim=0) * B.cumsum(dim=0). It is the sequence-preserving tensor form of the core multiply-accumulate semantics in LAS SNNMACOperator, but it does not implicitly call sum(0); callers should aggregate explicitly when a single-step output is required.

The output contains floating-point differential values and may contain negative values. It is not a binary spike tensor and does not represent fully spike-driven multiply-accumulate. Dtype, device and broadcasting semantics follow PyTorch element-wise multiplication. The operator is a stateful TD MemoryModule; call reset before processing an independent sequence.

参数:: step_mode (str) -- Step mode, "s" or "m". The default is "m".

ann_forward(a, b)[源代码]#

参数:

a (Tensor)
b (Tensor)

返回类型:

multi_step_forward(a_seq, b_seq)[源代码]#

API Language

中文

对两条完整时间序列执行 sequence-preserving SNN 逐元素乘法：

\[A_{cum}[t] = \sum_{i=0}^{t} A[i]\]

\[B_{cum}[t] = \sum_{i=0}^{t} B[i]\]

\[Y_{cum}[t] = A_{cum}[t] \odot B_{cum}[t]\]

\[Y[0] = Y_{cum}[0], \quad Y[t] = Y_{cum}[t] - Y_{cum}[t-1]\]

参数:

a_seq (Tensor) -- 左输入序列，形状为 [T, ...]，且 T > 0。
b_seq (Tensor) -- 右输入序列，形状为 [T, ...]，且 T > 0。

返回:

差分输出序列，形状由 PyTorch broadcast 规则决定。

返回类型:

抛出:

ValueError -- 若输入少于 2 维、时间维为空或时间长度不一致。

English

Apply sequence-preserving SNN element-wise product to two complete time sequences:

\[A_{cum}[t] = \sum_{i=0}^{t} A[i]\]

\[B_{cum}[t] = \sum_{i=0}^{t} B[i]\]

\[Y_{cum}[t] = A_{cum}[t] \odot B_{cum}[t]\]

\[Y[0] = Y_{cum}[0], \quad Y[t] = Y_{cum}[t] - Y_{cum}[t-1]\]

参数:

a_seq (Tensor) -- Left input sequence with shape [T, ...] and T > 0.
b_seq (Tensor) -- Right input sequence with shape [T, ...] and T > 0.

返回:

Differential output sequence whose shape follows PyTorch broadcasting rules.

返回类型:

抛出:

ValueError -- If an input has fewer than 2 dimensions, the time dimension is empty, or the time lengths differ.

class spikingjelly.activation_based.ann2snn.operators.TDScaledDotProductAttention(is_causal=False, scale=None, step_mode='m')[源代码]#

基类：TDModule

API Language

中文

Temporal-difference (TD) scaled dot-product attention 算子。 step_mode="m" 时输入必须是完整时间序列，时间维固定为第 0 维。 query_seq 的形状为 [T, ..., L, E]，key_seq 的形状为 [T, ..., S, E]，value_seq 的形状为 [T, ..., S, Ev]；模块先分别对 query、key、value 在时间维做累积，再调用 torch.nn.functional.scaled_dot_product_attention()，最后返回累积输出在时间维上的差分。step_mode="s" 时输入被解释为当前差分时间步，模块更新内部累积 memory 并返回当前差分输出；普通 SDPA 路径由 ann_forward() 提供。

返回值是浮点差分值，可能包含负值；它不是二值脉冲，也不表示 fully spike-driven attention。dtype、device 与 mask broadcast 语义遵循 torch.nn.functional.scaled_dot_product_attention()；推荐使用 float32、float16、bfloat16 或 float64 输入。该算子完全由 PyTorch 可微算子组成，对 autograd 透明。该算子是 stateful TD MemoryModule；重复处理独立序列前应调用 reset。该算子仅依赖 PyTorch SDPA，支持 CPU 与 CUDA，后端与 torch 一致，无 CuPy / Triton 专用路径。

该算子的机制来源于 SpikeZIP-TF: Conversion is All You Need for Transformer-based SNN 中对 Transformer 算子的累积-差分等价转换思路。本文档中的 TD scaled dot-product attention 只实现张量级最小 primitive：在多步模式下它仍调用 PyTorch SDPA，需要完整时间序列输入，不是逐时间步在线算子，也不是面向神经形态硬件的 fully spike-driven attention。本实现固定 dropout_p=0.0，且不暴露 enable_gqa。组合 TD Transformer block 时，普通带 bias 的 torch.nn.Linear 不能直接作用在差分序列上，因为累计后 bias 会被重复累加；应使用 bias=False 或专门的 TD Linear。

op = TDScaledDotProductAttention()
q_seq = torch.randn(4, 2, 3, 8)
k_seq = torch.randn(4, 2, 5, 8)
v_seq = torch.randn(4, 2, 5, 6)
y_seq = op(q_seq, k_seq, v_seq)

参数:

is_causal (bool) -- 是否应用 causal attention mask。若为 True， forward 中不能同时传入 attn_mask。
scale (Optional[float]) -- attention scale。若为 None，使用 PyTorch SDPA 默认值。
step_mode (str) -- 步进模式，"s" 或 "m"。默认 "m"。

English

Temporal-difference (TD) scaled dot-product attention operator. With step_mode="m", the inputs must be complete time sequences whose time dimension is fixed at dimension 0. query_seq has shape [T, ..., L, E], key_seq has shape [T, ..., S, E], and value_seq has shape [T, ..., S, Ev]. This module first accumulates query, key, and value over time, calls torch.nn.functional.scaled_dot_product_attention(), and returns the temporal difference of the cumulative outputs. With step_mode="s", the inputs are interpreted as the current differential time step; the module updates its cumulative memory and returns the current differential output. The ordinary SDPA path is exposed by ann_forward().

The output contains floating-point differential values and may contain negative values. It is not a binary spike tensor and does not represent fully spike-driven attention. Dtype, device, and mask broadcasting follow torch.nn.functional.scaled_dot_product_attention(); float32, float16, bfloat16 and float64 inputs are recommended. The operator is composed entirely of differentiable PyTorch operations and is transparent to autograd. The operator is a stateful TD MemoryModule; call reset before processing an independent sequence. It only depends on PyTorch SDPA, supports CPU and CUDA, follows the torch backend behavior, and has no CuPy / Triton specific path.

The mechanism follows the cumulative-difference equivalence idea for Transformer operators in SpikeZIP-TF: Conversion is All You Need for Transformer-based SNN. This implementation provides only a tensor-level minimal primitive: in multi-step mode it still calls PyTorch SDPA, requires a complete time sequence, is not a step-wise online operator, and is not fully spike-driven attention for neuromorphic hardware. This implementation fixes dropout_p=0.0 and does not expose enable_gqa. When composing TD Transformer blocks, ordinary torch.nn.Linear layers with bias must not be applied directly to differential sequences, because the bias would be accumulated repeatedly; use bias=False or a dedicated TD Linear.

op = TDScaledDotProductAttention()
q_seq = torch.randn(4, 2, 3, 8)
k_seq = torch.randn(4, 2, 5, 8)
v_seq = torch.randn(4, 2, 5, 6)
y_seq = op(q_seq, k_seq, v_seq)

参数:

is_causal (bool) -- Whether to apply causal attention masking. If True, attn_mask must not be passed to forward.
scale (Optional[float]) -- Attention scale. If None, use the PyTorch SDPA default.
step_mode (str) -- Step mode, "s" or "m". The default is "m".

ann_forward(query, key, value, attn_mask=None)[源代码]#

参数:

query (Tensor)
key (Tensor)
value (Tensor)
attn_mask (Tensor | None)

返回类型:

single_step_forward(query, key, value, attn_mask=None)[源代码]#

参数:

query (Tensor)
key (Tensor)
value (Tensor)
attn_mask (Tensor | None)

返回类型:

multi_step_forward(query_seq, key_seq, value_seq, attn_mask=None)[源代码]#

API Language

中文

对完整 query、key、value 时间序列执行 TD scaled dot-product attention。计算过程为：

\[Q_{cum}[t] = \sum_{i=0}^{t} Q[i], \quad K_{cum}[t] = \sum_{i=0}^{t} K[i], \quad V_{cum}[t] = \sum_{i=0}^{t} V[i]\]

\[Y_{cum}[t] = \operatorname{SDPA}(Q_{cum}[t], K_{cum}[t], V_{cum}[t])\]

\[Y[0] = Y_{cum}[0], \quad Y[t] = Y_{cum}[t] - Y_{cum}[t-1]\]

因此 Y.cumsum(dim=0) 与对累积 query、key、value 逐时间步执行 ANN SDPA 的结果一致。输出是浮点差分值，可能为负，不是二值脉冲。当 T = 1 时，Y[0] 直接等于对第一步 query、key、value 执行 SDPA 的结果。输出 dtype 与 PyTorch SDPA 一致，且该算子对 autograd 透明。

参数:

query_seq (Tensor) -- query 时间序列，形状为 [T, ..., L, E]，且 T > 0。
key_seq (Tensor) -- key 时间序列，形状为 [T, ..., S, E]，且时间维长度必须与 query_seq 相同。
value_seq (Tensor) -- value 时间序列，形状为 [T, ..., S, Ev]，且时间维长度必须与 query_seq 相同。
attn_mask (Tensor or None) -- attention mask，broadcast 语义与 PyTorch SDPA 一致。

返回:

TD scaled dot-product attention 差分序列，形状为 [T, ..., L, Ev]。

返回类型:

抛出:

ValueError -- 若任一输入少于 3 维、时间维为空、三者时间维长度不一致，或 is_causal=True 时同时传入 attn_mask。

English

Apply TD scaled dot-product attention to complete query, key, and value time sequences:

\[Q_{cum}[t] = \sum_{i=0}^{t} Q[i], \quad K_{cum}[t] = \sum_{i=0}^{t} K[i], \quad V_{cum}[t] = \sum_{i=0}^{t} V[i]\]

\[Y_{cum}[t] = \operatorname{SDPA}(Q_{cum}[t], K_{cum}[t], V_{cum}[t])\]

\[Y[0] = Y_{cum}[0], \quad Y[t] = Y_{cum}[t] - Y_{cum}[t-1]\]

Thus, Y.cumsum(dim=0) matches ANN SDPA applied to cumulative query, key, and value at each time step. The output contains floating-point differential values, may be negative, and is not a binary spike tensor. When T = 1, Y[0] is exactly SDPA applied to the first query, key, and value step. The output dtype follows PyTorch SDPA, and the operator is transparent to autograd.

参数:

query_seq (Tensor) -- Query time sequence with shape [T, ..., L, E] and T > 0.
key_seq (Tensor) -- Key time sequence with shape [T, ..., S, E]. Its time dimension length must match query_seq.
value_seq (Tensor) -- Value time sequence with shape [T, ..., S, Ev]. Its time dimension length must match query_seq.
attn_mask (Tensor or None) -- Attention mask with the same broadcast semantics as PyTorch SDPA.

返回:

TD scaled dot-product attention differential sequence with shape [T, ..., L, Ev].

返回类型: