spikingjelly.activation_based.cuda_kernel.auto_cuda package#

spikingjelly.activation_based.cuda_kernel.auto_cuda.base.wrap_with_comment(code, comment)[源代码]#

参数:

code (str)
comment (str)

spikingjelly.activation_based.cuda_kernel.auto_cuda.base.startswiths(x, prefixes)[源代码]#

参数:

x (str)
prefixes (tuple)

class spikingjelly.activation_based.cuda_kernel.auto_cuda.base.CKernel(kernel_name)[源代码]#

基类：object

API Language - 中文 | English

中文

自定义 CUDA kernel 的基础封装类。它维护 kernel 形参表 cparams、保留变量名 reserved_cnames，以及可拼接的代码片段（declaration/head/core/tail）。

参数:: kernel_name (str) -- CUDA kernel 名称

English

Base wrapper for custom CUDA kernels. It stores kernel parameter metadata (cparams), reserved C variable names (reserved_cnames), and code segments (declaration/head/core/tail).

参数:: kernel_name (str) -- CUDA kernel name

check_attributes(**kwargs)[源代码]#

API Language

中文 | English

中文

检查 kwargs 中给定属性值是否与当前对象属性一致。

参数:: kwargs (dict) -- 待检查的属性键值对
返回:: 全部属性一致时返回 True，否则返回 False
返回类型:: bool

English

Check whether provided attribute values in kwargs match current attributes on this object.

参数:: kwargs (dict) -- Attribute key-value pairs to check
返回:: True if all attributes match; otherwise False
返回类型:: bool

property core#

set_contiguous(py_dict)[源代码]#

API Language

中文 | English

中文

将 py_dict 中的 torch.Tensor/cupy.ndarray 转为连续内存；若出现其他类型则抛出异常。

参数:: py_dict (dict) -- kernel 参数字典

English

Make torch.Tensor/cupy.ndarray values in py_dict contiguous. Raise an error for unsupported value types.

参数:: py_dict (dict) -- Kernel argument dictionary

get_device(py_dict)[源代码]#

API Language - 中文 | English

中文

遍历 py_dict，返回首个张量对象所在 CUDA 设备编号。

参数:: py_dict (dict) -- kernel 参数字典
返回:: CUDA 设备编号
返回类型:: int
抛出:: ValueError -- 当 py_dict 中没有张量参数时

English

Traverse py_dict and return the CUDA device id of the first tensor-like value.

参数:: py_dict (dict) -- Kernel argument dictionary
返回:: CUDA device id
返回类型:: int
抛出:: ValueError -- If no tensor-like value is found

check_device(device, py_dict)[源代码]#

API Language

中文 | English

中文

检查 py_dict 中所有张量是否都位于 device 指定的 CUDA 设备上。

参数:

device (int) -- 目标 CUDA 设备编号
py_dict (dict) -- kernel 参数字典

English

Validate that all tensor-like values in py_dict are on the target CUDA device device.

参数:

device (int) -- Target CUDA device id
py_dict (dict) -- Kernel argument dictionary

check_keys(py_dict)[源代码]#

API Language - 中文 | English

中文

检查 py_dict 的键集合是否与 self.cparams 一致，不一致时抛出异常。

参数:: py_dict (dict) -- kernel 参数字典

English

Check whether keys in py_dict exactly match keys in self.cparams. Raise an error on mismatch.

参数:: py_dict (dict) -- Kernel argument dictionary

check_ctypes(py_dict)[源代码]#

API Language

中文 | English

中文

检查 py_dict 中各参数的数据类型是否与 self.cparams 中声明的 CUDA C 参数类型匹配（例如 float / half2 / int）。

参数:: py_dict (dict) -- kernel 参数字典

English

Validate that runtime value dtypes in py_dict match declared CUDA C parameter types in self.cparams (e.g., float/half2/int).

参数:: py_dict (dict) -- Kernel argument dictionary

check_half2(py_dict)[源代码]#

API Language

中文 | English

中文

供子类按需实现的 half2 参数校验接口。

参数:: py_dict (dict) -- kernel 参数字典

English

Extension hook for subclasses to implement half2-related checks.

参数:: py_dict (dict) -- Kernel argument dictionary

get_ptrs(py_dict)[源代码]#

API Language - 中文 | English

中文

提取 py_dict 中每个张量参数的底层指针（或等价对象），按键排序后的参数顺序返回。

参数:: py_dict (dict) -- kernel 参数字典
返回:: 指针元组
返回类型:: tuple

English

Collect underlying pointers (or equivalent objects) for tensor-like values in py_dict and return them as a tuple.

参数:: py_dict (dict) -- Kernel argument dictionary
返回:: Tuple of argument pointers
返回类型:: tuple

__call__(grid, block, py_dict, *args_1, **kwargs)[源代码]#

API Language - 中文 | English

中文

执行 CUDA kernel。调用前会完成设备一致性检查、连续化、参数类型检查和键集合校验。

参数:

grid (tuple) -- CUDA grid 配置
block (tuple) -- CUDA block 配置
py_dict (dict) -- kernel 实参字典，键需与 self.cparams 一一对应

English

Execute the CUDA kernel after validating device consistency, contiguous layout, ctypes compatibility, and key alignment with self.cparams.

参数:

grid (tuple) -- CUDA grid configuration
block (tuple) -- CUDA block configuration
py_dict (dict) -- Runtime argument dictionary matching self.cparams

add_param(ctype, cname)[源代码]#

API Language - 中文 | English

中文

向 self.cparams 添加一个 CUDA 形参声明。

参数:

ctype (str) -- CUDA 参数类型字符串
cname (str) -- CUDA 参数名

抛出:

ValueError -- 当参数名重复或与保留名冲突时

English

Add one CUDA parameter declaration to self.cparams.

参数:

ctype (str) -- CUDA parameter type string
cname (str) -- CUDA parameter name

抛出:

ValueError -- If the name already exists or conflicts with reserved names

property declaration#

property head#

property tail#

property full_codes#

API Language

中文 | English

中文

返回拼接后的完整 CUDA 代码字符串。

返回:: 完整 CUDA 源码
返回类型:: str

English

Return the full CUDA source string assembled from declaration/head/core/tail.

返回:: Full CUDA source code
返回类型:: str

class spikingjelly.activation_based.cuda_kernel.auto_cuda.base.CKernel1D(*args, **kwargs)[源代码]#

基类：CKernel

API Language - 中文 | English

中文

一维（逐元素）CUDA kernel 封装类，继承自 CKernel。该类默认添加 numel 形参并保留 index 作为线程索引变量名。

参数:: kernel_name (str) -- CUDA kernel 名称（通过 *args, **kwargs 传入基类）

English

1D (element-wise) CUDA kernel wrapper inherited from CKernel. It adds numel by default and reserves index as the thread index variable name.

参数:: kernel_name (str) -- CUDA kernel name (forwarded to the base class via *args, **kwargs)

property head#

API Language - 中文 | English

中文

返回 1D kernel 头部代码，包含线程索引计算和 index < numel 边界判断。

返回:: CUDA 头部代码片段
返回类型:: str

English

Return the 1D kernel head code, including thread-index computation and the index < numel guard.

返回:: CUDA head code snippet
返回类型:: str

property tail#

API Language - 中文 | English

中文

返回 1D kernel 尾部代码，用于闭合头部中的代码块。

返回:: CUDA 尾部代码片段
返回类型:: str

English

Return the 1D kernel tail code that closes the blocks opened in head.

返回:: CUDA tail code snippet
返回类型:: str

check_half2(py_dict)[源代码]#

API Language

中文 | English

中文

检查 py_dict 中 half/float16 张量的元素个数是否为偶数，以满足 half2 访存需求。

参数:: py_dict (dict) -- kernel 参数字典

注意

CKernel1D.__call__() 会在执行前自动对奇数长度 half 张量补齐，因此通常无需手工补齐。

English

Validate that half/float16 tensor lengths in py_dict are even, which is required by half2 operations.

参数:: py_dict (dict) -- Kernel argument dictionary

Note

CKernel1D.__call__() pads odd-length half tensors before kernel launch, so manual padding is usually unnecessary.

__call__(grid, block, py_dict, *args_1, **kwargs)[源代码]#

API Language - 中文 | English

中文

执行 1D CUDA kernel。对于 half/float16 且元素个数为奇数的张量，会先补齐末元素后再调用基类执行，完成后再恢复原始形状与长度。

参数:

grid (tuple) -- CUDA grid 配置
block (tuple) -- CUDA block 配置
py_dict (dict) -- kernel 参数字典，键应与 self.cparams 对应

English

Execute the 1D CUDA kernel. For odd-length half/float16 tensors, this method pads one trailing element before delegating to the base call, then removes the padding and restores the original shape.

参数:

grid (tuple) -- CUDA grid configuration
block (tuple) -- CUDA block configuration
py_dict (dict) -- Runtime argument dictionary aligned with self.cparams

simple_call(**kwargs)[源代码]#

API Language

中文 | English

中文

CKernel1D.__call__ 的简化入口。自动从 kwargs 中推导设备与 numel，并使用配置中的线程数计算 blocks 后执行 kernel。

参数:: kwargs (dict) -- kernel 参数键值对（不需要手动提供 numel）

English

A convenience wrapper of CKernel1D.__call__(). It infers device and numel from kwargs, computes launch blocks with configured threads, and launches the kernel.

参数:: kwargs (dict) -- Kernel argument mapping (numel is inferred automatically)

class spikingjelly.activation_based.cuda_kernel.auto_cuda.base.CKernel2D(kernel_name, reverse=False)[源代码]#

基类：CKernel

API Language - 中文 | English

中文

二维 CUDA kernel 封装，继承自 CKernel。默认包含 numel 与 N 两个内置参数，分别表示总元素数与每个时间步的元素数。二维张量按 [T, N] 解释，其中 T 为序列长度，N 为单步元素数量。

reverse 控制时间维循环方向： True 时使用 for(int t = numel - N + index; t >= 0; t -= dt)； False 时使用 for(int t = index; t < numel; t += dt)。

参数:

kernel_name (str) -- CUDA kernel 名称
reverse (bool) -- 是否使用反向时间循环

English

A 2D CUDA kernel wrapper derived from CKernel. It provides built-in parameters numel (total element count) and N (elements per time step). Any 2D tensor is interpreted as [T, N], where T is sequence length.

reverse controls the temporal loop direction: True uses for(int t = numel - N + index; t >= 0; t -= dt); False uses for(int t = index; t < numel; t += dt).

参数:

kernel_name (str) -- CUDA kernel name
reverse (bool) -- Whether to use reverse temporal traversal

property pre_core#

property post_core#

check_shape(py_dict)[源代码]#

API Language

中文 | English

中文

检查 py_dict 中的张量维度。所有 torch.Tensor 与 cupy.ndarray 的维度都必须不超过 2。

参数:: py_dict (dict) -- kernel 参数字典

English

Validate tensor dimensionality in py_dict. All torch.Tensor and cupy.ndarray values must have ndim <= 2.

参数:: py_dict (dict) -- Kernel argument dictionary

check_half2(py_dict)[源代码]#

API Language

中文 | English

中文

检查 py_dict 中半精度张量是否满足 half2 对齐要求（偶数长度）。对 torch.half 和 np.float16： 1D 张量要求 numel 为偶数，2D 张量要求 shape[1] 为偶数。

参数:: py_dict (dict) -- kernel 参数字典

Note

实际执行前，CKernel2D.__call__() 会自动补齐奇数长度半精度张量；本函数用于约束与校验。

English

Check whether half-precision tensors in py_dict satisfy half2 alignment requirements (even length). For torch.half and np.float16 values: 1D tensors require even numel; 2D tensors require even shape[1].

参数:: py_dict (dict) -- Kernel argument dictionary

Note

CKernel2D.__call__() performs automatic padding for odd-sized half tensors before launch. This method is for validation checks.

__call__(grid, block, py_dict, *args_1, **kwargs)[源代码]#

API Language - 中文 | English

中文

执行二维 CUDA kernel。*args_1 与 **kwargs 会透传给 cupy.RawKernel 调用。

py_dict 中 key 必须与 self.cparams 的形参名一一对应， value 需为匹配数据类型的 torch.Tensor 或 cupy.ndarray。键顺序可以任意，内部会按形参顺序重排。

参数:

grid (tuple) -- CUDA grid 配置
block (tuple) -- CUDA block 配置
py_dict (dict) -- kernel 参数字典

Note

py_dict 中张量必须为 1D 或 2D。对于奇数长度半精度张量，调用前会自动补齐，执行后移除补齐并恢复原始形状。

English

Execute the 2D CUDA kernel. *args_1 and **kwargs are forwarded directly to cupy.RawKernel.

Keys in py_dict must match self.cparams one-to-one, and values must be torch.Tensor/cupy.ndarray objects with compatible dtypes. Key order is arbitrary because arguments are aligned internally by formal parameter order.

参数:

grid (tuple) -- CUDA grid configuration
block (tuple) -- CUDA block configuration
py_dict (dict) -- Kernel argument dictionary

Note

Tensor inputs must be 1D or 2D. Odd-sized half-precision tensors are padded before launch, then unpadded and reshaped back afterward.

property head#

property tail#

simple_call(**kwargs)[源代码]#

API Language

中文 | English

中文

CKernel2D.__call__() 的便捷封装。该函数会从 kwargs 自动推断设备、numel、N，并根据配置计算 CUDA threads 与 blocks 后执行。

参数:: kwargs (dict) -- kernel 参数键值对（无需手动传入 numel 与 N）

English

A convenience wrapper of CKernel2D.__call__(). It infers device, numel, and N from kwargs, computes launch threads and blocks from configuration, and then launches the kernel.

参数:: kwargs (dict) -- Kernel argument mapping (numel and N are inferred)

class spikingjelly.activation_based.cuda_kernel.auto_cuda.base.CodeTyper(indent_num)[源代码]#

基类：object

API Language - 中文 | English

中文

CUDA 代码缩进与拼接工具。内部维护 self.codes，可逐段追加代码并按缩进格式化。

参数:: indent_num (int) -- 初始缩进空格数

English

A helper for formatting and assembling CUDA code with indentation. The accumulated code text is stored in self.codes.

参数:: indent_num (int) -- Number of spaces used for initial indentation

append(codes)[源代码]#

API Language - 中文 | English

中文

将输入 CUDA 代码片段追加到 self.codes。函数按 ; 分句并逐句写入，同时处理 {/} 这类块边界语句。

参数:: codes (str) -- 待追加的 CUDA 代码

English

Append CUDA code snippets into self.codes. The method splits by ; and writes each statement with current indentation, while handling { and } block boundary tokens.

参数:: codes (str) -- CUDA code snippet to append

class spikingjelly.activation_based.cuda_kernel.auto_cuda.base.CodeBlock(env)[源代码]#

基类：object

API Language - 中文 | English

中文

CodeTyper 的上下文管理器工具，用于自动插入代码块 {...} 并维护缩进，便于组织包含中间变量的多行 CUDA 逻辑。

参数:: env (CodeTyper) -- 目标代码环境

English

A context-manager utility for CodeTyper that inserts {...} blocks and adjusts indentation automatically. It is useful for composing multi-line CUDA logic with intermediate variables.

参数:: env (CodeTyper) -- Target code-typing environment

spikingjelly.activation_based.cuda_kernel.auto_cuda.cfunction.wrap_return_codes(y, codes)[源代码]#

参数:

y (str | None)
codes (str)

spikingjelly.activation_based.cuda_kernel.auto_cuda.cfunction.float2half2(y, x)[源代码]#

参数:

y (str | None)
x (str)

spikingjelly.activation_based.cuda_kernel.auto_cuda.cfunction.constant(y, x, dtype)[源代码]#

参数:

y (str | None)
x (float)
dtype (str)

spikingjelly.activation_based.cuda_kernel.auto_cuda.cfunction.abs(y, x, dtype)[源代码]#

参数:

y (str | None)
x (str)
dtype (str)

spikingjelly.activation_based.cuda_kernel.auto_cuda.cfunction.power(z, x, y, dtype)[源代码]#

参数:

z (str | None)
x (str)
y (str)
dtype (str)

spikingjelly.activation_based.cuda_kernel.auto_cuda.cfunction.if_else(z, x, y, mask, dtype)[源代码]#

参数:

z (str | None)
x (str)
y (str)
mask (str)
dtype (str)

spikingjelly.activation_based.cuda_kernel.auto_cuda.cfunction.if_else_else(w, x, y, z, mask_x, mask_y, dtype)[源代码]#

参数:

w (str | None)
x (str)
y (str)
z (str)
mask_x (str)
mask_y (str)
dtype (str)

spikingjelly.activation_based.cuda_kernel.auto_cuda.cfunction.greater_equal(z, x, y, dtype)[源代码]#

参数:

z (str | None)
x (str)
y (str)
dtype (str)

spikingjelly.activation_based.cuda_kernel.auto_cuda.cfunction.greater_than(z, x, y, dtype)[源代码]#

参数:

z (str | None)
x (str)
y (str)
dtype (str)

spikingjelly.activation_based.cuda_kernel.auto_cuda.cfunction.minimal(z, x, y, dtype)[源代码]#

参数:

z (str | None)
x (str)
y (str)
dtype (str)

spikingjelly.activation_based.cuda_kernel.auto_cuda.cfunction.maximum(z, x, y, dtype)[源代码]#

参数:

z (str | None)
x (str)
y (str)
dtype (str)

spikingjelly.activation_based.cuda_kernel.auto_cuda.cfunction.add(z, x, y, dtype)[源代码]#

参数:

z (str | None)
x (str)
y (str)
dtype (str)

spikingjelly.activation_based.cuda_kernel.auto_cuda.cfunction.sub(z, x, y, dtype)[源代码]#

参数:

z (str | None)
x (str)
y (str)
dtype (str)

spikingjelly.activation_based.cuda_kernel.auto_cuda.cfunction.mul(z, x, y, dtype)[源代码]#

参数:

z (str | None)
x (str)
y (str)
dtype (str)

spikingjelly.activation_based.cuda_kernel.auto_cuda.cfunction.div(z, x, y, dtype)[源代码]#

参数:

z (str | None)
x (str)
y (str)
dtype (str)

spikingjelly.activation_based.cuda_kernel.auto_cuda.cfunction.neg(y, x, dtype)[源代码]#

参数:

y (str | None)
x (str)
dtype (str)

spikingjelly.activation_based.cuda_kernel.auto_cuda.cfunction.heaviside(y, x, dtype)[源代码]#

参数:

y (str | None)
x (str)
dtype (str)

spikingjelly.activation_based.cuda_kernel.auto_cuda.cfunction.exp(y, x, dtype)[源代码]#

参数:

y (str | None)
x (str)
dtype (str)

spikingjelly.activation_based.cuda_kernel.auto_cuda.cfunction.sigmoid(y, x, alpha, dtype)[源代码]#

参数:

y (str | None)
x (str)
alpha (float)
dtype (str)

spikingjelly.activation_based.cuda_kernel.auto_cuda.cfunction.sigmoid_backward(y, x, alpha, dtype)[源代码]#

参数:

y (str)
x (str)
alpha (float)
dtype (str)

spikingjelly.activation_based.cuda_kernel.auto_cuda.cfunction.atan_backward(y, x, alpha, dtype)[源代码]#

参数:

y (str)
x (str)
alpha (float)
dtype (str)

spikingjelly.activation_based.cuda_kernel.auto_cuda.cfunction.piecewise_leaky_relu_backward(y, x, w, c, dtype)[源代码]#

参数:

y (str)
x (str)
w (float)
c (float)
dtype (str)

spikingjelly.activation_based.cuda_kernel.auto_cuda.cfunction.s2nn_backward(y, x, alpha, beta, dtype)[源代码]#

参数:

y (str)
x (str)
alpha (float)
beta (float)
dtype (str)

spikingjelly.activation_based.cuda_kernel.auto_cuda.cfunction.q_pseudo_spike_backward(y, x, alpha, dtype)[源代码]#

参数:

y (str)
x (str)
alpha (float)
dtype (str)

spikingjelly.activation_based.cuda_kernel.auto_cuda.cfunction.leaky_k_relu_backward(y, x, leak, k, dtype)[源代码]#

参数:

y (str)
x (str)
leak (float)
k (float)
dtype (str)

spikingjelly.activation_based.cuda_kernel.auto_cuda.cfunction.fake_numerical_gradient_backward(y, x, alpha, dtype)[源代码]#

参数:

y (str)
x (str)
alpha (float)
dtype (str)

spikingjelly.activation_based.cuda_kernel.auto_cuda.cfunction.log_tailed_relu_backward(y, x, alpha, dtype)[源代码]#

参数:

y (str)
x (str)
alpha (float)
dtype (str)