spikingjelly.activation_based.cuda_kernel.auto_cuda package#

spikingjelly.activation_based.cuda_kernel.auto_cuda.base.wrap_with_comment(code: str, comment: str)[源代码]#
spikingjelly.activation_based.cuda_kernel.auto_cuda.base.startswiths(x: str, prefixes: tuple)[源代码]#
class spikingjelly.activation_based.cuda_kernel.auto_cuda.base.CKernel(kernel_name: str)[源代码]#

基类:object

API Language: 中文 | English


  • 中文

自定义 CUDA kernel 的基础封装类。它维护 kernel 形参表 cparams、保留变量名 reserved_cnames,以及可拼接的代码片段(declaration/head/core/tail)。

参数:

kernel_name (str) -- CUDA kernel 名称


  • English

Base wrapper for custom CUDA kernels. It stores kernel parameter metadata (cparams), reserved C variable names (reserved_cnames), and code segments (declaration/head/core/tail).

参数:

kernel_name (str) -- CUDA kernel name

check_attributes(**kwargs)[源代码]#

API Language: 中文 | English


  • 中文

检查 kwargs 中给定属性值是否与当前对象属性一致。

参数:

kwargs (dict) -- 待检查的属性键值对

返回:

全部属性一致时返回 True,否则返回 False

返回类型:

bool


  • English

Check whether provided attribute values in kwargs match current attributes on this object.

参数:

kwargs (dict) -- Attribute key-value pairs to check

返回:

True if all attributes match; otherwise False

返回类型:

bool

property core#
set_contiguous(py_dict: dict)[源代码]#

API Language: 中文 | English


  • 中文

py_dict 中的 torch.Tensor/cupy.ndarray 转为连续内存;若出现 其他类型则抛出异常。

参数:

py_dict (dict) -- kernel 参数字典


  • English

Make torch.Tensor/cupy.ndarray values in py_dict contiguous. Raise an error for unsupported value types.

参数:

py_dict (dict) -- Kernel argument dictionary

get_device(py_dict: dict) int[源代码]#

API Language: 中文 | English


  • 中文

遍历 py_dict,返回首个张量对象所在 CUDA 设备编号。

参数:

py_dict (dict) -- kernel 参数字典

返回:

CUDA 设备编号

返回类型:

int

抛出:

ValueError -- 当 py_dict 中没有张量参数时


  • English

Traverse py_dict and return the CUDA device id of the first tensor-like value.

参数:

py_dict (dict) -- Kernel argument dictionary

返回:

CUDA device id

返回类型:

int

抛出:

ValueError -- If no tensor-like value is found

check_device(device: int, py_dict: dict)[源代码]#

API Language: 中文 | English


  • 中文

检查 py_dict 中所有张量是否都位于 device 指定的 CUDA 设备上。

参数:
  • device (int) -- 目标 CUDA 设备编号

  • py_dict (dict) -- kernel 参数字典


  • English

Validate that all tensor-like values in py_dict are on the target CUDA device device.

参数:
  • device (int) -- Target CUDA device id

  • py_dict (dict) -- Kernel argument dictionary

check_keys(py_dict: dict)[源代码]#

API Language: 中文 | English


  • 中文

检查 py_dict 的键集合是否与 self.cparams 一致,不一致时抛出异常。

参数:

py_dict (dict) -- kernel 参数字典


  • English

Check whether keys in py_dict exactly match keys in self.cparams. Raise an error on mismatch.

参数:

py_dict (dict) -- Kernel argument dictionary

check_ctypes(py_dict: dict)[源代码]#

API Language: 中文 | English


  • 中文

检查 py_dict 中各参数的数据类型是否与 self.cparams 中声明的 CUDA C 参数类型匹配(例如 float / half2 / int)。

参数:

py_dict (dict) -- kernel 参数字典


  • English

Validate that runtime value dtypes in py_dict match declared CUDA C parameter types in self.cparams (e.g., float/half2/int).

参数:

py_dict (dict) -- Kernel argument dictionary

check_half2(py_dict: dict)[源代码]#

API Language: 中文 | English


  • 中文

供子类按需实现的 half2 参数校验接口。

参数:

py_dict (dict) -- kernel 参数字典


  • English

Extension hook for subclasses to implement half2-related checks.

参数:

py_dict (dict) -- Kernel argument dictionary

get_ptrs(py_dict: dict)[源代码]#

API Language: 中文 | English


  • 中文

提取 py_dict 中每个张量参数的底层指针(或等价对象),按键排序后的参数顺序返回。

参数:

py_dict (dict) -- kernel 参数字典

返回:

指针元组

返回类型:

tuple


  • English

Collect underlying pointers (or equivalent objects) for tensor-like values in py_dict and return them as a tuple.

参数:

py_dict (dict) -- Kernel argument dictionary

返回:

Tuple of argument pointers

返回类型:

tuple

__call__(grid: tuple, block: tuple, py_dict: dict, *args_1, **kwargs)[源代码]#

API Language: 中文 | English


  • 中文

执行 CUDA kernel。调用前会完成设备一致性检查、连续化、参数类型检查和键集合校验。

参数:
  • grid (tuple) -- CUDA grid 配置

  • block (tuple) -- CUDA block 配置

  • py_dict (dict) -- kernel 实参字典,键需与 self.cparams 一一对应


  • English

Execute the CUDA kernel after validating device consistency, contiguous layout, ctypes compatibility, and key alignment with self.cparams.

参数:
  • grid (tuple) -- CUDA grid configuration

  • block (tuple) -- CUDA block configuration

  • py_dict (dict) -- Runtime argument dictionary matching self.cparams

add_param(ctype: str, cname: str)[源代码]#

API Language: 中文 | English


  • 中文

self.cparams 添加一个 CUDA 形参声明。

参数:
  • ctype (str) -- CUDA 参数类型字符串

  • cname (str) -- CUDA 参数名

抛出:

ValueError -- 当参数名重复或与保留名冲突时


  • English

Add one CUDA parameter declaration to self.cparams.

参数:
  • ctype (str) -- CUDA parameter type string

  • cname (str) -- CUDA parameter name

抛出:

ValueError -- If the name already exists or conflicts with reserved names

property declaration#
property head#
property tail#
property full_codes#

** 中文 | English


  • 中文

返回拼接后的完整 CUDA 代码字符串。

返回:

完整 CUDA 源码

返回类型:

str


  • English

Return the full CUDA source string assembled from declaration/head/core/tail.

返回:

Full CUDA source code

返回类型:

str

Type:

**API Language

class spikingjelly.activation_based.cuda_kernel.auto_cuda.base.CKernel1D(*args, **kwargs)[源代码]#

基类:CKernel

API Language: 中文 | English


  • 中文

一维(逐元素)CUDA kernel 封装类,继承自 CKernel。该类默认添加 numel 形参并保留 index 作为线程索引变量名。

参数:

kernel_name (str) -- CUDA kernel 名称(通过 *args, **kwargs 传入基类)


  • English

1D (element-wise) CUDA kernel wrapper inherited from CKernel. It adds numel by default and reserves index as the thread index variable name.

参数:

kernel_name (str) -- CUDA kernel name (forwarded to the base class via *args, **kwargs)

property head#

** 中文 | English


  • 中文

返回 1D kernel 头部代码,包含线程索引计算和 index < numel 边界判断。

返回:

CUDA 头部代码片段

返回类型:

str


  • English

Return the 1D kernel head code, including thread-index computation and the index < numel guard.

返回:

CUDA head code snippet

返回类型:

str

Type:

**API Language

property tail#

** 中文 | English


  • 中文

返回 1D kernel 尾部代码,用于闭合头部中的代码块。

返回:

CUDA 尾部代码片段

返回类型:

str


  • English

Return the 1D kernel tail code that closes the blocks opened in head.

返回:

CUDA tail code snippet

返回类型:

str

Type:

**API Language

check_half2(py_dict: dict)[源代码]#

API Language: 中文 | English


  • 中文

检查 py_dicthalf/float16 张量的元素个数是否为偶数,以满足 half2 访存需求。

参数:

py_dict (dict) -- kernel 参数字典

注意

CKernel1D.__call__() 会在执行前自动对奇数长度 half 张量补齐,因此通常 无需手工补齐。


  • English

Validate that half/float16 tensor lengths in py_dict are even, which is required by half2 operations.

参数:

py_dict (dict) -- Kernel argument dictionary

Note

CKernel1D.__call__() pads odd-length half tensors before kernel launch, so manual padding is usually unnecessary.

__call__(grid: tuple, block: tuple, py_dict: dict, *args_1, **kwargs)[源代码]#

API Language: 中文 | English


  • 中文

执行 1D CUDA kernel。对于 half/float16 且元素个数为奇数的张量,会先补齐 末元素后再调用基类执行,完成后再恢复原始形状与长度。

参数:
  • grid (tuple) -- CUDA grid 配置

  • block (tuple) -- CUDA block 配置

  • py_dict (dict) -- kernel 参数字典,键应与 self.cparams 对应


  • English

Execute the 1D CUDA kernel. For odd-length half/float16 tensors, this method pads one trailing element before delegating to the base call, then removes the padding and restores the original shape.

参数:
  • grid (tuple) -- CUDA grid configuration

  • block (tuple) -- CUDA block configuration

  • py_dict (dict) -- Runtime argument dictionary aligned with self.cparams

simple_call(**kwargs)[源代码]#

API Language: 中文 | English


  • 中文

CKernel1D.__call__ 的简化入口。自动从 kwargs 中推导设备与 numel, 并使用配置中的线程数计算 blocks 后执行 kernel。

参数:

kwargs (dict) -- kernel 参数键值对(不需要手动提供 numel


  • English

A convenience wrapper of CKernel1D.__call__(). It infers device and numel from kwargs, computes launch blocks with configured threads, and launches the kernel.

参数:

kwargs (dict) -- Kernel argument mapping (numel is inferred automatically)

class spikingjelly.activation_based.cuda_kernel.auto_cuda.base.CKernel2D(kernel_name: str, reverse: bool = False)[源代码]#

基类:CKernel

API Language: 中文 | English


  • 中文

二维 CUDA kernel 封装,继承自 CKernel。默认包含 numelN 两个内置参数,分别表示总元素数与每个时间步的元素数。二维张量按 [T, N] 解释,其中 T 为序列长度,N 为单步元素数量。

reverse 控制时间维循环方向: True 时使用 for(int t = numel - N + index; t >= 0; t -= dt)False 时使用 for(int t = index; t < numel; t += dt)

参数:
  • kernel_name (str) -- CUDA kernel 名称

  • reverse (bool) -- 是否使用反向时间循环


  • English

A 2D CUDA kernel wrapper derived from CKernel. It provides built-in parameters numel (total element count) and N (elements per time step). Any 2D tensor is interpreted as [T, N], where T is sequence length.

reverse controls the temporal loop direction: True uses for(int t = numel - N + index; t >= 0; t -= dt); False uses for(int t = index; t < numel; t += dt).

参数:
  • kernel_name (str) -- CUDA kernel name

  • reverse (bool) -- Whether to use reverse temporal traversal

property pre_core#
property post_core#
check_shape(py_dict: dict)[源代码]#

API Language: 中文 | English


  • 中文

检查 py_dict 中的张量维度。所有 torch.Tensorcupy.ndarray 的维度都必须不超过 2。

参数:

py_dict (dict) -- kernel 参数字典


  • English

Validate tensor dimensionality in py_dict. All torch.Tensor and cupy.ndarray values must have ndim <= 2.

参数:

py_dict (dict) -- Kernel argument dictionary

check_half2(py_dict: dict)[源代码]#

API Language: 中文 | English


  • 中文

检查 py_dict 中半精度张量是否满足 half2 对齐要求(偶数长度)。 对 torch.halfnp.float16: 1D 张量要求 numel 为偶数,2D 张量要求 shape[1] 为偶数。

参数:

py_dict (dict) -- kernel 参数字典

Note

实际执行前,CKernel2D.__call__() 会自动补齐奇数长度半精度张量; 本函数用于约束与校验。


  • English

Check whether half-precision tensors in py_dict satisfy half2 alignment requirements (even length). For torch.half and np.float16 values: 1D tensors require even numel; 2D tensors require even shape[1].

参数:

py_dict (dict) -- Kernel argument dictionary

Note

CKernel2D.__call__() performs automatic padding for odd-sized half tensors before launch. This method is for validation checks.

__call__(grid: tuple, block: tuple, py_dict: dict, *args_1, **kwargs)[源代码]#

API Language: 中文 | English


  • 中文

执行二维 CUDA kernel。*args_1**kwargs 会透传给 cupy.RawKernel 调用。

py_dictkey 必须与 self.cparams 的形参名一一对应, value 需为匹配数据类型的 torch.Tensorcupy.ndarray。 键顺序可以任意,内部会按形参顺序重排。

参数:
  • grid (tuple) -- CUDA grid 配置

  • block (tuple) -- CUDA block 配置

  • py_dict (dict) -- kernel 参数字典

Note

py_dict 中张量必须为 1D 或 2D。对于奇数长度半精度张量, 调用前会自动补齐,执行后移除补齐并恢复原始形状。


  • English

Execute the 2D CUDA kernel. *args_1 and **kwargs are forwarded directly to cupy.RawKernel.

Keys in py_dict must match self.cparams one-to-one, and values must be torch.Tensor/cupy.ndarray objects with compatible dtypes. Key order is arbitrary because arguments are aligned internally by formal parameter order.

参数:
  • grid (tuple) -- CUDA grid configuration

  • block (tuple) -- CUDA block configuration

  • py_dict (dict) -- Kernel argument dictionary

Note

Tensor inputs must be 1D or 2D. Odd-sized half-precision tensors are padded before launch, then unpadded and reshaped back afterward.

property head#
property tail#
simple_call(**kwargs)[源代码]#

API Language: 中文 | English


  • 中文

CKernel2D.__call__() 的便捷封装。该函数会从 kwargs 自动推断 设备、numelN,并根据配置计算 CUDA threadsblocks 后执行。

参数:

kwargs (dict) -- kernel 参数键值对(无需手动传入 numelN


  • English

A convenience wrapper of CKernel2D.__call__(). It infers device, numel, and N from kwargs, computes launch threads and blocks from configuration, and then launches the kernel.

参数:

kwargs (dict) -- Kernel argument mapping (numel and N are inferred)

class spikingjelly.activation_based.cuda_kernel.auto_cuda.base.CodeTyper(indent_num: int)[源代码]#

基类:object

API Language: 中文 | English


  • 中文

CUDA 代码缩进与拼接工具。内部维护 self.codes,可逐段追加代码并按缩进格式化。

参数:

indent_num (int) -- 初始缩进空格数


  • English

A helper for formatting and assembling CUDA code with indentation. The accumulated code text is stored in self.codes.

参数:

indent_num (int) -- Number of spaces used for initial indentation

append(codes: str)[源代码]#

API Language: 中文 | English


  • 中文

将输入 CUDA 代码片段追加到 self.codes。函数按 ; 分句并逐句写入, 同时处理 {/} 这类块边界语句。

参数:

codes (str) -- 待追加的 CUDA 代码


  • English

Append CUDA code snippets into self.codes. The method splits by ; and writes each statement with current indentation, while handling { and } block boundary tokens.

参数:

codes (str) -- CUDA code snippet to append

class spikingjelly.activation_based.cuda_kernel.auto_cuda.base.CodeBlock(env: CodeTyper)[源代码]#

基类:object

API Language: 中文 | English


  • 中文

CodeTyper 的上下文管理器工具,用于自动插入代码块 {...} 并维护缩进, 便于组织包含中间变量的多行 CUDA 逻辑。

参数:

env (CodeTyper) -- 目标代码环境


  • English

A context-manager utility for CodeTyper that inserts {...} blocks and adjusts indentation automatically. It is useful for composing multi-line CUDA logic with intermediate variables.

参数:

env (CodeTyper) -- Target code-typing environment

spikingjelly.activation_based.cuda_kernel.auto_cuda.cfunction.wrap_return_codes(y: str | None, codes: str)[源代码]#
spikingjelly.activation_based.cuda_kernel.auto_cuda.cfunction.float2half2(y: str | None, x: str)[源代码]#
spikingjelly.activation_based.cuda_kernel.auto_cuda.cfunction.constant(y: str | None, x: float, dtype: str)[源代码]#
spikingjelly.activation_based.cuda_kernel.auto_cuda.cfunction.abs(y: str | None, x: str, dtype: str)[源代码]#
spikingjelly.activation_based.cuda_kernel.auto_cuda.cfunction.power(z: str | None, x: str, y: str, dtype: str)[源代码]#
spikingjelly.activation_based.cuda_kernel.auto_cuda.cfunction.if_else(z: str | None, x: str, y: str, mask: str, dtype: str)[源代码]#
spikingjelly.activation_based.cuda_kernel.auto_cuda.cfunction.if_else_else(w: str | None, x: str, y: str, z: str, mask_x: str, mask_y: str, dtype: str)[源代码]#
spikingjelly.activation_based.cuda_kernel.auto_cuda.cfunction.greater_equal(z: str | None, x: str, y: str, dtype: str)[源代码]#
spikingjelly.activation_based.cuda_kernel.auto_cuda.cfunction.greater_than(z: str | None, x: str, y: str, dtype: str)[源代码]#
spikingjelly.activation_based.cuda_kernel.auto_cuda.cfunction.minimal(z: str | None, x: str, y: str, dtype: str)[源代码]#
spikingjelly.activation_based.cuda_kernel.auto_cuda.cfunction.maximum(z: str | None, x: str, y: str, dtype: str)[源代码]#
spikingjelly.activation_based.cuda_kernel.auto_cuda.cfunction.add(z: str | None, x: str, y: str, dtype: str)[源代码]#
spikingjelly.activation_based.cuda_kernel.auto_cuda.cfunction.sub(z: str | None, x: str, y: str, dtype: str)[源代码]#
spikingjelly.activation_based.cuda_kernel.auto_cuda.cfunction.mul(z: str | None, x: str, y: str, dtype: str)[源代码]#
spikingjelly.activation_based.cuda_kernel.auto_cuda.cfunction.div(z: str | None, x: str, y: str, dtype: str)[源代码]#
spikingjelly.activation_based.cuda_kernel.auto_cuda.cfunction.neg(y: str | None, x: str, dtype: str)[源代码]#
spikingjelly.activation_based.cuda_kernel.auto_cuda.cfunction.heaviside(y: str | None, x: str, dtype: str)[源代码]#
spikingjelly.activation_based.cuda_kernel.auto_cuda.cfunction.exp(y: str | None, x: str, dtype: str)[源代码]#
spikingjelly.activation_based.cuda_kernel.auto_cuda.cfunction.sigmoid(y: str | None, x: str, alpha: float, dtype: str)[源代码]#
spikingjelly.activation_based.cuda_kernel.auto_cuda.cfunction.sigmoid_backward(y: str, x: str, alpha: float, dtype: str)[源代码]#
spikingjelly.activation_based.cuda_kernel.auto_cuda.cfunction.atan_backward(y: str, x: str, alpha: float, dtype: str)[源代码]#
spikingjelly.activation_based.cuda_kernel.auto_cuda.cfunction.piecewise_leaky_relu_backward(y: str, x: str, w: float, c: float, dtype: str)[源代码]#
spikingjelly.activation_based.cuda_kernel.auto_cuda.cfunction.s2nn_backward(y: str, x: str, alpha: float, beta: float, dtype: str)[源代码]#
spikingjelly.activation_based.cuda_kernel.auto_cuda.cfunction.q_pseudo_spike_backward(y: str, x: str, alpha: float, dtype: str)[源代码]#
spikingjelly.activation_based.cuda_kernel.auto_cuda.cfunction.leaky_k_relu_backward(y: str, x: str, leak: float, k: float, dtype: str)[源代码]#
spikingjelly.activation_based.cuda_kernel.auto_cuda.cfunction.fake_numerical_gradient_backward(y: str, x: str, alpha: float, dtype: str)[源代码]#
spikingjelly.activation_based.cuda_kernel.auto_cuda.cfunction.log_tailed_relu_backward(y: str, x: str, alpha: float, dtype: str)[源代码]#