spikingjelly.activation_based.cuda_kernel.auto_cuda package#
- spikingjelly.activation_based.cuda_kernel.auto_cuda.base.wrap_with_comment(code: str, comment: str)[源代码]#
- class spikingjelly.activation_based.cuda_kernel.auto_cuda.base.CKernel(kernel_name: str)[源代码]#
基类:
object
中文
自定义 CUDA kernel 的基础封装类。它维护 kernel 形参表
cparams、保留变量名reserved_cnames,以及可拼接的代码片段(declaration/head/core/tail)。- 参数:
kernel_name (str) -- CUDA kernel 名称
English
Base wrapper for custom CUDA kernels. It stores kernel parameter metadata (
cparams), reserved C variable names (reserved_cnames), and code segments (declaration/head/core/tail).- 参数:
kernel_name (str) -- CUDA kernel name
- check_attributes(**kwargs)[源代码]#
-
中文
检查
kwargs中给定属性值是否与当前对象属性一致。
English
Check whether provided attribute values in
kwargsmatch current attributes on this object.
- property core#
- set_contiguous(py_dict: dict)[源代码]#
-
中文
将
py_dict中的torch.Tensor/cupy.ndarray转为连续内存;若出现 其他类型则抛出异常。- 参数:
py_dict (dict) -- kernel 参数字典
English
Make
torch.Tensor/cupy.ndarrayvalues inpy_dictcontiguous. Raise an error for unsupported value types.- 参数:
py_dict (dict) -- Kernel argument dictionary
- get_device(py_dict: dict) int[源代码]#
-
中文
遍历
py_dict,返回首个张量对象所在 CUDA 设备编号。- 参数:
py_dict (dict) -- kernel 参数字典
- 返回:
CUDA 设备编号
- 返回类型:
- 抛出:
ValueError -- 当
py_dict中没有张量参数时
English
Traverse
py_dictand return the CUDA device id of the first tensor-like value.- 参数:
py_dict (dict) -- Kernel argument dictionary
- 返回:
CUDA device id
- 返回类型:
- 抛出:
ValueError -- If no tensor-like value is found
- check_device(device: int, py_dict: dict)[源代码]#
-
中文
检查
py_dict中所有张量是否都位于device指定的 CUDA 设备上。
English
Validate that all tensor-like values in
py_dictare on the target CUDA devicedevice.
- check_keys(py_dict: dict)[源代码]#
-
中文
检查
py_dict的键集合是否与self.cparams一致,不一致时抛出异常。- 参数:
py_dict (dict) -- kernel 参数字典
English
Check whether keys in
py_dictexactly match keys inself.cparams. Raise an error on mismatch.- 参数:
py_dict (dict) -- Kernel argument dictionary
- check_ctypes(py_dict: dict)[源代码]#
-
中文
检查
py_dict中各参数的数据类型是否与self.cparams中声明的 CUDA C 参数类型匹配(例如float/half2/int)。- 参数:
py_dict (dict) -- kernel 参数字典
English
Validate that runtime value dtypes in
py_dictmatch declared CUDA C parameter types inself.cparams(e.g.,float/half2/int).- 参数:
py_dict (dict) -- Kernel argument dictionary
- check_half2(py_dict: dict)[源代码]#
-
中文
供子类按需实现的
half2参数校验接口。- 参数:
py_dict (dict) -- kernel 参数字典
English
Extension hook for subclasses to implement
half2-related checks.- 参数:
py_dict (dict) -- Kernel argument dictionary
- get_ptrs(py_dict: dict)[源代码]#
-
中文
提取
py_dict中每个张量参数的底层指针(或等价对象),按键排序后的参数顺序返回。
English
Collect underlying pointers (or equivalent objects) for tensor-like values in
py_dictand return them as a tuple.
- __call__(grid: tuple, block: tuple, py_dict: dict, *args_1, **kwargs)[源代码]#
-
中文
执行 CUDA kernel。调用前会完成设备一致性检查、连续化、参数类型检查和键集合校验。
- 参数:
English
Execute the CUDA kernel after validating device consistency, contiguous layout, ctypes compatibility, and key alignment with
self.cparams.
- add_param(ctype: str, cname: str)[源代码]#
-
中文
向
self.cparams添加一个 CUDA 形参声明。- 参数:
- 抛出:
ValueError -- 当参数名重复或与保留名冲突时
English
Add one CUDA parameter declaration to
self.cparams.- 参数:
- 抛出:
ValueError -- If the name already exists or conflicts with reserved names
- property declaration#
- property head#
- property tail#
- class spikingjelly.activation_based.cuda_kernel.auto_cuda.base.CKernel1D(*args, **kwargs)[源代码]#
基类:
CKernel
中文
一维(逐元素)CUDA kernel 封装类,继承自
CKernel。该类默认添加numel形参并保留index作为线程索引变量名。- 参数:
kernel_name (str) -- CUDA kernel 名称(通过
*args, **kwargs传入基类)
English
1D (element-wise) CUDA kernel wrapper inherited from
CKernel. It addsnumelby default and reservesindexas the thread index variable name.- 参数:
kernel_name (str) -- CUDA kernel name (forwarded to the base class via
*args, **kwargs)
- property head#
-
中文
返回 1D kernel 头部代码,包含线程索引计算和
index < numel边界判断。- 返回:
CUDA 头部代码片段
- 返回类型:
English
Return the 1D kernel head code, including thread-index computation and the
index < numelguard.
- property tail#
-
中文
返回 1D kernel 尾部代码,用于闭合头部中的代码块。
- 返回:
CUDA 尾部代码片段
- 返回类型:
English
Return the 1D kernel tail code that closes the blocks opened in
head.
- check_half2(py_dict: dict)[源代码]#
-
中文
检查
py_dict中half/float16张量的元素个数是否为偶数,以满足half2访存需求。- 参数:
py_dict (dict) -- kernel 参数字典
注意
CKernel1D.__call__()会在执行前自动对奇数长度 half 张量补齐,因此通常 无需手工补齐。
English
Validate that
half/float16tensor lengths inpy_dictare even, which is required byhalf2operations.- 参数:
py_dict (dict) -- Kernel argument dictionary
Note
CKernel1D.__call__()pads odd-length half tensors before kernel launch, so manual padding is usually unnecessary.
- __call__(grid: tuple, block: tuple, py_dict: dict, *args_1, **kwargs)[源代码]#
-
中文
执行 1D CUDA kernel。对于
half/float16且元素个数为奇数的张量,会先补齐 末元素后再调用基类执行,完成后再恢复原始形状与长度。- 参数:
English
Execute the 1D CUDA kernel. For odd-length
half/float16tensors, this method pads one trailing element before delegating to the base call, then removes the padding and restores the original shape.
- simple_call(**kwargs)[源代码]#
-
中文
CKernel1D.__call__的简化入口。自动从kwargs中推导设备与numel, 并使用配置中的线程数计算blocks后执行 kernel。- 参数:
kwargs (dict) -- kernel 参数键值对(不需要手动提供
numel)
English
A convenience wrapper of
CKernel1D.__call__(). It infers device andnumelfromkwargs, computes launch blocks with configured threads, and launches the kernel.- 参数:
kwargs (dict) -- Kernel argument mapping (
numelis inferred automatically)
- class spikingjelly.activation_based.cuda_kernel.auto_cuda.base.CKernel2D(kernel_name: str, reverse: bool = False)[源代码]#
基类:
CKernel
中文
二维 CUDA kernel 封装,继承自
CKernel。默认包含numel与N两个内置参数,分别表示总元素数与每个时间步的元素数。二维张量按[T, N]解释,其中T为序列长度,N为单步元素数量。reverse控制时间维循环方向:True时使用for(int t = numel - N + index; t >= 0; t -= dt);False时使用for(int t = index; t < numel; t += dt)。
English
A 2D CUDA kernel wrapper derived from
CKernel. It provides built-in parametersnumel(total element count) andN(elements per time step). Any 2D tensor is interpreted as[T, N], whereTis sequence length.reversecontrols the temporal loop direction:Trueusesfor(int t = numel - N + index; t >= 0; t -= dt);Falseusesfor(int t = index; t < numel; t += dt).- 参数:
- property pre_core#
- property post_core#
- check_shape(py_dict: dict)[源代码]#
-
中文
检查
py_dict中的张量维度。所有torch.Tensor与cupy.ndarray的维度都必须不超过 2。- 参数:
py_dict (dict) -- kernel 参数字典
English
Validate tensor dimensionality in
py_dict. Alltorch.Tensorandcupy.ndarrayvalues must havendim <= 2.- 参数:
py_dict (dict) -- Kernel argument dictionary
- check_half2(py_dict: dict)[源代码]#
-
中文
检查
py_dict中半精度张量是否满足half2对齐要求(偶数长度)。 对torch.half和np.float16: 1D 张量要求numel为偶数,2D 张量要求shape[1]为偶数。- 参数:
py_dict (dict) -- kernel 参数字典
Note
实际执行前,
CKernel2D.__call__()会自动补齐奇数长度半精度张量; 本函数用于约束与校验。
English
Check whether half-precision tensors in
py_dictsatisfyhalf2alignment requirements (even length). Fortorch.halfandnp.float16values: 1D tensors require evennumel; 2D tensors require evenshape[1].- 参数:
py_dict (dict) -- Kernel argument dictionary
Note
CKernel2D.__call__()performs automatic padding for odd-sized half tensors before launch. This method is for validation checks.
- __call__(grid: tuple, block: tuple, py_dict: dict, *args_1, **kwargs)[源代码]#
-
中文
执行二维 CUDA kernel。
*args_1与**kwargs会透传给cupy.RawKernel调用。py_dict中key必须与self.cparams的形参名一一对应,value需为匹配数据类型的torch.Tensor或cupy.ndarray。 键顺序可以任意,内部会按形参顺序重排。Note
py_dict中张量必须为 1D 或 2D。对于奇数长度半精度张量, 调用前会自动补齐,执行后移除补齐并恢复原始形状。
English
Execute the 2D CUDA kernel.
*args_1and**kwargsare forwarded directly tocupy.RawKernel.Keys in
py_dictmust matchself.cparamsone-to-one, and values must betorch.Tensor/cupy.ndarrayobjects with compatible dtypes. Key order is arbitrary because arguments are aligned internally by formal parameter order.- 参数:
Note
Tensor inputs must be 1D or 2D. Odd-sized half-precision tensors are padded before launch, then unpadded and reshaped back afterward.
- property head#
- property tail#
- simple_call(**kwargs)[源代码]#
-
中文
CKernel2D.__call__()的便捷封装。该函数会从kwargs自动推断 设备、numel、N,并根据配置计算 CUDAthreads与blocks后执行。- 参数:
kwargs (dict) -- kernel 参数键值对(无需手动传入
numel与N)
English
A convenience wrapper of
CKernel2D.__call__(). It infers device,numel, andNfromkwargs, computes launchthreadsandblocksfrom configuration, and then launches the kernel.- 参数:
kwargs (dict) -- Kernel argument mapping (
numelandNare inferred)
- class spikingjelly.activation_based.cuda_kernel.auto_cuda.base.CodeTyper(indent_num: int)[源代码]#
基类:
object
中文
CUDA 代码缩进与拼接工具。内部维护
self.codes,可逐段追加代码并按缩进格式化。- 参数:
indent_num (int) -- 初始缩进空格数
English
A helper for formatting and assembling CUDA code with indentation. The accumulated code text is stored in
self.codes.- 参数:
indent_num (int) -- Number of spaces used for initial indentation
- append(codes: str)[源代码]#
-
中文
将输入 CUDA 代码片段追加到
self.codes。函数按;分句并逐句写入, 同时处理{/}这类块边界语句。- 参数:
codes (str) -- 待追加的 CUDA 代码
English
Append CUDA code snippets into
self.codes. The method splits by;and writes each statement with current indentation, while handling{and}block boundary tokens.- 参数:
codes (str) -- CUDA code snippet to append
- class spikingjelly.activation_based.cuda_kernel.auto_cuda.base.CodeBlock(env: CodeTyper)[源代码]#
基类:
object
中文
CodeTyper的上下文管理器工具,用于自动插入代码块{...}并维护缩进, 便于组织包含中间变量的多行 CUDA 逻辑。- 参数:
env (CodeTyper) -- 目标代码环境
English
A context-manager utility for
CodeTyperthat inserts{...}blocks and adjusts indentation automatically. It is useful for composing multi-line CUDA logic with intermediate variables.- 参数:
env (CodeTyper) -- Target code-typing environment
- spikingjelly.activation_based.cuda_kernel.auto_cuda.cfunction.wrap_return_codes(y: str | None, codes: str)[源代码]#
- spikingjelly.activation_based.cuda_kernel.auto_cuda.cfunction.float2half2(y: str | None, x: str)[源代码]#
- spikingjelly.activation_based.cuda_kernel.auto_cuda.cfunction.constant(y: str | None, x: float, dtype: str)[源代码]#
- spikingjelly.activation_based.cuda_kernel.auto_cuda.cfunction.abs(y: str | None, x: str, dtype: str)[源代码]#
- spikingjelly.activation_based.cuda_kernel.auto_cuda.cfunction.power(z: str | None, x: str, y: str, dtype: str)[源代码]#
- spikingjelly.activation_based.cuda_kernel.auto_cuda.cfunction.if_else(z: str | None, x: str, y: str, mask: str, dtype: str)[源代码]#
- spikingjelly.activation_based.cuda_kernel.auto_cuda.cfunction.if_else_else(w: str | None, x: str, y: str, z: str, mask_x: str, mask_y: str, dtype: str)[源代码]#
- spikingjelly.activation_based.cuda_kernel.auto_cuda.cfunction.greater_equal(z: str | None, x: str, y: str, dtype: str)[源代码]#
- spikingjelly.activation_based.cuda_kernel.auto_cuda.cfunction.greater_than(z: str | None, x: str, y: str, dtype: str)[源代码]#
- spikingjelly.activation_based.cuda_kernel.auto_cuda.cfunction.minimal(z: str | None, x: str, y: str, dtype: str)[源代码]#
- spikingjelly.activation_based.cuda_kernel.auto_cuda.cfunction.maximum(z: str | None, x: str, y: str, dtype: str)[源代码]#
- spikingjelly.activation_based.cuda_kernel.auto_cuda.cfunction.add(z: str | None, x: str, y: str, dtype: str)[源代码]#
- spikingjelly.activation_based.cuda_kernel.auto_cuda.cfunction.sub(z: str | None, x: str, y: str, dtype: str)[源代码]#
- spikingjelly.activation_based.cuda_kernel.auto_cuda.cfunction.mul(z: str | None, x: str, y: str, dtype: str)[源代码]#
- spikingjelly.activation_based.cuda_kernel.auto_cuda.cfunction.div(z: str | None, x: str, y: str, dtype: str)[源代码]#
- spikingjelly.activation_based.cuda_kernel.auto_cuda.cfunction.neg(y: str | None, x: str, dtype: str)[源代码]#
- spikingjelly.activation_based.cuda_kernel.auto_cuda.cfunction.heaviside(y: str | None, x: str, dtype: str)[源代码]#
- spikingjelly.activation_based.cuda_kernel.auto_cuda.cfunction.exp(y: str | None, x: str, dtype: str)[源代码]#
- spikingjelly.activation_based.cuda_kernel.auto_cuda.cfunction.sigmoid(y: str | None, x: str, alpha: float, dtype: str)[源代码]#
- spikingjelly.activation_based.cuda_kernel.auto_cuda.cfunction.sigmoid_backward(y: str, x: str, alpha: float, dtype: str)[源代码]#
- spikingjelly.activation_based.cuda_kernel.auto_cuda.cfunction.atan_backward(y: str, x: str, alpha: float, dtype: str)[源代码]#
- spikingjelly.activation_based.cuda_kernel.auto_cuda.cfunction.piecewise_leaky_relu_backward(y: str, x: str, w: float, c: float, dtype: str)[源代码]#
- spikingjelly.activation_based.cuda_kernel.auto_cuda.cfunction.s2nn_backward(y: str, x: str, alpha: float, beta: float, dtype: str)[源代码]#
- spikingjelly.activation_based.cuda_kernel.auto_cuda.cfunction.q_pseudo_spike_backward(y: str, x: str, alpha: float, dtype: str)[源代码]#
- spikingjelly.activation_based.cuda_kernel.auto_cuda.cfunction.leaky_k_relu_backward(y: str, x: str, leak: float, k: float, dtype: str)[源代码]#