spikingjelly.activation_based.auto_cuda package
Module contents
- class spikingjelly.activation_based.auto_cuda.base.CKernel(kernel_name: str)[源代码]
基类:
object
- 参数:
kernel_name (str) – the name of kernel
The base python class for simplifying the using of custom CUDA kernel.
Some critical attributes:
- cparams:
a dict for saving parameters name and type.
- reserved_cnames:
a list for saving reserved variables names, which can not be used to name variable again.
Here is an example:
from spikingjelly.activation_based.auto_cuda import base example_ck = base.CKernel(kernel_name='example_ck') print(example_ck.full_codes)
The outputs are:
#include <cuda_fp16.h> extern "C" __global__ void example_ck( ) {}
A
CKernel
is composed of three parts: declaration, head, core, and tail. When settinglogging level <= DEBUG
, some debug information will be added to cuda codes or printed. And we can check where is each part. Here is an example:import logging logging.basicConfig(level=logging.DEBUG) from spikingjelly.activation_based.auto_cuda import base example_ck = base.CKernel(kernel_name='example_ck') print(example_ck.full_codes)
The outputs are:
//------declaration start------ #include <cuda_fp16.h> extern "C" __global__ void example_ck( ) //------declaration end-------- //------head start------ { //------head end-------- //------core start------ //------core end-------- //------tail start------ } //------tail end--------
In most cases,
CKernel
is used as a base class. Refer toCKernel1D
andCKernel2D
for more details.- check_attributes(**kwargs)[源代码]
- 参数:
kwargs (dict) – a dict of attributes
- 返回:
if all
value
inkwargs[key]
is identical toself.__getattribute__(key)
- 返回类型:
This function can be used to check if a
CKernel
is changed by if any of its attributes changes.
- property core
- set_contiguous(py_dict: dict)[源代码]
- 参数:
py_dict (dict) – a dict whose value is
torch.Tensor
orcupy.ndarray
Check if all values in py_dict are
torch.Tensor
orcupy.ndarray
and contiguous. If not, this function will raise an error.
- get_device(py_dict: dict) int [源代码]
- 参数:
py_dict (dict) – a dict
Traverse the dict and return the device id of the first met
torch.Tensor
. If notorch.Tensor
inpy_dict
, this function will raise an error.
- check_device(device: int, py_dict: dict)[源代码]
-
Check if the device id of each
torch.Tensor
orcupy.ndarray
in py_dict is identical todevice
. If not, this function will raise an error.
- check_keys(py_dict: dict)[源代码]
- 参数:
py_dict (dict) – a dict
Check if keys of
py_dict
are identical to keys ofself.cparams
. If not, this function will raise an error.
- check_ctypes(py_dict: dict)[源代码]
- 参数:
py_dict (dict) – a dict
Check if the value in
py_dict
has the correspondingctype
inself.cparams
, which includes:torch.float
ornp.float32
——'const float'
or'float'
torch.half
ornp.float16
——'const half2'
or'half2'
np.int_
——'const int'
or'int'
If not, this function will raise an error.
- get_ptrs(py_dict: dict)[源代码]
-
Get the address of the first element of each
torch.Tensor
orcupy.ndarray
inpy_dict
.
- __call__(grid: tuple, block: tuple, py_dict: dict, *args_1, **kwargs)[源代码]
- 参数:
Execute the CUDA kernel.
*args_1, **kwargs
are used as*args_1, **kwargs
incupy.RawKernel
.py_dict
should containkey: value
wherekey
is the cuda kernel function param name, andvalue
is the variable. This dict should be one-to-one correspondence toself.cparams
.For example, if
self.cparams
is{ 'numel': 'const int &', 'x': 'const float *', 'y': 'const float *' }
Then
py_dict
sould be{ 'numel': numel, 'x': x, 'y': y }
where
numel, x, y
should betorch.Tensor
orcupy.ndarray
with the corresponding data type, e.g.,x
inpy_dict
should have data typetorch.float
becausex
inself.cparams
have value'const float *'
.The keys order is arbitrary because this function will sort keys to align formal and actual parameters.
- add_param(ctype: str, cname: str)[源代码]
-
Add a param to
self.cparams
.Note
When calling
self.__call__
, the params order in the CUDA kernel are sorted by the dictionary order. Thus, the user do not need to calladd_param
by some specific order.Here is an example:
from spikingjelly.activation_based.auto_cuda import base example_ck = base.CKernel(kernel_name='example_ck') print('origin:') print(example_ck.full_codes) example_ck.add_param(ctype='const float*', cname='x') example_ck.add_param(ctype='const float*', cname='y') example_ck.add_param(ctype='float', cname='z') print('after:') print(example_ck.full_codes)
origin: #include <cuda_fp16.h> extern "C" __global__ void example_ck( const int & numel ) after: #include <cuda_fp16.h> extern "C" __global__ void example_ck( const int & numel, const float* x, const float* y, float z )
- property declaration
- property head
- property tail
- property full_codes
the full cuda codes :rtype: str
- Type:
return
- class spikingjelly.activation_based.auto_cuda.base.CKernel1D(*args, **kwargs)[源代码]
基类:
CKernel
- 参数:
kernel_name (str) – the name of kernel
The 1D (element-wise) CUDA kernel, which is extended from
CKernel
. All input/output tensors will be regarded as 1D tensors.Some critical attributes:
- cparams:
A dict for saving parameters name and type. The default value is
{'numel': 'const int &'}
.numel
represents the numel of elements for element-wise operations, which is also the numer of cuda threads.- reserved_cnames:
A list for saving reserved variables names, which can not be used to name variable again. The defaule value is
['index']
.index
represents the index of element, which is also the cuda thread index.
Now let us check what the empty 1d kernel looks like:
from spikingjelly.activation_based.auto_cuda import base temp_kernel = base.CKernel1D(kernel_name='temp_kernel') print(temp_kernel.full_codes)
The outputs are:
#include <cuda_fp16.h> extern "C" __global__ void temp_kernel( const int & numel ) { const int index = blockIdx.x * blockDim.x + threadIdx.x; if (index < numel) { } }
With setting logging level, we can check each part of the kernel:
import logging logging.basicConfig(level=logging.DEBUG) from spikingjelly.activation_based.auto_cuda import base temp_kernel = base.CKernel1D(kernel_name='temp_kernel') print(temp_kernel.full_codes)
The outputs are:
//------declaration start------ #include <cuda_fp16.h> extern "C" __global__ void temp_kernel( const int & numel ) //------declaration end-------- //------head start------ { const int index = blockIdx.x * blockDim.x + threadIdx.x; if (index < numel) { //------head end-------- //------core start------ //------core end-------- //------tail start------ } } //------tail end--------
self.code
can be specified by user.For example, if we want to write a heaviside kernel, we can implement it easily with the cuda code
y[index] = x[index] >= 0.0f ? 1.0f: 0.0f;
, and add two paramsx, y
, which are inputs and outputs.
Here is the example:
from spikingjelly.activation_based.auto_cuda import base c_heaviside = base.CKernel1D(kernel_name='heaviside') c_heaviside.add_param(ctype='const float *', cname='x') c_heaviside.add_param(ctype='float *', cname='y') c_heaviside.core = ''' y[index] = x[index] >= 0.0f ? 1.0f: 0.0f; ''' print(c_heaviside.full_codes)
The outputs are:
#include <cuda_fp16.h> extern "C" __global__ void heaviside( const int & numel, const float * x, float * y ) { const int index = blockIdx.x * blockDim.x + threadIdx.x; if (index < numel) { y[index] = x[index] >= 0.0f ? 1.0f: 0.0f; } }
Here is an example of how to execute the kernel:
import torch from spikingjelly.activation_based import cuda_utils device = 'cuda:0' x = torch.rand([4, 4], device=device) - 0.5 y = torch.zeros_like(x) numel = x.numel() threads = 1024 blocks = cuda_utils.cal_blocks(numel, threads) print('x=') print(x) with cuda_utils.DeviceEnvironment(device=x.get_device()): numel = cupy.asarray(numel) py_dict = { 'numel': numel, 'x': x, 'y': y } c_heaviside((blocks, ), (threads, ), py_dict) print('y=') print(y)
The outputs are:
x= tensor([[-0.0423, -0.1383, -0.0238, 0.1018], [ 0.3422, 0.1449, -0.2938, -0.1858], [-0.3503, 0.0004, -0.4274, -0.2012], [-0.0227, 0.2229, -0.0776, 0.2687]], device='cuda:0') y= tensor([[0., 0., 0., 1.], [1., 1., 0., 0.], [0., 1., 0., 0.], [0., 1., 0., 1.]], device='cuda:0')
- property head
- property tail
- check_half2(py_dict: dict)[源代码]
- 参数:
py_dict (dict) – a dict
Check value in
py_dict
. If the value istorch.Tensor
withvalue.dtype == torch.half
orcupy.ndarray
withvalue.dtype == np.float16
, this function will check whether the number of elements of value is even.We assert when using half dtype, the numel should be even because we will use
half2
in CUDA kernel.Note
CKernel1D.__call__
will pad half tensor to even numel before executing the kernel. Thus, the user does not need to worry about padding.
- __call__(grid: tuple, block: tuple, py_dict: dict, *args_1, **kwargs)[源代码]
- 参数:
Execute the CUDA kernel.
*args_1, **kwargs
are used as*args_1, **kwargs
incupy.RawKernel
.py_dict
should containkey: value
wherekey
is the cuda kernel function param name, andvalue
is the variable. This dict should be one-to-one correspondence toself.cparams
.For example, if
self.cparams
is{ 'numel': 'const int &', 'x': 'const float *', 'y': 'const float *' }
Then
py_dict
sould be{ 'numel': numel, 'x': x, 'y': y }
where
numel, x, y
should betorch.Tensor
orcupy.ndarray
with the corresponding data type, e.g.,x
inpy_dict
should have data typetorch.float
becausex
inself.cparams
have value'const float *'
.The keys order is arbitrary because this function will sort keys to align formal and actual parameters.
Note
All tensors in
py_dict
will be regarded as 1D.Note
If any tensor
x
inpy_dict
with data typetorch.half
ornp.float16
but odd numel will be flattened and padded byx = [x, x[-1]]
before executing the CUDA kernel. After execution, padded values inx
will be removed, andx
will be reshaped to the origin shape.
- simple_call(**kwargs)[源代码]
- 参数:
kwargs (dict) – the dict that contains parameters for CUDA kernel
The simplified calling function, which is simplified from the standard calling function is
CKernel1D.simple_call
.Compared with
CKernel1D.simple_call
, the device, numel, numbers of CUDA threads and blocks are calculated automatically from tensors inkwargs
.Here is the example:
import torch from spikingjelly.activation_based import cuda_utils from spikingjelly.activation_based.auto_cuda import base c_heaviside = base.CKernel1D(kernel_name='heaviside') c_heaviside.add_param(ctype='const float *', cname='x') c_heaviside.add_param(ctype='float *', cname='y') c_heaviside.core = ''' y[index] = x[index] >= 0.0f ? 1.0f: 0.0f; ''' device = 'cuda:0' x = torch.rand([4, 4], device=device) - 0.5 y = torch.zeros_like(x) print('x=') print(x) c_heaviside.simple_call(x=x, y=y) print('y=') print(y)
The outputs are:
x= tensor([[-0.1706, 0.2063, -0.2077, 0.3335], [-0.0180, -0.2429, 0.3488, 0.1146], [ 0.0362, 0.1584, 0.4828, -0.1389], [-0.2684, 0.1898, 0.0560, 0.2058]], device='cuda:0') y= tensor([[0., 1., 0., 1.], [0., 0., 1., 1.], [1., 1., 1., 0.], [0., 1., 1., 1.]], device='cuda:0')
- class spikingjelly.activation_based.auto_cuda.base.CKernel2D(kernel_name: str, reverse: bool = False)[源代码]
基类:
CKernel
- 参数:
The 2D CUDA kernel, which is extended from
CKernel
.All input/output tensors should have dimensions no more than 2. All 2D tensors will be regarded as
shape = [T, N]
, whereT
is the sequence length andN
is the elements number of data at one time-stepSome critical attributes:
- cparams:
A dict for saving parameters name and type. The default value is
{'numel': 'const int &', 'N': 'const int &'}
.N
: the number of elements number of sequence data at one time-step (the numel of 1-th dimension)numel
: the numel of elements in input/output tensors, which isT * N
- reserved_cnames:
A list for saving reserved variables names, which can not be used to name variable again. The defaule value is
['index', 'dt', 't']
.index
: the index in 1-th dimension, which is also the CUDA thread indext
: the index in 0-th dimensiondt
: used in CUDA kernel as the time-step stride. Whenx[t_py][j]
in python code is identical tox[t]
in CUDA code, thenx[t_py + 1][j]
in python code is identical tox[t + dt]
in CUDA code.
Now let us check what the empty 2d kernel looks like:
from spikingjelly.activation_based.auto_cuda import base temp_kernel = base.CKernel2D(kernel_name='temp_kernel') print(temp_kernel.full_codes)
The outputs are:
#include <cuda_fp16.h> extern "C" __global__ void temp_kernel( const int & numel, const int & N ) { const int index = blockIdx.x * blockDim.x + threadIdx.x; if (index < N) { const int dt = N; for(int t = index; t < numel; t += dt) { } } }
With setting logging level, we can check each part of the kernel:
import logging logging.basicConfig(level=logging.DEBUG) from spikingjelly.activation_based.auto_cuda import base temp_kernel = base.CKernel2D(kernel_name='temp_kernel') print(temp_kernel.full_codes)
The outputs are:
//------declaration start------ #include <cuda_fp16.h> extern "C" __global__ void temp_kernel( const int & numel, const int & N ) //------declaration end-------- //------head start------ { const int index = blockIdx.x * blockDim.x + threadIdx.x; if (index < N) { const int dt = N; //------pre_core start------ //------pre_core end-------- for(int t = index; t < numel; t += dt) { //------head end-------- //------core start------ //------core end-------- //------tail start------ } //------post_core start------ //------post_core end-------- } } //------tail end--------
self.pre_core, self.post_core, self.core
can be specified by user.Here is the example of how to implement the
cumsum
operation:import torch import cupy from spikingjelly.activation_based.auto_cuda import base from spikingjelly.activation_based import cuda_utils cumsum = base.CKernel2D(kernel_name='cumsum') cumsum.add_param(ctype='const float *', cname='x') cumsum.add_param(ctype='float *', cname='y') cumsum.core = ''' if (t - dt < 0) { y[t] = x[t]; } else { y[t] = x[t] + y[t - dt]; } ''' print(cumsum.full_codes) T = 4 N = 3 device = 'cuda:0' x = torch.randint(low=0, high=4, size=[T, N], device=device).float() y = torch.zeros_like(x) threads = 1024 blocks = cuda_utils.cal_blocks(N, threads) with cuda_utils.DeviceEnvironment(device=x.get_device()): numel = cupy.asarray(T * N) N = cupy.asarray(N) py_dict = { 'N': N, 'numel': numel, 'x': x, 'y': y } cumsum((blocks, ), (threads, ), py_dict) print('x=') print(x) print('y=') print(y)
The outputs are:
#include <cuda_fp16.h> extern "C" __global__ void cumsum( const int & numel, const int & N, const float * x, float * y ) { const int index = blockIdx.x * blockDim.x + threadIdx.x; if (index < N) { const int dt = N; for(int t = index; t < numel; t += dt) { if (t - dt < 0) { y[t] = x[t]; } else { y[t] = x[t] + y[t - dt]; } } } }
x= tensor([[3., 0., 2.], [2., 0., 0.], [2., 3., 2.], [2., 1., 0.]], device='cuda:0') y= tensor([[3., 0., 2.], [5., 0., 2.], [7., 3., 4.], [9., 4., 4.]], device='cuda:0')
- property pre_core
- property post_core
- check_half2(py_dict: dict)[源代码]
- 参数:
py_dict (dict) – a dict
Check value in
py_dict
. If the value istorch.Tensor
withvalue.dtype == torch.half
orcupy.ndarray
withvalue.dtype == np.float16
, this function will check whether the number of elements of value is even.If the tensor
x
is 1D, it will be padded whenx.numel() % 2 != 0
. If the tensorx
is 2D, it will be padded whenx.shape[1] % 2 != 0
.We assert when using half dtype, the numel should be even because we will use
half2
in CUDA kernel.Note
CKernel2D.__call__
will pad half tensor to even numel before executing the kernel. Thus, the user does not need to worry about padding.
- __call__(grid: tuple, block: tuple, py_dict: dict, *args_1, **kwargs)[源代码]
- 参数:
Execute the CUDA kernel.
*args_1, **kwargs
are used as*args_1, **kwargs
incupy.RawKernel
.py_dict
should containkey: value
wherekey
is the cuda kernel function param name, andvalue
is the variable. This dict should be one-to-one correspondence toself.cparams
.For example, if
self.cparams
is{ 'numel': 'const int &', 'x': 'const float *', 'y': 'const float *' }
Then
py_dict
sould be{ 'numel': numel, 'x': x, 'y': y }
where
numel, x, y
should betorch.Tensor
orcupy.ndarray
with the corresponding data type, e.g.,x
inpy_dict
should have data typetorch.float
becausex
inself.cparams
have value'const float *'
.The keys order is arbitrary because this function will sort keys to align formal and actual parameters.
Note
All tensors in
py_dict
should be 1D or 2D.Note
If any 1D tensor
x
inpy_dict
with data typetorch.half
ornp.float16
but odd numel will be flattened and padded byx = [x, x[-1]]
before executing the CUDA kernel.If any 2D tensor
x
with shape[T, N]
inpy_dict
with data typetorch.half
ornp.float16
butN
is odd, thenx
will be padded asx = [x, x[:, -1]]
, whose shape is[T, N + 1]
.After execution, padded values in
x
will be removed, andx
will be reshaped to the origin shape.
- property head
- property tail
- simple_call(**kwargs)[源代码]
- 参数:
kwargs (dict) – the dict that contains parameters for CUDA kernel
The simplified calling function, which is simplified from the standard calling function is
CKernel2D.simple_call
.Compared with
CKernel2D.simple_call
, the device, N, numel, numbers of CUDA threads and blocks are calculated automatically from tensors inkwargs
.Here is the example:
import torch import cupy from spikingjelly.activation_based.auto_cuda import base from spikingjelly.activation_based import cuda_utils cumsum = base.CKernel2D(kernel_name='cumsum') cumsum.add_param(ctype='const float *', cname='x') cumsum.add_param(ctype='float *', cname='y') cumsum.core = ''' if (t - dt < 0) { y[t] = x[t]; } else { y[t] = x[t] + y[t - dt]; } ''' T = 4 N = 3 device = 'cuda:0' x = torch.randint(low=0, high=4, size=[T, N], device=device).float() y = torch.zeros_like(x) cumsum.simple_call(x=x, y=y) print('x=') print(x) print('y=') print(y)
The outputs are:
x= tensor([[0., 2., 1.], [1., 3., 1.], [2., 2., 0.], [2., 0., 1.]], device='cuda:0') y= tensor([[0., 2., 1.], [1., 5., 2.], [3., 7., 2.], [5., 7., 3.]], device='cuda:0')
- class spikingjelly.activation_based.auto_cuda.base.CodeTyper(indent_num: int)[源代码]
基类:
object
- 参数:
indent_num (int) – the number of indents
A CUDA code formatter with adding indents. The full code can be accessed by
self.codes
.Here is an example:
from spikingjelly.activation_based.auto_cuda import base, cfunction code0 = cfunction.if_else(z='z', x='x', y='y', mask='mask', dtype='float') code1 = cfunction.sigmoid_backward(y='y', x='x', alpha=2., dtype='float') codes = '' codes += code0 codes += code1 print('// Without CodeTyper:') print('// ------------------') print(codes) print('// ------------------') ctyper = base.CodeTyper(4) ctyper.append(code0) ctyper.append(code1) print('// With CodeTyper:') print('// ------------------') print(ctyper.codes) print('// ------------------')
// Without CodeTyper: // ------------------ z = x * mask + y * (1.0f - mask);const float sigmoid_backward__sigmoid_ax = 1.0f / (1.0f + expf(- (2.0f) * x)); y = (1.0f - sigmoid_backward__sigmoid_ax) * sigmoid_backward__sigmoid_ax * (2.0f); // ------------------ // With CodeTyper: // ------------------ z = x * mask + y * (1.0f - mask); const float sigmoid_backward__sigmoid_ax = 1.0f / (1.0f + expf(- (2.0f) * x)); y = (1.0f - sigmoid_backward__sigmoid_ax) * sigmoid_backward__sigmoid_ax * (2.0f); // ------------------
- class spikingjelly.activation_based.auto_cuda.base.CodeBlock(env: CodeTyper)[源代码]
基类:
object
- 参数:
env (CodeTyper) – a CodeTyper
A tool for adding a CUDA code block in
CodeTyper.code
. It is helpful when we want to calculate by intermediate variables.Here is an example:
from spikingjelly.activation_based.auto_cuda import base ctyper = base.CodeTyper(4) with base.CodeBlock(ctyper): ctyper.append('// swap x and y') ctyper.append('float temp_var = x;') ctyper.append('x = y;') ctyper.append('y = temp_var;') print(ctyper.codes)
The outputs are:
{ // swap x and y; float temp_var = x; x = y; y = temp_var; }
- spikingjelly.activation_based.auto_cuda.cfunction.wrap_return_codes(y: Optional[str], codes: str)[源代码]
- spikingjelly.activation_based.auto_cuda.cfunction.constant(y: Optional[str], x: float, dtype: str)[源代码]
- spikingjelly.activation_based.auto_cuda.cfunction.power(z: Optional[str], x: str, y: str, dtype: str)[源代码]
- spikingjelly.activation_based.auto_cuda.cfunction.if_else(z: Optional[str], x: str, y: str, mask: str, dtype: str)[源代码]
- spikingjelly.activation_based.auto_cuda.cfunction.if_else_else(w: Optional[str], x: str, y: str, z: str, mask_x: str, mask_y: str, dtype: str)[源代码]
- spikingjelly.activation_based.auto_cuda.cfunction.greater_equal(z: Optional[str], x: str, y: str, dtype: str)[源代码]
- spikingjelly.activation_based.auto_cuda.cfunction.greater_than(z: Optional[str], x: str, y: str, dtype: str)[源代码]
- spikingjelly.activation_based.auto_cuda.cfunction.minimal(z: Optional[str], x: str, y: str, dtype: str)[源代码]
- spikingjelly.activation_based.auto_cuda.cfunction.maximum(z: Optional[str], x: str, y: str, dtype: str)[源代码]
- spikingjelly.activation_based.auto_cuda.cfunction.add(z: Optional[str], x: str, y: str, dtype: str)[源代码]
- spikingjelly.activation_based.auto_cuda.cfunction.sub(z: Optional[str], x: str, y: str, dtype: str)[源代码]
- spikingjelly.activation_based.auto_cuda.cfunction.mul(z: Optional[str], x: str, y: str, dtype: str)[源代码]
- spikingjelly.activation_based.auto_cuda.cfunction.div(z: Optional[str], x: str, y: str, dtype: str)[源代码]
- spikingjelly.activation_based.auto_cuda.cfunction.heaviside(y: Optional[str], x: str, dtype: str)[源代码]
- spikingjelly.activation_based.auto_cuda.cfunction.sigmoid(y: Optional[str], x: str, alpha: float, dtype: str)[源代码]
- spikingjelly.activation_based.auto_cuda.cfunction.sigmoid_backward(y: str, x: str, alpha: float, dtype: str)[源代码]
- spikingjelly.activation_based.auto_cuda.cfunction.atan_backward(y: str, x: str, alpha: float, dtype: str)[源代码]
- spikingjelly.activation_based.auto_cuda.cfunction.piecewise_leaky_relu_backward(y: str, x: str, w: float, c: float, dtype: str)[源代码]
- spikingjelly.activation_based.auto_cuda.cfunction.s2nn_backward(y: str, x: str, alpha: float, beta: float, dtype: str)[源代码]
- spikingjelly.activation_based.auto_cuda.cfunction.q_pseudo_spike_backward(y: str, x: str, alpha: float, dtype: str)[源代码]
- spikingjelly.activation_based.auto_cuda.cfunction.leaky_k_relu_backward(y: str, x: str, leak: float, k: float, dtype: str)[源代码]
- spikingjelly.activation_based.auto_cuda.cfunction.fake_numerical_gradient_backward(y: str, x: str, alpha: float, dtype: str)[源代码]
- spikingjelly.activation_based.auto_cuda.cfunction.log_tailed_relu_backward(y: str, x: str, alpha: float, dtype: str)[源代码]
- spikingjelly.activation_based.auto_cuda.neuron_kernel.neuronal_hard_reset(v_next: str, h: str, spike: str, v_reset: str, dtype: str = 'float')[源代码]
- spikingjelly.activation_based.auto_cuda.neuron_kernel.neuronal_soft_reset(v_next: str, h: str, spike: str, v_th: str, dtype: str = 'float')[源代码]
- spikingjelly.activation_based.auto_cuda.neuron_kernel.neuronal_fire(spike: str, v: str, v_th: str, dtype: str = 'float')[源代码]
- class spikingjelly.activation_based.auto_cuda.neuron_kernel.NeuronFPTTKernel(hard_reset: bool, dtype: str)[源代码]
基类:
CKernel2D
- neuronal_charge() str [源代码]
- 返回:
CUDA code
- 返回类型:
Returns CUDA code for calculating \(H[t] = f(X[t], V[t-1], ...)\).
This function should define how
h_seq[t]
is calculated byx_seq[t], v_v_seq[t]
and other params if the neuron needs.For example, the IF neuron define this function as:
def neuronal_charge(self) -> str: # note that v_v_seq[t] is v_seq[t - dt] return cfunction.add(z='h_seq[t]', x='x_seq[t]', y='v_v_seq[t]', dtype=self.dtype)
- property core
- class spikingjelly.activation_based.auto_cuda.neuron_kernel.NeuronBPTTKernel(surrogate_function: Callable, hard_reset: bool, detach_reset: bool, dtype: str)[源代码]
基类:
CKernel2D
- property pre_core
- property post_core
- grad_h_next_to_v() str [源代码]
- 返回:
CUDA code
- 返回类型:
Returns CUDA code for calculating \(\frac{\mathrm{d} H[t+1]}{\mathrm{d} V[t]}\).
This function should define how
grad_h_next_to_v
is calculated. Note thatgrad_h_next_to_v
has not been declared. Thus, this function should also declaregrad_h_next_to_v
.For example, the IF neuron define this function as:
def grad_h_next_to_v(self) -> str: return cfunction.constant(y=f'const {self.dtype} grad_h_next_to_v', x=1., dtype=self.dtype)
- grad_h_to_x() str [源代码]
- 返回:
CUDA code
- 返回类型:
Returns CUDA code for calculating \(\frac{\mathrm{d} H[t]}{\mathrm{d} X[t]}\).
This function should define how
grad_h_to_x
is calculated. Note thatgrad_h_to_x
has not been declared. Thus, this function should also declaregrad_h_to_x
.For example, the IF neuron define this function as:
def grad_h_to_x(self) -> str: return cfunction.constant(y=f'const {self.dtype} grad_h_to_x', x=1., dtype=self.dtype)
- property core
- class spikingjelly.activation_based.auto_cuda.neuron_kernel.IFNodeFPTTKernel(hard_reset: bool, dtype: str)[源代码]
- class spikingjelly.activation_based.auto_cuda.neuron_kernel.IFNodeBPTTKernel(surrogate_function: Callable, hard_reset: bool, detach_reset: bool, dtype: str)[源代码]
- spikingjelly.activation_based.auto_cuda.neuron_kernel.scalar_to_cupy(py_dict: dict, ref: str = 'x_seq')[源代码]
- spikingjelly.activation_based.auto_cuda.neuron_kernel.new_tensors(news: tuple, py_dict: dict, ref: str = 'x_seq')[源代码]
- class spikingjelly.activation_based.auto_cuda.neuron_kernel.NeuronATGFBase[源代码]
基类:
object
- static pre_forward(py_dict: dict)[源代码]
- 参数:
py_dict (dict) – a dict built from the neuron’s forward autograd function. It should at least contain
x_seq, v_init, v_reset
- 返回:
requires_grad, blocks, threads, py_dict
- requires_grad: bool
if any tensor in
py_dict
requires grad, thenrequires_grad = True
;elserequires_grad = False
- blocks: int
CUDA param used in calling CUDA kernel
- threads: int
CUDA param used in calling CUDA kernel. The default value is
spikingjelly.configure.cuda_threads
- py_dict: dict
Compared with the input
py_dict
, the returnedpy_dict
will:convert all
float/int
scalars inpy_dict
tocupy.ndarray
add
h_seq, spike_seq, v_v_seq
topy_dict
.h_seq, spike_seq
are zero tensors with the same shape withx_seq
.v_v_seq
is concatenated fromv_init
andv_seq
, which is zero tensors with the same shape withx_seq
add
N, numel
topy_dict
. Note thatx_seq.shape = [T, N]
andnumel = T * N
. A specific case is thatx_seq.dtype == torch.half
, thenN = math.ceil(N / 2)
, andnumel = N * x_seq.shape[0]
. Note thatN, numel
in the returnedpy_dict
arecupy.ndarray
- 返回类型:
- static ctx_save(ctx, requires_grad: bool, *args, **kwargs)[源代码]
- 参数:
ctx –
ctx
intorch.autograd.Function
requires_grad (bool) – if any tensor in forward params requires grad
args – tensors that need to be saved by
ctx.save_for_backward
kwargs – items that need to be saved by
ctx.xx = xx
Saves
*args, **kwargs
inctx
byctx.save_for_backward(*args)
andctx.xx = xx
for allxx
inkwargs.items()
.
- static pre_backward(ctx, grad_spike_seq: Tensor, grad_v_seq: Tensor)[源代码]
- 参数:
ctx –
ctx
intorch.autograd.Function
grad_spike_seq (torch.Tensor) – gradients of
spike_seq
grad_v_seq (torch.Tensor) – gradients of
v_seq
- 返回:
backward_kernel, blocks, threads, py_dict
- backward_kernel: NeuronBPTTKernel
The CUDA kernel used for backward. It should be provided in
ctx.backward_kernel
- blocks: int
CUDA param used in calling CUDA kernel. It should be provided in
ctx.blocks
- threads: int
CUDA param used in calling CUDA kernel. It should be provided in
ctx.threads
- 返回类型:
- class spikingjelly.activation_based.auto_cuda.neuron_kernel.IFNodeATGF(*args, **kwargs)[源代码]
基类:
Function
- static forward(ctx, x_seq: Tensor, v_init: Tensor, v_th: float, v_reset: Optional[float], forward_kernel: IFNodeFPTTKernel, backward_kernel: IFNodeBPTTKernel)[源代码]
- class spikingjelly.activation_based.auto_cuda.neuron_kernel.LIFNodeFPTTKernel(decay_input: bool, hard_reset: bool, dtype: str)[源代码]
- class spikingjelly.activation_based.auto_cuda.neuron_kernel.LIFNodeBPTTKernel(decay_input: bool, surrogate_function: Callable, hard_reset: bool, detach_reset: bool, dtype: str)[源代码]
- class spikingjelly.activation_based.auto_cuda.neuron_kernel.LIFNodeATGF(*args, **kwargs)[源代码]
基类:
Function
- static forward(ctx, x_seq: Tensor, v_init: Tensor, v_th: float, v_reset: Optional[float], decay: float, forward_kernel: LIFNodeFPTTKernel, backward_kernel: LIFNodeBPTTKernel)[源代码]
- class spikingjelly.activation_based.auto_cuda.neuron_kernel.ParametricLIFNodeFPTTKernel(decay_input: bool, hard_reset: bool, dtype: str)[源代码]
- class spikingjelly.activation_based.auto_cuda.neuron_kernel.ParametricLIFNodeBPTTKernel(decay_input: bool, surrogate_function: Callable, hard_reset: bool, detach_reset: bool, dtype: str)[源代码]
-
- property head
- property pre_core
- property core
- property tail
- class spikingjelly.activation_based.auto_cuda.neuron_kernel.ParametricLIFNodeATGF(*args, **kwargs)[源代码]
基类:
Function
- static forward(ctx, x_seq: Tensor, v_init: Tensor, v_th: float, v_reset: Optional[float], decay: Tensor, forward_kernel: ParametricLIFNodeFPTTKernel, backward_kernel: ParametricLIFNodeBPTTKernel)[源代码]