spikingjelly.activation_based.rnn package
Module contents
- spikingjelly.activation_based.rnn.bidirectional_rnn_cell_forward(cell: Module, cell_reverse: Module, x: Tensor, states: Tensor, states_reverse: Tensor)[源代码]
- 参数
cell (nn.Module) – 正向RNN cell,输入是正向序列
cell_reverse (nn.Module) – 反向的RNN cell,输入是反向序列
x (torch.Tensor) –
shape = [T, batch_size, input_size]的输入states (torch.Tensor) – 正向RNN cell的起始状态 若RNN cell只有单个隐藏状态,则
shape = [batch_size, hidden_size]; 否则shape = [states_num, batch_size, hidden_size]states_reverse – 反向RNN cell的起始状态 若RNN cell只有单个隐藏状态,则
shape = [batch_size, hidden_size]; 否则shape = [states_num, batch_size, hidden_size]
- 返回
y, ss, ss_r
- y: torch.Tensor
shape = [T, batch_size, 2 * hidden_size]的输出。y[t]由正向cell在t时刻和反向cell在T - t - 1时刻的输出拼接而来- ss: torch.Tensor
shape与states相同,正向cell在T-1时刻的状态- ss_r: torch.Tensor
shape与states_reverse相同,反向cell在0时刻的状态
计算单个正向和反向RNN cell沿着时间维度的循环并输出结果和两个cell的最终状态。
- class spikingjelly.activation_based.rnn.SpikingRNNCellBase(input_size: int, hidden_size: int, bias=True)[源代码]
基类:
ModuleSpiking RNN Cell 的基类。
- 参数
备注
所有权重和偏置项都会按照 \(\mathcal{U}(-\sqrt{k}, \sqrt{k})\) 进行初始化。 其中 \(k = \frac{1}{\text{hidden_size}}\).
The base class of Spiking RNN Cell.
- 参数
Note
All the weights and biases are initialized from \(\mathcal{U}(-\sqrt{k}, \sqrt{k})\) where \(k = \frac{1}{\text{hidden_size}}\).
- class spikingjelly.activation_based.rnn.SpikingRNNBase(input_size, hidden_size, num_layers, bias=True, dropout_p=0, invariant_dropout_mask=False, bidirectional=False, *args, **kwargs)[源代码]
基类:
Module多层 脉冲 RNN的基类。
- 参数
input_size (int) – 输入
x的特征数hidden_size (int) – 隐藏状态
h的特征数num_layers (int) – 内部RNN的层数,例如
num_layers = 2将会创建堆栈式的两层RNN,第1层接收第0层的输出作为输入, 并计算最终输出bias (bool) – 若为
False, 则内部的隐藏层不会带有偏置项b_ih和b_hh。 默认为Truedropout_p (float) – 若非
0,则除了最后一层,每个RNN层后会增加一个丢弃概率为dropout_p的 Dropout 层。 默认为0invariant_dropout_mask (bool) – 若为
False,则使用普通的 Dropout;若为True,则使用SNN中特有的,mask 不 随着时间变化的 Dropout`,参见Dropout。默认为Falsebidirectional (bool) – 若为
True,则使用双向RNN。默认为Falseargs – 子类使用的额外参数
kwargs – 子类使用的额外参数
The base-class of a multi-layer spiking RNN.
- 参数
input_size (int) – The number of expected features in the input
xhidden_size (int) – The number of features in the hidden state
hnum_layers (int) – Number of recurrent layers. E.g., setting
num_layers=2would mean stacking two LSTMs together to form a stacked RNN, with the second RNN taking in outputs of the first RNN and computing the final resultsbias (bool) – If
False, then the layer does not use bias weights b_ih and b_hh. Default:Truedropout_p (float) – If non-zero, introduces a Dropout layer on the outputs of each RNN layer except the last layer, with dropout probability equal to
dropout. Default: 0invariant_dropout_mask (bool) – If
False,use the naive Dropout;IfTrue,use the dropout in SNN that mask doesn’t change in different time steps, seeDropoutfor more information. Defaule:Falsebidirectional (bool) – If
True, becomes a bidirectional LSTM. Default:Falseargs – additional arguments for sub-class
kwargs – additional arguments for sub-class
- create_cells(*args, **kwargs)[源代码]
-
- 参数
args – 子类使用的额外参数
kwargs – 子类使用的额外参数
- 返回
若
self.bidirectional == True则会返回正反两个堆栈式RNN;否则返回单个堆栈式RNN- 返回类型
nn.Sequential
- 参数
args – additional arguments for sub-class
kwargs – additional arguments for sub-class
- 返回
If
self.bidirectional == True, return a RNN for forward direction and a RNN for reverse direction; else, return a single stacking RNN- 返回类型
nn.Sequential
- static base_cell()[源代码]
-
- 返回
构成该RNN的基本RNN Cell。例如对于
SpikingLSTM, 返回的是SpikingLSTMCell- 返回类型
nn.Module
- 返回
The base cell of this RNN. E.g., in
SpikingLSTMthis function will returnSpikingLSTMCell- 返回类型
nn.Module
- static states_num()[源代码]
-
- 返回
状态变量的数量。例如对于
SpikingLSTM,由于其输出是h和c, 因此返回2;而对于SpikingGRU,由于其输出是h,因此返回1- 返回类型
- 返回
The states number. E.g., for
SpikingLSTMthe output arehandc, this function will return2; forSpikingGRUthe output ish, this function will return1- 返回类型
- forward(x: Tensor, states=None)[源代码]
-
- 参数
x (torch.Tensor) –
shape = [T, batch_size, input_size],输入序列states (torch.Tensor or tuple) –
self.states_num()为1时是单个tensor, 否则是一个tuple,包含self.states_num()个tensors。 所有的tensor的尺寸均为shape = [num_layers * num_directions, batch, hidden_size], 包含self.states_num()个初始状态 如果RNN是双向的,num_directions为2, 否则为1
- 返回
output, output_states output: torch.Tensor
shape = [T, batch, num_directions * hidden_size],最后一层在所有时刻的输出- output_states: torch.Tensor or tuple
self.states_num()为1时是单个tensor, 否则是一个tuple,包含self.states_num()个tensors。 所有的tensor的尺寸均为shape = [num_layers * num_directions, batch, hidden_size], 包含self.states_num()个最后时刻的状态
- 参数
x (torch.Tensor) –
shape = [T, batch_size, input_size], tensor containing the features of the input sequencestates (torch.Tensor or tuple) – a single tensor when
self.states_num()is1, otherwise a tuple withself.states_num()tensors.shape = [num_layers * num_directions, batch, hidden_size]for all tensors, containing theself.states_num()initial states for each element in the batch. If the RNN is bidirectional,num_directionsshould be2, else it should be1
- 返回
output, output_states output: torch.Tensor
shape = [T, batch, num_directions * hidden_size], tensor containing the output features from the last layer of the RNN, for eacht- output_states: torch.Tensor or tuple
a single tensor when
self.states_num()is1, otherwise a tuple withself.states_num()tensors.shape = [num_layers * num_directions, batch, hidden_size]for all tensors, containing theself.states_num()states fort = T - 1
- class spikingjelly.activation_based.rnn.SpikingLSTMCell(input_size: int, hidden_size: int, bias=True, surrogate_function1=Erf(alpha=2.0, spiking=True), surrogate_function2=None)[源代码]
-
脉冲 长短时记忆 (LSTM) cell, 最先由 Long Short-Term Memory Spiking Networks and Their Applications 一文提出。
\[\begin{split}i &= \Theta(W_{ii} x + b_{ii} + W_{hi} h + b_{hi}) \\ f &= \Theta(W_{if} x + b_{if} + W_{hf} h + b_{hf}) \\ g &= \Theta(W_{ig} x + b_{ig} + W_{hg} h + b_{hg}) \\ o &= \Theta(W_{io} x + b_{io} + W_{ho} h + b_{ho}) \\ c' &= f * c + i * g \\ h' &= o * c'\end{split}\]其中 \(\Theta\) 是heaviside阶跃函数(脉冲函数), and \(*\) 是Hadamard点积,即逐元素相乘。
- 参数
input_size (int) – 输入
x的特征数hidden_size (int) – 隐藏状态
h的特征数bias (bool) – 若为
False, 则内部的隐藏层不会带有偏置项b_ih和b_hh。 默认为Truesurrogate_function1 (spikingjelly.activation_based.surrogate.SurrogateFunctionBase) – 反向传播时用来计算脉冲函数梯度的替代函数, 计算
i,f,o反向传播时使用surrogate_function2 (None or spikingjelly.activation_based.surrogate.SurrogateFunctionBase) – 反向传播时用来计算脉冲函数梯度的替代函数, 计算
g反向传播时使用。 若为None, 则设置成surrogate_function1。默认为None
备注
所有权重和偏置项都会按照 \(\mathcal{U}(-\sqrt{k}, \sqrt{k})\) 进行初始化。 其中 \(k = \frac{1}{\text{hidden_size}}\).
示例代码:
T = 6 batch_size = 2 input_size = 3 hidden_size = 4 rnn = rnn.SpikingLSTMCell(input_size, hidden_size) input = torch.randn(T, batch_size, input_size) * 50 h = torch.randn(batch_size, hidden_size) c = torch.randn(batch_size, hidden_size) output = [] for t in range(T): h, c = rnn(input[t], (h, c)) output.append(h) print(output)
A spiking long short-term memory (LSTM) cell, which is firstly proposed in Long Short-Term Memory Spiking Networks and Their Applications.
\[\begin{split}i &= \Theta(W_{ii} x + b_{ii} + W_{hi} h + b_{hi}) \\ f &= \Theta(W_{if} x + b_{if} + W_{hf} h + b_{hf}) \\ g &= \Theta(W_{ig} x + b_{ig} + W_{hg} h + b_{hg}) \\ o &= \Theta(W_{io} x + b_{io} + W_{ho} h + b_{ho}) \\ c' &= f * c + i * g \\ h' &= o * c'\end{split}\]where \(\Theta\) is the heaviside function, and \(*\) is the Hadamard product.
- 参数
input_size (int) – The number of expected features in the input
xhidden_size (The number of features in the hidden state
h) – intbias (bool) – If
False, then the layer does not use bias weightsb_ihandb_hh. Default:Truesurrogate_function1 (spikingjelly.activation_based.surrogate.SurrogateFunctionBase) – surrogate function for replacing gradient of spiking functions during back-propagation, which is used for generating
i,f,osurrogate_function2 (None or spikingjelly.activation_based.surrogate.SurrogateFunctionBase) – surrogate function for replacing gradient of spiking functions during back-propagation, which is used for generating
g. IfNone, the surrogate function for generatinggwill be set assurrogate_function1. Default:None
Note
All the weights and biases are initialized from \(\mathcal{U}(-\sqrt{k}, \sqrt{k})\) where \(k = \frac{1}{\text{hidden_size}}\).
Examples:
T = 6 batch_size = 2 input_size = 3 hidden_size = 4 rnn = rnn.SpikingLSTMCell(input_size, hidden_size) input = torch.randn(T, batch_size, input_size) * 50 h = torch.randn(batch_size, hidden_size) c = torch.randn(batch_size, hidden_size) output = [] for t in range(T): h, c = rnn(input[t], (h, c)) output.append(h) print(output)
- forward(x: Tensor, hc=None)[源代码]
-
- 参数
x (torch.Tensor) –
shape = [batch_size, input_size]的输入hc (tuple or None) –
(h_0, c_0) h_0 : torch.Tensor
shape = [batch_size, hidden_size],起始隐藏状态- c_0torch.Tensor
shape = [batch_size, hidden_size],起始细胞状态
如果不提供(h_0, c_0),
h_0默认c_0默认为0
- 返回
(h_1, c_1) : h_1 : torch.Tensor
shape = [batch_size, hidden_size],下一个时刻的隐藏状态- c_1torch.Tensor
shape = [batch_size, hidden_size],下一个时刻的细胞状态
- 返回类型
- 参数
x (torch.Tensor) – the input tensor with
shape = [batch_size, input_size]hc (tuple or None) –
(h_0, c_0) h_0 : torch.Tensor
shape = [batch_size, hidden_size], tensor containing the initial hidden state for each element in the batch- c_0torch.Tensor
shape = [batch_size, hidden_size], tensor containing the initial cell state for each element in the batch
If (h_0, c_0) is not provided, both
h_0andc_0default to zero
- 返回
(h_1, c_1) : h_1 : torch.Tensor
shape = [batch_size, hidden_size], tensor containing the next hidden state for each element in the batch- c_1torch.Tensor
shape = [batch_size, hidden_size], tensor containing the next cell state for each element in the batch
- 返回类型
- class spikingjelly.activation_based.rnn.SpikingLSTM(input_size, hidden_size, num_layers, bias=True, dropout_p=0, invariant_dropout_mask=False, bidirectional=False, surrogate_function1=Erf(alpha=2.0, spiking=True), surrogate_function2=None)[源代码]
-
多层`脉冲` 长短时记忆LSTM, 最先由 Long Short-Term Memory Spiking Networks and Their Applications 一文提出。
每一层的计算按照
\[\begin{split}i_{t} &= \Theta(W_{ii} x_{t} + b_{ii} + W_{hi} h_{t-1} + b_{hi}) \\ f_{t} &= \Theta(W_{if} x_{t} + b_{if} + W_{hf} h_{t-1} + b_{hf}) \\ g_{t} &= \Theta(W_{ig} x_{t} + b_{ig} + W_{hg} h_{t-1} + b_{hg}) \\ o_{t} &= \Theta(W_{io} x_{t} + b_{io} + W_{ho} h_{t-1} + b_{ho}) \\ c_{t} &= f_{t} * c_{t-1} + i_{t} * g_{t} \\ h_{t} &= o_{t} * c_{t-1}'\end{split}\]其中 \(h_{t}\) 是 \(t\) 时刻的隐藏状态,\(c_{t}\) 是 \(t\) 时刻的细胞状态,\(h_{t-1}\) 是该层 \(t-1\) 时刻的隐藏状态或起始状态,\(i_{t}\),\(f_{t}\),\(g_{t}\),\(o_{t}\) 分别是输入,遗忘,细胞,输出门, \(\Theta\) 是heaviside阶跃函数(脉冲函数), and \(*\) 是Hadamard点积,即逐元素相乘。
- 参数
input_size (int) – 输入
x的特征数hidden_size (int) – 隐藏状态
h的特征数num_layers (int) – 内部RNN的层数,例如
num_layers = 2将会创建堆栈式的两层RNN,第1层接收第0层的输出作为输入, 并计算最终输出bias (bool) – 若为
False, 则内部的隐藏层不会带有偏置项b_ih和b_hh。 默认为Truedropout_p (float) – 若非
0,则除了最后一层,每个RNN层后会增加一个丢弃概率为dropout_p的 Dropout 层。 默认为0invariant_dropout_mask (bool) – 若为
False,则使用普通的 Dropout;若为True,则使用SNN中特有的,mask 不 随着时间变化的 Dropout`,参见Dropout。默认为Falsebidirectional (bool) – 若为
True,则使用双向RNN。默认为Falsesurrogate_function1 (spikingjelly.activation_based.surrogate.SurrogateFunctionBase) – 反向传播时用来计算脉冲函数梯度的替代函数, 计算
i,f,o反向传播时使用surrogate_function2 (None or spikingjelly.activation_based.surrogate.SurrogateFunctionBase) – 反向传播时用来计算脉冲函数梯度的替代函数, 计算
g反向传播时使用。 若为None, 则设置成surrogate_function1。默认为None
The spiking multi-layer long short-term memory (LSTM), which is firstly proposed in Long Short-Term Memory Spiking Networks and Their Applications.
For each element in the input sequence, each layer computes the following function:
\[\begin{split}i_{t} &= \Theta(W_{ii} x_{t} + b_{ii} + W_{hi} h_{t-1} + b_{hi}) \\ f_{t} &= \Theta(W_{if} x_{t} + b_{if} + W_{hf} h_{t-1} + b_{hf}) \\ g_{t} &= \Theta(W_{ig} x_{t} + b_{ig} + W_{hg} h_{t-1} + b_{hg}) \\ o_{t} &= \Theta(W_{io} x_{t} + b_{io} + W_{ho} h_{t-1} + b_{ho}) \\ c_{t} &= f_{t} * c_{t-1} + i_{t} * g_{t} \\ h_{t} &= o_{t} * c_{t-1}'\end{split}\]where \(h_t\) is the hidden state at time t, \(c_t\) is the cell state at time t, \(x_t\) is the input at time t, \(h_{t-1}\) is the hidden state of the layer at time t-1 or the initial hidden state at time 0, and \(i_t\), \(f_t\), \(g_t\), \(o_t\) are the input, forget, cell, and output gates, respectively. \(\Theta\) is the heaviside function, and \(*\) is the Hadamard product.
- 参数
input_size (int) – The number of expected features in the input
xhidden_size (int) – The number of features in the hidden state
hnum_layers (int) – Number of recurrent layers. E.g., setting
num_layers=2would mean stacking two LSTMs together to form a stacked RNN, with the second RNN taking in outputs of the first RNN and computing the final resultsbias (bool) – If
False, then the layer does not use bias weights b_ih and b_hh. Default:Truedropout_p (float) – If non-zero, introduces a Dropout layer on the outputs of each RNN layer except the last layer, with dropout probability equal to
dropout. Default: 0invariant_dropout_mask (bool) – If
False,use the naive Dropout;IfTrue,use the dropout in SNN that mask doesn’t change in different time steps, seeDropoutfor more information. Defaule:Falsebidirectional (bool) – If
True, becomes a bidirectional LSTM. Default:Falsesurrogate_function1 (spikingjelly.activation_based.surrogate.SurrogateFunctionBase) – surrogate function for replacing gradient of spiking functions during back-propagation, which is used for generating
i,f,osurrogate_function2 (None or spikingjelly.activation_based.surrogate.SurrogateFunctionBase) – surrogate function for replacing gradient of spiking functions during back-propagation, which is used for generating
g. IfNone, the surrogate function for generatinggwill be set assurrogate_function1. Default:None
- class spikingjelly.activation_based.rnn.SpikingVanillaRNNCell(input_size: int, hidden_size: int, bias=True, surrogate_function=Erf(alpha=2.0, spiking=True))[源代码]
- class spikingjelly.activation_based.rnn.SpikingVanillaRNN(input_size, hidden_size, num_layers, bias=True, dropout_p=0, invariant_dropout_mask=False, bidirectional=False, surrogate_function=Erf(alpha=2.0, spiking=True))[源代码]
- class spikingjelly.activation_based.rnn.SpikingGRUCell(input_size: int, hidden_size: int, bias=True, surrogate_function1=Erf(alpha=2.0, spiking=True), surrogate_function2=None)[源代码]