spikingjelly.activation_based.functional.loss 源代码

import torch
import torch.nn.functional as F
from torch import Tensor

__all__ = [
    "kernel_dot_product",
    "spike_similar_loss",
    "temporal_efficient_training_cross_entropy",
]



[文档]
def kernel_dot_product(x: Tensor, y: Tensor, kernel="linear", *args):
    r"""
    **API Language** - :ref:`中文 <kernel_dot_product-cn>` | :ref:`English <kernel_dot_product-en>`

    ----

    .. _kernel_dot_product-cn:

    * **中文**

    计算批量数据 ``x`` 和 ``y`` 在核空间的内积。记2个M维tensor分别为 :math:`\boldsymbol{x_{i}}` 和 :math:`\boldsymbol{y_{j}}` ， ``kernel`` 定义了不同形式的内积：

    - ``linear`` ，线性内积， :math:`\kappa(\boldsymbol{x_{i}}, \boldsymbol{y_{j}}) = \boldsymbol{x_{i}}^{T}\boldsymbol{y_{j}}`。

    - ``polynomial`` ，多项式内积， :math:`\kappa(\boldsymbol{x_{i}}, \boldsymbol{y_{j}}) = (\boldsymbol{x_{i}}^{T}\boldsymbol{y_{j}})^{d}`，其中 :math:`d = args[0]`。

    - ``sigmoid`` ，Sigmoid内积， :math:`\kappa(\boldsymbol{x_{i}}, \boldsymbol{y_{j}}) = \mathrm{sigmoid}(\alpha \boldsymbol{x_{i}}^{T}\boldsymbol{y_{j}})`，其中 :math:`\alpha = args[0]`。

    - ``gaussian`` ，高斯内积， :math:`\kappa(\boldsymbol{x_{i}}, \boldsymbol{y_{j}}) = \mathrm{exp}(- \frac{||\boldsymbol{x_{i}} - \boldsymbol{y_{j}}||^{2}}{2\sigma^{2}})` ，其中 :math:`\sigma = args[0]`。

    :param x: shape=[N, M]的tensor，看作是N个M维向量
    :type x: torch.Tensor

    :param y: shape=[N, M]的tensor，看作是N个M维向量
    :type y: torch.Tensor

    :param kernel: 计算内积时所使用的核函数
    :type kernel: str

    :param args: 用于计算内积的额外的参数

    :return: ret, shape=[N, N]的tensor， ``ret[i][j]`` 表示 ``x[i]`` 和 ``y[j]`` 的内积
    :rtype: torch.Tensor

    :raises ValueError: 当 ``kernel`` 为 ``"gaussian"`` 且 ``sigma <= 0`` 时

    ----

    .. _kernel_dot_product-en:

    * **English**

    Calculate inner product of ``x`` and ``y`` in kernel space. These 2 M-dim tensors are denoted by :math:`\boldsymbol{x_{i}}` and :math:`\boldsymbol{y_{j}}`. ``kernel`` determine the kind of inner product:

    - ``linear`` -- Linear kernel, :math:`\kappa(\boldsymbol{x_{i}}, \boldsymbol{y_{j}}) = \boldsymbol{x_{i}}^{T}\boldsymbol{y_{j}}`.

    - ``polynomial`` -- Polynomial kernel, :math:`\kappa(\boldsymbol{x_{i}}, \boldsymbol{y_{j}}) = (\boldsymbol{x_{i}}^{T}\boldsymbol{y_{j}})^{d}`, where :math:`d = args[0]`.

    - ``sigmoid`` -- Sigmoid kernel, :math:`\kappa(\boldsymbol{x_{i}}, \boldsymbol{y_{j}}) = \mathrm{sigmoid}(\alpha \boldsymbol{x_{i}}^{T}\boldsymbol{y_{j}})`, where :math:`\alpha = args[0]`.

    - ``gaussian`` -- Gaussian kernel, :math:`\kappa(\boldsymbol{x_{i}}, \boldsymbol{y_{j}}) = \mathrm{exp}(- \frac{||\boldsymbol{x_{i}} - \boldsymbol{y_{j}}||^{2}}{2\sigma^{2}})`, where :math:`\sigma = args[0]`.

    :param x: Tensor of shape=[N, M]
    :type x: torch.Tensor

    :param y: Tensor of shape=[N, M]
    :type y: torch.Tensor

    :param kernel: Type of kernel function used when calculating inner products.
    :type kernel: str

    :param args: Extra parameters for inner product

    :return: ret, Tensor of shape=[N, N], ``ret[i][j]`` is inner product of ``x[i]`` and ``y[j]``.
    :rtype: torch.Tensor

    :raises ValueError: if ``kernel`` is ``"gaussian"`` and ``sigma <= 0``
    """
    if kernel == "linear":
        return x.mm(y.t())
    elif kernel == "polynomial":
        d = args[0]
        return x.mm(y.t()).pow(d)
    elif kernel == "sigmoid":
        alpha = args[0]
        return torch.sigmoid(alpha * x.mm(y.t()))
    elif kernel == "gaussian":
        sigma = args[0]
        if sigma <= 0:
            raise ValueError(
                f"sigma must be positive for the Gaussian kernel, but got {sigma}"
            )
        N = x.shape[0]
        x2 = x.square().sum(dim=1)  # shape=[N]
        y2 = y.square().sum(dim=1)  # shape=[N]
        xy = x.mm(y.t())  # shape=[N, N]
        d_xy = x2.unsqueeze(1).repeat(1, N) + y2.unsqueeze(0).repeat(N, 1) - 2 * xy
        # d_xy[i][j]的元素是x[i]的平方和，加上y[j]的平方和，减去2倍的sum_{k} x[i][k]y[j][k]，因此
        # d_xy[i][j]就是x[i]和y[j]相减，平方，求和
        return torch.exp(-d_xy / (2 * sigma * sigma))
    else:
        raise NotImplementedError




[文档]
def spike_similar_loss(
    spikes: Tensor, labels: Tensor, kernel_type="linear", loss_type="mse", *args
):
    r"""
    **API Language** - :ref:`中文 <spike_similar_loss-cn>` | :ref:`English <spike_similar_loss-en>`

    ----

    .. _spike_similar_loss-cn:

    * **中文**

    将N个数据输入到输出层有M个神经元的SNN，运行T步，得到shape=[N, M, T]的脉冲。这N个数据的标签为shape=[N, C]的 ``labels``。

    用shape=[N, N]的矩阵 ``sim`` 表示 **实际相似度矩阵**， ``sim[i][j] == 1`` 表示
    数据i与数据j相似，反之亦然。若 ``labels[i]`` 与 ``labels[j]`` 共享至少同一个标签，则
    认为他们相似，否则不相似。

    用shape=[N, N]的矩阵 ``sim_p`` 表示 **输出相似度矩阵** ， ``sim_p[i][j]`` 的取值
    为0到1，值越大表示数据i与数据j的脉冲越相似。

    使用内积来衡量两个脉冲之间的相似性， ``kernel_type`` 是计算内积时，所使用的核函数种类：

    - ``linear`` ，线性内积， :math:`\kappa(\boldsymbol{x_{i}}, \boldsymbol{y_{j}}) = \boldsymbol{x_{i}}^{T}\boldsymbol{y_{j}}`。

    - ``sigmoid`` ，Sigmoid内积， :math:`\kappa(\boldsymbol{x_{i}}, \boldsymbol{y_{j}}) = \mathrm{sigmoid}(\alpha \boldsymbol{x_{i}}^{T}\boldsymbol{y_{j}})`，其中 :math:`\alpha = args[0]`。

    - ``gaussian`` ，高斯内积，:math:`\kappa(\boldsymbol{x_{i}}, \boldsymbol{y_{j}}) = \mathrm{exp}(- \frac{||\boldsymbol{x_{i}} - \boldsymbol{y_{j}}||^{2}}{2\sigma^{2}})`，其中 :math:`\sigma = args[0]`。

    当使用Sigmoid或高斯内积时，内积的取值范围均在[0, 1]之间；而使用线性内积时，为了保证内积取值仍然在[0, 1]之间，会进行归一化，按照 :math:`\text{sim}_p[i][j]=\frac{\kappa(\boldsymbol{x_{i}}, \boldsymbol{y_{j}})}{||\boldsymbol{x_{i}}|| · ||\boldsymbol{y_{j}}||}`。

    对于相似的数据，根据输入的 ``loss_type`` ，返回度量 ``sim`` 与 ``sim_p`` 差异的损失：

    - ``mse`` -- 返回sim与sim_p的均方误差（也就是l2误差）。

    - ``l1`` -- 返回sim与sim_p的l1误差。

    - ``bce`` -- 返回sim与sim_p的二值交叉熵误差。

    .. note::
        脉冲向量稀疏、离散，最好先使用高斯核进行平滑，然后再计算相似度。

    :param spikes: shape=[N, M, T]，N个数据生成的脉冲
    :type spikes: torch.Tensor

    :param labels: shape=[N, C]，N个数据的标签， ``labels[i][k] == 1`` 表示数据i属于
        第k类，反之亦然，允许多标签
    :type labels: torch.Tensor

    :param kernel_type: 使用内积来衡量两个脉冲之间的相似性， ``kernel_type`` 是计算内积时，
        所使用的核函数种类
    :type kernel_type: str

    :param loss_type: 返回哪种损失，可以为'mse', 'l1', 'bce'
    :type loss_type: str

    :param args: 用于计算内积的额外参数

    :return: shape=[1]的tensor，相似损失
    :rtype: torch.Tensor

    ----

    .. _spike_similar_loss-en:

    * **English**

    A SNN consisting M neurons will receive a batch of N input data in each timestep (from 0 to T-1) and output a spike tensor of shape=[N, M, T]. The label is a tensor of shape=[N, C].

    The **groundtruth similarity matrix** ``sim`` has a shape of [N, N]. ``sim[i][j] == 1`` indicates that input i is similar to input j and vice versa. If and only if ``labels[i]`` and ``labels[j]`` have at least one common label, they are viewed as similar.

    The **output similarity matrix** ``sim_p`` has a shape of [N, N]. The value of ``sim_p[i][j]`` ranges from 0 to 1, represents the similarity between output spike from both input i and input j.

    The similarity is measured by inner product of two spikes. ``kernel_type`` is the type of kernel function when calculating inner product:

    - ``linear``, Linear kernel, :math:`\kappa(\boldsymbol{x_{i}}, \boldsymbol{y_{j}}) = \boldsymbol{x_{i}}^{T}\boldsymbol{y_{j}}`.

    - ``sigmoid``, Sigmoid kernel, :math:`\kappa(\boldsymbol{x_{i}}, \boldsymbol{y_{j}}) = \mathrm{sigmoid}(\alpha \boldsymbol{x_{i}}^{T}\boldsymbol{y_{j}})`, where :math:`\alpha = args[0]`.

    - ``gaussian``, Gaussian kernel，:math:`\kappa(\boldsymbol{x_{i}}, \boldsymbol{y_{j}}) = \mathrm{exp}(- \frac{||\boldsymbol{x_{i}} - \boldsymbol{y_{j}}||^{2}}{2\sigma^{2}})`, where :math:`\sigma = args[0]`.

    When Sigmoid or Gaussian kernel is applied, the inner product naturally lies in :math:`[0, 1]`. To make the value consistent when using linear kernel, the result will be normalized as: :math:`\text{sim}_p[i][j]=\frac{\kappa(\boldsymbol{x_{i}}, \boldsymbol{y_{j}})}{||\boldsymbol{x_{i}}|| · ||\boldsymbol{y_{j}}||}`.

    For similar data, return the specified discrepancy loss between ``sim`` and ``sim_p`` according to ``loss_type``.

    - ``mse`` -- Return the Mean-Square Error (squared L2 norm) between sim and sim_p.

    - ``l1`` -- Return the L1 error between sim and sim_p.

    - ``bce`` -- Return the Binary Cross Entropy between sim and sim_p.

    .. admonition:: Note
        :class: note

        Since spike vectors are usually discrete and sparse, it would be better to apply Gaussian filter first to smooth the vectors before calculating similarities.

    :param spikes: shape=[N, M, T], output spikes corresponding to a batch of N inputs
    :type spikes: torch.Tensor

    :param labels: shape=[N, C], labels of inputs, ``labels[i][k] == 1`` means
        the i-th input belongs to the k-th category and vice versa.
        Multi-label input is allowed.
    :type labels: torch.Tensor

    :param kernel_type: Type of kernel function used when calculating inner
        products. The inner product is the similarity measure of two spikes.
    :type kernel_type: str

    :param loss_type: Type of loss returned. Can be: 'mse', 'l1', 'bce'
    :type loss_type: str

    :param args: Extra parameters for inner product

    :return: shape=[1], similarity loss
    :rtype: torch.Tensor
    """
    spikes = spikes.flatten(start_dim=1)

    sim_p = kernel_dot_product(spikes, spikes, kernel_type, *args)

    if kernel_type == "linear":
        spikes_len = spikes.norm(p=2, dim=1, keepdim=True)
        sim_p = sim_p / ((spikes_len.mm(spikes_len.t())) + 1e-8)

    labels = labels.float()
    sim = labels.mm(labels.t()).clamp_max(
        1
    )  # labels.mm(labels.t())[i][j]位置的元素表现输入数据i和数据数据j有多少个相同的标签
    # 将大于1的元素设置为1，因为共享至少同一个标签，就认为他们相似

    if loss_type == "mse":
        return F.mse_loss(sim_p, sim)
    elif loss_type == "l1":
        return F.l1_loss(sim_p, sim)
    elif loss_type == "bce":
        return F.binary_cross_entropy(sim_p, sim)
    else:
        raise NotImplementedError




[文档]
def temporal_efficient_training_cross_entropy(x_seq: Tensor, target: Tensor):
    """
    **API Language** - :ref:`中文 <temporal_efficient_training_cross_entropy-cn>` | :ref:`English <temporal_efficient_training_cross_entropy-en>`

    ----

    .. _temporal_efficient_training_cross_entropy-cn:

    * **中文**

    Temporal efficient training (TET) 交叉熵损失, 是每个时间步的交叉熵损失的平均。

    .. note::

        TET交叉熵是 `Temporal Efficient Training of Spiking Neural Network via Gradient Re-weighting <https://openreview.net/forum?id=_XNtisL32jv>`_ 一文提出的。

    :param x_seq: ``shape=[T, N, C, *]`` 的预测值，其中 ``C`` 是类别总数
    :type x_seq: torch.Tensor

    :param target: ``shape=[N]`` 的真实值，其中 ``target[i]`` 是真实类别
    :type target: torch.Tensor

    :return: 损失值
    :rtype: torch.Tensor

    ----

    .. _temporal_efficient_training_cross_entropy-en:

    * **English**

    The temporal efficient training (TET) cross entropy, which is the mean of cross entropy of each time-step.

    .. admonition:: Note
        :class: note

        The TET cross entropy is proposed by `Temporal Efficient Training of Spiking Neural Network via Gradient Re-weighting <https://openreview.net/forum?id=_XNtisL32jv>`_.

    :param x_seq: the predicted value with ``shape=[T, N, C, *]``, where ``C`` is the number of classes
    :type x_seq: torch.Tensor

    :param target: the ground truth tensor with ``shape=[N]``, where ``target[i]`` is the label
    :type target: torch.Tensor

    :return: the temporal efficient training cross entropy
    :rtype: torch.Tensor

    ----

    * **示例代码 | Example**

    .. code-block:: python

        def tet_ce_for_loop_version(x_seq: torch.Tensor, target: torch.LongTensor):
            loss = 0.0
            for t in range(x_seq.shape[0]):
                loss += F.cross_entropy(x_seq[t], target)
            return loss / x_seq.shape[0]


        T = 8
        N = 4
        C = 10
        x_seq = torch.rand([T, N, C])
        target = torch.randint(low=0, high=C - 1, size=[N])
        print(
            f"max error = {(tet_ce_for_loop_version(x_seq, target) - temporal_efficient_training_cross_entropy(x_seq, target)).abs().max()}"
        )
        # max error < 1e-6
    """
    # x_seq.shape = [T, N, C, *]
    x_seq = x_seq.transpose(0, 1).transpose(1, 2)  # [N, C, T, *]
    T = x_seq.shape[2]
    if x_seq.dim() == 3:
        # x_seq.shape = [N, C, T]
        # target.shape = [N]
        target = target.unsqueeze(1).repeat(1, T)  # [N, T]
    else:
        # x_seq.shape = [N, C, T, d1, d2, ..., dk]
        # target.shape = [N, d1, d2, ..., dk]
        rep_shape = [1, T]
        rep_shape.extend([1] * (x_seq.dim() - 3))
        target = target.unsqueeze(1).repeat(rep_shape)

    loss = F.cross_entropy(x_seq, target)
    return loss