spikingjelly.activation_based.distributed package#

本子包提供基于 torch.distributedDTensor、tensor parallel 与 FSDP2 的实验性分布式训练工具,面向 spikingjelly.activation_based 的多步 SNN。


This package provides experimental distributed-training helpers for multi-step SNNs in spikingjelly.activation_based based on torch.distributed, DTensor, tensor parallelism, and FSDP2.

Distributed Helpers#

SNNDistributedConfig

High-level configuration for DTensor-ready SNN distribution.

SNNDistributedAnalysis

Capability analysis for stateful modules and tensor-parallel candidates.

ensure_distributed_initialized

Initialize torch.distributed when needed.

build_device_mesh

Build a DeviceMesh for tensor/data parallelism.

configure_snn_distributed

The main low-level entry for DTensor-ready SNN distribution.

configure_cifar10dvs_vgg_distributed

Convenience helper for CIFAR10DVSVGG with DP / TP.

configure_cifar10dvs_vgg_fsdp2

Convenience helper for CIFAR10DVSVGG with FSDP2 / FSDP2+TP.

materialize_dtensor_output

Convert a DTensor output back to a regular tensor when needed.

API Language: 中文 | English


  • 中文

分布式训练支持模块,包含张量并行和数据并行工具。

return:

None

rtype:

None


  • English

Distributed training support module with tensor and data parallelism utilities.

return:

None

rtype:

None

class spikingjelly.activation_based.distributed.DistributedFeatureSet(allow_experimental_conv_tp: 'bool' = False, allow_experimental_spikformer_tp: 'bool' = False, allow_pipeline: 'bool' = True, allow_zero_optimizer: 'bool' = True)[源代码]#

基类:object

allow_experimental_conv_tp: bool = False#
allow_experimental_spikformer_tp: bool = False#
allow_pipeline: bool = True#
allow_zero_optimizer: bool = True#
class spikingjelly.activation_based.distributed.SNNDistributedPlan(mode: 'str', objective: 'str', topology: 'SNNDistributedTopology', model_family: 'str', backend: 'str', batch_size: 'int', optimizer_strategy: 'str', memopt_level: 'int', rationale: 'Tuple[str, ...]', notes: 'Tuple[str, ...]', tensor_parallel_roots: 'Optional[Tuple[str, ...]]' = None, mesh_shape: 'Optional[Tuple[int, ...]]' = None, tp_mesh_dim: 'int' = 0, dp_mesh_dim: 'Optional[int]' = None, pp_microbatches: 'Optional[int]' = None, pp_schedule: 'str' = '1f1b', pp_virtual_stages: 'int' = 1, pp_layout: 'Optional[Tuple[int, ...]]' = None, pp_delay_wgrad: 'bool' = False, experimental_features: 'DistributedFeatureSet' = DistributedFeatureSet(allow_experimental_conv_tp=False, allow_experimental_spikformer_tp=False, allow_pipeline=True, allow_zero_optimizer=True))[源代码]#

基类:object

dp_mesh_dim: int | None = None#
experimental_features: DistributedFeatureSet = DistributedFeatureSet(allow_experimental_conv_tp=False, allow_experimental_spikformer_tp=False, allow_pipeline=True, allow_zero_optimizer=True)#
mesh_shape: Tuple[int, ...] | None = None#
pp_delay_wgrad: bool = False#
pp_layout: Tuple[int, ...] | None = None#
pp_microbatches: int | None = None#
pp_schedule: str = '1f1b'#
pp_virtual_stages: int = 1#
tensor_parallel_roots: Tuple[str, ...] | None = None#
tp_mesh_dim: int = 0#
mode: str#
objective: str#
topology: SNNDistributedTopology#
model_family: str#
backend: str#
batch_size: int#
optimizer_strategy: str#
memopt_level: int#
rationale: Tuple[str, ...]#
notes: Tuple[str, ...]#
class spikingjelly.activation_based.distributed.SNNDistributedAnalysis(memory_module_names: Tuple[str, ...], tensor_parallel_candidate_names: Tuple[str, ...], unsupported_tensor_parallel_names: Tuple[str, ...], notes: Tuple[str, ...], tensor_parallel_roots: Tuple[str, ...] | None = None)[源代码]#

基类:object

API Language: 中文 | English


  • 中文

  • 中文

SNN 分布式训练分析器。分析模型结构并推荐并行策略。


  • English

  • English

SNN distributed training analyzer.

tensor_parallel_roots: Tuple[str, ...] | None = None#
memory_module_names: Tuple[str, ...]#
tensor_parallel_candidate_names: Tuple[str, ...]#
unsupported_tensor_parallel_names: Tuple[str, ...]#
notes: Tuple[str, ...]#
class spikingjelly.activation_based.distributed.SNNDistributedRuntime(kind: 'str', model: 'nn.Module', mesh: 'Optional[object]', analysis: 'Optional[SNNDistributedAnalysis]', plan: 'Optional[SNNDistributedPlan]' = None, mode: 'str' = 'none', pipeline_runtime: 'Optional[SNNPipelineRuntime]' = None)[源代码]#

基类:object

build_optimizer(optimizer_cls=<class 'torch.optim.adam.Adam'>, lr: float = 0.001, weight_decay: float = 0.0, **kwargs)[源代码]#
forward_loss(criterion, images: Tensor, labels: Tensor)[源代码]#
classmethod from_legacy(*, kind: str, model: Module, mesh: object | None, analysis: SNNDistributedAnalysis | None, mode: str, pipeline_runtime: SNNPipelineRuntime | None = None) SNNDistributedRuntime[源代码]#
mode: str = 'none'#
pipeline_runtime: SNNPipelineRuntime | None = None#
plan: SNNDistributedPlan | None = None#
prepare_classification_output(outputs, labels: Tensor, *, return_metadata: bool = False) Tuple[Tensor, Tensor] | PreparedModelOutput[源代码]#
prepare_dataloader(*, dataset, batch_size: int, shuffle: bool, num_workers: int, drop_last: bool, pin_memory: bool = True) DataLoader[源代码]#
static reduce_classification_output(outputs: Tensor, labels: Tensor) Tuple[Tensor, Tensor][源代码]#
reset_state()[源代码]#
kind: str#
model: Module#
mesh: object | None#
analysis: SNNDistributedAnalysis | None#
class spikingjelly.activation_based.distributed.SNNDistributedTopology(world_size: 'int', dims: 'Mapping[str, int]')[源代码]#

基类:object

classmethod from_mapping(dims: Mapping[str, int], *, world_size: int | None = None) SNNDistributedTopology[源代码]#
property mesh_shape: Tuple[int, ...]#
property ordered_dim_names: Tuple[str, ...]#
world_size: int#
dims: Mapping[str, int]#
class spikingjelly.activation_based.distributed.TensorShardMemoryModule(source: MemoryModule, shard_dim: int, logical_dim_size: int | None = None, process_group=None)[源代码]#

基类:MemoryModule

API Language: 中文 | English


  • 中文

  • 中文

支持张量并行分片的内存模块基类。


  • English

  • English

Base memory module supporting tensor parallel sharding.

参数:
  • source (base.MemoryModule) -- 源 MemoryModule

  • shard_dim (int) -- 切分维度

  • logical_dim_size (Optional[int]) -- 逻辑维度大小(每一维的大小),用于验证分片正确性

  • process_group (Any) -- 分布式进程组

  • source -- Source MemoryModule

  • shard_dim -- Dimension along which to shard

  • logical_dim_size -- Logical dimension size, used to validate sharding

  • process_group -- Distributed process group

返回:

None

返回类型:

None

extra_repr() str[源代码]#
forward(x: Tensor)[源代码]#
reset()[源代码]#
property store_v_seq#
property supported_backends#
spikingjelly.activation_based.distributed.analyze(model: Module, *, model_family: str | None = None, roots: Sequence[str] | None = None) SNNDistributedAnalysis[源代码]#
spikingjelly.activation_based.distributed.apply(*, model: Module, plan: SNNDistributedPlan, device_type: str = 'cuda', device_mesh=None) SNNDistributedRuntime[源代码]#
spikingjelly.activation_based.distributed.apply_pipeline_stage_memopt(runtime: SNNPipelineRuntime, *, memopt_level: int, compress_x: bool = False, stage_budget_ratio: float = 0.5, use_plan_cache: bool = True) Tuple[SNNPipelineRuntime, float, bool][源代码]#
spikingjelly.activation_based.distributed.build_snn_optimizer(module: ~torch.nn.modules.module.Module, mode: str, lr: float, weight_decay: float = 0.0, optimizer_sharding: str = 'none', foreach: bool | None = None, optimizer_cls=<class 'torch.optim.adam.Adam'>, **optimizer_kwargs)[源代码]#
spikingjelly.activation_based.distributed.build_device_mesh(device_type: str = 'cuda', mesh_shape: Tuple[int, ...] | None = None, mesh_dim_names: Tuple[str, ...] | None = None) DeviceMesh[源代码]#
spikingjelly.activation_based.distributed.enable_tp_communication_debug(enabled: bool = True) None[源代码]#
spikingjelly.activation_based.distributed.ensure_distributed_initialized(backend: str | None = None, init_method: str | None = None, rank: int | None = None, world_size: int | None = None) bool[源代码]#
spikingjelly.activation_based.distributed.get_tp_communication_debug_stats() Dict[str, int][源代码]#
spikingjelly.activation_based.distributed.plan(*, analysis: SNNDistributedAnalysis, objective: str, topology: Mapping[str, int] | SNNDistributedTopology, backend: str, batch_size: int, model_family: str | None = None, mode: str | None = None, features: DistributedFeatureSet | None = None) SNNDistributedPlan[源代码]#
spikingjelly.activation_based.distributed.recommended_pipeline_microbatches(batch_size: int, num_stages: int) int[源代码]#

API Language: 中文 | English


推荐流水线并行的微批次数量。


Recommend microbatches for pipeline parallelism.

spikingjelly.activation_based.distributed.recommend_snn_distributed_strategy(model: str, world_size: int, prefer: str, batch_size: int, backend: str = 'inductor', zero_redundancy_optimizer_available: bool | None = None, pipelining_available: bool | None = None, fsdp2_available: bool | None = None, tensor_parallel_available: bool | None = None) SNNDistributedRecommendation[源代码]#

API Language: 中文 | English


  • 中文

推荐 SNN 分布式训练策略。


  • English

Recommend SNN distributed strategy.

spikingjelly.activation_based.distributed.recommend_pipeline_memopt_stages(stage_costs: Sequence[float], stage_budget_ratio: float = 0.5) Tuple[int, ...][源代码]#
spikingjelly.activation_based.distributed.reset_tp_communication_debug_stats() None[源代码]#
spikingjelly.activation_based.distributed.resolve_data_parallel_partition(device_mesh: DeviceMesh | None, dp_mesh_dim: int | None, sharded_by_data_parallel: bool) Tuple[int, int][源代码]#
spikingjelly.activation_based.distributed.resolve_tensor_parallel_group_size(device_mesh: DeviceMesh | None, tp_mesh_dim: int, tensor_parallel_enabled: bool) int[源代码]#
spikingjelly.activation_based.distributed.unwrap_parallel_module(module: Module) Module[源代码]#