Shortcuts

LSFEnvironment

class pytorch_lightning.plugins.environments.LSFEnvironment[source]

Bases: pytorch_lightning.plugins.environments.cluster_environment.ClusterEnvironment

An environment for running on clusters managed by the LSF resource manager.

It is expected that any execution using this ClusterEnvironment was executed using the Job Step Manager i.e. jsrun.

This plugin expects the following environment variables:

LSB_JOBID

The LSF assigned job ID

LSB_DJOB_RANKFILE

The OpenMPI compatibile rank file for the LSF job

JSM_NAMESPACE_LOCAL_RANK

The node local rank for the task. This environment variable is set by jsrun

JSM_NAMESPACE_SIZE

The world size for the task. This environment variable is set by jsrun

JSM_NAMESPACE_RANK

The global rank for the task. This environment variable is set by jsrun

global_rank()[source]

The world size is read from the environment variable JSM_NAMESPACE_RANK.

Return type

int

static is_using_lsf()[source]

Returns True if the current process was launched using the jsrun command.

Return type

bool

local_rank()[source]

The local rank is read from the environment variable JSM_NAMESPACE_LOCAL_RANK.

Return type

int

master_address()[source]

The main address is read from an OpenMPI host rank file in the environment variable LSB_DJOB_RANKFILE.

Return type

str

master_port()[source]

The main port is calculated from the LSF job ID.

Return type

int

node_rank()[source]

The node rank is determined by the position of the current hostname in the OpenMPI host rank file stored in LSB_DJOB_RANKFILE.

Return type

int

world_size()[source]

The world size is read from the environment variable JSM_NAMESPACE_SIZE.

Return type

int

property creates_processes_externally: bool

LSF creates subprocesses, i.e., PyTorch Lightning does not need to spawn them.

Return type

bool