Shortcuts

LightningEnvironment

class pytorch_lightning.plugins.environments.LightningEnvironment[source]

Bases: pytorch_lightning.plugins.environments.cluster_environment.ClusterEnvironment

The default environment used by Lightning for a single node or free cluster (not managed).

There are two modes the Lightning environment can operate with:

  1. The user only launches the main process by python train.py ... with no additional environment variables set. Lightning will spawn new worker processes for distributed training in the current node.

  2. The user launches all processes manually or with utilities like torch.distributed.launch. The appropriate environment variables need to be set, and at minimum LOCAL_RANK.

If the master address and port are not provided, the default environment will choose them automatically. It is recommended to use this default environment for single-node distributed training as it provides a convenient way to launch the training script.

creates_children()[source]

Returns whether the cluster creates the processes or not. If at least LOCAL_RANK is available as environment variable, Lightning assumes the user acts as the process launcher/job scheduler and Lightning will not launch new processes.

Return type

bool

global_rank()[source]

The rank (index) of the currently running process across all nodes and devices.

Return type

int

local_rank()[source]

The rank (index) of the currently running process inside of the current node.

Return type

int

master_address()[source]

The master address through which all processes connect and communicate.

Return type

str

master_port()[source]

An open and configured port in the master node through which all processes communicate.

Return type

int

node_rank()[source]

The rank (index) of the node on which the current process runs.

Return type

int

teardown()[source]

Clean up any state set after execution finishes.

Return type

None

world_size()[source]

The number of processes across all devices and nodes.

Return type

int