Shortcuts

Train on the cloud (intermediate)

Audience: Anyone looking to train a model on the cloud in the background


What is background training?

Background training lets you train models in the background without you needing to interact with the machine. As the model trains you can monitor its progress via Tensorboard or an experiment manager of your choice.


0: Install lightning-grid

First Navigate to https://platform.grid.ai to create a free account.

Next, install lightning-grid and login

pip install lightning-grid
grid login

# Login successful. Welcome to Grid.

1: Create a dataset

Create a datastore which optimizes your datasets for training at scale on the cloud. Datastores can be created from all sorts of sources such as .zip and .tar links, local files/folders and even s3 buckets.

Let’s create a datastore from this .zip file

grid datastore create https://pl-flash-data.s3.amazonaws.com/tinycifar5.zip --name cifar5

Now your dataset is ready to be used for training on the cloud!

Note

In some research workflows, your model script ALSO downloads the dataset. If the dataset is only a few GBs this is fine. Otherwise we recommend you create a Datastore.


2: Choose the model to run

You can run any python script in the background. For this example, we’ll use a simple classifier:

Clone the code to your machine:

git clone https://github.com/williamFalcon/cifar5-simple.git
cd cifar5-simple

Note

Code repositories can be as complicated as needed. This is just a simple demo.


3: Run on the cloud

To run this model on the cloud with the attached datastore, use the grid run command:

grid run --datastore_name cifar5 cifar5.py --data_dir /datastores/cifar5

The grid command has two parts the [run args] and the [file args]

grid run [run args] file.py [file args]

4: Monitor and manage

Now that your model is running in the background, monitor and manage it here.

You can also monitor its progress on the commandline:

grid status

Cost

Lightning (via lightning-grid) provides access to cloud machines to the community for free. However, you must buy credits on lightning-grid which are used to pay the cloud providers on your behalf.

If you want to run on your own AWS account and pay the cloud provider directly, please contact our onprem team: mailto:onprem@pytorchlightning.ai


Next Steps

Here are the recommended next steps depending on your workflow.


© Copyright Copyright (c) 2018-2022, Lightning AI et al... Revision dbb5ca8d.

Built with Sphinx using a theme provided by Read the Docs.