View the runnable example on GitHub
Convert PyTorch Training Loop to Use TorchNano#
📚 Related Reading
If you have already defined a PyTorch training loop function with a model, optimizers, and dataloaders as parameters, you could refer to this guide to use
@nanodecorator, which is a simpler way to gain acceleration from BigDL-Nano.
TorchNano API integrates multiple optimizations to accelerate custom PyTorch training loop. As a pure PyTorch user, you could apply few changes to your existing code to use TorchNano.
📝 Note
Before starting your PyTorch application, it is highly recommended to run
source bigdl-nano-initto set several environment variables based on your current hardware. Empirically, these variables will bring big performance increase for most PyTorch applications on training workloads.
PyTorch Training Loops Example#
Suppose you would like to finetune a ResNet-18 model (pretrained on ImageNet dataset) on OxfordIIITPet dataset, you may create datasets, the model and define your training loops as follows:
[ ]:
from tqdm import tqdm
def train_loops():
model = MyPytorchModule()
optimizer = torch.optim.SGD(model.parameters(), lr=0.01, momentum=0.9, weight_decay=5e-4)
loss_fuc = torch.nn.CrossEntropyLoss()
train_loader = create_train_dataloader()
num_epochs = 5
for epoch in range(num_epochs):
model.train()
train_loss, num = 0, 0
with tqdm(train_loader, unit="batch") as tepoch:
for data, target in tepoch:
tepoch.set_description(f"Epoch {epoch}")
optimizer.zero_grad()
output = model(data)
loss = loss_fuc(output, target)
loss.backward()
optimizer.step()
loss_value = loss.sum()
train_loss += loss_value
num += 1
tepoch.set_postfix(loss=loss_value)
print(f'Train Epoch: {epoch}, avg_loss: {train_loss / num}')
The definition of MyPytorchModule and create_train_dataloader can be found in the runnable example.
Convert to TorchNano#
There are 5 simple steps to convert your PyTorch code to use TorchNano:
Import
TorchNanoSubclass
TorchNanoand override itstrainmethodMove the code for your custom training loops inside the
TorchNano’strainmethodCall
TorchNano’ssetupmethod to set up model, optimizer(s), and dataloader(s) for accelerated trainingReplace
loss.backward()withself.backward(loss)
[ ]:
# Step 1. import TorchNano
from bigdl.nano.pytorch import TorchNano
# Step 2. subclass TorchNano and override its train method
class MyNano(TorchNano):
def train(self):
# Step 3. Move the code for your custom training loops inside the train method
model = MyPytorchModule()
optimizer = torch.optim.SGD(model.parameters(), lr=0.01, momentum=0.9, weight_decay=5e-4)
loss_fuc = torch.nn.CrossEntropyLoss()
train_loader = create_train_dataloader()
# Step 4. call setup method to set up model, optimizer(s),
# and dataloader(s) for accelerated training
model, optimizer, train_loader = self.setup(model, optimizer, train_loader)
num_epochs = 5
for epoch in range(num_epochs):
model.train()
train_loss, num = 0, 0
with tqdm(train_loader, unit="batch") as tepoch:
for data, target in tepoch:
tepoch.set_description(f"Epoch {epoch}")
optimizer.zero_grad()
output = model(data)
loss = loss_fuc(output, target)
# Step 5. Replace loss.backward() with self.backward(loss)
self.backward(loss)
optimizer.step()
loss_value = loss.sum()
train_loss += loss_value
num += 1
tepoch.set_postfix(loss=loss_value)
print(f'Train Epoch: {epoch}, avg_loss: {train_loss / num}')
📝 Note
To make sure that the converted
TorchNanostill has a functional training loop, there are some requirements:
there should be one and only one instance of
torch.nn.Moduleas model in the training loopthere should be at least one instance of
torch.optim.Optimizeras optimizer in the training loopthere should be at least one instance of
torch.utils.data.DataLoaderas dataloader in the training loop
You could then do the training by instantiating MyNano and calling its train method:
[ ]:
MyNano().train()
📝 Note
Due to the optimized environment variables set by
source bigdl-nano-init, you could already experience some training acceleration after converting your PyTorch code to useTorchNano.For more optimizations provided by
TorchNano, you can refer to the Related Readings.
📚 Related Readings
How to accelerate a PyTorch application on training workloads through Intel® Extension for PyTorch*
How to accelerate a PyTorch application on training workloads through multiple instances
How to use the channels last memory format in your PyTorch application for training
How to conduct BFloat16 Mixed Precision training in your PyTorch application