Running speaker embeeding training on multiple GPUs on single node

Hello,
Thanks for sharing the PYtorch code for embedding training.
If we look at thepytorch_xvectors/pytorch_run.sh,
  CUDA_VISIBLE_DEVICES=0 python -m torch.distributed.launch --nproc_per_node=1 \
  train_xent.py exp/xvector_nnet_1a/egs/
**If we look at the above line,it seems like you are training the DNN on using single GPU. Is it possible to train using multiple gpus?**

Further if we look at the train_utils.py script,
def prepareModel(args):
    elif args.trainingMode == 'init':
        net.to(device)
        net = torch.nn.parallel.DistributedDataParallel(net,
                                                     device_ids=[0],
                                                     output_device=0)
        if torch.cuda.device_count() > 1:
            print("Using ", torch.cuda.device_count(), "GPUs!")
            net = nn.DataParallel(net)

**Why we are using both torch.nn.parallel.DistributedDataParallel and net = nn.DataParallel(net) ?
When I tried to train, it's training using single GPU. How it needs to modified to train on multiple gpus?**

I look forward to hearing from you.


Thanks.

K. Ahilan

        
   

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Running speaker embeeding training on multiple GPUs on single node #13

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Running speaker embeeding training on multiple GPUs on single node #13

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions