Note on Pytorch

记录学习Pytorch的过程.

配置PyTorch

  1. 安装CUDA

    • 确认GPU支持CUDA

    • 安装CUDA Toolkit

    • 确认安装版本

      1
      nvcc -V
  2. 安装PyTorch

    • 使用官方的安装方法, 可能出现下载失败的情况

      1
      conda install pytorch torchvision torchaudio cudatoolkit=11.0 -c pytorch
    • 为避免出现PyTorch下载速度慢的问题, 在Anaconda Cloud上直接下载相应安装包并安装

      1
      conda install --use-local pytorch-1.7.1-py3.8_cuda110_cudnn8_0.tar.bz2
    • 确认安装版本

      1
      2
      import torch
      print(torch.__version__)

使用GPU

使用如下函数输出GPU信息并返回, 见How to check if pytorch is using the GPU?.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
def get_device():
print('torch.cuda.is_available():', torch.cuda.is_available())
device_count = torch.cuda.device_count()
print('torch.cuda.device_count():', device_count)
device_idxes = list(range(device_count))
print('device_idxes:', device_idxes)
devices = [torch.cuda.device(_) for _ in device_idxes]
print('devices:', devices)
device_names = [torch.cuda.get_device_name(_) for _ in device_idxes]
print('device_names:', device_names, '\n')

current_device = torch.cuda.current_device()
print('torch.cuda.current_device():', current_device)
print('torch.cuda.device(current_device):', devices[current_device])
print('torch.cuda.get_device_name(current_device):', device_names[current_device], '\n')

device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
print('Using device:', device, '\n')

if device.type == 'cuda':
print('Memory Usage:')
print('Allocated:', round(torch.cuda.memory_allocated(0) / 1024 ** 3, 1), 'GB')
print('Cached: ', round(torch.cuda.memory_reserved(0) / 1024 ** 3, 1), 'GB')

return device, device_count

对模型和输入调用.to(device), 见Porting PyTorch code from CPU to GPU.

使用多GPU, 见How to use multiple GPUs in pytorch?.

使用watch -n 2 nvidia-smi查看所有GPU的使用情况, 见How to check if pytorch is using the GPU?

安装多版本CUDA

安装时遇到报错cuda you already have a newer version of the nvidia frameview sdk installed, 依次卸载以下软件后可以继续安装:

  1. PhysX
  2. NVIDIA GeForce Experience
  3. NVIDIA FrameView SDK

参见WIndows 10 CUDA installation failure solved, CUDA installation problem.

In-Place Operation

1
2
3
4
5
6
7
8
9
10
11
12
>>> x = torch.rand(1)
>>> y = torch.rand(1)
>>> x
tensor([0.2738])
>>> id(x)
140736259305336
>>> x = x + y # Normal operation
>>> id(x)
140726604827672 # New location
>>> x += y
>>> id(x)
140726604827672 # Existing location used (in-place)

DeepaliDeepali Patel, What is in-place operation?

Which is faster?

.expand().clone() or .repeat()?

Keep in mind though that if you plan on changing this expanded tensor inplace, you will need to use .clone() on it before so that it actually is a full tensor (with memory for each element). But even .expand().clone() should be faster than .repeat() I think.

albanD, Torch.repeat and torch.expand which to use?

.unsqueeze(dim=1).expand(-1, 2).clone().view(-1) or .repeat_interleave(2)

1
2
3
a = torch.arange(3)  # tensor([0, 1, 2]) 
a.unsqueeze(dim=1).expand(-1, 2).clone().view(-1) # tensor([0, 0, 1, 1, 2, 2])
a.repeat_interleave(2) # tensor([0, 0, 1, 1, 2, 2])