Machine learning 3 HW (Hungyi Lee)

Home Work 1

这个是hungyi老师助教的pytorch教程,简单记录一下。

Dataset & Dataloader

  • Dataset就是存数据结构的

  • Dataloader是把数据用batch的格式存起来,分成一批一批的,同时能够并行计算。

    1
    dataset = MyDataset(file)
    1
    dataloader = DataLoader(dataset,batch_size,shuffle=True)

    其中shuffle在training的时候设置为True,在testing的时候设置为False

  • 如何构建DatasetDataloader

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    from torch.utils.data import Dataset, Dataloader

    class Mydataset(Dataset):
    # 读取数据和做预处理
    def __init__(self, file):
    self.data = ...

    # 一次返回一个样例
    def __getten__(self, index):
    return self.data[index]

    # 返回数据集的大小
    def __len__(self):
    return len(self.data)
    • Dataloader的作用就是从Dataset中每次调用 __getten__(i)然后组成一个mini-batch(大小为batch_size)

      image-20250728230559384

Tensors

  • 简单来说就是高纬的数据结构

    image-20250728231044647

  • 查看tensor的维度:tensor.shape()

    image-20250728231239106

  • 创建Tensors

    • 可以直接从numpy或者list创建

      1
      2
      3
      4
      5
      6
      x = torch.tensor([[1,-1],[1,-1]])
      x = torch.from_numpy(np.arry([[1,-1],[1,-1]]))
      '''
      tensor([[1., -1.],
      [1., -1.]])
      '''
    • 创建全为0或者全为1的

      1
      2
      3
      4
      5
      6
      x = torch.zeros([2,2]) # shape
      '''
      tensor([[0., 0.],
      [0., 0.]])
      '''
      x = torch.ones([1,2,5])
  • 经典的操作

    • Addition

      • z = x + y
    • Subtraction

      • z = x - y
    • Power

      • y = x.pow(2)
    • Summation

      • y = x.sum()
    • Mean

      • y = x.mean()
    • 转置 transpose

      1
      2
      3
      4
      5
      6
      7
      8
      9
      10
      11
      12
      13
      >>> x = torch.zeros([2,3])
      >>> x.shape
      torch.Size([2, 3])
      >>> x
      tensor([[0., 0., 0.],
      [0., 0., 0.]])
      >>> x = x.transpose(0,1) # 交换dim = 0 和 dim = 1
      >>> x.shape
      torch.Size([3, 2])
      >>> x
      tensor([[0., 0.],
      [0., 0.],
      [0., 0.]])
  • 添加 or 删除维度,这并不会导致信息的丢失,只是改变维度

    1
    2
    3
    4
    5
    6
    >>> x = torch.zeros([1,2,3])
    >>> x.shape
    torch.Size([1, 2, 3])
    >>> x = x.squeeze(0) # dim = 0
    >>> x.shape
    torch.Size([2, 3])
    1
    2
    3
    4
    5
    >>> x.shape
    torch.Size([2, 3])
    >>> x = x.unsqueeze(1) # 在原来的dim = 1的位置添加一个维度
    >>> x.shape
    torch.Size([2, 1, 3])
  • 连接多个维度的tensors

    image-20250728233610507

  • 数据类型

    | Data Type | dtype | tensor |
    | ———————— | —————- | ————————- |
    | 32bit 浮点数 | torch.float | torch.FloatTensor |
    | 64bit 带符号整数 | torch.long | torch.LongTensor |

  • 选择device

    1
    2
    3
    torch.cuda.is_available() # 查看是否有GPU
    X = x.to('cpu')
    X = x.to('cuda')
  • Gradient Calculation 计算梯度

    • 直接放图吧

      image-20250728234225896

搭建模型

  • network layers

    • 例如全连接层:nn.Liner(in_features, out_features)就是只需要定义最后一个输入维度和输出维度

      image-20250728234932223

    • 还有nn.Sigmoid(), nn.ReLU()

  • build own neural network

    • 很简单的例子:

      image-20250728235100395

      • 等价写法

        image-20250728235227843

计算损失

  • 类别
    • MSE 用于回归任务 criterion = nn.MSELoss()
    • Cross Entropy 用于分类任务 criterion = nn.CrossEntropyLoss()
  • 计算loss很简单:loss = criterion(model_output, labels)

选择优化方法

  • 作业是只用了梯度下降法:
    Stochastic Gradient Descent(SGS): torch.optim.SGD(model.parameters(), lr, momentum = 0)

  • for every batch of data:

    1. call optimizer.zero_grad() to reset gradients of model parameters.
    2. call loss.backward() to backpropagate gradients of prediction loss.
    3. call optimizer.step() to adjust model parameters.

image-20250729000214474

完整的流程

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
dataset = MyDataset(file) # read data via MyDataset
tr_set = DataLoader(dataset, 16, shuffle=True) # put dataset into Dataloader
model = MyModel().to(device) # construct model and move to device (cpu/cuda)
criterion = nn.MSELoss() # set loss function
optimizer = torch.optim.SGD(model.parameters(), 0.1) # set optimizer

# training
for epoch in range(n_epochs): # iterate n_epochs
model.train() # set model to train mode
for x, y in tr_set: # iterate through the dataloader
optimizer.zero_grad() # set gradient to zero
x, y = x.to(device), y.to(device) # move data to device (cpu/cuda)
pred = model(x) # forward pass (compute output)
loss = criterion(pred, y) # compute loss
loss.backward() # compute gradient (backpropagation)
optimizer.step() # update model with optimizer

# testing
model.eval() # set model to evaluation mode
total_loss = 0
for x, y in dv_set: # iterate through the dataloader
x, y = x.to(device), y.to(device) # move data to device (cpu/cuda)
with torch.no_grad(): # disable gradient calculation
pred = model(x) # forward pass (compute output)
loss = criterion(pred, y) # compute loss
total_loss += loss.cpu().item() * len(x) # accumulate loss
avg_loss = total_loss / len(dv_set.dataset) # compute averaged loss

# predicating
model.eval()
preds = []
for x in tt_set: # iterate through the dataloader
x = x.to(device)
with torch.no_grad(): # disable gradient calculation
pred = model(x) # forward pass (compute output)
preds.append(pred.cpu()) # collect prediction

# save
torch.save(model.state_dict(), path)
# load
ckpt = torch.load(path)
model.load_state_dict(ckpt)

HomeWork 1

没去用kaggle测分数,简单写了下

https://github.com/qshen0629/machine-learing-HW/tree/main/ml2022spring-hw1