Machine learning 3 HW (Hungyi Lee)

SHEN Qi

2025-07-28 (Updated: 2025-07-30)

Home Work 1

这个是hungyi老师助教的pytorch教程，简单记录一下。

Dataset & Dataloader

Dataset就是存数据结构的
Dataloader是把数据用batch的格式存起来，分成一批一批的，同时能够并行计算。
1
dataset = MyDataset(file)
1
dataloader = DataLoader(dataset,batch_size,shuffle=True)
其中shuffle在training的时候设置为True，在testing的时候设置为False

如何构建Dataset和Dataloader

from torch.utils.data import Dataset, Dataloader

class Mydataset(Dataset):
    # 读取数据和做预处理
    def __init__(self, file):
        self.data = ...
        
    # 一次返回一个样例
    def __getten__(self, index)：
    	return self.data[index]
    
    # 返回数据集的大小
    def __len__(self):
        return len(self.data)

Dataloader的作用就是从Dataset中每次调用 __getten__(i)然后组成一个mini-batch(大小为batch_size)

Tensors

简单来说就是高纬的数据结构
查看tensor的维度：tensor.shape()

创建Tensors

可以直接从numpy或者list创建

x = torch.tensor([[1,-1],[1,-1]])
x = torch.from_numpy(np.arry([[1,-1],[1,-1]]))
'''
	tensor([[1., -1.],
           [1., -1.]])
'''

创建全为0或者全为1的

x = torch.zeros([2,2]) # shape
'''
	tensor([[0., 0.],
           [0., 0.]])
'''
x = torch.ones([1,2,5])

经典的操作

Addition
- z = x + y
Subtraction
- z = x - y
Power
- y = x.pow(2)
Summation
- y = x.sum()
Mean
- y = x.mean()

转置 transpose

>>> x = torch.zeros([2,3])
>>> x.shape
torch.Size([2, 3])
>>> x
tensor([[0., 0., 0.],
        [0., 0., 0.]])
>>> x = x.transpose(0,1) # 交换dim = 0 和 dim = 1
>>> x.shape
torch.Size([3, 2])
>>> x
tensor([[0., 0.],
        [0., 0.],
        [0., 0.]])

添加 or 删除维度，这并不会导致信息的丢失，只是改变维度

>>> x = torch.zeros([1,2,3])
>>> x.shape
torch.Size([1, 2, 3])
>>> x = x.squeeze(0) # dim = 0
>>> x.shape
torch.Size([2, 3])

>>> x.shape
torch.Size([2, 3])
>>> x = x.unsqueeze(1) # 在原来的dim = 1的位置添加一个维度
>>> x.shape
torch.Size([2, 1, 3])

连接多个维度的tensors

数据类型

| Data Type | dtype | tensor |
| ———————— | —————- | ————————- |
| 32bit 浮点数 | torch.float | torch.FloatTensor |
| 64bit 带符号整数 | torch.long | torch.LongTensor |

选择device

1
2
3

torch.cuda.is_available() # 查看是否有GPU
X = x.to('cpu')
X = x.to('cuda')

Gradient Calculation 计算梯度
- 直接放图吧

搭建模型

network layers
- 例如全连接层：nn.Liner(in_features, out_features)就是只需要定义最后一个输入维度和输出维度
- 还有nn.Sigmoid(), nn.ReLU()
build own neural network
- 很简单的例子：
  - 等价写法

计算损失

类别
- MSE 用于回归任务 criterion = nn.MSELoss()
- Cross Entropy 用于分类任务 criterion = nn.CrossEntropyLoss()
计算loss很简单：loss = criterion(model_output, labels)

选择优化方法

作业是只用了梯度下降法:
Stochastic Gradient Descent(SGS): torch.optim.SGD(model.parameters(), lr, momentum = 0)
for every batch of data:
1. call optimizer.zero_grad() to reset gradients of model parameters.
2. call loss.backward() to backpropagate gradients of prediction loss.
3. call optimizer.step() to adjust model parameters.

完整的流程

dataset = MyDataset(file) # read data via MyDataset
tr_set = DataLoader(dataset, 16, shuffle=True) # put dataset into Dataloader
model = MyModel().to(device) # construct model and move to device (cpu/cuda)
criterion = nn.MSELoss() # set loss function
optimizer = torch.optim.SGD(model.parameters(), 0.1) # set optimizer

# training
for epoch in range(n_epochs): # iterate n_epochs
    model.train() # set model to train mode
    for x, y in tr_set: # iterate through the dataloader
        optimizer.zero_grad() # set gradient to zero
        x, y = x.to(device), y.to(device) # move data to device (cpu/cuda)
        pred = model(x) # forward pass (compute output)
        loss = criterion(pred, y) # compute loss
        loss.backward() # compute gradient (backpropagation)
        optimizer.step() # update model with optimizer

# testing
model.eval() # set model to evaluation mode
total_loss = 0
for x, y in dv_set: # iterate through the dataloader
    x, y = x.to(device), y.to(device) # move data to device (cpu/cuda)
    with torch.no_grad(): # disable gradient calculation
        pred = model(x) # forward pass (compute output)
        loss = criterion(pred, y) # compute loss
    total_loss += loss.cpu().item() * len(x) # accumulate loss
    avg_loss = total_loss / len(dv_set.dataset) # compute averaged loss

# predicating
model.eval()
preds = []
for x in tt_set: # iterate through the dataloader
    x = x.to(device) 
    with torch.no_grad(): # disable gradient calculation
        pred = model(x) # forward pass (compute output)
        preds.append(pred.cpu()) # collect prediction
        
# save
torch.save(model.state_dict(), path)
# load
ckpt = torch.load(path)
model.load_state_dict(ckpt)

HomeWork 1

没去用kaggle测分数，简单写了下

https://github.com/qshen0629/machine-learing-HW/tree/main/ml2022spring-hw1