引言

众所周知，PyTorch 是一个基于 Torch 的 Python 开源深度学习计算库，用于计算机视觉、自然语言处理等应用程序。PyTorch 不仅能够实现强大的 GPU 加速，同时还支持动态神经网络，这一点是现在很多主流框架如 TensorFlow 都不支持的。 PyTorch 提供了两个高级功能：

具有强大的GPU加速的张量计算（如 Numpy）；
包含自动求导系统的深度神经网络。

本文介绍的内容就是利用 PyTorch 自动求导机制和随机梯度下降算法 SGD 来求解非线性方程组。

问题定义

y_1 = a_1 \cdot x_1 + a_2 \cdot x_2

y_2 = a_3 \cdot x_3 + a_4 \cdot x_2 \cdot x_4 + a_5 \cdot x_2 \cdot x_5

y_3 = a_6 \cdot x_6 + a_7 \cdot x_2 \cdot x_7 + a_8 \cdot x_2 \cdot x_8

y_4 = a_9 \cdot x_9 + a_{10} \cdot x_2 \cdot x_{10} + a_{11} \cdot x_2 \cdot x_2 + a_{12} \cdot x_2 \cdot x_5

y_5 = a_{13} \cdot x_{11} + a_{14} \cdot x_{2} \cdot x_7 +a_{15} \cdot x_2 \cdot x_8

y_6 = a_{16} \cdot x_{12} + a_{17} \cdot x_{2} \cdot x_4 +a_{18} \cdot x_2 \cdot x_5

如上述公式所示，未知量为 $a_1-a_{18}$ 的非线性方程组。传统求解方法（或启发式求解方法）这里不再赘述，本文只讨论使用 Pytorch 来进行求解。

项目依赖

import torch
import numpy

训练数据采样生成

假设未知量 $a_i = i$ 的问题。训练数据 $x$ 为12维已知量， $y$ 为6维已知量，可通过下述代码生成1000条数据用于训练。

import numpy

n_sample = 1000
x = numpy.random.randn(n_sample, 12)
a = numpy.array([i for i in range(18)])

y1 = a[0] * x[:,0] + a[1] * x[:,1]
y1 = y1.reshape(n_sample, 1)
y2 = a[2] * x[:,2] + a[3] * x[:,1]* x[:,3] + a[4] * x[:,1]* x[:,4]
y2 = y2.reshape(n_sample, 1)
y3 = a[5] * x[:,5] + a[6] * x[:,1]* x[:,6] + a[7] * x[:,1]* x[:,7]
y3 = y3.reshape(n_sample, 1)
y4 = a[8] * x[:,8] + a[9] * x[:,1]* x[:,9] + a[10] * x[:,1]* x[:,1] + a[11] * x[:,1]* x[:,4]
y4 = y4.reshape(n_sample, 1)
y5 = a[12] * x[:,10] + a[13] * x[:,1]* x[:,6] + a[14] * x[:,1]* x[:,7]
y5 = y5.reshape(n_sample, 1)
y6 = a[15] * x[:,11] + a[16] * x[:,1]* x[:,3] + a[17] * x[:,1]* x[:,4]
y6 = y6.reshape(n_sample, 1)

y = numpy.concatenate((y1, y2, y3, y4, y5, y6), axis=1)

numpy.save('x.npy', x)
numpy.save('y.npy', y)

最终，生成的数据将保存在x.npy和y.npy文件中。

构建求解模型

未知量 $a_1-a_{18}$ 共计18个，因此，模型中的参数权重self.param为18维，并采用随机初始化方法。

class Solver(torch.nn.Module):
    def __init__(self, param_dim=18):
        super(Solver, self).__init__()
        self.param_dim = param_dim
        self.param = torch.nn.Parameter(torch.randn(param_dim))

    def forward(self, x):
        y1 = self.param[0] * x[:,0] + self.param[1] * x[:,1]
        y1 = y1.unsqueeze(1)
        y2 = self.param[2] * x[:,2] + self.param[3] * x[:,1]* x[:,3] + self.param[4] * x[:,1]* x[:,4]
        y2 = y2.unsqueeze(1)
        y3 = self.param[5] * x[:,5] + self.param[6] * x[:,1]* x[:,6] + self.param[7] * x[:,1]* x[:,7]
        y3 = y3.unsqueeze(1)
        y4 = self.param[8] * x[:,8] + self.param[9] * x[:,1]* x[:,9] + self.param[10] * x[:,1]* x[:,1] + self.param[11] * x[:,1]* x[:,4]
        y4 = y4.unsqueeze(1)
        y5 = self.param[12] * x[:,10] + self.param[13] * x[:,1]* x[:,6] + self.param[14] * x[:,1]* x[:,7]
        y5 = y5.unsqueeze(1)
        y6 = self.param[15] * x[:,11] + self.param[16] * x[:,1]* x[:,3] + self.param[17] * x[:,1]* x[:,4]
        y6 = y6.unsqueeze(1)
        return torch.cat((y1, y2, y3, y4, y5, y6), dim=1)

构建训练数据集

Pytorch 中构建自定义数据集时，需要重写两个成员函数__len__和__getitem__，这里不再赘述。如有不懂，自行查阅相关资料。

class SDataset(torch.utils.data.Dataset):
    def __init__(self, x_path, y_path):
        self.x = numpy.load(x_path)
        self.y = numpy.load(y_path)

    def __len__(self):
        return self.x.shape[0]
    
    def __getitem__(self, idx):
        return self.x[idx], self.y[idx]

构建训练过程

def train(model, train_loader, optimizer, epoch):
    model.train()
    for batch_idx, (data, target) in enumerate(train_loader):
        optimizer.zero_grad()
        output = model(data)
        loss = torch.nn.MSELoss()(output, target)
        loss.backward()
        optimizer.step()
        if batch_idx % 10 == 0: # 每10个 Batch 会打印一次损失值
            print('Train Epoch: {} [{}/{} ]\tLoss: {:.6f}'.format(
                epoch, batch_idx, len(train_loader), loss.item()))

求解过程

Epochs = 300
model = Solver() # 加载模型
optimizer = torch.optim.SGD(model.parameters(), lr=0.001) # 构造优化器 SGD
dataset = SDataset('x.npy', 'y.npy') # 加载数据集
train_loader = torch.utils.data.DataLoader(dataset, batch_size=16, shuffle=True)

for epoch in range(Epochs):
    print('Epoch: {}'.format(epoch))
    train(model, train_loader, optimizer, epoch)
    torch.save(model, 'model.pth') # 保存训练好的模型
    print("A:",model.param.data.numpy()) # 打印解集

结果

A: [-2.3117074e-05  9.9803221e-01  1.9947073e+00  2.9986596e+00
  3.9973042e+00  4.9946051e+00  5.9927812e+00  6.9894891e+00
  7.9793091e+00  8.9938631e+00  1.0000929e+01  1.0992265e+01
  1.1979302e+01  1.2986021e+01  1.3973579e+01  1.4958998e+01
  1.5995734e+01  1.6989164e+01]

目录CONTENT

【优化算法】PyTorch除了训练深度学习模型，居然还可以这样用？（求解方程组）

引言