【Pytorch编程】Pytorch-Ignite v0.4.8的安装以及简单使用

人工智能125

Pytorch-Ignite是Pytorch的高级库,类似于Keras与Tensorflow的关系。其官方网站为:

概括而言,这个库可以帮助我们更加方便地训练、测试、使用基于Pytorch编写的深度学习模型。

Pytorch-Ignite是依赖于Pytorch的,其安装可以包括以下几步:
1、创建python环境:

conda create -n py36_ignite_048 python=3.6
conda activate py36_ignite_048

2、安装pytorch:
以下命令安装的是cuda10.2,pytorch1.9.0版本

pip install torch==1.9.0+cu102 torchvision==0.10.0+cu102 torchaudio==0.9.0 -f https://download.pytorch.org/whl/torch_stable.html

3、安装Pytorch-Ignite:

pip install pytorch-ignite

4、安装jupyter notebook【为了方便测试】:

pip install notebook

5、安装tensorflow【方便使用tensorboard】

pip install tensorflow

打开jupyter notebook

conda activate py36_ignite_048
jupyter notebook --port=1234 # 指定端口

上面的 --port=1234指定了jupyter notebook远程服务的端口,这是因为大多数服务器的端口是进行了防火墙管理的,我们需要跟管理员咨询,哪些端口是开放给我们使用的,管理员可以通过使用防火墙工具查询,例如 ufw

版本查看

新建notebook文件,
建立第一个cell单元,粘贴以下代码并运行

import torch
print(torch.__file__) # 查看安装位置
print(torch.__version__) # 查看版本号
print(torch.cuda.is_available()) # 查看CUDA版本是否可用

返回以下结果说明安装pytorch成功:

/home/XXXXXX/anaconda3/envs/py36_ignite_048/lib/python3.6/site-packages/torch/__init__.py
1.9.0+cu102
True

新建一个cell单元,查看ignite:

import ignite
print(ignite.__file__)  # 查看安装位置
print(ignite.__version__) # 查看版本号

返回以下结果,表示安装成功:

[En]

The following results are returned to indicate that the installation is successful:

/home/XXXXXX/anaconda3/envs/py36_ignite_048/lib/python3.6/site-packages/ignite/__init__.py
0.4.8

代码框架

from ignite.engine import Events, create_supervised_trainer, create_supervised_evaluator
from ignite.metrics import Accuracy, Loss

model = Net()
train_loader, val_loader = get_data_loaders(train_batch_size, val_batch_size)
optimizer = torch.optim.SGD(model.parameters(), lr=0.01, momentum=0.8)
criterion = nn.NLLLoss()

trainer = create_supervised_trainer(model, optimizer, criterion)

val_metrics = {
    "accuracy": Accuracy(),
    "nll": Loss(criterion)
}
evaluator = create_supervised_evaluator(model, metrics=val_metrics)

@trainer.on(Events.ITERATION_COMPLETED(every=log_interval))
def log_training_loss(trainer):
    print(f"Epoch[{trainer.state.epoch}] Loss: {trainer.state.output:.2f}")

@trainer.on(Events.EPOCH_COMPLETED)
def log_training_results(trainer):
    evaluator.run(train_loader)
    metrics = evaluator.state.metrics
    print(f"Training Results - Epoch: {trainer.state.epoch}  Avg accuracy: {metrics['accuracy']:.2f} Avg loss: {metrics['nll']:.2f}")

@trainer.on(Events.EPOCH_COMPLETED)
def log_validation_results(trainer):
    evaluator.run(val_loader)
    metrics = evaluator.state.metrics
    print(f"Validation Results - Epoch: {trainer.state.epoch}  Avg accuracy: {metrics['accuracy']:.2f} Avg loss: {metrics['nll']:.2f}")

trainer.run(train_loader, max_epochs=100)

使用示例

import torch
from torch import nn
from torch.utils.data import DataLoader
from torchvision.datasets import MNIST
from torchvision.models import resnet18
from torchvision.transforms import Compose, Normalize, ToTensor

from ignite.engine import Engine, Events, create_supervised_trainer, create_supervised_evaluator
from ignite.metrics import Accuracy, Loss
from ignite.handlers import ModelCheckpoint
from ignite.contrib.handlers import TensorboardLogger, global_step_from_engine

/home/phd-chen.yirong/anaconda3/envs/py36_perbot/lib/python3.6/site-packages/tqdm/auto.py:22: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html
  from .autonotebook import tqdm as notebook_tqdm
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
print(device)
cuda

该部分与Pytorch一模一样,没有任何变化!

# 定义深度学习模型类
class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        # Changed the output layer to output 10 classes instead of 1000 classes
        self.model = resnet18(num_classes=10)
        # Changed the input layer to take grayscale images for MNIST instaed of RGB images
        self.model.conv1 = nn.Conv2d(
            1, 64, kernel_size=3, padding=1, bias=False
        )
    def forward(self, x):
        return self.model(x)

# 创建Net类的对象
model = Net().to(device)
# 打印模型
print(model)
Net(
  (model): ResNet(
    (conv1): Conv2d(1, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
    (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (relu): ReLU(inplace=True)
    (maxpool): MaxPool2d(kernel_size=3, stride=2, padding=1, dilation=1, ceil_mode=False)
    (layer1): Sequential(
      (0): BasicBlock(
        (conv1): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu): ReLU(inplace=True)
        (conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
      (1): BasicBlock(
        (conv1): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu): ReLU(inplace=True)
        (conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
    )
    (layer2): Sequential(
      (0): BasicBlock(
        (conv1): Conv2d(64, 128, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
        (bn1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu): ReLU(inplace=True)
        (conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (downsample): Sequential(
          (0): Conv2d(64, 128, kernel_size=(1, 1), stride=(2, 2), bias=False)
          (1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        )
      )
      (1): BasicBlock(
        (conv1): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu): ReLU(inplace=True)
        (conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
    )
    (layer3): Sequential(
      (0): BasicBlock(
        (conv1): Conv2d(128, 256, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
        (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu): ReLU(inplace=True)
        (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (downsample): Sequential(
          (0): Conv2d(128, 256, kernel_size=(1, 1), stride=(2, 2), bias=False)
          (1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        )
      )
      (1): BasicBlock(
        (conv1): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu): ReLU(inplace=True)
        (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
    )
    (layer4): Sequential(
      (0): BasicBlock(
        (conv1): Conv2d(256, 512, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
        (bn1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu): ReLU(inplace=True)
        (conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn2): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (downsample): Sequential(
          (0): Conv2d(256, 512, kernel_size=(1, 1), stride=(2, 2), bias=False)
          (1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        )
      )
      (1): BasicBlock(
        (conv1): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu): ReLU(inplace=True)
        (conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn2): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
    )
    (avgpool): AdaptiveAvgPool2d(output_size=(1, 1))
    (fc): Linear(in_features=512, out_features=10, bias=True)
  )
)

下载数据集并且存放在与本文件相同的路径下,这部分与Pytorch也一摸一样

data_transform = Compose([ToTensor(), Normalize((0.1307,), (0.3081,))])

train_loader = DataLoader(
    MNIST(download=True, root=".", transform=data_transform, train=True), batch_size=128, shuffle=True
)

val_loader = DataLoader(
    MNIST(download=True, root=".", transform=data_transform, train=False), batch_size=256, shuffle=False
)
/home/phd-chen.yirong/anaconda3/envs/py36_perbot/lib/python3.6/site-packages/torchvision/datasets/mnist.py:498: UserWarning: The given NumPy array is not writeable, and PyTorch does not support non-writeable tensors. This means you can write to the underlying (supposedly non-writeable) NumPy array using the tensor. You may want to copy the array to protect its data or make it writeable before converting it to a tensor. This type of warning will be suppressed for the rest of this program. (Triggered internally at  /pytorch/torch/csrc/utils/tensor_numpy.cpp:180.)
  return torch.from_numpy(parsed.astype(m[2], copy=False)).view(*s)

这部分也与Pytorch一模一样

# 优化器以及学习率设置
optimizer = torch.optim.RMSprop(model.parameters(), lr=0.005)
# 损失函数
criterion = nn.CrossEntropyLoss()

这一部分实际上从高层次上概括了传统的培训过程。

[En]

This part actually encapsulates the traditional training process at a high level.

# 评估参数字典,Accuracy()、Loss(criterion)均为ignite.metrics的方法
val_metrics = {
    "accuracy": Accuracy(),
    "loss": Loss(criterion)
}

# 自定义训练过程
def train_step(engine, batch):
    model.train()
    optimizer.zero_grad()
    x, y = batch[0].to(device), batch[1].to(device)
    y_pred = model(x)
    loss = criterion(y_pred, y)
    loss.backward()
    optimizer.step()
    return loss.item()

# 对象trainer、train_evaluator和val_evaluator都是Engine的实例
trainer = Engine(train_step) # 训练循环

# 自定义验证过程
def validation_step(engine, batch):
    model.eval()
    with torch.no_grad():
        x, y = batch[0].to(device), batch[1].to(device)
        y_pred = model(x)
        return y_pred, y

train_evaluator = Engine(validation_step) # 验证循环
val_evaluator = Engine(validation_step) # 验证循环

# 将评估参数字典中的所有参数绑定到验证器当中
for name, metric in val_metrics.items():
    metric.attach(train_evaluator, name)

for name, metric in val_metrics.items():
    metric.attach(val_evaluator, name)

log_interval = 100

@trainer.on(Events.ITERATION_COMPLETED(every=log_interval))
def log_training_loss(engine):
    print(f"Epoch[{engine.state.epoch}], Iter[{engine.state.iteration}] Loss: {engine.state.output:.2f}")

@trainer.on(Events.EPOCH_COMPLETED)
def log_training_results(trainer):
    train_evaluator.run(train_loader)
    metrics = train_evaluator.state.metrics
    print(f"Training Results - Epoch[{trainer.state.epoch}] Avg accuracy: {metrics['accuracy']:.2f} Avg loss: {metrics['loss']:.2f}")

@trainer.on(Events.EPOCH_COMPLETED)
def log_validation_results(trainer):
    val_evaluator.run(val_loader)
    metrics = val_evaluator.state.metrics
    print(f"Validation Results - Epoch[{trainer.state.epoch}] Avg accuracy: {metrics['accuracy']:.2f} Avg loss: {metrics['loss']:.2f}")

# Score函数返回我们在val_metrics中定义的任何度量的当前值
def score_function(engine):
    return engine.state.metrics["accuracy"]

# Checkpoint to store n_saved best models wrt score function
model_checkpoint = ModelCheckpoint(
    "checkpoint",
    n_saved=2,
    filename_prefix="best",
    score_function=score_function,
    score_name="accuracy",
    global_step_transform=global_step_from_engine(trainer), # helps fetch the trainer's state
)

# Save the model after every epoch of val_evaluator is completed
val_evaluator.add_event_handler(Events.COMPLETED, model_checkpoint, {"model": model})
<ignite.engine.events.removableeventhandle at 0x7ffa680bc198>
</ignite.engine.events.removableeventhandle>

方便使用tensorboard查看

# 创建TensorboardLogger对象
tb_logger = TensorboardLogger(log_dir="tb-logger")

# Attach handler to plot trainer's loss every 100 iterations
tb_logger.attach_output_handler(
    trainer,
    event_name=Events.ITERATION_COMPLETED(every=100),
    tag="training",
    output_transform=lambda loss: {"batch_loss": loss},
)

# Attach handler for plotting both evaluators' metrics after every epoch completes
for tag, evaluator in [("training", train_evaluator), ("validation", val_evaluator)]:
    tb_logger.attach_output_handler(
        evaluator,
        event_name=Events.EPOCH_COMPLETED,
        tag=tag,
        metric_names="all",
        global_step_transform=global_step_from_engine(trainer),
    )

trainer.run(train_loader, max_epochs=5)

tb_logger.close()
/home/phd-chen.yirong/anaconda3/envs/py36_perbot/lib/python3.6/site-packages/torch/nn/functional.py:718: UserWarning: Named tensors and all their associated APIs are an experimental feature and subject to change. Please do not use them for anything important until they are released as stable. (Triggered internally at  /pytorch/c10/core/TensorImpl.h:1156.)
  return torch.max_pool2d(input, kernel_size, stride, padding, dilation, ceil_mode)

Epoch[1], Iter[100] Loss: 0.12
Epoch[1], Iter[200] Loss: 0.08
Epoch[1], Iter[300] Loss: 0.09
Epoch[1], Iter[400] Loss: 0.10
Training Results - Epoch[1] Avg accuracy: 0.86 Avg loss: 0.53
Validation Results - Epoch[1] Avg accuracy: 0.86 Avg loss: 0.56
Epoch[2], Iter[500] Loss: 0.07
Epoch[2], Iter[600] Loss: 0.03
Epoch[2], Iter[700] Loss: 0.04
Epoch[2], Iter[800] Loss: 0.02
Epoch[2], Iter[900] Loss: 0.16
Training Results - Epoch[2] Avg accuracy: 0.99 Avg loss: 0.04
Validation Results - Epoch[2] Avg accuracy: 0.99 Avg loss: 0.04
Epoch[3], Iter[1000] Loss: 0.05
Epoch[3], Iter[1100] Loss: 0.07
Epoch[3], Iter[1200] Loss: 0.01
Epoch[3], Iter[1300] Loss: 0.06
Epoch[3], Iter[1400] Loss: 0.04
Training Results - Epoch[3] Avg accuracy: 0.99 Avg loss: 0.03
Validation Results - Epoch[3] Avg accuracy: 0.99 Avg loss: 0.04
Epoch[4], Iter[1500] Loss: 0.04
Epoch[4], Iter[1600] Loss: 0.07
Epoch[4], Iter[1700] Loss: 0.03
Epoch[4], Iter[1800] Loss: 0.03
Training Results - Epoch[4] Avg accuracy: 0.99 Avg loss: 0.03
Validation Results - Epoch[4] Avg accuracy: 0.99 Avg loss: 0.04
Epoch[5], Iter[1900] Loss: 0.04
Epoch[5], Iter[2000] Loss: 0.07
Epoch[5], Iter[2100] Loss: 0.04
Epoch[5], Iter[2200] Loss: 0.03
Epoch[5], Iter[2300] Loss: 0.04
Training Results - Epoch[5] Avg accuracy: 0.99 Avg loss: 0.02
Validation Results - Epoch[5] Avg accuracy: 0.99 Avg loss: 0.03
conda activate py36_ignite_048
cd [代码文件所在位置]
tensorboard --logdir=./tb-logger --bind_all --port=6666

上面的 --port=6666指定端口号

Original: https://www.cnblogs.com/chenyirong/p/16342368.html
Author: 华工陈艺荣
Title: 【Pytorch编程】Pytorch-Ignite v0.4.8的安装以及简单使用



相关阅读

Title: Win11系统+RTX3060安装Tensorflow2.7+Keras2.7+Cuda11.1+Cudnn8.1.0

安装步骤

整体步骤

最近新买了一个电脑自带的是WIN11系统,还想着去专门下载支持Win11的Cuda,最后逛了一圈以后发现其实Win10的Cuda版本也可以使用,为了使用3060的Cuda,折腾了2天都在运行模型时出现各种各样的问题,终于弄好后想着写一个教程给大家一个参考。

首先安装Anaconda

Anaconda直接在官网上下下来安装完成就行了,网上有很多的教程,我就不再赘述了。
安装好Anaconda以后进入Anaconda的Prompt
【Pytorch编程】Pytorch-Ignite v0.4.8的安装以及简单使用
然后使用命令行新建一个Conda环境:

conda create --name tf27 python=3.7

创建完成之后使用命令切换到刚才我们创建好的环境之中

conda activate tf27

然后使用豆瓣源pip下载tensorflow2.7相关的依赖包,tensorflow1.15版本以上就不需要区分普通版和Gpu版本的了,可以直接安装,并且keras在安装时也会一并安装下来。

pip install tensorflow==2.7.0 -i https://pypi.douban.com/simple/

CUDA的安装

CUDA就直接先去官网下载11.1的本地安装包
CUDA11.1
【Pytorch编程】Pytorch-Ignite v0.4.8的安装以及简单使用
下载好以后安装不要选择全部安装,点击自定义安装,如果你电脑的驱动版本或者Geforce Experience版本比需求版本高的话,就把前面的勾勾取消掉只安装第一个组件
【Pytorch编程】Pytorch-Ignite v0.4.8的安装以及简单使用
安装完成后在命令行输入:

nvcc -V

【Pytorch编程】Pytorch-Ignite v0.4.8的安装以及简单使用
出现这个就说明安装成功了。

Cudnn的安装

进入Cudnn的官网去下载对应版本的cudnn8.1.0:
Cudnn8.1.0
Cudnn需要注册才能下载,跟着流程走就行了。
下载后将zip文件解压到安装Cuda的文件夹中
【Pytorch编程】Pytorch-Ignite v0.4.8的安装以及简单使用
(当然这一步是我自己这样做的,不知道到底有没有用哈,反正对我是有用的)
然后将这个3个文件夹里的东西分别复制到Cuda对应的文件夹中,lib要复制到x64那个文件夹中
【Pytorch编程】Pytorch-Ignite v0.4.8的安装以及简单使用
复制完了之后,cudnn的整体文件夹还是保存,不需要删除。
然后进入我的电脑->高级系统设置->环境变量中去设置环境变量,编辑path然后将最上面的4个变量按照你自己下载的目录填写上去。
【Pytorch编程】Pytorch-Ignite v0.4.8的安装以及简单使用
填写完成后,一路点击确定。
然后就可以去你的程序检测是否完成安装了.

tf.test.is_gpu_available()这种测试方法对我没用,我前面装的3个Cuda都可以通过这个检测,但是一旦使用到具体的模型中就无法使用了。

Original: https://blog.csdn.net/qq_38216057/article/details/123078557
Author: DreamerCoder
Title: Win11系统+RTX3060安装Tensorflow2.7+Keras2.7+Cuda11.1+Cudnn8.1.0