先抛出问题,望高人指点迷津:
跑 mnist 的 demo,GPU 比 CPU 慢 了很多,为啥呢?
本机环境:
CPU:Intel I9 8核16线程
内存:64G
显卡:AMD Radeon Pro 5500M
示例代码
import tensorflow as tf
def run():
mnist = tf.keras.datasets.mnist
(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0
print(len(x_train), len(y_train), len(x_test), len(y_test))
model = tf.keras.models.Sequential([
tf.keras.layers.Flatten(input_shape=(28, 28)),
tf.keras.layers.Dense(128, activation='relu'),
tf.keras.layers.Dropout(0.2),
tf.keras.layers.Dense(10, activation='softmax')
])
model.compile(optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
model.fit(x_train, y_train, epochs=5)
model.evaluate(x_test, y_test, verbose=2)
if __name__ == '__main__':
devices = tf.config.list_physical_devices()
print(devices)
with tf.device("cpu:0"):
print('start with cpu')
run()
with tf.device("gpu:0"):
print('start with gpu')
run()
示例结果
[PhysicalDevice(name='/physical_device:CPU:0', device_type='CPU'), PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]
2022-01-07 17:49:09.013880: I tensorflow/core/platform/cpu_feature_guard.cc:151] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
Metal device set to: AMD Radeon Pro 5500M
systemMemory: 64.00 GB
maxCacheSize: 3.99 GB
2022-01-07 17:49:09.014762: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:305] Could not identify NUMA node of platform GPU ID 0, defaulting to 0. Your kernel may not have been built with NUMA support.
2022-01-07 17:49:09.015263: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:271] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 0 MB memory) -> physical PluggableDevice (device: 0, name: METAL, pci bus id: )
start with cpu
60000 60000 10000 10000
Epoch 1/5
1875/1875 [==============================] - 2s 836us/step - loss: 0.2982 - accuracy: 0.9135
Epoch 2/5
1875/1875 [==============================] - 1s 795us/step - loss: 0.1431 - accuracy: 0.9569
Epoch 3/5
1875/1875 [==============================] - 2s 806us/step - loss: 0.1059 - accuracy: 0.9679
Epoch 4/5
1875/1875 [==============================] - 2s 1ms/step - loss: 0.0884 - accuracy: 0.9732
Epoch 5/5
1875/1875 [==============================] - 2s 848us/step - loss: 0.0744 - accuracy: 0.9764
313/313 - 0s - loss: 0.0780 - accuracy: 0.9770 - 254ms/epoch - 810us/step
start with gpu
60000 60000 10000 10000
Epoch 1/5
2022-01-07 17:49:18.964441: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled.
1875/1875 [==============================] - 13s 7ms/step - loss: 0.2918 - accuracy: 0.9146
Epoch 2/5
1875/1875 [==============================] - 12s 6ms/step - loss: 0.1402 - accuracy: 0.9591
Epoch 3/5
1875/1875 [==============================] - 12s 6ms/step - loss: 0.1055 - accuracy: 0.9683
Epoch 4/5
1875/1875 [==============================] - 12s 6ms/step - loss: 0.0851 - accuracy: 0.9741
Epoch 5/5
1875/1875 [==============================] - 12s 6ms/step - loss: 0.0703 - accuracy: 0.9781
2022-01-07 17:50:19.926187: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled.
313/313 - 1s - loss: 0.0784 - accuracy: 0.9759 - 1s/epoch - 4ms/step
环境安装
英文好的可以看参考原文 Tensorflow Plugin - Metal - Apple Developer
确保Python 是 3.8版本。不是的话, brew 安装一下
#查看py版本
python3 -V
# 不是3.8的话,安装一下
brew install python@3.8
# 创建虚拟环境
python3 -m venv ~/tensorflow-metal
source ~/tensorflow-metal/bin/activate
python -m pip install -U pip
# 安装 tensorflow-macos
SYSTEM_VERSION_COMPAT=0 python -m pip install。tensorflow-macos
python -m pip install tensorflow-metal
# 现在就可以跑上面的 demo 了
坑1: 本来使用anaconda来装,死活报下面的错,浪费了很多时间
PSGraph adamUpdateWithLearningRateTensor:beta1Tensor:beta2Tensor:epsilonTensor:beta1PowerTensor:beta2PowerTensor:valuesTensor:momentumTensor:velocityTensor:maximumVelocityTensor:gradientTensor:name:]: unrecognized selector sent to instance
python 环境详情
```bash
❯ pip list
Package Version
Original: https://blog.csdn.net/weixin_45919616/article/details/122369508
Author: 一条老萌新
Title: Mac 安装 TensorFlow 2.7 环境,支持 AMD 显卡