在WSL2-Ubuntu中安装CUDA12.8、cuDNN、Anaconda、Pytorch并验证安装

Aadmin · 11 Mar

https://blog.csdn.net/u014451778/article/details/146075238

Aadmin · 11 Mar

2025/03/11按照这一篇重新安装一次
我不需要前面那些windows环境下的安装，直接从
二、在WSL2-Ubuntu系统中安装CUDA、cuDNN、Anaoconda
开始

Aadmin · 11 Mar

一. 安装CUDA

cd /mnt/download apt-get install build-essential wget https://developer.download.nvidia.com/compute/cuda/12.8.1/local_installers/cuda_12.8.1_570.124.06_linux.run sh cuda_12.8.1_570.124.06_linux.run

按照目前我的环境
CUDA Toolkit 12.8 Update 1 Downloads
下面的这个文件
cuda_12.8.1_570.124.06_linux.run
有5G

运行sh命令出现如下错误提示
root@study:/mnt/download# sh cuda_12.8.1_570.124.06_linux.run Installation failed. See log at /var/log/cuda-installer.log for details.
参照
https://blog.csdn.net/wr1997/article/details/106909423
禁用nouveau

禁用之后重启，主界面没有任何显示，只能是通过ssh进去

再次报错
root@study:/mnt/download# sh cuda_12.8.1_570.124.06_linux.run sh: 1: dkms: not found Installation failed. See log at /var/log/cuda-installer.log for details.
二话不说，先
apt-get install dkms

安装过程中选择了安装nvidia-fs,出现下面错误提示
mofed is not installed

再次安装，不选择安装nvidia-fs，结果如下

root@study:/mnt/download# sh cuda_12.8.1_570.124.06_linux.run

= Summary =

Driver: Installed
Toolkit: Installed in /usr/local/cuda-12.8/

Please make sure that

PATH includes /usr/local/cuda-12.8/bin

LD_LIBRARY_PATH includes /usr/local/cuda-12.8/lib64, or, add /usr/local/cuda-12.8/lib64 to /etc/ld.so.conf and run ldconfig as root

To uninstall the CUDA Toolkit, run cuda-uninstaller in /usr/local/cuda-12.8/bin
To uninstall the NVIDIA Driver, run nvidia-uninstall
Logfile is /var/log/cuda-installer.log

验证一下

root@study:/mnt/download# nvidia-smi
Tue Mar 11 09:41:08 2025
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 570.124.06 Driver Version: 570.124.06 CUDA Version: 12.8 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA GeForce RTX 3060 Off | 00000000:01:00.0 Off | N/A |
| 30% 29C P0 33W / 170W | 1MiB / 12288MiB | 2% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| No running processes found |
+-----------------------------------------------------------------------------------------+

看来驱动安装成功

Aadmin · 11 Mar

原贴中的
安装结束后进行环境变量的编辑：
这一部分感觉不需要
目前运行
nvcc -V
会显示

root@study:/mnt/download# nvcc -V
Command 'nvcc' not found, but can be installed with:
apt install nvidia-cuda-toolkit

Aadmin · 11 Mar

看上面驱动安装之后这一段

Driver: Installed
Toolkit: Installed in /usr/local/cuda-12.8/

Please make sure that

PATH includes /usr/local/cuda-12.8/bin

LD_LIBRARY_PATH includes /usr/local/cuda-12.8/lib64, or, add /usr/local/cuda-12.8/lib64 to /etc/ld.so.conf and run ldconfig as root

To uninstall the CUDA Toolkit, run cuda-uninstaller in /usr/local/cuda-12.8/bin
To uninstall the NVIDIA Driver, run nvidia-uninstall
Logfile is /var/log/cuda-installer.log

看来path定义还是需要的

编辑 ~/.bashrc 文件
nano ~/.bashrc
添加以下内容：
export PATH=/usr/local/cuda-12.8/bin:$PATH export LD_LIBRARY_PATH=/usr/local/cuda-12.8/lib64:$LD_LIBRARY_PATH
按下 Ctrl + X ，然后按 Y 确认保存，最后按 Enter 完成退出。
保存文件后，运行以下命令使变量生效：
source ~/.bashrc

现在运行结果如下

root@study:/mnt/download# nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2025 NVIDIA Corporation
Built on Fri_Feb_21_20:23:50_PST_2025
Cuda compilation tools, release 12.8, V12.8.93
Build cuda_12.8.r12.8/compiler.35583870_0

至此，CUDA安装成功并得到验证。

Aadmin · 11 Mar

走到这一步
驱动程序安装程序（建议）
运行
apt-get install -y nvidia-open
结果如下

root@study:/mnt/download# apt-get install -y nvidia-open
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
E: Unable to locate package nvidia-open

参考这一篇
https://blog.csdn.net/guilutian0541/article/details/119928323
添加PPA镜像源
还是不行
放弃！

Aadmin · 11 Mar

二、安装cuDNN
原贴作者只运行了
wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2404/x86_64/cuda-keyring_1.1-1_all.deb sudo dpkg -i cuda-keyring_1.1-1_all.deb sudo apt-get update sudo apt-get -y install cudnn
这四条命令，
但是这个链接
https://developer.nvidia.com/cudnn-downloads?target_os=Linux&target_arch=x86_64&Distribution=Ubuntu&target_version=24.04&target_type=deb_network
下面还有一条命令
sudo apt-get -y install cudnn-cuda-12

不需要再执行最后一条命令，上面那个不带版本号的已经装好了！

Aadmin · 11 Mar

三、安装Anaconda
我是在
/mnt/download
下面进行安装的，按照原贴操作
没有出错

检查 Conda 版本
conda --version
(base) root@study:~# conda --version conda 24.9.2

创建激活新环境先不进行，继续向下

Aadmin · 11 Mar

四、安装Pytorch并验证CUDA12.8、cuDNN、Anaconda、Pytorch的安装

CUDA12.8版本还不支持conda命令安装
用官方给的pip命令安装Preview (Nightly)版
pip3 install --pre torch torchvision torchaudio --index-url https://download.pytorch.org/whl/nightly/cu128

啧啧
感觉下载不少东西，还是应该去download目录下面操作的，等着吧。

昨晚睡觉竟然没注意直接关机了
今早继续装