跳转至

GPU 配置

背景

  • GPU:RTX 4060 Ti
  • 系统:Linux(Ubuntu 22.04)
  • 已安装:NVIDIA 驱动 ✅
nvidia-smi
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 570.133.07             Driver Version: 570.133.07     CUDA Version: 12.8     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA GeForce RTX 4060 Ti     Off |   00000000:01:00.0  On |                  N/A |
|  0%   34C    P8              3W /  165W |      34MiB /  16380MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI              PID   Type   Process name                        GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|  No running processes found                                                             |
+-----------------------------------------------------------------------------------------+

✅ 说明:

  • 驱动正常
  • GPU 可识别
  • 支持 CUDA 12.8

核心概念 & 关键结论

组件 作用 使用建议
NVIDIA Driver 提供 GPU 硬件驱动(基础依赖) 必须安装;已有可用驱动则无需重复安装
CUDA Toolkit 提供开发工具链(nvcc、CUDA 库) 仅在需要开发或编译 CUDA 程序时安装,一般情况下可不安装
PyTorch (cuXXX) 内置 CUDA Runtime(运行时) 可直接使用,无需额外安装 CUDA Toolkit


安装 CUDA Toolkit

根据 Pytorch 版本依赖要求,推荐安装 CUDA 12.8 Toolkit

wget https://developer.download.nvidia.com/compute/cuda/12.8.1/local_installers/cuda_12.8.1_570.124.06_linux.run
sudo sh cuda_12.8.1_570.124.06_linux.run

注意

已经正确安装了最新版驱动(570.133.07),完全不需要卸载。

✅ 正确操作:在这个界面中,用方向键移动到 Continue,然后按 Enter

Existing package manager installation of the driver found. It is strongly
recommended that you remove this before continuing.

Abort
Continue

这样 CUDA 安装器会 跳过驱动安装,只安装 CUDA Toolkit 本体(就是 nvcc 编译器和库)。

❗ 关键点:已经安装了驱动,所以必须取消默认的驱动安装!

CUDA Installer
- [ ] Driver
    [ ] 570.124.06
+ [X] CUDA Toolkit 12.8
[X]CUDA Demo Suite 12.8
[X]CUDA Documentation 12.8
- [ ]Kernel Objects
    [ ] nvidia-fs
Options
Install

组件 选择建议
Driver ❌ 取消勾选
CUDA Toolkit 12.8 ✅ 保留
Demo Suite / Docs ✅ 可选
Kernel Objects ❌ 忽略或不选

安装完成

===========
= Summary =
===========

Driver:   Not Selected
Toolkit:  Installed in /usr/local/cuda-12.8/

Please make sure that
-   PATH includes /usr/local/cuda-12.8/bin
-   LD_LIBRARY_PATH includes /usr/local/cuda-12.8/lib64, or, add /usr/local/cuda-12.8/lib64 to /etc/ld.so.conf and run ldconfig as root

To uninstall the CUDA Toolkit, run cuda-uninstaller in /usr/local/cuda-12.8/bin
***WARNING: Incomplete installation! This installation did not install the CUDA Driver. A driver of version at least 570.00 is required for CUDA 12.8 functionality to work.
To install the driver using this installer, run the following command, replacing <CudaInstaller> with the name of this run file:
    sudo <CudaInstaller>.run --silent --driver

Logfile is /var/log/cuda-installer.log

配置环境变量

echo 'export PATH=/usr/local/cuda-12.8/bin:$PATH' >> ~/.zshrc
echo 'export LD_LIBRARY_PATH=/usr/local/cuda-12.8/lib64:$LD_LIBRARY_PATH' >> ~/.zshrc
source ~/.zshrc
$ nvcc --version

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2025 NVIDIA Corporation
Built on Fri_Feb_21_20:23:50_PST_2025
Cuda compilation tools, release 12.8, V12.8.93
Build cuda_12.8.r12.8/compiler.35583870_0

安装 PyTorch

使用 pip 安装 PyTorch 2.7.1(cu128)

pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu128

验证 GPU 是否可用:

python3 -c "import torch; print(torch.__version__, torch.version.cuda, torch.cuda.is_available()); print(torch.cuda.get_device_name(0) if torch.cuda.is_available() else 'CUDA not available')"

完成后状态

项目 状态
PyTorch 版本 2.7.1+cu128
CUDA 运行时 12.8
GPU 可用 ✅ 是
GPU 设备 NVIDIA GeForce RTX 4060 Ti

CUDA 驱动、Toolkit、PyTorch 配套无误,现在可以使用 GPU 加速训练、推理或编译库(如 FAISS)了。