nvidia-smi Failed to initialize NVML: Driver/library version mismatch NVML library version: 535.247

Alternative title: "Help I upgraded my Debian/Ubuntu Linux server and now my expensive GPU does not work, at all, why is that? Also I don't want to reboot. How can I fix this without rebooting?"

Faced with the following error:

$ nvidia-smi

Failed to initialize NVML: Driver/library version mismatch
NVML library version: 535.247"

Short answer: You may have inadvertently, during your routine installation of updates/patched ( apt get upgrade etc) your Nvidia kernel module drivers in the process. The loaded kernel module is older than the one on risk, and nvidia-smi is trying to protect/warn you of this fact.

How do I fix it?

Verify your hardware is actually OK using lspci

lspci | grep -i nvidia

If it shows in the output (e.g. 01:00.0...) "This means the hardware is alive but the software isn't talking to it".

You could simply reboot, after which, your new dynamicially loaded (newer) nvidia kernel modules will have been loaded. If you don't want/can't currently afford to reboot, then you have option still:

First attempt to unload the nvidia kernel modules:

sudo rmmod nvidia_uvm
sudo rmmod nvidia_drm
sudo rmmod nvidia_modeset
sudo rmmod nvidia

You'll likely observe errors such as rmmod: ERROR: Module nvidia_uvm is in use , to diagnose which processes are currently using the various modules, use lsof /dev/nvidia* and (consider before) kill the processes which come back. So by definition this article is assuming you can't affor to reboot the server, but you can afford/willing to terminate any and all processes currently trying to use the GPU.

With those processes killed, you'll be able to rmmod the drivers

sudo rmmod nvidia_uvm
sudo rmmod nvidia_drm
sudo rmmod nvidia_modeset
sudo rmmod nvidia

Then finally, load back the nvidia kernal module - which this time will be the newer (updated) kernel module:

sudo modprobe nvidia

..and verify your GPU is happy and running again with nvidia-smi e.g:

nvidia-smi
Fri Jan 16 20:01:06 2026       
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.247.01             Driver Version: 535.247.01   CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  Tesla V100-PCIE-16GB           Off | 00000000:82:00.0 Off |                    0 |
| N/A   35C    P0              36W / 250W |      0MiB / 16384MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
                                                                                         
+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
|  No running processes found                                                           |
+---------------------------------------------------------------------------------------+