CUDA error: no kernel image is available for execution on the device Search for ...

2026年03月24日 11:55 状态: processing

🚨 错误信息

torch.AcceleratorError: CUDA error: no kernel image is available for execution on the device Search for `cudaErrorNoKernelImageForDevice' in https://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART__TYPES.html for more information. CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect. For debugging consider passing CUDA_LAUNCH_BLOCKING=1 Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions. File "E:\ComfyUI_windows_portable\ComfyUI\execution.py", line 525, in execute output_data, output_ui, has_subgraph, has_pending_tasks = await get_output_data(prompt_id, unique_id, obj, input_data_all, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb, v3_data=v3_data) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "E:\ComfyUI_windows_portable\ComfyUI\execution.py", line 334, in get_output_data return_values = await _async_map_node_over_list(prompt_id, unique_id, obj, input_data_all, obj.FUNCTION, allow_interrupt=True, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb, v3_data=v3_data) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "E:\ComfyUI_windows_portable\ComfyUI\execution.py", line 308, in _async_map_node_over_list await process_inputs(input_dict, i) File "E:\ComfyUI_windows_portable\ComfyUI\execution.py", line 296, in process_inputs result = f(**inputs) File "E:\ComfyUI_windows_portable\ComfyUI\nodes.py", line 80, in encode return (clip.encode_from_tokens_scheduled(tokens), ) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^ File "E:\ComfyUI_windows_portable\ComfyUI\comfy\sd.py", line 313, in encode_from_tokens_scheduled pooled_dict = self.encode_from_tokens(tokens, return_pooled=return_pooled, return_dict=True) File "E:\ComfyUI_windows_portable\ComfyUI\comfy\sd.py", line 377, in encode_from_tokens o = self.cond_stage_model.encode_token_weights(tokens) File "E:\ComfyUI_windows_portable\ComfyUI\comfy\sd1_clip.py", line 737, in encode_token_weights out = getattr(self, self.clip).encode_token_weights(token_weight_pairs) File "E:\ComfyUI_windows_portable\ComfyUI\comfy\sd1_clip.py", line 45, in encode_token_weights o = self.encode(to_encode) File "E:\ComfyUI_windows_portable\ComfyUI\comfy\sd1_clip.py", line 306, in encode return self(tokens) File "E:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1776, in _wrapped_call_impl return self._call_impl(*args, **kwargs) ~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^ File "E:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1787, in _call_impl return forward_call(*args, **kwargs) File "E:\ComfyUI_windows_portable\ComfyUI\comfy\sd1_clip.py", line 266, in forward embeds, attention_mask, num_tokens, embeds_info = self.process_tokens(tokens, device) ~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^ File "E:\ComfyUI_windows_portable\ComfyUI\comfy\sd1_clip.py", line 213, in process_tokens tokens_embed = self.transformer.get_input_embeddings()(tokens_embed, out_dtype=torch.float32) File "E:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1776, in _wrapped_call_impl return self._call_impl(*args, **kwargs) ~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^ File "E:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1787, in _call_impl return forward_call(*args, **kwargs) File "E:\ComfyUI_windows_portable\ComfyUI\comfy\ops.py", line 633, in forward return self.forward_comfy_cast_weights(*args, **kwargs) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^ File "E:\ComfyUI_windows_portable\ComfyUI\comfy\ops.py", line 625, in forward_comfy_cast_weights x = torch.nn.functional.embedding(input, weight, self.padding_idx, self.max_norm, self.norm_type, self.scale_grad_by_freq, self.sparse).to(dtype=output_dtype) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "E:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\functional.py", line 2567, in embedding return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse) ~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

🤖 AI解决方案

报错分析报告

1.错误翻译


原文：torch.AcceleratorError: CUDA error: no kernel image is available for execution on the device
译文：Torch加速器错误：CUDA错误：没有可用于在设备上执行的内核镜像

2. 原因分析

根本原因：PyTorch预编译的CUDA内核镜像与当前GPU设备的计算能力不匹配（如GPU过旧/新，或CUDA版本不兼容）。

具体位置：执行torch.nn.functional.embedding时触发CUDA内核调用，但设备无对应可执行镜像（堆栈中最后一步是torch.embedding）。

3. 解决方案

#### 方案1：强制使用CPU运行（快速验证）

bash
ComfyUI启动命令添加--cpu参数（Windows便携版示例）

E:\ComfyUI_windows_portable\run_nvidia_gpu.bat --cpu

#### 方案2：代码中指定设备为CPU（若需修改源码）

python
在触发错误的代码前添加（如ComfyUI的nodes.py或启动脚本）

import torch
torch.device("cpu")  # 强制使用CPU
或修改模型加载逻辑：model.to("cpu")

#### 方案3：安装匹配GPU的PyTorch版本（彻底解决）

查GPU计算能力：打开命令行输入nvidia-smi，看“CUDA Version”或查[NVIDIA显卡计算能力表](https://developer.nvidia.com/cuda-gpus)。

安装对应版本（如GPU计算能力5.0，选CUDA 11.3的PyTorch）：

bash
   pip install torch==2.0.1+cu113 torchvision==0.15.2+cu113 --extra-index-url https://download.pytorch.org/whl/cu113

4. 预防措施

✅ 前置检查：安装PyTorch前用torch.cuda.get_device_capability()查GPU计算能力，确保与CUDA版本匹配。

✅ 版本锁定：用pip freeze > requirements.txt记录环境，避免自动升级导致不兼容。

-️ 推荐工具：nvidia-smi（查GPU状态）、[PyTorch官网版本选择器](https://pytorch.org/get-started/locally/)（精准选版本）。

---
注：若需进一步调试，可按报错提示设置CUDA_LAUNCH_BLOCKING=1（如export CUDA_LAUNCH_BLOCKING=1）获取准确堆栈。

🔍 分析新问题 📝 查看更多问题