CUDA error: no kernel image is available for execution on the device Search for ...

2026年03月24日 11:55 状态: processing

🚨 错误信息

torch.AcceleratorError: CUDA error: no kernel image is available for execution on the device Search for `cudaErrorNoKernelImageForDevice' in https://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART__TYPES.html for more information. CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect. For debugging consider passing CUDA_LAUNCH_BLOCKING=1 Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions. File "E:\ComfyUI_windows_portable\ComfyUI\execution.py", line 525, in execute output_data, output_ui, has_subgraph, has_pending_tasks = await get_output_data(prompt_id, unique_id, obj, input_data_all, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb, v3_data=v3_data) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "E:\ComfyUI_windows_portable\ComfyUI\execution.py", line 334, in get_output_data return_values = await _async_map_node_over_list(prompt_id, unique_id, obj, input_data_all, obj.FUNCTION, allow_interrupt=True, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb, v3_data=v3_data) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "E:\ComfyUI_windows_portable\ComfyUI\execution.py", line 308, in _async_map_node_over_list await process_inputs(input_dict, i) File "E:\ComfyUI_windows_portable\ComfyUI\execution.py", line 296, in process_inputs result = f(**inputs) File "E:\ComfyUI_windows_portable\ComfyUI\nodes.py", line 80, in encode return (clip.encode_from_tokens_scheduled(tokens), ) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^ File "E:\ComfyUI_windows_portable\ComfyUI\comfy\sd.py", line 313, in encode_from_tokens_scheduled pooled_dict = self.encode_from_tokens(tokens, return_pooled=return_pooled, return_dict=True) File "E:\ComfyUI_windows_portable\ComfyUI\comfy\sd.py", line 377, in encode_from_tokens o = self.cond_stage_model.encode_token_weights(tokens) File "E:\ComfyUI_windows_portable\ComfyUI\comfy\sd1_clip.py", line 737, in encode_token_weights out = getattr(self, self.clip).encode_token_weights(token_weight_pairs) File "E:\ComfyUI_windows_portable\ComfyUI\comfy\sd1_clip.py", line 45, in encode_token_weights o = self.encode(to_encode) File "E:\ComfyUI_windows_portable\ComfyUI\comfy\sd1_clip.py", line 306, in encode return self(tokens) File "E:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1776, in _wrapped_call_impl return self._call_impl(*args, **kwargs) ~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^ File "E:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1787, in _call_impl return forward_call(*args, **kwargs) File "E:\ComfyUI_windows_portable\ComfyUI\comfy\sd1_clip.py", line 266, in forward embeds, attention_mask, num_tokens, embeds_info = self.process_tokens(tokens, device) ~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^ File "E:\ComfyUI_windows_portable\ComfyUI\comfy\sd1_clip.py", line 213, in process_tokens tokens_embed = self.transformer.get_input_embeddings()(tokens_embed, out_dtype=torch.float32) File "E:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1776, in _wrapped_call_impl return self._call_impl(*args, **kwargs) ~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^ File "E:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1787, in _call_impl return forward_call(*args, **kwargs) File "E:\ComfyUI_windows_portable\ComfyUI\comfy\ops.py", line 633, in forward return self.forward_comfy_cast_weights(*args, **kwargs) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^ File "E:\ComfyUI_windows_portable\ComfyUI\comfy\ops.py", line 625, in forward_comfy_cast_weights x = torch.nn.functional.embedding(input, weight, self.padding_idx, self.max_norm, self.norm_type, self.scale_grad_by_freq, self.sparse).to(dtype=output_dtype) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "E:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\functional.py", line 2567, in embedding return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse) ~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

🤖 AI解决方案

报错分析报告

1.错误翻译



原文:torch.AcceleratorError: CUDA error: no kernel image is available for execution on the device
译文:Torch加速器错误:CUDA错误:没有可用于在设备上执行的内核镜像

2. 原因分析


  • 根本原因:PyTorch预编译的CUDA内核镜像与当前GPU设备的计算能力不匹配(如GPU过旧/新,或CUDA版本不兼容)。

  • 具体位置:执行torch.nn.functional.embedding时触发CUDA内核调用,但设备无对应可执行镜像(堆栈中最后一步是torch.embedding)。
  • 3. 解决方案


    #### 方案1:强制使用CPU运行(快速验证)
    bash

    ComfyUI启动命令添加--cpu参数(Windows便携版示例)


    E:\ComfyUI_windows_portable\run_nvidia_gpu.bat --cpu

    #### 方案2:代码中指定设备为CPU(若需修改源码)

    python

    在触发错误的代码前添加(如ComfyUI的nodes.py或启动脚本)


    import torch
    torch.device("cpu") # 强制使用CPU

    或修改模型加载逻辑:model.to("cpu")


    #### 方案3:安装匹配GPU的PyTorch版本(彻底解决)

  • 查GPU计算能力:打开命令行输入nvidia-smi,看“CUDA Version”或查[NVIDIA显卡计算能力表](https://developer.nvidia.com/cuda-gpus)。

  • 安装对应版本(如GPU计算能力5.0,选CUDA 11.3的PyTorch):

  • bash
    pip install torch==2.0.1+cu113 torchvision==0.15.2+cu113 --extra-index-url https://download.pytorch.org/whl/cu113

    4. 预防措施


  • 前置检查:安装PyTorch前用torch.cuda.get_device_capability()查GPU计算能力,确保与CUDA版本匹配。

  • 版本锁定:用pip freeze > requirements.txt记录环境,避免自动升级导致不兼容。

  • -️ 推荐工具nvidia-smi(查GPU状态)、[PyTorch官网版本选择器](https://pytorch.org/get-started/locally/)(精准选版本)。

    ---
    :若需进一步调试,可按报错提示设置CUDA_LAUNCH_BLOCKING=1(如export CUDA_LAUNCH_BLOCKING=1)获取准确堆栈。