aboutsummaryrefslogtreecommitdiff
path: root/modules/devices.py
AgeCommit message (Collapse)Author
2024-01-09rearrange if-statements for cpuKohaku-Blueleaf
2024-01-09Apply the correct behavior of precision='full'Kohaku-Blueleaf
2024-01-09Revert "Apply correct inference precision implementation"Kohaku-Blueleaf
This reverts commit e00365962b17550a42235d1fbe2ad2c7cc4b8961.
2024-01-09Apply correct inference precision implementationKohaku-Blueleaf
2024-01-09linting and debugsKohaku-Blueleaf
2024-01-09Fix bugs when arg dtype doesn't matchKohakuBlueleaf
2024-01-09improve efficiency and support more deviceKohaku-Blueleaf
2023-12-31change import statements for #14478AUTOMATIC1111
2023-12-31Add utility to inspect a model's parameters (to get dtype/device)Aarni Koskela
2023-12-03Merge branch 'dev' into test-fp8Kohaku-Blueleaf
2023-12-02Merge pull request #14171 from Nuullll/ipexAUTOMATIC1111
Initial IPEX support for Intel Arc GPU
2023-12-02Merge branch 'dev' into test-fp8Kohaku-Blueleaf
2023-12-02Merge pull request #14131 from read-0nly/patch-1AUTOMATIC1111
Update devices.py - Make 'use-cpu all' actually apply to 'all'
2023-12-02Disable ipex autocast due to its bad perfNuullll
2023-11-30Initial IPEX supportNuullll
2023-11-27Update devices.pyobsol
fixes issue where "--use-cpu" all properly makes SD run on CPU but leaves ControlNet (and other extensions, I presume) pointed at GPU, causing a crash in ControlNet caused by a mismatch between devices between SD and CN https://github.com/AUTOMATIC1111/stable-diffusion-webui/issues/14097
2023-11-19Better namingKohaku-Blueleaf
2023-11-19Use options instead of cmd_argsKohaku-Blueleaf
2023-10-28Add MPS manual castKohakuBlueleaf
2023-10-28ManualCast for 10/16 series gpuKohaku-Blueleaf
2023-10-24Add CPU fp8 supportKohaku-Blueleaf
Since norm layer need fp32, I only convert the linear operation layer(conv2d/linear) And TE have some pytorch function not support bf16 amp in CPU. I add a condition to indicate if the autocast is for unet.
2023-09-09fix for crash when running #12924 without --device-idAUTOMATIC1111
2023-08-31More accurate check for enabling cuDNN benchmark on 16XX cardscatboxanon
2023-08-09split shared.py into multiple files; should resolve all circular reference ↵AUTOMATIC1111
import errors related to shared.py
2023-08-09rework RNG to use generators instead of generating noises beforehandAUTOMATIC1111
2023-08-03rework torchsde._brownian.brownian_interval replacement to use ↵AUTOMATIC1111
device.randn_local and respect the NV setting.
2023-08-03add NV option for Random number generator source setting, which allows to ↵AUTOMATIC1111
generate same pictures on CPU/AMD/Mac as on NVidia videocards.
2023-07-11Fix MPS cache cleanupAarni Koskela
Importing torch does not import torch.mps so the call failed.
2023-07-08added torch.mps.empty_cache() to torch_gc()AUTOMATIC1111
changed a bunch of places that use torch.cuda.empty_cache() to use torch_gc() instead
2023-06-05Remove a bunch of unused/vestigial codeAarni Koskela
As found by Vulture and some eyes
2023-05-21run basic torch calculation at startup in parallel to reduce the performance ↵AUTOMATIC
impact of first generation
2023-05-10ruff auto fixesAUTOMATIC
2023-04-29rename CPU RNG to RNG source in settings, add infotext and parameters ↵AUTOMATIC
copypaste support to RNG source
2023-04-18Option to use CPU for random number generation.Deciare
Makes a given manual seed generate the same images across different platforms, independently of the GPU architecture in use. Fixes #9613.
2023-02-01Refactor Mac specific code to a separate filebrkirch
Move most Mac related code to a separate file, don't even load it unless web UI is run under macOS.
2023-02-01Refactor MPS fixes to CondFuncbrkirch
2023-02-01MPS fix is still needed :(brkirch
Apparently I did not test with large enough images to trigger the bug with torch.narrow on MPS
2023-01-28Merge pull request #7309 from brkirch/fix-embeddingsAUTOMATIC1111
Fix embeddings, upscalers, and refactor `--upcast-sampling`
2023-01-28Remove MPS fix no longer needed for PyTorchbrkirch
The torch.narrow fix was required for nightly PyTorch builds for a while to prevent a hard crash, but newer nightly builds don't have this issue.
2023-01-28Refactor conditional casting, fix upscalersbrkirch
2023-01-27clarify the option to disable NaN check.AUTOMATIC
2023-01-27remove the need to place configs near modelsAUTOMATIC
2023-01-25Add UI setting for upcasting attention to float32brkirch
Adds "Upcast cross attention layer to float32" option in Stable Diffusion settings. This allows for generating images using SD 2.1 models without --no-half or xFormers. In order to make upcasting cross attention layer optimizations possible it is necessary to indent several sections of code in sd_hijack_optimizations.py so that a context manager can be used to disable autocast. Also, even though Stable Diffusion (and Diffusers) only upcast q and k, unfortunately my findings were that most of the cross attention layer optimizations could not function unless v is upcast also.
2023-01-25Add option for float32 sampling with float16 UNetbrkirch
This also handles type casting so that ROCm and MPS torch devices work correctly without --no-half. One cast is required for deepbooru in deepbooru_model.py, some explicit casting is required for img2img and inpainting. depth_model can't be converted to float16 or it won't work correctly on some systems (it's known to have issues on MPS) so in sd_models.py model.depth_model is removed for model.half().
2023-01-19Merge pull request #6922 from brkirch/cumsum-fixAUTOMATIC1111
Improve cumsum fix for MPS
2023-01-17Fix cumsum for MPS in newer torchbrkirch
The prior fix assumed that testing int16 was enough to determine if a fix is needed, but a recent fix for cumsum has int16 working but not bool.
2023-01-17disable the new NaN check for the CIAUTOMATIC
2023-01-16Add a check and explanation for tensor with all NaNs.AUTOMATIC
2023-01-05Add support for PyTorch nightly and local buildsbrkirch
2022-12-17Add numpy fix for MPS on PyTorch 1.12.1brkirch
When saving training results with torch.save(), an exception is thrown: "RuntimeError: Can't call numpy() on Tensor that requires grad. Use tensor.detach().numpy() instead." So for MPS, check if Tensor.requires_grad and detach() if necessary.