aboutsummaryrefslogtreecommitdiff
path: root/modules/sd_hijack_optimizations.py
AgeCommit message (Collapse)Author
2023-08-02update doggettx cross attention optimization to not use an unreasonable ↵AUTOMATIC1111
amount of memory in some edge cases -- suggestion by MorkTheOrk
2023-07-13get attention optimizations to workAUTOMATIC1111
2023-07-12SDXL supportAUTOMATIC1111
2023-06-07Merge pull request #11066 from aljungberg/patch-1AUTOMATIC1111
Fix upcast attention dtype error.
2023-06-06Fix upcast attention dtype error.Alexander Ljungberg
Without this fix, enabling the "Upcast cross attention layer to float32" option while also using `--opt-sdp-attention` breaks generation with an error: ``` File "/ext3/automatic1111/stable-diffusion-webui/modules/sd_hijack_optimizations.py", line 612, in sdp_attnblock_forward out = torch.nn.functional.scaled_dot_product_attention(q, k, v, dropout_p=0.0, is_causal=False) RuntimeError: Expected query, key, and value to have the same dtype, but got query.dtype: float key.dtype: float and value.dtype: c10::Half instead. ``` The fix is to make sure to upcast the value tensor too.
2023-06-04Merge pull request #10990 from vkage/sd_hijack_optimizations_bugfixAUTOMATIC1111
torch.cuda.is_available() check for SdOptimizationXformers
2023-06-04fix the broken line for #10990AUTOMATIC
2023-06-03torch.cuda.is_available() check for SdOptimizationXformersVivek K. Vasishtha
2023-06-01revert default cross attention optimization to DoggettxAUTOMATIC
make --disable-opt-split-attention command line option work again
2023-06-01revert default cross attention optimization to DoggettxAUTOMATIC
make --disable-opt-split-attention command line option work again
2023-05-31rename print_error to report, use it with together with package nameAUTOMATIC
2023-05-29Add & use modules.errors.print_error where currently printing exception info ↵Aarni Koskela
by hand
2023-05-21Add a couple `from __future__ import annotations`es for Py3.9 compatAarni Koskela
2023-05-19Apply suggestions from code reviewAUTOMATIC1111
Co-authored-by: Aarni Koskela <akx@iki.fi>
2023-05-19fix linter issuesAUTOMATIC
2023-05-18make it possible for scripts to add cross attention optimizationsAUTOMATIC
add UI selection for cross attention optimization
2023-05-11Autofix Ruff W (not W605) (mostly whitespace)Aarni Koskela
2023-05-10ruff auto fixesAUTOMATIC
2023-05-10autofixes from ruffAUTOMATIC
2023-05-08Fix for Unet NaNsbrkirch
2023-03-24Update sd_hijack_optimizations.pyFNSpd
2023-03-21Update sd_hijack_optimizations.pyFNSpd
2023-03-10sdp_attnblock_forward hijackPam
2023-03-10argument to disable memory efficient for sdpPam
2023-03-07scaled dot product attentionPam
2023-01-25Add UI setting for upcasting attention to float32brkirch
Adds "Upcast cross attention layer to float32" option in Stable Diffusion settings. This allows for generating images using SD 2.1 models without --no-half or xFormers. In order to make upcasting cross attention layer optimizations possible it is necessary to indent several sections of code in sd_hijack_optimizations.py so that a context manager can be used to disable autocast. Also, even though Stable Diffusion (and Diffusers) only upcast q and k, unfortunately my findings were that most of the cross attention layer optimizations could not function unless v is upcast also.
2023-01-23better support for xformers flash attention on older versions of torchAUTOMATIC
2023-01-21add --xformers-flash-attention option & implTakuma Mori
2023-01-21extra networks UIAUTOMATIC
rework of hypernets: rather than via settings, hypernets are added directly to prompt as <hypernet:name:weight>
2023-01-06Added licensebrkirch
2023-01-06Change sub-quad chunk threshold to use percentagebrkirch
2023-01-06Add Birch-san's sub-quadratic attention implementationbrkirch
2022-12-20Use other MPS optimization for large q.shape[0] * q.shape[1]brkirch
Check if q.shape[0] * q.shape[1] is 2**18 or larger and use the lower memory usage MPS optimization if it is. This should prevent most crashes that were occurring at certain resolutions (e.g. 1024x1024, 2048x512, 512x2048). Also included is a change to check slice_size and prevent it from being divisible by 4096 which also results in a crash. Otherwise a crash can occur at 1024x512 or 512x1024 resolution.
2022-12-10cleanup some unneeded imports for hijack filesAUTOMATIC
2022-12-10do not replace entire unet for the resolution hackAUTOMATIC
2022-11-23Patch UNet Forward to support resolutions that are not multiples of 64Billy Cao
Also modifed the UI to no longer step in 64
2022-10-19Remove wrong self reference in CUDA support for invokeaiCheka
2022-10-18Update sd_hijack_optimizations.pyC43H66N12O12S2
2022-10-18readd xformers attnblockC43H66N12O12S2
2022-10-18delete xformers attnblockC43H66N12O12S2
2022-10-11Use apply_hypernetwork functionbrkirch
2022-10-11Add InvokeAI and lstein to credits, add back CUDA supportbrkirch
2022-10-11Add check for psutilbrkirch
2022-10-11Add cross-attention optimization from InvokeAIbrkirch
* Add cross-attention optimization from InvokeAI (~30% speed improvement on MPS) * Add command line option for it * Make it default when CUDA is unavailable
2022-10-11rename hypernetwork dir to hypernetworks to prevent clash with an old ↵AUTOMATIC
filename that people who use zip instead of git clone will have
2022-10-11fixes related to mergeAUTOMATIC
2022-10-11replace duplicate code with a functionAUTOMATIC
2022-10-10remove functorchC43H66N12O12S2
2022-10-09Fix VRAM Issue by only loading in hypernetwork when selected in settingsFampai
2022-10-08make --force-enable-xformers work without needing --xformersAUTOMATIC