aboutsummaryrefslogtreecommitdiff
path: root/modules/sd_hijack_optimizations.py
AgeCommit message (Collapse)Author
2023-08-13Make sub-quadratic the default for MPSbrkirch
2023-08-13Use fixed size for sub-quadratic chunking on MPSbrkirch
Even if this causes chunks to be much smaller, performance isn't significantly impacted. This will usually reduce memory usage but should also help with poor performance when free memory is low.
2023-08-02update doggettx cross attention optimization to not use an unreasonable ↵AUTOMATIC1111
amount of memory in some edge cases -- suggestion by MorkTheOrk
2023-07-13get attention optimizations to workAUTOMATIC1111
2023-07-12SDXL supportAUTOMATIC1111
2023-06-07Merge pull request #11066 from aljungberg/patch-1AUTOMATIC1111
Fix upcast attention dtype error.
2023-06-06Fix upcast attention dtype error.Alexander Ljungberg
Without this fix, enabling the "Upcast cross attention layer to float32" option while also using `--opt-sdp-attention` breaks generation with an error: ``` File "/ext3/automatic1111/stable-diffusion-webui/modules/sd_hijack_optimizations.py", line 612, in sdp_attnblock_forward out = torch.nn.functional.scaled_dot_product_attention(q, k, v, dropout_p=0.0, is_causal=False) RuntimeError: Expected query, key, and value to have the same dtype, but got query.dtype: float key.dtype: float and value.dtype: c10::Half instead. ``` The fix is to make sure to upcast the value tensor too.
2023-06-04Merge pull request #10990 from vkage/sd_hijack_optimizations_bugfixAUTOMATIC1111
torch.cuda.is_available() check for SdOptimizationXformers
2023-06-04fix the broken line for #10990AUTOMATIC
2023-06-03torch.cuda.is_available() check for SdOptimizationXformersVivek K. Vasishtha
2023-06-01revert default cross attention optimization to DoggettxAUTOMATIC
make --disable-opt-split-attention command line option work again
2023-06-01revert default cross attention optimization to DoggettxAUTOMATIC
make --disable-opt-split-attention command line option work again
2023-05-31rename print_error to report, use it with together with package nameAUTOMATIC
2023-05-29Add & use modules.errors.print_error where currently printing exception info ↵Aarni Koskela
by hand
2023-05-21Add a couple `from __future__ import annotations`es for Py3.9 compatAarni Koskela
2023-05-19Apply suggestions from code reviewAUTOMATIC1111
Co-authored-by: Aarni Koskela <akx@iki.fi>
2023-05-19fix linter issuesAUTOMATIC
2023-05-18make it possible for scripts to add cross attention optimizationsAUTOMATIC
add UI selection for cross attention optimization
2023-05-11Autofix Ruff W (not W605) (mostly whitespace)Aarni Koskela
2023-05-10ruff auto fixesAUTOMATIC
2023-05-10autofixes from ruffAUTOMATIC
2023-05-08Fix for Unet NaNsbrkirch
2023-03-24Update sd_hijack_optimizations.pyFNSpd
2023-03-21Update sd_hijack_optimizations.pyFNSpd
2023-03-10sdp_attnblock_forward hijackPam
2023-03-10argument to disable memory efficient for sdpPam
2023-03-07scaled dot product attentionPam
2023-01-25Add UI setting for upcasting attention to float32brkirch
Adds "Upcast cross attention layer to float32" option in Stable Diffusion settings. This allows for generating images using SD 2.1 models without --no-half or xFormers. In order to make upcasting cross attention layer optimizations possible it is necessary to indent several sections of code in sd_hijack_optimizations.py so that a context manager can be used to disable autocast. Also, even though Stable Diffusion (and Diffusers) only upcast q and k, unfortunately my findings were that most of the cross attention layer optimizations could not function unless v is upcast also.
2023-01-23better support for xformers flash attention on older versions of torchAUTOMATIC
2023-01-21add --xformers-flash-attention option & implTakuma Mori
2023-01-21extra networks UIAUTOMATIC
rework of hypernets: rather than via settings, hypernets are added directly to prompt as <hypernet:name:weight>
2023-01-06Added licensebrkirch
2023-01-06Change sub-quad chunk threshold to use percentagebrkirch
2023-01-06Add Birch-san's sub-quadratic attention implementationbrkirch
2022-12-20Use other MPS optimization for large q.shape[0] * q.shape[1]brkirch
Check if q.shape[0] * q.shape[1] is 2**18 or larger and use the lower memory usage MPS optimization if it is. This should prevent most crashes that were occurring at certain resolutions (e.g. 1024x1024, 2048x512, 512x2048). Also included is a change to check slice_size and prevent it from being divisible by 4096 which also results in a crash. Otherwise a crash can occur at 1024x512 or 512x1024 resolution.
2022-12-10cleanup some unneeded imports for hijack filesAUTOMATIC
2022-12-10do not replace entire unet for the resolution hackAUTOMATIC
2022-11-23Patch UNet Forward to support resolutions that are not multiples of 64Billy Cao
Also modifed the UI to no longer step in 64
2022-10-19Remove wrong self reference in CUDA support for invokeaiCheka
2022-10-18Update sd_hijack_optimizations.pyC43H66N12O12S2
2022-10-18readd xformers attnblockC43H66N12O12S2
2022-10-18delete xformers attnblockC43H66N12O12S2
2022-10-11Use apply_hypernetwork functionbrkirch
2022-10-11Add InvokeAI and lstein to credits, add back CUDA supportbrkirch
2022-10-11Add check for psutilbrkirch
2022-10-11Add cross-attention optimization from InvokeAIbrkirch
* Add cross-attention optimization from InvokeAI (~30% speed improvement on MPS) * Add command line option for it * Make it default when CUDA is unavailable
2022-10-11rename hypernetwork dir to hypernetworks to prevent clash with an old ↵AUTOMATIC
filename that people who use zip instead of git clone will have
2022-10-11fixes related to mergeAUTOMATIC
2022-10-11replace duplicate code with a functionAUTOMATIC
2022-10-10remove functorchC43H66N12O12S2