-
Book Overview & Buying
-
Table Of Contents
-
Feedback & Rating

Using Stable Diffusion with Python
By :

In PyTorch, floating point tensors are created in FP32 precision by default. The TF32 data format was developed for Nvidia Ampere and later CUDA devices. TF32 can achieve faster matrix multiplications and convolutions with slightly less accurate computation [5]. Both FP32 and TF32 are historic artifact settings and are required for training, but it is rare that networks need this much numerical accuracy for inference.
Instead of using the TF32 and FP32 data types, we can load and run the Stable Diffusion model weights in float16 or bfloat16 precision to save VRAM usage and improve speed. But what are the differences between float16 and bfloat16, and which one should we use?
bfloat16 and float16 are both half-precision floating-point data formats, but they have some differences: