Path: blob/21.2-virgl/docs/_extra/specs/INTEL_shader_atomic_float_minmax.txt
4566 views
Name12INTEL_shader_atomic_float_minmax34Name Strings56GL_INTEL_shader_atomic_float_minmax78Contact910Ian Romanick (ian . d . romanick 'at' intel . com)1112Contributors131415Status1617In progress1819Version2021Last Modified Date: 06/22/201822Revision: 42324Number2526TBD2728Dependencies2930OpenGL 4.2, OpenGL ES 3.1, ARB_shader_storage_buffer_object, or31ARB_compute_shader is required.3233This extension is written against version 4.60 of the OpenGL Shading34Language Specification.3536Overview3738This extension provides GLSL built-in functions allowing shaders to39perform atomic read-modify-write operations to floating-point buffer40variables and shared variables. Minimum, maximum, exchange, and41compare-and-swap are enabled.424344New Procedures and Functions4546None.4748New Tokens4950None.5152IP Status5354None.5556Modifications to the OpenGL Shading Language Specification, Version 4.605758Including the following line in a shader can be used to control the59language features described in this extension:6061#extension GL_INTEL_shader_atomic_float_minmax : <behavior>6263where <behavior> is as specified in section 3.3.6465New preprocessor #defines are added to the OpenGL Shading Language:6667#define GL_INTEL_shader_atomic_float_minmax 16869Additions to Chapter 8 of the OpenGL Shading Language Specification70(Built-in Functions)7172Modify Section 8.11, "Atomic Memory Functions"7374(add a new row after the existing "atomicMin" table row, p. 179)7576float atomicMin(inout float mem, float data)777879Computes a new value by taking the minimum of the value of data and80the contents of mem. If one of these is an IEEE signaling NaN (i.e.,81a NaN with the most-significant bit of the mantissa cleared), it is82always considered smaller. If one of these is an IEEE quiet NaN83(i.e., a NaN with the most-significant bit of the mantissa set), it is84always considered larger. If both are IEEE quiet NaNs or both are85IEEE signaling NaNs, the result of the comparison is undefined.8687(add a new row after the exiting "atomicMax" table row, p. 179)8889float atomicMax(inout float mem, float data)9091Computes a new value by taking the maximum of the value of data and92the contents of mem. If one of these is an IEEE signaling NaN (i.e.,93a NaN with the most-significant bit of the mantissa cleared), it is94always considered larger. If one of these is an IEEE quiet NaN (i.e.,95a NaN with the most-significant bit of the mantissa set), it is always96considered smaller. If both are IEEE quiet NaNs or both are IEEE97signaling NaNs, the result of the comparison is undefined.9899(add to "atomicExchange" table cell, p. 180)100101float atomicExchange(inout float mem, float data)102103(add to "atomicCompSwap" table cell, p. 180)104105float atomicCompSwap(inout float mem, float compare, float data)106107Interactions with OpenGL 4.6 and ARB_gl_spirv108109If OpenGL 4.6 or ARB_gl_spirv is supported, then110SPV_INTEL_shader_atomic_float_minmax must also be supported.111112The AtomicFloatMinmaxINTEL capability is available whenever the OpenGL or113OpenGL ES implementation supports INTEL_shader_atomic_float_minmax.114115Issues1161171) Why call this extension INTEL_shader_atomic_float_minmax?118119RESOLVED: Several other extensions already set the precedent of120VENDOR_shader_atomic_float and VENDOR_shader_atomic_float64 for extensions121that enable floating-point atomic operations. Using that as a base for122the name seems logical.123124There already exists NV_shader_atomic_float, but the two extensions have125nearly zero overlap in functionality. NV_shader_atomic_float adds126atomicAdd and image atomic operations that currently shipping Intel GPUs127do not support. Calling this extension INTEL_shader_atomic_float would128likely have been confusing.129130Adding something to describe the actual functions added by this extension131seemed reasonable. INTEL_shader_atomic_float_compare was considered, but132that name was deemed to be not properly descriptive. Calling this133extension INTEL_shader_atomic_float_min_max_exchange_compswap is right134out.1351362) What atomic operations should we support for floating-point targets?137138RESOLVED. Exchange, min, max, and compare-swap make sense, and these are139all supported by the hardware. Future extensions may add other functions.140141For buffer variables and shared variables it is not possible to bit-cast142the memory location in GLSL, so existing integer operations, such as143atomicOr, cannot be used. However, the underlying hardware implementation144can do this by treating the memory as an integer. It would be possible to145implement atomicNegate using this technique with atomicXor. It is unclear146whether this provides any actual utility.1471483) What should be said about the NaN behavior?149150RESOLVED. There are several aspects of NaN behavior that should be151documented in this extension. However, some of this behavior varies based152on NaN concepts that do not exist in the GLSL specification.153154* atomicCompSwap performs the comparison as the floating-point equality155operator (==). That is, if either 'mem' or 'compare' is NaN, the156comparison result is always false.157158* atomicMin and atomicMax implement the IEEE specification with respect to159NaN. IEEE considers two different kinds of NaN: signaling NaN and quiet160NaN. A quiet NaN has the most significant bit of the mantissa set, and161a signaling NaN does not. This concept does not exist in SPIR-V,162Vulkan, or OpenGL. Let qNaN denote a quiet NaN and sNaN denote a163signaling NaN. atomicMin and atomicMax specifically implement164165- fmin(qNaN, x) = fmin(x, qNaN) = fmax(qNaN, x) = fmax(x, qNaN) = x166- fmin(sNaN, x) = fmin(x, sNaN) = fmax(sNaN, x) = fmax(x, sNaN) = sNaN167- fmin(sNaN, qNaN) = fmin(qNaN, sNaN) = fmax(sNaN, qNaN) =168fmax(qNaN, sNaN) = sNaN169- fmin(sNaN, sNaN) = sNaN. This specification does not define which of170the two arguments is stored.171- fmax(sNaN, sNaN) = sNaN. This specification does not define which of172the two arguments is stored.173- fmin(qNaN, qNaN) = qNaN. This specification does not define which of174the two arguments is stored.175- fmax(qNaN, qNaN) = qNaN. This specification does not define which of176the two arguments is stored.177178Further details are available in the Skylake Programmer's Reference179Manuals available at180https://01.org/linuxgraphics/documentation/hardware-specification-prms.1811824) What about atomicMin and atomicMax with (+0.0, -0.0) or (-0.0, +0.0)183arguments?184185RESOLVED. atomicMin should store -0.0, and atomicMax should store +0.0.186Due to a known issue in shipping Skylake GPUs, the incorrectly signed 0 is187stored. This behavior may change in later GPUs.188189Revision History190191Rev Date Author Changes192--- ---------- -------- ---------------------------------------------1931 04/19/2018 idr Initial version1942 05/05/2018 idr Describe interactions with the capabilities195added by SPV_INTEL_shader_atomic_float_minmax.1963 05/29/2018 idr Remove mention of 64-bit float support.1974 06/22/2018 idr Resolve issue #2.198Add issue #3 (regarding NaN behavior).199Add issue #4 (regarding atomicMin(-0, +0).200201202