CoCalc -- MESA_shader_integer

GitHub Repository: PojavLauncherTeam/mesa
Path: blob/21.2-virgl/docs/_extra/specs/MESA_shader_integer_functions.txt
⁴⁵⁶⁶ views
1
Name
2

3
    MESA_shader_integer_functions
4

5
Name Strings
6

7
    GL_MESA_shader_integer_functions
8

9
Contact
10

11
    Ian Romanick <[email protected]>
12

13
Contributors
14

15
    All the contributors of GL_ARB_gpu_shader5
16

17
Status
18

19
    Supported by all GLSL 1.30 capable drivers in Mesa 12.1 and later
20

21
Version
22

23
    Version 3, March 31, 2017
24

25
Number
26

27
    OpenGL Extension #495
28

29
Dependencies
30

31
    This extension is written against the OpenGL 3.2 (Compatibility Profile)
32
    Specification.
33

34
    This extension is written against Version 1.50 (Revision 09) of the OpenGL
35
    Shading Language Specification.
36

37
    GLSL 1.30 (OpenGL) or GLSL ES 3.00 (OpenGL ES) is required.
38

39
    This extension interacts with ARB_gpu_shader5.
40

41
    This extension interacts with ARB_gpu_shader_fp64.
42

43
    This extension interacts with NV_gpu_shader5.
44

45
Overview
46

47
    GL_ARB_gpu_shader5 extends GLSL in a number of useful ways.  Much of this
48
    added functionality requires significant hardware support.  There are many
49
    aspects, however, that can be easily implmented on any GPU with "real"
50
    integer support (as opposed to simulating integers using floating point
51
    calculations).
52

53
    This extension provides a set of new features to the OpenGL Shading
54
    Language to support capabilities of these GPUs, extending the
55
    capabilities of version 1.30 of the OpenGL Shading Language and version
56
    3.00 of the OpenGL ES Shading Language.  Shaders using the new
57
    functionality provided by this extension should enable this
58
    functionality via the construct
59

60
      #extension GL_MESA_shader_integer_functions : require   (or enable)
61

62
    This extension provides a variety of new features for all shader types,
63
    including:
64

65
      * support for implicitly converting signed integer types to unsigned
66
        types, as well as more general implicit conversion and function
67
        overloading infrastructure to support new data types introduced by
68
        other extensions;
69

70
      * new built-in functions supporting:
71

72
        * splitting a floating-point number into a significand and exponent
73
          (frexp), or building a floating-point number from a significand and
74
          exponent (ldexp);
75

76
        * integer bitfield manipulation, including functions to find the
77
          position of the most or least significant set bit, count the number
78
          of one bits, and bitfield insertion, extraction, and reversal;
79

80
        * extended integer precision math, including add with carry, subtract
81
          with borrow, and extenended multiplication;
82

83
    The resulting extension is a strict subset of GL_ARB_gpu_shader5.
84

85
IP Status
86

87
    No known IP claims.
88

89
New Procedures and Functions
90

91
    None
92

93
New Tokens
94

95
    None
96

97
Additions to Chapter 2 of the OpenGL 3.2 (Compatibility Profile) Specification
98
(OpenGL Operation)
99

100
    None.
101

102
Additions to Chapter 3 of the OpenGL 3.2 (Compatibility Profile) Specification
103
(Rasterization)
104

105
    None.
106

107
Additions to Chapter 4 of the OpenGL 3.2 (Compatibility Profile) Specification
108
(Per-Fragment Operations and the Frame Buffer)
109

110
    None.
111

112
Additions to Chapter 5 of the OpenGL 3.2 (Compatibility Profile) Specification
113
(Special Functions)
114

115
    None.
116

117
Additions to Chapter 6 of the OpenGL 3.2 (Compatibility Profile) Specification
118
(State and State Requests)
119

120
    None.
121

122
Additions to Appendix A of the OpenGL 3.2 (Compatibility Profile)
123
Specification (Invariance)
124

125
    None.
126

127
Additions to the AGL/GLX/WGL Specifications
128

129
    None.
130

131
Modifications to The OpenGL Shading Language Specification, Version 1.50
132
(Revision 09)
133

134
    Including the following line in a shader can be used to control the
135
    language features described in this extension:
136

137
      #extension GL_MESA_shader_integer_functions : <behavior>
138

139
    where <behavior> is as specified in section 3.3.
140

141
    New preprocessor #defines are added to the OpenGL Shading Language:
142

143
      #define GL_MESA_shader_integer_functions        1
144

145

146
    Modify Section 4.1.10, Implicit Conversions, p. 27
147

148
    (modify table of implicit conversions)
149

150
                                Can be implicitly
151
        Type of expression        converted to
152
        ---------------------   -----------------
153
        int                     uint, float
154
        ivec2                   uvec2, vec2
155
        ivec3                   uvec3, vec3
156
        ivec4                   uvec4, vec4
157

158
        uint                    float
159
        uvec2                   vec2
160
        uvec3                   vec3
161
        uvec4                   vec4
162

163
    (modify second paragraph of the section) No implicit conversions are
164
    provided to convert from unsigned to signed integer types or from
165
    floating-point to integer types.  There are no implicit array or structure
166
    conversions.
167

168
    (insert before the final paragraph of the section) When performing
169
    implicit conversion for binary operators, there may be multiple data types
170
    to which the two operands can be converted.  For example, when adding an
171
    int value to a uint value, both values can be implicitly converted to uint
172
    and float.  In such cases, a floating-point type is chosen if either
173
    operand has a floating-point type.  Otherwise, an unsigned integer type is
174
    chosen if either operand has an unsigned integer type.  Otherwise, a
175
    signed integer type is chosen.
176
    
177

178
    Modify Section 5.9, Expressions, p. 57
179

180
    (modify bulleted list as follows, adding support for implicit conversion
181
    between signed and unsigned types)
182

183
    Expressions in the shading language are built from the following:
184

185
    * Constants of type bool, int, int64_t, uint, uint64_t, float, all vector
186
      types, and all matrix types.
187

188
    ...
189

190
    * The operator modulus (%) operates on signed or unsigned integer scalars
191
      or vectors.  If the fundamental types of the operands do not match, the
192
      conversions from Section 4.1.10 "Implicit Conversions" are applied to
193
      produce matching types.  ...
194

195

196
    Modify Section 6.1, Function Definitions, p. 63
197

198
    (modify description of overloading, beginning at the top of p. 64)
199

200
     Function names can be overloaded.  The same function name can be used for
201
     multiple functions, as long as the parameter types differ.  If a function
202
     name is declared twice with the same parameter types, then the return
203
     types and all qualifiers must also match, and it is the same function
204
     being declared.  For example,
205

206
       vec4 f(in vec4 x, out vec4  y);   // (A)
207
       vec4 f(in vec4 x, out uvec4 y);   // (B) okay, different argument type
208
       vec4 f(in ivec4 x, out uvec4 y);  // (C) okay, different argument type
209

210
       int  f(in vec4 x, out ivec4 y);  // error, only return type differs
211
       vec4 f(in vec4 x, in  vec4  y);  // error, only qualifier differs
212
       vec4 f(const in vec4 x, out vec4 y);  // error, only qualifier differs
213

214
     When function calls are resolved, an exact type match for all the
215
     arguments is sought.  If an exact match is found, all other functions are
216
     ignored, and the exact match is used.  If no exact match is found, then
217
     the implicit conversions in Section 4.1.10 (Implicit Conversions) will be
218
     applied to find a match.  Mismatched types on input parameters (in or
219
     inout or default) must have a conversion from the calling argument type
220
     to the formal parameter type.  Mismatched types on output parameters (out
221
     or inout) must have a conversion from the formal parameter type to the
222
     calling argument type.
223

224
     If implicit conversions can be used to find more than one matching
225
     function, a single best-matching function is sought.  To determine a best
226
     match, the conversions between calling argument and formal parameter
227
     types are compared for each function argument and pair of matching
228
     functions.  After these comparisons are performed, each pair of matching
229
     functions are compared.  A function definition A is considered a better
230
     match than function definition B if:
231

232
       * for at least one function argument, the conversion for that argument
233
         in A is better than the corresponding conversion in B; and
234

235
       * there is no function argument for which the conversion in B is better
236
         than the corresponding conversion in A.
237

238
     If a single function definition is considered a better match than every
239
     other matching function definition, it will be used.  Otherwise, a
240
     semantic error occurs and the shader will fail to compile.
241

242
     To determine whether the conversion for a single argument in one match is
243
     better than that for another match, the following rules are applied, in
244
     order:
245

246
       1. An exact match is better than a match involving any implicit
247
          conversion.
248

249
       2. A match involving an implicit conversion from float to double is
250
          better than a match involving any other implicit conversion.
251

252
       3. A match involving an implicit conversion from either int or uint to
253
          float is better than a match involving an implicit conversion from
254
          either int or uint to double.
255

256
     If none of the rules above apply to a particular pair of conversions,
257
     neither conversion is considered better than the other.
258

259
     For the function prototypes (A), (B), and (C) above, the following
260
     examples show how the rules apply to different sets of calling argument
261
     types:
262

263
       f(vec4, vec4);        // exact match of vec4 f(in vec4 x, out vec4 y)
264
       f(vec4, uvec4);       // exact match of vec4 f(in vec4 x, out ivec4 y)
265
       f(vec4, ivec4);       // matched to vec4 f(in vec4 x, out vec4 y)
266
                             //   (C) not relevant, can't convert vec4 to 
267
                             //   ivec4.  (A) better than (B) for 2nd
268
                             //   argument (rule 2), same on first argument.
269
       f(ivec4, vec4);       // NOT matched.  All three match by implicit
270
                             //   conversion.  (C) is better than (A) and (B)
271
                             //   on the first argument.  (A) is better than
272
                             //   (B) and (C).
273

274

275
    Modify Section 8.3, Common Functions, p. 84
276

277
    (add support for single-precision frexp and ldexp functions)
278

279
    Syntax:
280

281
      genType frexp(genType x, out genIType exp);
282
      genType ldexp(genType x, in genIType exp);
283

284
    The function frexp() splits each single-precision floating-point number in
285
    <x> into a binary significand, a floating-point number in the range [0.5,
286
    1.0), and an integral exponent of two, such that:
287

288
      x = significand * 2 ^ exponent
289

290
    The significand is returned by the function; the exponent is returned in
291
    the parameter <exp>.  For a floating-point value of zero, the significant
292
    and exponent are both zero.  For a floating-point value that is an
293
    infinity or is not a number, the results of frexp() are undefined.  
294

295
    If the input <x> is a vector, this operation is performed in a
296
    component-wise manner; the value returned by the function and the value
297
    written to <exp> are vectors with the same number of components as <x>.
298

299
    The function ldexp() builds a single-precision floating-point number from
300
    each significand component in <x> and the corresponding integral exponent
301
    of two in <exp>, returning:
302

303
      significand * 2 ^ exponent
304

305
    If this product is too large to be represented as a single-precision
306
    floating-point value, the result is considered undefined.
307

308
    If the input <x> is a vector, this operation is performed in a
309
    component-wise manner; the value passed in <exp> and returned by the
310
    function are vectors with the same number of components as <x>.
311

312

313
    (add support for new integer built-in functions)
314

315
    Syntax:
316

317
      genIType bitfieldExtract(genIType value, int offset, int bits);
318
      genUType bitfieldExtract(genUType value, int offset, int bits);
319

320
      genIType bitfieldInsert(genIType base, genIType insert, int offset, 
321
                              int bits);
322
      genUType bitfieldInsert(genUType base, genUType insert, int offset, 
323
                              int bits);
324

325
      genIType bitfieldReverse(genIType value);
326
      genUType bitfieldReverse(genUType value);
327

328
      genIType bitCount(genIType value);
329
      genIType bitCount(genUType value);
330

331
      genIType findLSB(genIType value);
332
      genIType findLSB(genUType value);
333

334
      genIType findMSB(genIType value);
335
      genIType findMSB(genUType value);
336

337
    The function bitfieldExtract() extracts bits <offset> through
338
    <offset>+<bits>-1 from each component in <value>, returning them in the
339
    least significant bits of corresponding component of the result.  For
340
    unsigned data types, the most significant bits of the result will be set
341
    to zero.  For signed data types, the most significant bits will be set to
342
    the value of bit <offset>+<base>-1.  If <bits> is zero, the result will be
343
    zero.  The result will be undefined if <offset> or <bits> is negative, or
344
    if the sum of <offset> and <bits> is greater than the number of bits used
345
    to store the operand.  Note that for vector versions of bitfieldExtract(),
346
    a single pair of <offset> and <bits> values is shared for all components.
347

348
    The function bitfieldInsert() inserts the <bits> least significant bits of
349
    each component of <insert> into the corresponding component of <base>.
350
    The result will have bits numbered <offset> through <offset>+<bits>-1
351
    taken from bits 0 through <bits>-1 of <insert>, and all other bits taken
352
    directly from the corresponding bits of <base>.  If <bits> is zero, the
353
    result will simply be <base>.  The result will be undefined if <offset> or
354
    <bits> is negative, or if the sum of <offset> and <bits> is greater than
355
    the number of bits used to store the operand.  Note that for vector
356
    versions of bitfieldInsert(), a single pair of <offset> and <bits> values
357
    is shared for all components.
358

359
    The function bitfieldReverse() reverses the bits of <value>.  The bit
360
    numbered <n> of the result will be taken from bit (<bits>-1)-<n> of
361
    <value>, where <bits> is the total number of bits used to represent
362
    <value>.
363

364
    The function bitCount() returns the number of one bits in the binary
365
    representation of <value>.
366

367
    The function findLSB() returns the bit number of the least significant one
368
    bit in the binary representation of <value>.  If <value> is zero, -1 will
369
    be returned.
370

371
    The function findMSB() returns the bit number of the most significant bit
372
    in the binary representation of <value>.  For positive integers, the
373
    result will be the bit number of the most significant one bit.  For
374
    negative integers, the result will be the bit number of the most
375
    significant zero bit.  For a <value> of zero or negative one, -1 will be
376
    returned.
377

378

379
    (support for unsigned integer add/subtract with carry-out)
380

381
    Syntax:
382

383
      genUType uaddCarry(genUType x, genUType y, out genUType carry);
384
      genUType usubBorrow(genUType x, genUType y, out genUType borrow);
385

386
    The function uaddCarry() adds 32-bit unsigned integers or vectors <x> and
387
    <y>, returning the sum modulo 2^32.  The value <carry> is set to zero if
388
    the sum was less than 2^32, or one otherwise.
389

390
    The function usubBorrow() subtracts the 32-bit unsigned integer or vector
391
    <y> from <x>, returning the difference if non-negative or 2^32 plus the
392
    difference, otherwise.  The value <borrow> is set to zero if x >= y, or
393
    one otherwise.
394

395

396
    (support for signed and unsigned multiplies, with 32-bit inputs and a
397
     64-bit result spanning two 32-bit outputs)
398

399
    Syntax:
400

401
      void umulExtended(genUType x, genUType y, out genUType msb, 
402
                        out genUType lsb);
403
      void imulExtended(genIType x, genIType y, out genIType msb,
404
                        out genIType lsb);
405

406
    The functions umulExtended() and imulExtended() multiply 32-bit unsigned
407
    or signed integers or vectors <x> and <y>, producing a 64-bit result.  The
408
    32 least significant bits are returned in <lsb>; the 32 most significant
409
    bits are returned in <msb>.
410

411

412
GLX Protocol
413

414
    None.
415

416
Dependencies on ARB_gpu_shader_fp64
417

418
    This extension, ARB_gpu_shader_fp64, and NV_gpu_shader5 all modify the set
419
    of implicit conversions supported in the OpenGL Shading Language.  If more
420
    than one of these extensions is supported, an expression of one type may
421
    be converted to another type if that conversion is allowed by any of these
422
    specifications.
423

424
    If ARB_gpu_shader_fp64 or a similar extension introducing new data types
425
    is not supported, the function overloading rule in the GLSL specification
426
    preferring promotion an input parameters to smaller type to a larger type
427
    is never applicable, as all data types are of the same size.  That rule
428
    and the example referring to "double" should be removed.
429

430

431
Dependencies on NV_gpu_shader5
432

433
    This extension, ARB_gpu_shader_fp64, and NV_gpu_shader5 all modify the set
434
    of implicit conversions supported in the OpenGL Shading Language.  If more
435
    than one of these extensions is supported, an expression of one type may
436
    be converted to another type if that conversion is allowed by any of these
437
    specifications.
438

439
    If NV_gpu_shader5 is supported, integer data types are supported with four
440
    different precisions (8-, 16, 32-, and 64-bit) and floating-point data
441
    types are supported with three different precisions (16-, 32-, and
442
    64-bit).  The extension adds the following rule for output parameters,
443
    which is similar to the one present in this extension for input
444
    parameters:
445

446
       5. If the formal parameters in both matches are output parameters, a
447
          conversion from a type with a larger number of bits per component is
448
          better than a conversion from a type with a smaller number of bits
449
          per component.  For example, a conversion from an "int16_t" formal
450
          parameter type to "int"  is better than one from an "int8_t" formal
451
          parameter type to "int".
452

453
    Such a rule is not provided in this extension because there is no
454
    combination of types in this extension and ARB_gpu_shader_fp64 where this
455
    rule has any effect.
456

457

458
Errors
459

460
    None
461

462

463
New State
464

465
    None
466

467
New Implementation Dependent State
468

469
    None
470

471
Issues
472

473
    (1) What should this extension be called?
474

475
      UNRESOLVED.  This extension borrows from GL_ARB_gpu_shader5, so creating
476
      some sort of a play on that name would be viable.  However, nothing in
477
      this extension should require SM5 hardware, so such a name would be a
478
      little misleading and weird.
479

480
      Since the primary purpose is to add integer related functions from
481
      GL_ARB_gpu_shader5, call this extension GL_MESA_shader_integer_functions
482
      for now.
483

484
    (2) Why is some of the formatting in this extension weird?
485

486
      RESOLVED: This extension is formatted to minimize the differences (as
487
      reported by 'diff --side-by-side -W180') with the GL_ARB_gpu_shader5
488
      specification.
489

490
    (3) Should ldexp and frexp be included?
491

492
      RESOLVED: Yes.  Few GPUs have native instructions to implement these
493
      functions.  These are generally implemented using existing GLSL built-in
494
      functions and the other functions provided by this extension.
495

496
    (4) Should umulExtended and imulExtended be included?
497

498
      RESOLVED: Yes.  These functions should be implementable on any GPU that
499
      can support the rest of this extension, but the implementation may be
500
      complex.  The implementation on a GPU that only supports 32bit x 32bit =
501
      32bit multiplication would be quite expensive.  However, many GPUs
502
      (including OpenGL 4.0 GPUs that already support this function) have a
503
      32bit x 16bit = 48bit multiplier.  The implementation there is only
504
      trivially more expensive than regular 32bit multiplication.
505

506
    (5) Should the pack and unpack functions be included?
507

508
      RESOLVED: No.  These functions are already available via
509
      GL_ARB_shading_language_packing.
510

511
    (6) Should the "BitsTo" functions be included?
512

513
      RESOLVED: No.  These functions are already available via
514
      GL_ARB_shader_bit_encoding.
515

516
Revision History
517

518
    Rev.      Date     Author    Changes
519
    ----  -----------  --------  -----------------------------------------
520
     3    31-Mar-2017  Jon Leech Add ES support (OpenGL-Registry/issues/3)
521
     2     7-Jul-2016  idr       Fix typo in #extension line
522
     1    20-Jun-2016  idr       Initial version based on GL_ARB_gpu_shader5.
523

524
Product

Resources

Company