xref: /third_party/mesa3d/src/intel/genxml/README (revision bf215546)
1bf215546Sopenharmony_ciThis provides some background the design of the generated headers.  We
2bf215546Sopenharmony_cistarted out trying to generate bit fields but it evolved into the pack
3bf215546Sopenharmony_cifunctions because of a few limitations:
4bf215546Sopenharmony_ci
5bf215546Sopenharmony_ci  1) Bit fields still generate terrible code today. Even with modern
6bf215546Sopenharmony_ci     optimizing compilers you get multiple load+mask+store operations
7bf215546Sopenharmony_ci     to the same dword in memory as you set individual bits. The
8bf215546Sopenharmony_ci     compiler also has to generate code to mask out overflowing values
9bf215546Sopenharmony_ci     (for example, if you assign 200 to a 2 bit field). Our driver
10bf215546Sopenharmony_ci     never writes overflowing values so that's not needed. On the
11bf215546Sopenharmony_ci     other hand, most compiler recognize that the template struct we
12bf215546Sopenharmony_ci     use is a temporary variable and copy propagate the individual
13bf215546Sopenharmony_ci     fields and do amazing constant folding.  You should take a look
14bf215546Sopenharmony_ci     at the code that gets generated when you compile in release mode
15bf215546Sopenharmony_ci     with optimizations.
16bf215546Sopenharmony_ci
17bf215546Sopenharmony_ci  2) For some types we need to have overlapping bit fields. For
18bf215546Sopenharmony_ci     example, some values are 64 byte aligned 32 bit offsets. The
19bf215546Sopenharmony_ci     lower 5 bits of the offset are always zero, so the hw packs in a
20bf215546Sopenharmony_ci     few misc bits in the lower 5 bits there. Other times a field can
21bf215546Sopenharmony_ci     be either a u32 or a float. I tried to do this with overlapping
22bf215546Sopenharmony_ci     anonymous unions and it became a big mess. Also, when using
23bf215546Sopenharmony_ci     initializers, you can only initialize one union member so this
24bf215546Sopenharmony_ci     just doesn't work with out approach.
25bf215546Sopenharmony_ci
26bf215546Sopenharmony_ci     The pack functions on the other hand allows us a great deal of
27bf215546Sopenharmony_ci     flexibility in how we combine things. In the case of overlapping
28bf215546Sopenharmony_ci     fields (the u32 and float case), if we only set one of them in
29bf215546Sopenharmony_ci     the pack function, the compiler will recognize that the other is
30bf215546Sopenharmony_ci     initialized to 0 and optimize out the code to or it it.
31bf215546Sopenharmony_ci
32bf215546Sopenharmony_ci  3) Bit fields (and certainly overlapping anonymous unions of bit
33bf215546Sopenharmony_ci     fields) aren't generally stable across compilers in how they're
34bf215546Sopenharmony_ci     laid out and aligned. Our pack functions let us control exactly
35bf215546Sopenharmony_ci     how things get packed, using only simple and unambiguous bitwise
36bf215546Sopenharmony_ci     shifting and or'ing that works on any compiler.
37bf215546Sopenharmony_ci
38bf215546Sopenharmony_ciOnce we have the pack function it allows us to hook in various
39bf215546Sopenharmony_citransformations and validation as we go from template struct to dwords
40bf215546Sopenharmony_ciin memory:
41bf215546Sopenharmony_ci
42bf215546Sopenharmony_ci  1) Validation: As I said above, our driver isn't supposed to write
43bf215546Sopenharmony_ci     overflowing values to the fields, but we've of course had lots of
44bf215546Sopenharmony_ci     cases where we make mistakes and write overflowing values. With
45bf215546Sopenharmony_ci     the pack function, we can actually assert on that and catch it at
46bf215546Sopenharmony_ci     runtime.  bitfields would just silently truncate.
47bf215546Sopenharmony_ci
48bf215546Sopenharmony_ci  2) Type conversions: some times it's just a matter of writing a
49bf215546Sopenharmony_ci     float to a u32, but we also convert from bool to bits, from
50bf215546Sopenharmony_ci     floats to fixed point integers.
51bf215546Sopenharmony_ci
52bf215546Sopenharmony_ci  3) Relocations: whenever we have a pointer from one buffer to
53bf215546Sopenharmony_ci     another (for example a pointer from the meta data for a texture
54bf215546Sopenharmony_ci     to the raw texture data), we have to tell the kernel about it so
55bf215546Sopenharmony_ci     it can adjust the pointer to point to the final location. That
56bf215546Sopenharmony_ci     means extra work we have to do extra work to record and annotate
57bf215546Sopenharmony_ci     the dword location that holds the pointer. With bit fields, we'd
58bf215546Sopenharmony_ci     have to call a function to do this, but with the pack function we
59bf215546Sopenharmony_ci     generate code in the pack function to do this for us. That's a
60bf215546Sopenharmony_ci     lot less error prone and less work.
61bf215546Sopenharmony_ci
62bf215546Sopenharmony_ciKeeping genxml files tidy :
63bf215546Sopenharmony_ci
64bf215546Sopenharmony_ci   In order to spot differences easily between generations, we keep genxml files sorted.
65bf215546Sopenharmony_ci   You can trigger the sort by running :
66bf215546Sopenharmony_ci
67bf215546Sopenharmony_ci      $ cd src/intel/genxml; ./gen_sort_tags.py
68bf215546Sopenharmony_ci
69bf215546Sopenharmony_ci   gen_sort_tags.py is the script that sorts genxml files using with
70bf215546Sopenharmony_ci   the following rules :
71bf215546Sopenharmony_ci
72bf215546Sopenharmony_ci      1) Tags are grouped in the following order <enum>, <struct>,
73bf215546Sopenharmony_ci         <instruction>, <register>
74bf215546Sopenharmony_ci
75bf215546Sopenharmony_ci      2) <field> tags are sorted through the value of their start attribute
76bf215546Sopenharmony_ci
77bf215546Sopenharmony_ci      3) Sort <struct> tags by dependency so that other scripts have
78bf215546Sopenharmony_ci         everything properly ordered.
79