110 src->cat6.type 1 1 xxxxxxxxx 00 00000 LoaD Global {SY}{JP}{NAME}.{TYPE} {TYPE_HALF}{DST}, g[{SRC1}{OFF}], {SIZE} 0 extract_reg_iim(src->srcs[1]) extract_reg_uim(src->srcs[2]) LoaD Global {SY}{JP}{NAME}.{TYPE} {TYPE_HALF}{DST}, g[{SRC1}+({SRC2}{OFF})<<{SRC2_BYTE_SHIFT}], {SIZE} {SY}{JP}{NAME}.{TYPE} {TYPE_HALF}{DST}, g[{SRC1}+{SRC2}<<{SRC2_BYTE_SHIFT}{OFF}<<2], {SIZE} {SRC2_ADD_DWORD_SHIFT} > 0 0 1 {SRC2_ADD_DWORD_SHIFT} + 2 src->srcs[1] extract_reg_uim(src->srcs[2]) extract_reg_uim(src->srcs[3]) extract_reg_uim(src->srcs[4]) x xxxxxxxx 1x x 00011 1 STore Global {SY}{JP}{NAME}.{TYPE} g[{SRC1}{OFF}], {TYPE_HALF}{SRC3}, {SIZE} ({OFF_HI} << 8) | {OFF_LO} 0 extract_reg_iim(src->srcs[1]) & 0xff extract_reg_iim(src->srcs[1]) >> 8 src->srcs[2] extract_reg_uim(src->srcs[3]) STore Global {SY}{JP}{NAME}.{TYPE} g[{SRC1}+({SRC2}{OFF})<<{DST_BYTE_SHIFT}], {TYPE_HALF}{SRC3}, {SIZE} {SY}{JP}{NAME}.{TYPE} g[{SRC1}+{SRC2}<<{DST_BYTE_SHIFT}{OFF}<<2], {TYPE_HALF}{SRC3}, {SIZE} {SRC2_ADD_DWORD_SHIFT} > 0 {SRC2_ADD_DWORD_SHIFT} + 2 0 1 src->srcs[1] extract_reg_uim(src->srcs[2]) extract_reg_uim(src->srcs[3]) src->srcs[4] extract_reg_uim(src->srcs[5]) 1 x 1 xxxxxxxxx xx extract_reg_uim(src->srcs[1]) src->srcs[0] extract_reg_uim(src->srcs[2]) LoaD Local {SY}{JP}{NAME}.{TYPE} {DST}, l[{SRC}{OFF}], {SIZE} 00001 LoaD Private {SY}{JP}{NAME}.{TYPE} {DST}, p[{SRC}{OFF}], {SIZE} 00010 LoaD Local (variant used for passing data between geom stages) {SY}{JP}{NAME}.{TYPE} {DST}, l[{SRC}{OFF}], {SIZE} 01010 LoaD Local Varying - read directly from varying storage {SY}{JP}{NAME}.{TYPE} {DST}, l[{OFF}], {SIZE} 0 xxxxxxxx 11 xxxxxxxxx xx 11111 extract_reg_uim(src->srcs[1]) extract_reg_uim(src->srcs[0]) ({OFF_HI} << 8) | {OFF_LO} xxxxxxxxx 1 1 xx src->cat6.dst_offset >> 8 src->cat6.dst_offset & 0xff src->srcs[1] src->srcs[0]" extract_reg_uim(src->srcs[2]) STore Local {SY}{JP}{NAME}.{TYPE} l[{DST}{OFF}], {SRC}, {SIZE} x 00100 STore Private {SY}{JP}{NAME}.{TYPE} p[{DST}{OFF}], {SRC}, {SIZE} 0 00101 STore Local (variant used for passing data between geom stages) {SY}{JP}{NAME}.{TYPE} l[{DST}{OFF}], {SRC}, {SIZE} x 01011 {OFFSET} 0 a1.x{OFFSET} 1 Encoding for stc destination which can be constant or have an offset of a1.x. extract_reg_uim(src->srcs[0]) STore Const - used for shader prolog (between shps and shpe) to store "uniform folded" values into CONST file NOTE: TYPE field actually seems to be set to different values (ie f32 vs u32), but it seems that only the size (16b vs 32b) matters. Setting a 16-bit type (f16, u16, or s16) doesn't cause any promotion to 32-bit, it causes the 16-bit sources to be stored one after the other starting with the low half of the constant. So e.g. "stc.f16 c[1], hr0.x, 1" copies hr0.x to the bottom half of c0.y. There seems to be no way to set just the upper half. In any case, the blob seems to only use the 32-bit versions. The blob disassembly doesn't include the type, but we still display it so that we can preserve the different values the blob sets when round-tripping. NOTE: this conflicts with stgb from earlier gens {SY}{JP}{NAME}.{TYPE} c[{DST}], {SRC}, {SIZE} x xxxxxxxxxxxxxx 1 xxxxx xxxxxxxx xx 11100 src src->srcs[1] src->cat6.iim_val {SY}{JP}{NAME}.{TYPE}.{D}d {DST}, g[{SSBO}] x xxxxxxxx x xx xxxxxxxx x x xxxxxxxx 0 x 01111 src->cat6.d - 1 src->srcs[0] !!(src->srcs[0]->flags & IR3_REG_IMMED) x src->cat6.d - 1 src src->cat6.iim_val - 1 src->srcs[0] !!(src->srcs[0]->flags & IR3_REG_IMMED) {SY}{JP}{NAME}.{TYPED}.{D}d.{TYPE}.{TYPE_SIZE} {DST}, g[{SSBO}], {SRC1}, {SRC2} xxxxxxxx x src->srcs[1] !!(src->srcs[1]->flags & IR3_REG_IMMED) src->srcs[2] !!(src->srcs[2]->flags & IR3_REG_IMMED) 11011 1 11011 x 00110 1 {SY}{JP}{NAME}.{TYPED}.{D}d.{TYPE}.{TYPE_SIZE} g[{SSBO}], {SRC1}, {SRC2}, {SRC3} xxxxxxxxx src->srcs[1] !!(src->srcs[1]->flags & IR3_REG_IMMED) src->srcs[2] !!(src->srcs[2]->flags & IR3_REG_IMMED) src->srcs[3] !!(src->srcs[3]->flags & IR3_REG_IMMED) 0 1 11100 11101 11100 11101 Base for atomic instructions (I think mostly a4xx+, as a3xx didn't have real image/ssbo.. it was all just global). Still used as of a6xx for local. NOTE that existing disasm and asm parser expect atomic inc/dec to still have an extra src. For now, match that. {SY}{JP}{NAME}.{TYPED}.{D}d.{TYPE}.{TYPE_SIZE}.l {DST}, l[{SRC1}], {SRC2} x src src->cat6.d - 1 src->cat6.iim_val - 1 extract_cat6_SRC(src, 0) !!(extract_cat6_SRC(src, 0)->flags & IR3_REG_IMMED) extract_cat6_SRC(src, 1) !!(extract_cat6_SRC(src, 1)->flags & IR3_REG_IMMED) 1 00000000 00000000 0 0 10000 10001 10010 10011 10100 10101 10110 10111 11000 11001 11010 Pre-a6xx atomics for Image/SSBO {SY}{JP}{NAME}.{TYPED}.{D}d.{TYPE}.{TYPE_SIZE}.g {DST}, g[{SSBO}], {SRC1}, {SRC2}, {SRC3} 1 src->srcs[0] !!(src->srcs[0]->flags & IR3_REG_IMMED) extract_cat6_SRC(src, 2) !!(extract_cat6_SRC(src, 2)->flags & IR3_REG_IMMED) 0 10000 10001 10010 10011 10100 10101 10110 10111 11000 11001 11010 1 10000 10001 10010 10011 10100 10101 10110 10111 11000 11001 11010 a6xx+ global atomics which take iova in SRC1 {SY}{JP}{NAME}.{TYPED}.{D}d.{TYPE}.{TYPE_SIZE}.g {DST}, {SRC1}, {SRC2} 1 00000000 00000000 1 0 10000 10001 10010 10011 10100 10101 10110 10111 11000 11001 11010 Base for new instruction encoding that started being used with a6xx for instructions supporting bindless mode. 00 0 00000 extract_cat6_DESC_MODE(src) src->cat6.iim_val - 1 !!(src->flags & IR3_INSTR_B) src x x 011110 1xx x11 !!(src->srcs[1]->flags & IR3_REG_IMMED) src->srcs[1] src->srcs[0] ldc.k copies a series of UBO values to constants. In other words, it acts the same as a series of ldc followed by stc. It's also similar to a CP_LOAD_STATE with a UBO source but executed in the shader. Like CP_LOAD_STATE, the UBO offset and const file offset must be a multiple of 4 vec4's but it can load any number of vec4's. The UBO descriptor and offset are the same as a normal ldc. The const file offset is specified in a1.x and is in units of components, and the number of vec4's to copy is specified in LOAD_SIZE. {SY}{JP}ldc.{LOAD_SIZE}.k.{MODE}{BASE} c[a1.x], {SRC1}, {SRC2} xx 11 src->cat6.iim_val - 1 LoaD Constant - UBO load {SY}{JP}{NAME}.offset{OFFSET}.{TYPE_SIZE}.{MODE}{BASE} {DST}, {SRC1}, {SRC2} 10 src->cat6.d GET Shader Processor ID? {SY}{JP}{NAME}.{TYPE} {DST} 0 xx x 100100 x1xx xxxxxxxx xxxxxxxx 1x GET Wavefront ID {SY}{JP}{NAME}.{TYPE} {DST} 0 xx x 100101 x1xx xxxxxxxx xxxxxxxx 1x GET Fiber ID (gl_SubgroupID) {SY}{JP}{NAME}.{TYPE} {DST} 0 xx x 100110 11xx xxxxxxxx xxxxxxxx 1x RESourceINFO - returns image/ssbo dimensions (3 components) {SY}{JP}{NAME}.{TYPED}.{D}d.{TYPE}.{TYPE_SIZE}.{MODE}{BASE} {DST}, {SSBO} 0 001111 0110 xxxxxxxx 1x src->cat6.d - 1 src src->srcs[0] src->srcs[1] IBO (ie. Image/SSBO) instructions {SY}{JP}{NAME}.{TYPED}.{D}d.{TYPE}.{TYPE_SIZE}.{MODE}{BASE} {TYPE_HALF}{SRC1}, {SRC2}, {SSBO} 0110 src src->cat6.d - 1 src->srcs[0] src->srcs[2] src->srcs[1] STore IBo 0 011101 10 LoaD IBo x 000110 10 src->dsts[0] x 010000 11 x 010001 11 x 010010 11 x 010101 11 x 010110 11 x 010111 11 x 011000 11 x 011001 11 x 011010 11 {D_MINUS_ONE} + 1 {TYPE_SIZE_MINUS_ONE} + 1 {LOAD_SIZE_MINUS_ONE} + 1 {TYPED} typed untyped src->cat6.typed {BINDLESS} .base{BASE} src->cat6.base Source value that can be either immed or gpr {SRC_IM} {IMMED} r{GPR}.{SWIZ} src->num >> 2 src->num & 0x3 extract_reg_iim(src) {MODE} == 0 Source mode for "new" a6xx+ instruction encodings Immediate index. Index from a uniform register (ie. does not depend on flow control) Index from a non-uniform register (ie. potentially depends on flow control)