brw_fs.cpp - OpenGrok cross reference for /third_party/mesa3d/src/intel/compiler/brw

Lines Matching defs:that
135 fs_inst::fs_inst(const fs_inst &that)
137    memcpy((void*)this, &that, sizeof(that));
139    this->src = new fs_reg[MAX2(that.sources, 3)];
141    for (unsigned i = 0; i < that.sources; i++)
142       this->src[i] = that.src[i];
175     * components starting from that.
178     * and a portion done using fs_reg::offset, which means that if you have
181     * later notice that those loads are all the same and eliminate the
326  * instruction that is its last use.  For a single instruction, the
332  * - Virtual opcodes that translate to multiple instructions in the
352        * that one of the instructions will read from a channel corresponding
615  * things that are unsupported in SIMD16+ mode, so the compiler can skip the
635  * Returns true if the instruction has a flag that means it won't
639  * when a write to a variable screens off any preceding values that were in
983    /* Return the subset of flag registers that an instruction could
1062  * Note that this is not the 0 or 1 implied writes in an actual gen
1240        * We can use the fact that bit 15 is the MSB of g0.0:W to accomplish
1407        * FINISHME: One day, we could come up with a way to do this that
1478     * coarse pixel dispatch mode, so report 0 when that's not the case.
1482        * rate while the SPIR-V built-in is the enum value that has the shading
1740  * Build up an array of indices into the urb_setup array that
1875           * This is useful because it means that (a) inputs not used by the
1900          /* We have enough input varyings that the SF/SBE pipeline stage can't
1902           * in an order that matches the output of the previous pipeline stage
1906          /* Re-compute the VUE map here in the case that the one coming from
1976     * setup regs, now that the location of the constants has been chosen.
2025           * rule implies that elements within a 'Width' cannot cross GRF
2028           * So, for registers that are large enough, we have to split the exec
2065    /* Rewrite all ATTR file references to the hw grf that they land in. */
2344          /* We just found an unused register.  This means that we are
2426    /* Now that we know how many regular uniforms we'll push, reduce the
2656             /* On Gfx8+, the OR instruction can have a source modifier that
2772             /* It's possible that the selected component will be too large and
2825  * Optimize sample messages that have constant zero values for the trailing
2828  * that aren't sent default to zero anyway. This will cause the dead code
2829  * eliminator to remove the MOV instruction that would otherwise be emitted to
2837     * parameters that have to be provided for some texture types
2880  * Gfx9+ supports "split" SEND messages, which take two payloads that are
2882  * we can split that payload in two.  This results in smaller contiguous
2883  * register blocks for us to allocate.  But it can help beyond that, too.
2886  * For example, a sampler message often contains a x/y/z coordinate that may
3132        * things that computed the value of all GRFs of the source region.  The
3147 	     * that writes that reg, but it would require smarter
3182 	  * values that end up in MRFs are shortly before the MRF
3189 	  * MRF's source GRF that we wanted to rewrite, that stops us.
3204 	     * compute-to-MRF before that.
3212 	    /* Found a SEND instruction, which means that there are
3290       /* The optimization below assumes that channel zero is live on thread
3401    /* Now that we have the uniform assigned, go ahead and force it to a vec4. */
3447       /* Clear out the last-write records for MRFs that were overwritten. */
3537    /* Clear the flag for registers that actually got read (as expected). */
3560  *      must ensure that there is no destination hazard for the case of ‘write
3569  *      same time that both consider ‘r3’ as the target of their final writes.
3587     * we assume that there are no outstanding dependencies on entry to the
3591       /* If we hit control flow, assume that there *are* outstanding
3603       /* We insert our reads as late as possible on the assumption that any
3604        * instruction but a MOV that might have left us an outstanding
3622       /* Clear the flag for registers that actually got read (as expected). */
3667       /* Clear the flag for registers that actually got read (as expected). */
3724  * Note that execution masking for setting up pull constant loads is special:
3725  * the channels that need to be written are unrelated to the current execution
3891           * that into account now.
3934        * If multiplying by an immediate value that fits in 16-bits, do a
3935        * single MUL instruction with that value in the proper location.
3980        * We avoid the shl instruction by realizing that we only want to add
4024           * lowered by the subsequent lower_regioning pass.  In this case that
4171        * that access the accumulator implicitly (e.g. MACH).  A
4179        * accumulator register that doesn't exist, but on earlier Gfx7
4180        * hardware we need to make sure that the quarter control bits are
4199          /* If the instruction is already in a form that does not need lowering,
4254          /* If src1 is an immediate value that is not NaN, then it can't be
4255           * NaN.  In that case, emit CMP because it is much better for cmod
4466    /* Unlike the regular gl_HelperInvocation, that is defined at dispatch,
4481       /* The at() ensures that any code emitted to get the predicate happens
4535  * some common regioning and execution control restrictions that apply to FPU
4562     * which is the one that is going to limit the overall execution size of
4601           * up with writes to 4 registers and a source that reads 2 registers
4602           * and we may still need to lower all the way to SIMD8 in that case.
4654    /* From the IVB PRMs (applies to other devices that don't have the
4666     * it's hardwired to use NibCtrl+1, at least on HSW), which means that
4704     * empirical testing with existing CTS tests show that they pass just fine
4706     * is that conversion MOVs between HF and F are still mixed-float
4709     * lift the restriction if we can ensure that it is safe though, since these
4734  * various payload size restrictions that apply to sampler message
4758    /* Calculate the number of coordinate components that have to be present
4759     * assuming that additional arguments follow the texel coordinates in the
4779    /* Calculate the total number of argument components that need to be passed
4854       /* The Ivybridge/BayTrail WaCMPInstFlagDepClearedEarly workaround says that
4871       /* The Haswell WaForceSIMD8ForBFIInstruction workaround says that we
4932        * shorter return payload would be to use the SIMD8 sampler message that
5135  * Extract the data that would be consumed by the channel group given by
5185     * the results of multiple lowered instructions in order to make sure that
5223  * the temporary as result.  Any copy instructions that are required for
5292           * we're sure that both cases can be handled.
5318           * it off here so that we insert the zip instructions in the right
5322           * instructions will end up in the reverse order that we insert them.
5323           * However, certain render target writes require that the low group
5888        * same order that they appear in the brw_barycentric_mode enum.  Each
5978     * Note that the GS reads <URB Read Length> HWords for every vertex - so we
6037     * make sure that optimizations set the execution controls explicitly to
6081     * instruction is encountered, and again when the user of that result is
6203  *    "It is required that the second block of GRFs does not overlap with the
6250  * ARF NULL is not allowed.  Fix that up by allocating a temporary GRF.
6273    /* This workaround is about making sure that any instruction writing
6277     * important is anything that hasn't completed. Usually any SEND
6278     * instruction that has a destination register will be read by something
6281     * instructions that don't have a destination register.
6340  * Find the first instruction in the program that might start a region of
6366  * horizontal predicate that makes sure that their execution is omitted when
6399             /* Note that this doesn't handle BRW_OPCODE_HALT since only
6414             /* Note that the vast majority of NoMask SEND instructions in the
6425              * have no straightforward way to detect that currently, so just
6570     * it inserts dead code that happens to have side effects, and it does
6612        * that we could allocate a larger buffer, and partition it out
7129  * variables so that we catch interpolateAtCentroid() messages too, which
7544     * at the top to select the shader.  We've never implemented that.