Lines Matching defs:that

135 fs_inst::fs_inst(const fs_inst &that)
137 memcpy((void*)this, &that, sizeof(that));
139 this->src = new fs_reg[MAX2(that.sources, 3)];
141 for (unsigned i = 0; i < that.sources; i++)
142 this->src[i] = that.src[i];
175 * components starting from that.
178 * and a portion done using fs_reg::offset, which means that if you have
181 * later notice that those loads are all the same and eliminate the
326 * instruction that is its last use. For a single instruction, the
332 * - Virtual opcodes that translate to multiple instructions in the
352 * that one of the instructions will read from a channel corresponding
615 * things that are unsupported in SIMD16+ mode, so the compiler can skip the
635 * Returns true if the instruction has a flag that means it won't
639 * when a write to a variable screens off any preceding values that were in
983 /* Return the subset of flag registers that an instruction could
1062 * Note that this is not the 0 or 1 implied writes in an actual gen
1240 * We can use the fact that bit 15 is the MSB of g0.0:W to accomplish
1407 * FINISHME: One day, we could come up with a way to do this that
1478 * coarse pixel dispatch mode, so report 0 when that's not the case.
1482 * rate while the SPIR-V built-in is the enum value that has the shading
1740 * Build up an array of indices into the urb_setup array that
1875 * This is useful because it means that (a) inputs not used by the
1900 /* We have enough input varyings that the SF/SBE pipeline stage can't
1902 * in an order that matches the output of the previous pipeline stage
1906 /* Re-compute the VUE map here in the case that the one coming from
1976 * setup regs, now that the location of the constants has been chosen.
2025 * rule implies that elements within a 'Width' cannot cross GRF
2028 * So, for registers that are large enough, we have to split the exec
2065 /* Rewrite all ATTR file references to the hw grf that they land in. */
2344 /* We just found an unused register. This means that we are
2426 /* Now that we know how many regular uniforms we'll push, reduce the
2656 /* On Gfx8+, the OR instruction can have a source modifier that
2772 /* It's possible that the selected component will be too large and
2825 * Optimize sample messages that have constant zero values for the trailing
2828 * that aren't sent default to zero anyway. This will cause the dead code
2829 * eliminator to remove the MOV instruction that would otherwise be emitted to
2837 * parameters that have to be provided for some texture types
2880 * Gfx9+ supports "split" SEND messages, which take two payloads that are
2882 * we can split that payload in two. This results in smaller contiguous
2883 * register blocks for us to allocate. But it can help beyond that, too.
2886 * For example, a sampler message often contains a x/y/z coordinate that may
3132 * things that computed the value of all GRFs of the source region. The
3147 * that writes that reg, but it would require smarter
3182 * values that end up in MRFs are shortly before the MRF
3189 * MRF's source GRF that we wanted to rewrite, that stops us.
3204 * compute-to-MRF before that.
3212 /* Found a SEND instruction, which means that there are
3290 /* The optimization below assumes that channel zero is live on thread
3401 /* Now that we have the uniform assigned, go ahead and force it to a vec4. */
3447 /* Clear out the last-write records for MRFs that were overwritten. */
3537 /* Clear the flag for registers that actually got read (as expected). */
3560 * must ensure that there is no destination hazard for the case of ‘write
3569 * same time that both consider ‘r3’ as the target of their final writes.
3587 * we assume that there are no outstanding dependencies on entry to the
3591 /* If we hit control flow, assume that there *are* outstanding
3603 /* We insert our reads as late as possible on the assumption that any
3604 * instruction but a MOV that might have left us an outstanding
3622 /* Clear the flag for registers that actually got read (as expected). */
3667 /* Clear the flag for registers that actually got read (as expected). */
3724 * Note that execution masking for setting up pull constant loads is special:
3725 * the channels that need to be written are unrelated to the current execution
3891 * that into account now.
3934 * If multiplying by an immediate value that fits in 16-bits, do a
3935 * single MUL instruction with that value in the proper location.
3980 * We avoid the shl instruction by realizing that we only want to add
4024 * lowered by the subsequent lower_regioning pass. In this case that
4171 * that access the accumulator implicitly (e.g. MACH). A
4179 * accumulator register that doesn't exist, but on earlier Gfx7
4180 * hardware we need to make sure that the quarter control bits are
4199 /* If the instruction is already in a form that does not need lowering,
4254 /* If src1 is an immediate value that is not NaN, then it can't be
4255 * NaN. In that case, emit CMP because it is much better for cmod
4466 /* Unlike the regular gl_HelperInvocation, that is defined at dispatch,
4481 /* The at() ensures that any code emitted to get the predicate happens
4535 * some common regioning and execution control restrictions that apply to FPU
4562 * which is the one that is going to limit the overall execution size of
4601 * up with writes to 4 registers and a source that reads 2 registers
4602 * and we may still need to lower all the way to SIMD8 in that case.
4654 /* From the IVB PRMs (applies to other devices that don't have the
4666 * it's hardwired to use NibCtrl+1, at least on HSW), which means that
4704 * empirical testing with existing CTS tests show that they pass just fine
4706 * is that conversion MOVs between HF and F are still mixed-float
4709 * lift the restriction if we can ensure that it is safe though, since these
4734 * various payload size restrictions that apply to sampler message
4758 /* Calculate the number of coordinate components that have to be present
4759 * assuming that additional arguments follow the texel coordinates in the
4779 /* Calculate the total number of argument components that need to be passed
4854 /* The Ivybridge/BayTrail WaCMPInstFlagDepClearedEarly workaround says that
4871 /* The Haswell WaForceSIMD8ForBFIInstruction workaround says that we
4932 * shorter return payload would be to use the SIMD8 sampler message that
5135 * Extract the data that would be consumed by the channel group given by
5185 * the results of multiple lowered instructions in order to make sure that
5223 * the temporary as result. Any copy instructions that are required for
5292 * we're sure that both cases can be handled.
5318 * it off here so that we insert the zip instructions in the right
5322 * instructions will end up in the reverse order that we insert them.
5323 * However, certain render target writes require that the low group
5888 * same order that they appear in the brw_barycentric_mode enum. Each
5978 * Note that the GS reads <URB Read Length> HWords for every vertex - so we
6037 * make sure that optimizations set the execution controls explicitly to
6081 * instruction is encountered, and again when the user of that result is
6203 * "It is required that the second block of GRFs does not overlap with the
6250 * ARF NULL is not allowed. Fix that up by allocating a temporary GRF.
6273 /* This workaround is about making sure that any instruction writing
6277 * important is anything that hasn't completed. Usually any SEND
6278 * instruction that has a destination register will be read by something
6281 * instructions that don't have a destination register.
6340 * Find the first instruction in the program that might start a region of
6366 * horizontal predicate that makes sure that their execution is omitted when
6399 /* Note that this doesn't handle BRW_OPCODE_HALT since only
6414 /* Note that the vast majority of NoMask SEND instructions in the
6425 * have no straightforward way to detect that currently, so just
6570 * it inserts dead code that happens to have side effects, and it does
6612 * that we could allocate a larger buffer, and partition it out
7129 * variables so that we catch interpolateAtCentroid() messages too, which
7544 * at the top to select the shader. We've never implemented that.