Lines Matching full:aco
89 - DRI_PRIME fails with ACO only radeonsi
119 - [radeonsi] Wargame: Red Dragon /w OpenGL stopped working with ACO
130 - aco: SwizzleInvocationsMaskedAMD behavior is not correct for reads from inactive lanes
177 - aco: Assertion when compiling CP2077 shader
214 - aco, radv Rage 2 menu corruption - bisected
215 - radv, aco: World War Z character texture regression on 7900xtx
228 - Baldurs Gate 3 (DX11) - Graphical corruption on RDNA3 (ACO regression)
499 - aco: Remove is_ssa check
832 - aco: fix nir_op_vec8/16 with 16-bit elements.
833 - aco: Fix some constant patterns in 16-bit vec4 construction with s_pack.
843 - aco: Add WMMA instructions.
844 - aco: Make RA understand WMMA instructions.
1043 - amd/ci: update radv-stoney-aco-fails.txt for depth/stencil clear
1045 - amd/ci: update radv-stoney-aco-fails.txt for depth/stencil resolve
1252 - aco: append p_logical_end after monolithic RT shaders
1253 - aco/insert_exec_mask: set Exact mode after p_discard_if when necessary
1254 - aco: don't optimize cross-lane instructions across p_wqm
1255 - aco: make p_wqm a marker instruction without Operands/Definitions
1256 - aco: don't insert a copy when emitting p_wqm
1257 - aco: insert a single p_end_wqm after the last derivative calculation
1258 - aco/insert_exec_mask: Simplify WQM handling (1/2)
1259 - aco/insert_exec_mask: Simplify WQM handling (2/2)
3015 - aco/gfx11: fix get_gfx11_true16_mask with v_cmp_class_f16
3016 - aco: improve get_gfx11_true16_mask description
3017 - aco: combine a & ~b to bfi(b, 0, a)
3018 - aco/gfx11: use v_cmp_class_f16 with opsel for bitnz/bitz
3019 - aco: fix non constant 16bit bitnz/bitz
3021 - aco: use s_bitreplicate_b64_b32 to set exec to 0xffff0000ffff0000
3023 - aco: always use rtne for fquantize2f16
3028 - aco: fix u2f16 with 32bit input
3029 - aco: combine a | ~b to bfi(b, a, -1)
3030 - aco: use v_cvt_f32_ubyte for signed casts too
3033 - aco: implement some exclusive scans with inclusive scans
3034 - aco/gfx11: don't use bfe for local_invocation_id if the others are always 0
3036 - aco: simplify masked swizzle dpp selection by removing or_mask first
3037 - aco: fix p_extract with v1 dst and s1 operand
3038 - aco: implement 64bit div find_lsb
3040 - aco/optimizer: check if we can use omod before labeling it
3041 - aco/optimizer: copy propagate to output modifier instructions
3042 - aco: remove -0.0 for 32 bit fsign with mul_legacy/omod when denorms are flushed
3045 - aco: assume new generations are unsupported by clrx
3046 - aco: assume newer generation will use GFX11 wait_imm packing
3047 - aco: print final ir instead if printing asm is unsupported
3048 - aco/gfx11: optimize dual source export
3049 - aco/gfx11: apply clamp/omod to vinterp
3050 - aco: support v_fma_f32_dpp as fma_mix
3051 - aco/gfx11: support vinterp as fma_mix
3052 - aco: add missing scc def for SALU quad broadcast
3053 - aco/sched: treat p_dual_src_export_gfx11 like export
3987 - aco: Do not fixup registers if there are no shader calls
3995 - aco/validate: Handle p_wqm like p_parallelcopy
3996 - aco: Use bytes() instead of size() in emit_wqm
3997 - aco: Unify demote and demote_if selection
3999 - aco/lower_to_cssa: Fix typo
4020 - aco/spill: Make sure that offset stays in bounds
5106 - aco,radv: replace tess_input_vertices shader info param
5107 - radeonsi: aco does not pass LS outputs to HS by arg
5108 - radeonsi: extract si_get_prev_stage_nir_shader to be shared with aco
5109 - radeonsi: init aco shader info for merged LS/HS
5115 - radeonsi: aco compile support merged mono shader
5117 - radeonsi: enable aco compile for mono merged LS/HS
5118 - radeonsi: enable aco compile for mono merged ES/GS
5119 - aco: extract aco_compile_shader_part from aco_compile_ps_epilog
5120 - aco: add p_end_with_regs pseudo instruction
5121 - aco: move jump to epilog out of ic_merged_wave_info
5122 - aco: add tcs end regs for epilog usage
5123 - aco: allow tcs with epilog to keep nir store output instruction
5124 - aco: add pending_lds_access option for insert waitcnt
5125 - aco: add tcs epilog generation for radeonsi
5126 - aco: don't emit s_endpgm for tcs with epilog
5127 - aco: skip scratch init when no scratch arg provide
5128 - aco,radeonsi: save const addr to symbol
5130 - aco: use semantic location as io temp index
5133 - radeonsi: share si_get_tcs_out_patch_stride with aco
5134 - radeonsi: fill part mode tcs aco shader info
5140 - radeonsi: part mode standalone tcs support aco compile
5142 - aco: simplify setup_tcs_info
5143 - aco: pass sw_stage when setup_isel_context
5144 - aco: prepare fix_ls_vgpr_init_bug to be used by gl vs prolog
5145 - aco: add vs prolog instruction selection for radeonsi
5146 - aco: add aco compile interface for radeonsi vs prolog
5147 - aco: do not fix_exports when program is prolog
5150 - radeonsi: extract si_get_vs_prolog_args to be shared with aco
5151 - radeonsi: fix aco options has_ls_vgpr_init_bug setup
5152 - radeonsi: add vs prolog aco build
5153 - radeonsi: set vs has prolog aco shader info
5154 - radeonsi: enable aco compile for part mode standalone vs
5155 - aco,radv,radeonsi: rename is_monolithic to merged_shader_compiled_separately
5157 - aco: do not eliminate final exec write when p_end_with_regs block
5158 - aco: remove p_end_with_regs from needs_exact()
5159 - aco: add ps prolog generation for radeonsi
5160 - aco: handle ps outputs from radeonsi
5161 - aco: add create_fs_end_for_epilog for radeonsi
5162 - aco,radv: remove unused ps epilog info fields
5163 - aco,radv: rename ps epilog info inputs to colors
5164 - aco: simplify export_fs_mrt_color
5165 - aco,radv: add radeonsi spec ps epilog code
5166 - aco: compact ps expilog color export for radeonsi
5167 - aco,radv,radeonsi: pass spi ps input ena and addr
5168 - aco: do not fix_exports when program has epilog
5169 - aco: fix assertion fail when program contains empty block
5170 - aco: create exit block for p_end_with_regs to branch to
5171 - aco: wait memory ops done before go to next shader part
5172 - radeonsi: reduce sgpr count for scratch_offset when aco
5175 - radeonsi: extract si_get_ps_prolog_args to be shared with aco
5178 - radeonsi: extract si_get_ps_epilog_args to be shared with aco
5179 - radeonsi: fill aco shader info for ps part
5181 - radeonsi: enable aco compile for part mode ps
5182 - radeonsi: disable disk cache when use aco
5232 - aco: insert s_nop before VGPR deallocation
5238 - aco: summarize register demand after handling branches
5239 - aco: don't create sendmsg(dealloc_vgprs) if scratch is used
5248 - aco: fix p_bpermute_gfx6 with input at non-zero byte
5249 - aco: fix p_bpermute_gfx6's exec save/restore with wave32
5250 - aco: clarify bpermute pseudo opcode names
5251 - aco: add adjust_bpermute_dst helper
5252 - aco/spill: skip p_branch in process_block
5253 - aco/spill: add all live-in to merge block spill candidates
5257 - aco: remove fast path in insert_exec_mask's process_instructions
5258 - aco/optimizer_postRA: check overwritten_subdword in is_overwritten_since()
5259 - aco: check logical_phi_info at p_logical_end when eliminating exec writes
5260 - aco: remove unused p_logical_end check when optimizing branching sequence
5263 - aco: reset prefetch in the correct block after removing the exit
5264 - aco/waitcnt: replace wait_cnt::\*_cnt with booleans
5265 - aco/waitcnt: add print helpers
5268 - aco/optimizer_postRA: don't combine DPP across exec on GFX8/9
5269 - aco: don't combine DPP into v_cmpx
5270 - aco: disable zero offset optimization for strict WQM coords
5272 - aco: remove zero offset optimization
5273 - aco: shrink DPP8_instruction
5274 - aco: add fetch_inactive field to DPP instructions
5276 - aco: disable FI for quad/masked swizzle
5277 - aco: fix LdsDirectVMEMHazard WaW with the wrong waitcnt
5278 - aco: only mitigate VcmpxExecWARHazard when necessary
5279 - aco: fix s_setreg hazards
5280 - aco: consider exec_hi in reads_exec()
5281 - aco: resolve all possible hazards at the end of shader parts
5282 - aco/tests: test that hazards are resolved at the end of shader parts
5288 - aco,nir: add export_row_amd intrinsic
5571 - aco: add aco_shader_info::tcs::has_epilog
5572 - aco: add infra for compiling TCS epilogs
5573 - radv,aco: move has_epilog to radv_shader_info
5638 - aco: fix jumping from main TCS to epilog on GFX9+
5639 - aco: adjust TCS epilogs for RADV
5640 - aco: allow SGPRs operands with p_jump_to_epilog
5641 - aco: implement create_tcs_jump_to_epilog()
5648 - aco: rework printing shader stages
5659 - aco: disable shared VGPRs for non-monolithic shaders on GFX9+
5660 - aco: ensure to initialize exec manually for VS as LS on GFX9+
5661 - aco: add support for compiling VS+TCS separately on GFX9+
5664 - aco: ensure to initialize exec manually for non-monolithic {VS,TES}/GS on GFX9+
5665 - aco: add support for compiling {VS,TES}+GS separately on GFX9+
5666 - radv,aco: remove unused clip/cull distances variables
5671 - aco: fix emitting TCS epilogs end on GFX9+
5678 - aco: flag blocks with long-jump as export_end for separate compilation
5679 - aco: adjust fix_exports() for VS/TES as NGG and non-monolithic shaders
5680 - aco: allow separate compilation of NGG shaders
5785 - amd/llvm,aco,radv: implement NGG streamout with GDS_STRMOUT registers on GFX11
6071 - aco: Fix subgroup_id intrinsic on GFX10.3+.
6078 - aco: Remove subgroup_id and num_subgroups intrinsics.
6080 - aco: Refactor select_program to smaller functions.