Lines Matching full:aco
63 - [ACO] [RADV] Flickering squares in some areas in The Last of Us Part 1 (with workaround)
72 - aco: s_load_dword with negative soffset cause GPU hang
83 - aco: missing dependency on generated header
107 - aco: unused vtmp_in_loop
150 - ACO: dEQP-VK.binding_model.descriptor_buffer.multiple.graphics_geom_buffers1_sets3_imm_samplers h…
191 - radv/zink: ACO assert with DOOM2016
760 - aco: Pass correct number of coords to Vega 1D LOD instruction.
989 - aco: fix NIR infinite loops
991 - aco/dominance: set immediate dominator for any BB without predecessors
992 - aco/value_numbering: clear hashmap between disconnected CFGs
993 - aco/dead_code_analysis: don't add artificial uses to p_startpgm
994 - aco/insert_exec_mask: allow for disconnected CFG
995 - aco/spill: allow for disconnected CFG
1003 - aco: add RT stage enums
1004 - aco: don't set private_segment_buffer/scratch_offset on GFX9+
1005 - aco: move rt_dynamic_callable_stack_base_amd to VGPR
1006 - aco: implement load_ray_launch_{id|size}
1007 - aco: create hw_init_scratch() function for p_init_scratch lowering
1008 - aco: implement select_rt_prolog()
1012 - aco: remove aco::rt_stack variable
1024 - aco: split ps_epilog args before exporting them
1025 - aco/ra: adjust_max_used_regs() for fixed Operands
1026 - aco: don't use shared VGPRs for shaders consisting of multiple binaries
1222 - aco: drop leftover variable
1930 - aco: Swap operands for v_and_b32 in RT prolog
1932 - aco: Un-swap addressable VGPRs/SGPRs in RT prolog
1952 - Revert "aco: Combine v_cvt_u32_f32 with insert to v_cvt_pk_u8_f32."
1953 - aco: use s_bfm_64 for constant copies
1954 - aco: use s_pack_ll_b32_b16 for constant copies
1955 - aco: Improve wave64 cycle estimates.
1956 - aco: fix imod/omod for gfx11 VOP3 opcodes
1957 - aco: add mov/cndmask opcodes to does_fp_op_flush_denorms
1958 - aco: don't allow output modifiers for v_cvt_pkrtz_f16_f32
1959 - aco: allow output modifiers for ldexp_f16
1960 - aco: don't list imod/omod support v_fmaak_f32/v_fmamk_f32
1961 - aco: support omod/imod for v_fmac_f16
1962 - aco: remove stale TODOs about v_interp opsel
1963 - aco: new 16bit VOP3 opcodes can use opsel
1964 - aco: Don't use vcmpx with DPP.
1965 - aco: combine a ^ ~b and ~(a ^ b) to v_xnor_b32
1970 - aco: use v_permlane(x)16_b32 for masked swizzle
1971 - aco/gfx11: use dpp_row_xmask and dpp_row_share
1972 - aco: use and swizzle mask in dpp quad perm
1973 - aco/optimizer_postRA: assume all registers are untrackable in loop headers
1975 - aco: mark mad definition as precise if the mul/add were precise
1976 - aco: use v_fma_mix_f32 for v_fma_f32 with 2 fp16 representable, different literals
1978 - aco: treat VINTERP_INREG as VALU
1979 - aco/ir: rework IR to have one common valu instruction struct
1980 - aco/ra: set opsel_hi to zero when converting to VOP2
1981 - aco: validate VALU modifiers
1982 - aco/print_ir: simplify using VALU instruction
1983 - aco/optimizer: simplify using VALU instruction
1984 - aco: remove VOP[123C]P? structs
1985 - aco: add bitfield array helper classes
1986 - aco: use bitfield array helpers for valu modifiers
1987 - aco/assembler/gfx11: simplify 16bit VOP12C promotion to VOP3
1988 - aco/optimizer: don't reallocate instruction when converting to VOP3
1989 - aco: don't reallocate fma{mk,ak,_mix} instruction
1990 - aco: copy abs/neg with assignment
1991 - aco: use integer access for neg_lo/neg_hi
1992 - aco: use array indexing for opsel/opsel_lo/opsel_hi
1993 - aco: access neg/abs as int in usesModifiers
1994 - aco: use bitfield_array for temporary neg/abs/opsel
1996 - aco: remove duplicates from .clang-format
1998 - aco: don't check usesModifiers for pseudo instructions
1999 - aco: fix p_interp_gfx11 comment
2000 - aco: make .clang-format usable with tests
2001 - aco/ir: fix copy paste bug in convert_to_SDWA
2002 - aco/util: override default assignment operator for bitfield helpers
2003 - aco: clean up to_mad_mix
2004 - aco/ra: don't reallocate VOP3 instruction for non-vcc lane mask
2005 - aco/vn: hash opsel for VOP12C
2006 - aco/assembler: support VOP12C opsel
2007 - aco: validate VOP12C opsel
2008 - aco/to_hw_instr: use VOP1 opsel for v_mov_b16
2009 - aco/ra: prepare for VOP12C opsel
2010 - aco/optimizer: preserve opsel when fusing fma
2011 - aco: handle opsel in combine_comparison_ordering
2012 - aco: handle opsel in combine_ordering_test
2013 - aco: handle opsel in combine_constant_comparison_ordering
2014 - aco: update match_op3_for_vop3 for VOP12C opsel
2015 - aco: support v_cvt_f32_f16 with opsel in combine_mad_mix
2016 - aco: support neg(mul)/abs(mul) optimization in more cases
2017 - aco: return true in usesModifiers for VOP12C with opsel
2018 - aco: swap opsel when swapping VOP2/C operands
2019 - aco/ir: copy opsel when converting to DPP
2020 - aco: don't label mul with opsel as abs/neg
2021 - aco/gfx11: allow opsel for VOP12C
2022 - aco/optimizer: use opsel for VOP12C
2023 - aco: keep label_mul/usedef/minmax in apply_extract
2024 - aco/optimizer: remove to_SDWA
2025 - aco: add tests for fma with opsel
2026 - aco: add tests for dpp with opsel
2027 - aco: add tests for swap operand with opsel
2028 - aco: add tests for cmp ordering with opsel
2029 - aco: add test for min/max combining with opsel
2030 - aco/tests: run optimize.mad_mix.input_conv.modifiers on gfx11
2031 - aco: add tests for neg(mul) with opsel
2032 - aco/tests: add missing dependency on generated header
2951 - radv: Force ACO for BVH build shaders
2992 - aco: Remove is_gs_copy_shader
3417 - aco: implement nir_op_unpack_32_4x8
4321 - aco: implement nir_export_amd
4339 - nir,ac/llvm,aco: remove nir_export_primitive_amd
4340 - nir,ac/llvm,aco,radv,radeonsi: remove nir_export_vertex_amd
4341 - aco: remove early_rast wait insert
4349 - aco: only ls and ps use store output now
4350 - aco, radv: Add load_grid_size_from_user_sgpr to aco options.
4351 - aco, radv: Move is_trap_handler_shader to aco info.
4363 - aco: implement float16 nir_op_pack_(s|u)norm_2x16
4380 - ac,aco: move gfx10 ngg prim count zero workaround to nir
4381 - aco: fix nir_f2u64 translation
4392 - radv,aco: use ac_nir_lower_legacy_gs
4393 - aco: restore semantic_can_reorder for GS output stores
4396 - aco: add support for fp32 addition atomics
4399 - aco/tests: fix assembler.gfx11.vop12c_v128 with LLVM 15
4400 - aco/tests: update assembler tests for latest LLVM 16
4402 - aco: set has_color_exports with GPL
4403 - aco: end reduce tmp after control flow, when used within control flow
4404 - aco/tests: add setup_reduce_temp.divergent_if_phi
4405 - aco/spill: always end spill vgpr after control flow
4406 - aco: limit VALUPartialForwardingHazard search
4411 - aco: fix out-of-bounds access when moving s_mem(real)time across SMEM
4412 - aco: don't modify exec in p_interp_gfx11
4413 - aco: don't apply modifiers through DPP to unsupported instructions
4414 - aco: fix pathological case in LdsDirectVALUHazard
4415 - aco: always update orig_names in get_reg_phi()
4417 - aco/tests: add tests for v_fma_f32 with 2 fp16 literals
4418 - aco: make IDSet sparse
4423 - aco/gfx11: fix RT prolog scratch initialization
4424 - aco: set needs_flat_scr=true for RT
4431 - aco: don't optimize s_or_b64(v_cmp_u_f32(a, b), cmp(a, a))
4432 - aco: fix nir_var_shader_out barriers for task shaders
4435 - aco: remove SMEM_instruction::prevent_overflow
4437 - aco: don't move exec reads around exec writes
4438 - aco: don't move exec writes around exec writes
4919 - aco: remove unused aco_shader_info::vb_desc_usage_mask
5267 - aco/optimizer: Add missing v_lshlrev condition to can_apply_extract.
5268 - aco/optimizer: Optimize p_extract + v_mul_u32_u24 to v_mad_u32_u16.
5271 - radv, aco: Add uses_full_subgroups to compute shader info.
5272 - aco: Enable constant exec mask based optimization on compute shaders.
5274 - aco: Remove dynamic VS input loads.
5276 - radv, aco, ac: Implement pack_half_2x16_rtz_split.
5291 - ac: Port ACO's get_fetch_format to ac_get_safe_fetch_size.
5297 - aco: Get rid of redundant load_vmem_mubuf function.
5298 - aco: Don't set scalar offset on buffer load instructions when it's zero.
5299 - aco: Remove MTBUF zero operand.
5301 - aco/optimizer: Change v_cmp with subgroup invocation to constant.
5306 - aco: Generalize vs_inputs to args_pending_vmem.
5307 - aco, radv: Rename aco_*_key to aco_*_info.
5308 - aco, radv: Move PS epilog and VS prolog args to their info structs.
5309 - aco, radv: Don't use radv_shader_args in aco.
5310 - aco: Don't include headers from radv.
5312 - aco: Remove vtx_binding from MUBUF/MTBUF instructions.
5314 - aco: Implement load_typed_buffer_amd.
5318 - aco: Remove VS inputs from visit_load_input.
5319 - aco: Rename visit_load_input to visit_load_fs_input.
5322 - aco, radv: Remove VS IO information from ACO.
5323 - aco: Don't add soffset to swizzled MUBUF base.
5324 - aco: Use zero for MUBUF/MTBUF when soffset is undefined.
5325 - aco: Disable MUBUF/MTBUF offsets when they are zero.
5326 - aco: Always enable idxen for swizzled buffer access on GFX11.
5370 - aco: Consider p_cbranch_nz as divergent branch too.
5371 - aco: Don't remove exec writes that also write other registers.
5372 - aco: Simplify get_phi_operand using Operand::c32_or_c64.
5373 - aco: Don't verify branch exec read when eliminating exec writes.
5374 - aco: Pop branch operands when targets are same in SSA elimination.
5375 - aco: Call dominator_tree before lower_phis.
5376 - aco: Better phi lowering for merge block when else-side is const.
5388 - aco: Fix optimization of v_cmp with subgroup invocation.
5389 - aco: Don't use nir_selection_control in aco_ir.
5390 - aco: Only include nir.h in instruction selection.
5415 - ac, aco, radv: Clarify LDS size on GFX6, and NGG shaders.
5417 - aco: Remove setup_*_variables and add setup_lds_size instead.
5418 - aco, radv: Remove "key" from aco_compiler_options.
5419 - aco, radv: Remove redundant enable_mrt_output_nan_fixup from PS epilog info.
5421 - aco: Disallow constant propagation on SOPP and fixed operands.