Lines Matching full:aco

81 - ACO Unimplemented intrinsic instr
82 - RADV/ACO: assert on per-sample interpolation
116 - ACO ERROR: Temporary never defined or are defined after use
136 - aco: Radeonsi unable to use rusticl
864 - aco/ra: fix kill flags after renaming fixed Operands
865 - aco/ra: assert that the register file is empty after register allocation completed
866 - aco/lower_phis: simplify check for uniform predecessors
867 - aco: introduce aco_opcode::p_boolean_phi
868 - aco/vn: copy-propagate trivial phis
869 - aco/lower_phis: generalize init_state() so that it works with any scalar phis
870 - aco/lower_phis: implement SGPR phi lowering
871 - aco: use SGPR phi lowering for uniform phis in divergent merge blocks
872 - aco: use SGPR phi lowering for all loop header phis
873 - aco: use SGPR phi lowering for all scalar phis
874 - aco/optimizer: remove p_linear_phi handling from optimizer
881 - aco/ra: fix handling of killed operands in compact_relocate_vars()
882 - aco/ra: Fix array access when finding register for subdword variables
883 - aco/ra: refactor get_reg_simple() with increased stride.
884 - aco/ra: move can_write_m0() check into get_reg_specified()
885 - aco/ra: re-use registers from killed operands
886 - aco/ra: change heuristic to first fit
887 - aco/ra: use round robin register allocation
888 - aco/assembler: fix MTBUF opcode encoding on GFX11
889 - aco/assembler: slightly refactor MTBUF assembly for more readability
890 - aco/assembler: fix GFX67 MTBUF opcode encoding
891 - aco/scheduler: remove unused register_demand parameter
892 - aco: move live var information into struct Program
893 - aco/reindex_ssa: replace live_var parameter with boolean
894 - aco: make aco::monotonic_buffer_resource declaration visible for aco::IDSet
895 - aco: use aco::monotonic_allocator for IDSet
904 - aco/scheduler: fix register_demand validation debug code
905 - aco/spill: Unconditionally add 2 SGPRs to live-in demand
906 - aco: calculate register demand per instruction as maximum necessary to execute the instruction
907 - aco: track and use the live-in register demand per basic block
908 - aco: remove get_demand_before()
909 - aco/live_var_analysis: slightly refactor handling of additional register demand for Operand copies
910 - aco/live_var_analysis: ignore dead phis
911 - aco/spill: don't remove spilled phis
912 - aco/ra: use live_in_demand in should_compact_linear_vgprs()
913 - aco: add RegisterDemand member to Instruction
914 - aco/util: skip empty blocks in IDSet::insert(IDSet)
915 - aco/live_var_analysis: refactor using ctx struct
916 - aco/live_var_analysis: ignore phi definition and operand demand at predecessors
917 - aco/live_var_analysis: inline block->register_demand updates
918 - aco/live_var_analysis: remove unused includes
919 - aco/live_var_analysis: use separate allocator for temporary live sets
920 - aco/ra: remove special-casing of p_logical_end
924 - aco: compute live-in variables in addition to live-out variables
925 - aco/ra: use live-in variables directly rather than computing them
926 - aco/spill: use live-in variables directly rather than computing them
927 - aco/cssa: use live-in variables instead of live-out variables
928 - aco/validate: use live-in variables for RA validation
929 - aco/print_ir: print live-in instead of live-out variables
930 - aco: remove live-out variables from IR
931 - aco/spill: Don't add phi definitions to live-in variables
1371 - radv/ci: drop duplicate navi21-aco flakes line
1372 - radv/ci: drop duplicate navi31-aco flakes line
1990 - aco/tests: Insert p_logical_start/end in reduce_temp tests
1991 - aco/spill: Insert p_start_linear_vgpr right after p_logical_end
1995 - aco/spill: Don't spill phis with all-undef operands
1996 - aco: Limit rt stages to 128 vgprs
2013 - aco/tests: don't use undef for descriptors
2014 - aco/tests/post_ra: fix various validation errors
2015 - aco/lower_to_hw: fix v_cvt_pk_u16_u32 instruction format
2016 - aco/lower_to_hw: fix 16bit p_insert on gfx8
2017 - aco/tests: validate before and after post-ra tests
2019 - aco/lower_to_hw: don't use regClass to identify subdword reductions
2020 - aco: add a subdword lowering pass
2021 - aco: add tests for lower_subdword
2022 - aco/ra: remove gfx6/7 subdword paths
2023 - aco/lower_to_hw: remove gfx6/7 subdword paths
2029 - aco/gfx11+: use v_cvt_pk_u8_f32 for 8bit constant copies
2030 - aco/gfx10: use v_add_u16 with literal for constant copies
2031 - aco/tests: simplify small constant copy test
2032 - aco/gfx11+: optimize v_fma_mix throughput
2034 - aco/gfx11: use v_swap_b16
2035 - aco/optimizer: remove ineffective vcc opt
2036 - aco/optimizer: remove ineffective undef opt
2037 - aco: remove perfwarn
2038 - aco: don't pass program to emit_bpermute
2039 - aco/lower_to_hw: add copy_constant_sgpr
2040 - aco: small constant copy optimizations
2041 - aco/lower_to_hw: use copy_constant_sgpr for masks
2042 - aco/lower_to_hw: optimize split 64bit constant copies
2043 - aco/optimizer: use p_create_vector to create mask when a copy can't be used
2046 - aco: optimize branching sequence with p_create_vector exec producer
2051 - aco: rework how affinities for acc operands are determined
2052 - aco: add affinities for possible sopk optimizations
2053 - aco/gfx11+: fix inline constants for v_pk_fmac_f16
2054 - aco: move literal unswizzle opt to RA
2055 - aco/ra: use a switch to check vop2acc instruction support
2056 - aco: move s_add_u32 -> s_addk_i32 optimization fully to ra
2058 - aco: add more anonymous namespaces
2059 - aco: make local functions static in files without anonymous namespace
2062 - aco: implement ford, funord, fneo, fequ, fltu, fgeu
2068 - nir/opt_algebraic: add various unordered/ordered patterns from aco
2069 - aco: remove ordered/unordered optimizations
2070 - aco/ir: remove unused vopc helpers
2072 - aco/ra: fix affinity for s_addk
2073 - aco: fix s_delay_alu with salu and trans dependency
2074 - aco,nir: add dpp16_shift_amd intrinsic
2078 - aco/gfx12: use trans s_delay_alu for pseudo scalar
2079 - aco/gfx12: don't allow vgpr operands for pseudo scalar
2080 - aco/gfx11.5: select s_cvt_[ui]32_f32
2081 - aco/gfx11.5: select s_(ceil|floor|trunc|rndne)
2082 - aco: add aco_opcode::p_s_cvt_f16_f32_rtne
2083 - aco/gfx11.5: select SALU float conversions
2084 - aco/gfx11.5: fix s_fmac acc to definition
2085 - aco/gfx11.5: select SOP2 float instructions
2086 - aco/gfx11.5: select SOPC float instructions
2087 - aco/gfx11.5: select SALU fsat
2088 - aco/gfx11.5: select SALU fsign
2089 - aco/gfx11.5+: allow sgpr dst for trans ops and use pseudo scalar ops on gfx12
2090 - aco/gfx11.5: select SALU fneg/fabs
2091 - aco/gfx11.5: select SALU fquantize2f16
2092 - aco: micro optimize VALU fquantize2f16
2093 - aco: handle clustered uniform reductions correctly
2095 - aco: remove optimize_cmp_subgroup_invocation
2097 - aco/optimizer: update temp_rc when converting to uniform bool alu
2098 - aco/gfx11+: don't use VOP3 v_swap_b16
2100 - aco/gfx10+: set lateKill for sgprs used by wave64 VALU writing a mask
2915 - aco: print s_delay_alu INSTSKIP>3 correctly
3154 - radeonsi: serialize shader disassembly string to fix asm dumps for ACO
3173 - radeonsi: call nir_lower_int64 later to fix ACO failure with Tomb Raider
3232 - radeonsi: use shader_info::use_aco_amd to determine whether to use ACO
3630 - radeonsi,aco: Run ac_nir_lower_global_access pass
3840 - aco/tests: add tests for hidden breaks/continues
3841 - aco/tests: add tests for divergent merge phi with undef
3843 - aco/stats: fix s_waitcnt parsing
3844 - aco/stats: don't use VS counter pre-GFX10
3845 - aco/waitcnt: fix DS/VMEM ordered writes when mixed
3846 - aco: make wait_imm indexable
3847 - aco/waitcnt: add target_info
3848 - aco/waitcnt: refactor for indexable wait_imm
3849 - aco/stats: refactor for indexable wait_imm
3850 - aco: add wait_imm::unpack and wait_imm::max
3852 - aco: form hard clauses in VS prologs
3853 - aco: copy VS prolog constants after loads
3854 - aco: support VS prologs with unaligned access
3855 - aco/util: improve small_vec assertion
3857 - aco: don't count certain pseudo towards VMEM_STORE_CLAUSE_MAX_GRAB_DIST
3858 - aco/tests: support GFX12
3859 - aco: add SFPU/ValuPseudoScalarTrans instr class
3860 - aco: add GFX11.5+ opcodes
3861 - aco: support GFX12 in assembler
3862 - aco/tests: add GFX12 assembler tests
3863 - aco: don't change prefetch mode on GFX11.5+
3864 - aco/gfx12: disable s_cmpk optimization
3865 - aco: add GFX12 wait counters
3866 - aco/waitcnt: support GFX12 in waitcnt pass
3867 - aco/stats: support GFX12 in collect_preasm_stats()
3868 - aco: update VS prolog waitcnt for GFX12
3869 - aco/lower_phis: create loop header phis for non-boolean loop exit phis
3870 - aco: create lcssa phis for continue_or_break loops when necessary
3871 - aco: use scalar phi lowering for lcssa workaround
3872 - aco: remove nir_to_aco
3873 - aco/lower_phis: don't create boolean loop header phis in some situations
3875 - aco: support GFX12 in insert_NOPs
3876 - aco/gfx12: implement subgroup shader clock
3877 - aco/gfx12: implement workgroup barrier
3878 - aco/gfx12: sign-extend s_getpc_b64
3879 - aco/gfx12: don't create v_fmac_legacy_f32
3880 - aco/gfx12: use ttmp9/ttmp7 for workgroup id
3882 - aco/gfx12: remove MIMG vector affinity
3883 - aco/gfx12: decrease max_nsa_vgprs for VSAMPLE
3884 - aco/gfx12: disallow SCC and most constants for BUF SOFFSET
3885 - aco: fix fddx/y with uniform inf/nan input
3888 - aco/gfx12: implement load_subgroup_id
3890 - aco/gfx12: fix s_wait_event immediate
3891 - aco: don't combine vgpr into writelane src0
3892 - aco: implement nir_atomic_op_ordered_add_gfx12_amd
3893 - aco: implement nir_intrinsic_nop_amd and nir_intrinsic_sleep_amd
3897 - aco: remove support for sub-dword push constants
3898 - aco/gfx6: set glc for buffer_store_byte/short
3899 - aco: inline store_vmem_mubuf/emit_single_mubuf_store
3900 - aco: use ac_hw_cache_flags
3901 - aco: use GFX12 scope/temporal-hint
3903 - aco: use ac_get_hw_cache_flags()
3904 - aco: remove some missing label resets
3907 - aco: insert s_nop before discard early exit sendmsg(dealloc_vgpr)
3910 - aco: remove push constants
3911 - aco/insert_exec_mask: ensure top mask is not a temporary at loop exits
3913 - aco: use 1.5x vgprs for gfx1151 and gfx12
3914 - aco: skip continue_or_break LCSSA phis when not needed
3915 - aco: use s_pack_ll_b32_b16 for pack_32_2x16_split
3916 - aco: combine extracts into s_pack_ll_b32_b16
3917 - aco: use s_pack_*_b32_b16 more in p_insert/p_extract lowering
3918 - aco: turn split(vec()) into p_parallelcopy instead of p_create_vector
3919 - aco: add missing isConstant()/isTemp() checks
3920 - aco: fix follow_operand with combined label_extract and label_split
3921 - aco: use alignment information in visit_load_constant()
3922 - aco: fix wmma raw hazard
3923 - aco: replace constant v_bfrev_b32 with v_mov_b32 to create vopd
3924 - aco/gfx11: don't use v_bfrev_b32 with wave64
3931 - aco/gfx11.5: workaround export priority issue
3932 - aco: fix validation of v_s\_ opcodes
4135 - aco: add support for remapping color attachments
4208 - aco: use new common helpers for building buffer descriptors
4260 - aco: adjust loading local invocation ID for GS on GFX12
4651 - aco: Add missing nir_builder include.