Lines Matching full:aco

62 - radv/aco: xfb bug
64 - Occasional flicker corruption in Rage 2, e.g. after loading, with ACO on RX 5700 XT
69 - radv,aco: Regression with global atomics with negative offsets
611 - aco: Fix an MSVC warning
833 - aco: emit nir_intrinsic_discard() as p_discard_if()
834 - aco: remove block_kind_discard
835 - aco: make Preserve_WQM independent from block_kind_uses_discard_if
836 - aco: merge block_kind_uses_[demote|discard_if]
837 - aco: optimize discard_if when WQM is not needed afterwards
842 - aco/insert_exec_mask: stay in WQM while helper lanes are still needed
843 - aco: don't propagate WQM for p_as_uniform
844 - aco: don't emit WQM for bool_to_scalar_condition
845 - aco/insert_exec_mask: remove Preserve_WQM flag
846 - aco/insert_exec_mask: remove some unnecessary WQM loop handling code
847 - aco/insert_exec_mask: remove ever_again_needs and Exact_Branch
848 - aco/insert_exec_mask: refactor and simplify get_block_needs()
849 - aco/insert_exec_mask: refactor and remove some unnecessary WQM handling code
850 - aco: relax condition to remove branches in case of few instructions
851 - aco/ra: don't immediately assign a register for p_branch
855 - aco/ra: count constant moves in get_reg_create_vector()
856 - aco/ra: special-case get_reg_for_create_vector_copy()
857 - aco/ra: refactor find_vars() to return a vector
858 - aco/ra: refactor collect_vars() to return a sorted vector
860 - aco/optimizer: fix call to can_use_opsel() in apply_insert()
861 - aco: remove 'high' parameter from can_use_opsel()
862 - aco: use branch definition as scratch register for SSA lowering
863 - aco/ra: fix stride check on subdword parallelcopies for create_vector
864 - aco/optimizer: check recursively if we can eliminate s_and exec
865 - aco/ra: only use VCC if program->needs_vcc == true
866 - aco/ra: create VCC-affinities during RA
867 - aco/ra: omit VCC affinity on VOPC_SDWA for GFX9+
868 - aco: make program->needs_vcc independent of VCC hints
869 - aco: remove occurences of VCC hint
870 - aco: remove register hints entirely
871 - aco/ra: fix live-range splits of phi definitions
1314 - aco: do not use designated initializers
1378 - radv, aco: Add u_foreach_bit to .clang-format.
1387 - aco: Remove 0 data components from image stores.
1390 - aco: Implement 64bit uadd_sat.
1391 - aco: Implement scalar iadd_sat.
1393 - radv, aco: Packed iadd_sat/uadd_sat.
3032 - aco/tests: add a bunch more building helpers
3033 - aco/tests: implement sub-dword program inputs
3034 - aco: don't combine fneg/fabs of different bit-size
3035 - aco: don't apply omod/clamp of different bit-size
3036 - aco: don't combine add/mul of different bit-size
3037 - aco: fix neg(mul)/abs(mul) optimization with different bit-size
3038 - aco: add test for optimizations with casts
3039 - aco: don't encode src2 for v_writelane_b32_e64
3046 - aco: remove vcc hint from branch definitions
3047 - aco/ra: add get_reg_phi() helper
3048 - aco/ra: fix register allocation of branch definitions
3049 - aco: add validate_instr_defs()
3050 - aco: fix branch definition validation
3051 - aco/tests: add test for branch definition RA
3052 - aco: rework removal of jumps over branches
3053 - aco/insert_exec_mask: fix top-level to-exact with non-global exact mask
3054 - aco/insert_exec_mask: use get_exec_op
3055 - aco/insert_exec_mask: optimize top-level transition to exact before demote
3056 - aco: split and recombine unaligned sgpr inputs
3057 - radv,aco,ac/llvm: fix indirect dispatches on the compute queue on GFX7-10
3058 - aco: fix fp16 opcode definitions
3059 - aco: improve support for v_fma_mix
3060 - aco: refactor selection of mad/fma
3061 - aco: use v_fma_mix to combine mul/add/fma input conversions
3062 - aco: combine add/mul as v_fma_mix into fma
3063 - aco: apply clamp to v_fma_mix
3064 - aco: use v_fma_mix to combine mul/add/fma output conversions
3065 - aco/tests: add v_fma_mix tests
3067 - aco: implement load_{scalar,vector}_arg_amd and load_smem_amd
3075 - radv,aco: lower vulkan_resource_index in NIR
3076 - radv,aco: lower buffer descriptor loads in NIR
3077 - radv,aco: lower texture descriptor loads in NIR
3078 - radv,aco: lower image descriptor loads in NIR
3079 - aco: fix RA validation of 16-bit fma_mix operands
3080 - aco: don't use v_mad_mix on GFX9 if 16-bit denormals must be preserved
3083 - radv,aco: implement 64-bit inline push constants
3086 - aco: use vcc for 64-bit vgpr addition
3087 - aco: use saddr for global access with sgpr address
3088 - aco: don't expand smem/mubuf global loads
3091 - aco: implement _amd global access intrinsics
3092 - aco: increase global_load_params.max_const_offset_plus_one
3094 - aco: remove old global access intrinsics
3098 - aco: fix signedness of DS_instruction::offset0/1
3099 - aco: handle read2st64/write2st64 in optimizer
3100 - aco: implement load_shared2_amd/store_shared2_amd
3105 - aco/ra: fix vgpr_limit
3174 - aco: implement nir_intrinsic_load_vrs_rates_amd
3190 - radv,aco,llvm: lower adjusting vertex alpha in NIR
3199 - radv,aco: do not lower nir_op_pack_{unorm,snorm}_2x16
3201 - aco: implement nir_op_pack_{uint,sint}_2x16
3206 - radv,aco,llvm: lower post shuffle vertex in NIR
3207 - aco: always emit vk_cvt_pkrtz_f16_f32 for nir_op_pack_half_2x16_split
3229 - radv,aco: lower color exports in NIR
3289 - aco: fix load_barycentric_at_{sample,offset} on GFX6-7
3387 - aco: Allow 1-byte loads and stores with load/store_buffer_amd
3388 - aco: Fix workgroup_id.y and .z for NV_mesh_shader.
3389 - aco: Fix multiview view index for mesh shaders.
3398 - aco: Add storage class for Task Shader payload.
3399 - aco: Support task_payload with barriers, refactor allowed storage class.
3400 - aco: Support memory modes properly with load/store_buffer_amd.
3406 - aco: Remove superfluous code for mesh shader workgroup ID.
3415 - aco: Fix VOP2 instruction format in visit_tex.