Lines Matching full:aco
48 - New compiler backend "ACO" for RADV (RADV_PERFTEST=aco)
49 - VK_EXT_shader_demote_to_helper_invocation on RADV/ACO.
58 - radv/aco Jedi Fallen Order hair rendering buggy
74 - Rename ACO README to README.md
484 - amd: Build aco only if radv is enabled
782 - aco: Initial commit of independent AMD compiler
783 - radv/aco: Setup alternate path in RADV to support the experimental
784 ACO compiler
786 - radv/aco: enable VK_EXT_shader_demote_to_helper_invocation
788 - aco: only emit waitcnt on loop continues if we there was some load or
793 - radv/aco: Don't lower subtractions
794 - aco: call nir_opt_algebraic_late() exhaustively
796 - aco: re-use existing phi instruction when lowering boolean phis
797 - aco: don't reorder instructions in order to lower boolean phis
798 - aco: don't combine minmax3 if there is a neg or abs modifier in
800 - aco: ensure that uniform booleans are computed in WQM if their uses
802 - aco: refactor value numbering
803 - aco: restrict scheduling depending on max_waves
804 - aco: only skip RAR dependencies if the variable is killed somewhere
805 - aco: add can_reorder flags to load_ubo and load_constant
806 - aco: don't schedule instructions through depending VMEM instructions
807 - aco: Lower to CSSA
808 - aco: improve live variable analysis
809 - aco: remove potential critical edge on loops.
810 - aco: fix live-range splits of phis
811 - aco: fix transitive affinities of spilled variables
812 - aco: don't insert the exec mask into set of live-out variables when
814 - aco: consider loop_exit blocks like merge blocks, even if they have
816 - aco: don't add interferences between spilled phi operands
817 - aco: simplify calculation of target register pressure when spilling
818 - aco: ensure that spilled VGPR reloads are done after p_logical_start
819 - aco: omit linear VGPRs as spill variables
820 - aco: always set scratch_offset in startpgm
821 - aco: implement VGPR spilling
822 - docs/relnotes/new_features.txt: Add note about ACO
823 - aco: fix immediate offset for spills if scratch is used
824 - aco: only use single-dword loads/stores for spilling
825 - aco: fix accidential reordering of instructions when scheduling
826 - aco: workaround Tonga/Iceland hardware bug
827 - aco: fix invalid access on Pseudo_instructions
828 - aco: preserve kill flag on moved operands during RA
829 - aco: don't split live-ranges of linear VGPRs
830 - aco: fix a couple of value numbering issues
2539 - android: aco: fix undefined template 'std::__1::array' build errors
2541 - android: aco: add support for libmesa_aco
2543 - android: aco: fix Lower to CSSA
2554 - aco: Cleanup insert_before_logical_end
2793 - aco: run nir_lower_int64() before nir_lower_idiv()
2794 - aco: implement 64-bit ineg
2795 - aco: fix GFX9 opcode for v_xad_u32
2796 - aco: fix v_subrev_co_u32_e64 opcode
2797 - aco: fix opcode for s_mul_hi_i32
2798 - aco: check for duplicate opcode numbers
2799 - radv/aco: actually disable ACO when unsupported
2800 - aco,radv/aco: get dissassembly for release builds if requested
2801 - aco: store printed backend IR in binary
2802 - radv/aco: return a correct name and description for the backend IR
2803 - aco,radv: rename record_llvm_ir/llvm_ir_string to record_ir/ir_string
2804 - aco: don't CSE v_readlane_b32/v_readfirstlane_b32
2805 - aco: CSE readlane/readfirstlane/permute/reduce with the same exec
2807 - aco: set loop_info::has_discard for demotes
2808 - aco: don't remove the loop exec mask in transition_to_Exact()
2809 - radv/aco,aco: set lower_fmod
2811 - aco: fix load_constant with multiple arrays
2814 - aco: move s_andn2_b64 instructions out of the p_discard_if
2815 - aco: enable nir_opt_sink
2816 - aco: Allow literals on VOP3 instructions.
2817 - aco: Assemble opsel in VOP3 instructions.
2818 - aco: workaround GFX10 0x3f branch bug
2819 - aco: pad code with s_code_end on GFX10
2820 - aco: Initial work to avoid GFX10 hazards.
2821 - aco: Use the VOP3-only add/sub GFX10 instructions if needed.
2822 - aco: Have s_waitcnt_vscnt write to NULL.
2823 - radv/aco: disable NGG when ACO is used
2824 - aco/gfx10: fix inline uniform blocks
2825 - aco/gfx10: disable GFX9 1D texture workarounds
2826 - aco: rework scratch resource code
2827 - aco: update print_ir
2830 - aco: don't apply sgprs/constants to read/write lane instructions
2831 - aco: use can_accept_constant in valu_can_accept_literal
2832 - aco: readfirstlane vgpr pointers in convert_pointer_to_64_bit()
2833 - aco: implement divergent vulkan_resource_index
2834 - aco: don't use p_as_uniform for vgpr sampler/image indices
2835 - aco: fix scheduling with s_memtime/s_memrealtime
2836 - aco: don't CSE s_memtime
2837 - aco: emit_split_vector() s_memtime results
2839 - aco: use nir_lower_idiv_precise
2840 - aco: run opt_algebraic in a loop
2841 - aco: small stage corrections
2842 - aco: fix 64-bit p_extract_vector on 32-bit p_create_vector
2843 - aco: create load_lds/store_lds helpers
2844 - aco: fix sparse store_lds()
2845 - aco: properly combine additions into ds_write2_b64/ds_read2_b64
2846 - aco: use ds_read2_b64/ds_write2_b64
2847 - aco: add a few missing checks in value numbering
2848 - aco: keep can_reorder/barrier when combining addition into SMEM
2849 - aco: add missing bld.scc()
2850 - Revert "aco: only emit waitcnt on loop continues if we there was some
2853 - aco: increase accuracy of SGPR limits
2854 - aco: take LDS into account when calculating num_waves
2855 - aco: Fix reductions on GFX10.
2856 - aco: Remove dead code in reduction lowering.
2857 - aco: try to group together VMEM loads of the same resource
2858 - aco: a couple loop handling fixes for GFX10 hazard pass
2859 - aco: rename README to README.md
2860 - aco: fix new_demand calculation for first instructions
2861 - aco: fix shuffle with uniform operands
2862 - aco: fix read_invocation with VGPR lane index
2863 - aco: don't propagate vgprs into v_readlane/v_writelane
2864 - aco: don't combine literals into v_cndmask_b32/v_subb/v_addc
2865 - aco: fix 64-bit fsign with 0
2866 - aco: propagate p_wqm on an image_sample's coordinate p_create_vector
2867 - aco: fix i2i64
2868 - aco: add v_nop inbetween exec write and VMEM/DS/FLAT
3275 - aco: Set +wavefrontsize64 for LLVM disassembler in GFX10 wave64 mode.
3276 - aco: Add missing GFX10 specific fields and some README notes.
3277 - aco: Support GFX10 SMEM in aco_assembler.
3278 - aco: Support GFX10 VINTRP in aco_assembler.
3279 - aco: Support GFX10 DS in aco_assembler.
3280 - aco: Support GFX10 MUBUF in aco_assembler.
3282 - aco: Link ACO with amd/common.
3283 - aco: Support GFX10 MTBUF in aco_assembler.
3284 - aco: Support GFX10 MIMG and GFX9 D16 in aco_assembler.
3285 - aco: Fix GFX9 FLAT, SCRATCH, GLOBAL instructions, add GFX10 support.
3286 - aco: Support GFX10 EXP in aco_assembler.
3287 - aco: Support GFX10 VOP3 and VOP1 as VOP3 in aco_assembler.
3288 - aco: Set GFX10 DLC bit properly.
3289 - aco: Use ac_get_sampler_dim, delete duplicate code.
3290 - aco: Set GFX10 dimensionality on the instructions that need it.
3291 - aco: Support subvector loops in aco_assembler.
3292 - aco: Fix VS input VGPRs on GFX10.
3293 - aco: Fix s_dcache_wb on GFX10.
3294 - aco: Add extra assertion for number of FS input VGPRs.
3295 - aco: Clean up usages of PhysReg::reg from aco_assembler.
3296 - aco/gfx10: Wait for pending SMEM stores before loads
3297 - aco/gfx10: Fix PS exports for SPI_SHADER_32_AR.
3298 - aco/gfx10: Update constant addresses in fix_branches_gfx10.
3299 - aco/gfx10: Add notes about some GFX10 hazards.
3300 - aco/gfx10: Mitigate VcmpxPermlaneHazard.
3301 - aco/gfx10: Mitigate VcmpxExecWARHazard.
3302 - aco/gfx10: Mitigate SMEMtoVectorWriteHazard.
3303 - aco/gfx10: Mitigate LdsBranchVmemWARHazard.
3304 - aco/gfx10: Fix mitigation of VMEMtoScalarWriteHazard.
3305 - aco: Refactor hazard mitigations, separate pass for GFX10.
3308 - aco: Implement subgroup shuffle in GFX10 wave64 mode.
3309 - aco: Introduce vgpr_limit to keep track of available VGPRs.
3310 - radv: Enable ACO on Navi.