Lines Matching full:aco

87 - ACO doesn't hide lds_param_load latencies
88 - ACO doesn't form a VMEM clause for image stores in one case on GFX11
95 - ACO tests SIGSEGV in debian-vulkan job with LTO enabled
1213 - aco: fix nir_op_pack_32_4x8 handling
1467 - aco/insert_exec_mask: unify exec restore code after divergent control flow
1468 - aco/insert_exec_mask: replace phi for loop restore mask with explicit copies
1469 - aco/insert_exec_mask: only create loop phis for exec mask if necessary
1470 - aco: give spiller more room to assign spilled SGPRs to VGPRs
1474 - aco/insert_exec_mask: Fix unconditional demote at top-level control flow.
1475 - aco/insert_exec_mask: tiny refactor
1476 - aco: always terminate quads if they have been demoted entirely
1477 - aco/insert_exec_mask: Reduce latency when switching to WQM.
1479 - aco: enable WQM if demote is used with maximal reconvergence
1484 - aco: rematerialize constants in every basic block during optimizer
1485 - aco: reorder code and use namespaces in aco_interface.cpp
1486 - aco/util: small_vec few additions
1487 - aco: use small_vec as Block::edge_vec for predecessors and successors
1488 - aco/spill: refactor SSA repairing
1489 - aco/spill: don't allocate extra spill_id for phi operands in add_coupling_code()
1490 - aco/spill: add spills_entry interferences only when necessary
1491 - aco/spill: refactor adding spilled vars into separate function add_to_spills()
1492 - aco/spill: keep live-out variables spilled at branch blocks
1493 - aco/spill: don't prefer to spill phis at merge blocks
1494 - aco/spill: add interferences with variables spilled at loop headers
1495 - aco/spill: avoid re-spilling loop-carried variables in process_block()
1496 - aco/spill: avoid re-spilling loop-carried variables in add_coupling_code()
1497 - aco/spill: keep loop-carried variables spilled at loop headers
1498 - aco/spill: keep loop-carried variables spilled at merge blocks
1499 - aco/spill: select more loop-carried variables to be spilled
1500 - aco/spill: keep loop variables spilled during nested loops
1501 - aco: use instr_class::branch to identify SOPP branches
1502 - aco: remove SOPP_instruction::block member
1503 - aco: unify different SALU types into single struct SALU_instruction
1504 - aco/builder: use accessor functions instead of casting to subtypes
1505 - aco: change return type of create_instruction() to Instruction*
1506 - aco: defer instruction size from aco::Format in create_instruction()
1507 - aco: remove create_instruction() template parameter
1508 - aco: move create_instruction() to aco_ir.cpp
1509 - aco/spill: Fix assertion for nested loops
1510 - aco/spill: pass live_vars to spill_ctx
1511 - aco/spill: compute live-in variables from live-out
1512 - aco/spill: maintain valid live vars at any point
1513 - aco/spill: use live variables instead of next_use_distances in add_coupling_code()
1514 - aco/spill: gather information about average use distances
1515 - aco/spill: use average use distances in process_block()
1516 - aco/spill: use average use distances in init_live_in_vars() for merge blocks
1517 - aco/spill: use average use distances to spill loop variables
1518 - aco/ra: fix kill flags after renaming fixed Operands
2550 - aco/tests: Insert p_logical_start/end in reduce_temp tests
2551 - aco/spill: Insert p_start_linear_vgpr right after p_logical_end
2559 - aco: reassign split vector to SOPC
2560 - aco: stop scheduling at p_logical_end
2562 - aco: implement as_uniform and ballot_relaxed
2567 - aco: remove boolean shuffle isel
2568 - aco: fix printing dpp8
2569 - aco: validate v_permlane opsel correctly
2570 - aco: support v_permlane64_b32
2571 - aco/gfx11: use v_nop to resolve VcmpxPermlaneHazard
2572 - aco/gfx11: resolve VcmpxPermlaneHazard for v_permlane64
2573 - aco: implement rotate
2577 - aco/gfx11+: disable v_pk_fmac_f16_dpp
2578 - aco: add packed fma dpp note to README-ISA
2579 - aco: don't remove branches that skip v_writelane_b32
2580 - aco/print_ir: don't use alloca for input modifiers
2581 - aco: print neg prettier for packed math
2582 - aco: don't print hi() for permlane opsel
2583 - aco: print permlane16 bc/fi
2584 - aco: print exec/vcc_lo/hi for single dword access
2585 - aco/gfx11+: limit hard clauses to 32 instructions
2587 - aco: use fmamk/ak instead of fma with inline constant for more VOPD
2590 - aco: create pseudo instructions with correct struct
2591 - aco/post-ra: rename overwritten_subdword to allow additional uses
2592 - aco/post-ra: assume scc is going to be overwritten by phis at end of blocks
2593 - aco: store if pseudo instr needs scratch reg
2594 - aco/post-ra: track pseudo scratch sgpr/scc clobber
2595 - aco/ssa_elimination: check if pseudo scratch reg overwrittes regs used for v_cmpx opt
2596 - aco/builder: improve v_mul_imm for negative imm
2597 - aco/builder: use 24bit mul if low bits of imm are zero
2598 - aco/optimizer: combine v_mul_i32_i24 and add to mad
2599 - aco: avoid full 32bit imul for uniform reduce/scan
2600 - aco: don't combine mul+add_clamp to mad_clamp
2601 - aco/ra: use SDWA for 16bit instructions when the second byte is blocked
2602 - aco/vn: remove instruction hash templates
2603 - aco: use v1 definition for v_interp_p1lv_f16
2604 - aco/assembler: add vintrp high_16bit support
2605 - aco: swap opsel and wait_exp for vinterp
2606 - aco: support high_16bits FS IO
2607 - aco/tests: add assembler tests for interp high_16bits
2608 - aco/gfx9: all non legacy opsel instructions only write 16bits
2609 - aco: use v_interp_p2_f16 opsel
2610 - aco: add ra test for hi v_interp_p2_f16
3627 - aco: Only fix used variables to registers
4041 - radeonsi,aco: remove the VS prolog
4156 - aco: implement aco_is_gpu_supported using switch statement
4157 - aco: add a helper printing shader asm by disassembling via LLVM
4895 - aco: don't use python 3.7+ feature in aco_opcodes.py
4912 - aco: fix labelling of s_not with constant
4913 - aco: add VOPD format
4914 - aco: add VOPD statistic
4915 - aco: refactor schedule_ilp main loop
4916 - aco: implement VOPD scheduler
4917 - aco: enable VOPD scheduler
4918 - aco: fix >8 byte linear vgpr copies
4919 - aco/tests: fix to_hw_instr.swap_linear_vgpr
4920 - aco: refactor create_vopd_instruction
4921 - aco: swap operands to create VOPD instructions
4922 - aco: turn v_mov_b32 into addition to create VOPD instructions
4923 - aco: improve printing of VOPD instructions
4924 - aco/tests: add tests for VOPD operand swapping
4925 - aco/tests: use raw strings in form_hard_clauses.nsa
4927 - aco/ra: don't initialize assigned in initializer list
4928 - aco/ra: fix GFX9- writelane
4929 - aco: don't combine linear and normal VGPR copies
4930 - aco/ra: disable p_start_linear_vgpr allocation hint
4931 - aco: allow p_start_linear_vgpr to use multiple operands
4932 - aco: require linear vgpr uses to be late kill
4933 - aco: only allow linear vgpr kills in top-level blocks
4934 - aco/ra: constify various RegisterFile
4935 - aco/ra: move parallelcopy creation into helper
4936 - aco/ra: change get_reg_bounds() helper
4937 - aco/ra: rework linear VGPR allocation
4938 - aco/ra: disable live range splitting of linear vgprs
4939 - aco/ra: emit linear VGPR parallel copy separately
4940 - aco/tests: add tests for linear VGPR register allocation
4941 - aco: optimize for purely linear VGPR copies
4947 - aco: don't pass constant to is_overwritten_since()
4950 - radv,aco: allow VS prologs to increase VGPR usage
4951 - aco: don't reuse misaligned attribute destination VGPRs in VS prologs
4952 - aco/util: add small_vec
4954 - aco/cssa: reset equal_anc_out if merging fails
4955 - aco/cssa: update comments
4956 - aco: fix GFX6 buffer_load_dwordx4 opcode number
4957 - aco: rename opcode->instruction
4958 - aco: refactor VOPC opcode list
4959 - aco: use single tuple for all opcode numbers
4960 - aco: use op()
4961 - aco: move dot/wmma instructions into VOP3P list
4962 - aco: unify MIMG opcode lists
4963 - aco/gfx11: fix scratch ST mode assembly
4964 - aco: split instruction assembly into functions
4965 - aco: always emit float mode for merged shaders compiled separately
4966 - aco: avoid breaking clauses with waitcnt
4968 - aco: implement mqsad_4x8 and shfr
4974 - aco: remove unreachable merge blocks
4975 - aco: ensure loop exits exist in NIR
4976 - aco: save/reset/combine has_divergent_continue in uniform branches
4977 - nir,aco: add test intrinsics
4978 - aco/tests: add isel test helpers
4979 - aco/tests: add control flow tests
4980 - aco: assume no unreachable blocks
4981 - aco: don't include the clause in VMEM_CLAUSE_MAX_GRAB_DIST
4982 - aco: remove occupancy check in dealloc_vgprs()
4983 - aco/tests: don't assume constructor order
4984 - aco/tests: remove LLVM 11 code
4986 - aco: include LDSDIR in latency/etc stats
4987 - aco: make store clauses more aggressively
4988 - aco: schedule LDSDIR instructions
4989 - aco: schedule LDS instructions
4990 - aco: split vop3p results
4991 - aco/waitcnt: fix DS/VMEM ordered writes when mixed
4992 - aco: create lcssa phis for continue_or_break loops when necessary
5178 - aco: silent checking if clrxdisasm is available
5359 - radv,aco: stop duplicating PS/TCS epilog fields
5411 - radv: advertise extendedDynamicState3AlphaToOneEnable with ACO
5502 - aco: use SPDX-License-Identifier
5810 - aco: Eliminate SCC copies when possible.
5814 - aco: Allow passing constant operand to is_overwritten_since.
5823 - aco: Use common helper for counting tess level components.
5824 - aco: Use tess factors when TCS jumps to epilog.
5827 - radv, aco: Delete now dead TCS epilog code.
5835 - radv, aco: Remove the code that jumped to RADV's TCS epilogs.
5840 - aco: Delete all TCS epilog code.
5872 - aco/optimizer_postRA: Remove a check from SCC no-compare optimization.