xref: /aosp_15_r20/external/llvm/lib/Target/SystemZ/README.txt (revision 9880d6810fe72a1726cb53787c6711e909410d58)
1*9880d681SAndroid Build Coastguard Worker//===---------------------------------------------------------------------===//
2*9880d681SAndroid Build Coastguard Worker// Random notes about and ideas for the SystemZ backend.
3*9880d681SAndroid Build Coastguard Worker//===---------------------------------------------------------------------===//
4*9880d681SAndroid Build Coastguard Worker
5*9880d681SAndroid Build Coastguard WorkerThe initial backend is deliberately restricted to z10.  We should add support
6*9880d681SAndroid Build Coastguard Workerfor later architectures at some point.
7*9880d681SAndroid Build Coastguard Worker
8*9880d681SAndroid Build Coastguard Worker--
9*9880d681SAndroid Build Coastguard Worker
10*9880d681SAndroid Build Coastguard WorkerIf an inline asm ties an i32 "r" result to an i64 input, the input
11*9880d681SAndroid Build Coastguard Workerwill be treated as an i32, leaving the upper bits uninitialised.
12*9880d681SAndroid Build Coastguard WorkerFor example:
13*9880d681SAndroid Build Coastguard Worker
14*9880d681SAndroid Build Coastguard Workerdefine void @f4(i32 *%dst) {
15*9880d681SAndroid Build Coastguard Worker  %val = call i32 asm "blah $0", "=r,0" (i64 103)
16*9880d681SAndroid Build Coastguard Worker  store i32 %val, i32 *%dst
17*9880d681SAndroid Build Coastguard Worker  ret void
18*9880d681SAndroid Build Coastguard Worker}
19*9880d681SAndroid Build Coastguard Worker
20*9880d681SAndroid Build Coastguard Workerfrom CodeGen/SystemZ/asm-09.ll will use LHI rather than LGHI.
21*9880d681SAndroid Build Coastguard Workerto load 103.  This seems to be a general target-independent problem.
22*9880d681SAndroid Build Coastguard Worker
23*9880d681SAndroid Build Coastguard Worker--
24*9880d681SAndroid Build Coastguard Worker
25*9880d681SAndroid Build Coastguard WorkerThe tuning of the choice between LOAD ADDRESS (LA) and addition in
26*9880d681SAndroid Build Coastguard WorkerSystemZISelDAGToDAG.cpp is suspect.  It should be tweaked based on
27*9880d681SAndroid Build Coastguard Workerperformance measurements.
28*9880d681SAndroid Build Coastguard Worker
29*9880d681SAndroid Build Coastguard Worker--
30*9880d681SAndroid Build Coastguard Worker
31*9880d681SAndroid Build Coastguard WorkerThere is no scheduling support.
32*9880d681SAndroid Build Coastguard Worker
33*9880d681SAndroid Build Coastguard Worker--
34*9880d681SAndroid Build Coastguard Worker
35*9880d681SAndroid Build Coastguard WorkerWe don't use the BRANCH ON INDEX instructions.
36*9880d681SAndroid Build Coastguard Worker
37*9880d681SAndroid Build Coastguard Worker--
38*9880d681SAndroid Build Coastguard Worker
39*9880d681SAndroid Build Coastguard WorkerWe only use MVC, XC and CLC for constant-length block operations.
40*9880d681SAndroid Build Coastguard WorkerWe could extend them to variable-length operations too,
41*9880d681SAndroid Build Coastguard Workerusing EXECUTE RELATIVE LONG.
42*9880d681SAndroid Build Coastguard Worker
43*9880d681SAndroid Build Coastguard WorkerMVCIN, MVCLE and CLCLE may be worthwhile too.
44*9880d681SAndroid Build Coastguard Worker
45*9880d681SAndroid Build Coastguard Worker--
46*9880d681SAndroid Build Coastguard Worker
47*9880d681SAndroid Build Coastguard WorkerWe don't use CUSE or the TRANSLATE family of instructions for string
48*9880d681SAndroid Build Coastguard Workeroperations.  The TRANSLATE ones are probably more difficult to exploit.
49*9880d681SAndroid Build Coastguard Worker
50*9880d681SAndroid Build Coastguard Worker--
51*9880d681SAndroid Build Coastguard Worker
52*9880d681SAndroid Build Coastguard WorkerWe don't take full advantage of builtins like fabsl because the calling
53*9880d681SAndroid Build Coastguard Workerconventions require f128s to be returned by invisible reference.
54*9880d681SAndroid Build Coastguard Worker
55*9880d681SAndroid Build Coastguard Worker--
56*9880d681SAndroid Build Coastguard Worker
57*9880d681SAndroid Build Coastguard WorkerADD LOGICAL WITH SIGNED IMMEDIATE could be useful when we need to
58*9880d681SAndroid Build Coastguard Workerproduce a carry.  SUBTRACT LOGICAL IMMEDIATE could be useful when we
59*9880d681SAndroid Build Coastguard Workerneed to produce a borrow.  (Note that there are no memory forms of
60*9880d681SAndroid Build Coastguard WorkerADD LOGICAL WITH CARRY and SUBTRACT LOGICAL WITH BORROW, so the high
61*9880d681SAndroid Build Coastguard Workerpart of 128-bit memory operations would probably need to be done
62*9880d681SAndroid Build Coastguard Workervia a register.)
63*9880d681SAndroid Build Coastguard Worker
64*9880d681SAndroid Build Coastguard Worker--
65*9880d681SAndroid Build Coastguard Worker
66*9880d681SAndroid Build Coastguard WorkerWe don't use ICM or STCM.
67*9880d681SAndroid Build Coastguard Worker
68*9880d681SAndroid Build Coastguard Worker--
69*9880d681SAndroid Build Coastguard Worker
70*9880d681SAndroid Build Coastguard WorkerDAGCombiner doesn't yet fold truncations of extended loads.  Functions like:
71*9880d681SAndroid Build Coastguard Worker
72*9880d681SAndroid Build Coastguard Worker    unsigned long f (unsigned long x, unsigned short *y)
73*9880d681SAndroid Build Coastguard Worker    {
74*9880d681SAndroid Build Coastguard Worker      return (x << 32) | *y;
75*9880d681SAndroid Build Coastguard Worker    }
76*9880d681SAndroid Build Coastguard Worker
77*9880d681SAndroid Build Coastguard Workertherefore end up as:
78*9880d681SAndroid Build Coastguard Worker
79*9880d681SAndroid Build Coastguard Worker        sllg    %r2, %r2, 32
80*9880d681SAndroid Build Coastguard Worker        llgh    %r0, 0(%r3)
81*9880d681SAndroid Build Coastguard Worker        lr      %r2, %r0
82*9880d681SAndroid Build Coastguard Worker        br      %r14
83*9880d681SAndroid Build Coastguard Worker
84*9880d681SAndroid Build Coastguard Workerbut truncating the load would give:
85*9880d681SAndroid Build Coastguard Worker
86*9880d681SAndroid Build Coastguard Worker        sllg    %r2, %r2, 32
87*9880d681SAndroid Build Coastguard Worker        lh      %r2, 0(%r3)
88*9880d681SAndroid Build Coastguard Worker        br      %r14
89*9880d681SAndroid Build Coastguard Worker
90*9880d681SAndroid Build Coastguard Worker--
91*9880d681SAndroid Build Coastguard Worker
92*9880d681SAndroid Build Coastguard WorkerFunctions like:
93*9880d681SAndroid Build Coastguard Worker
94*9880d681SAndroid Build Coastguard Workerdefine i64 @f1(i64 %a) {
95*9880d681SAndroid Build Coastguard Worker  %and = and i64 %a, 1
96*9880d681SAndroid Build Coastguard Worker  ret i64 %and
97*9880d681SAndroid Build Coastguard Worker}
98*9880d681SAndroid Build Coastguard Worker
99*9880d681SAndroid Build Coastguard Workerought to be implemented as:
100*9880d681SAndroid Build Coastguard Worker
101*9880d681SAndroid Build Coastguard Worker        lhi     %r0, 1
102*9880d681SAndroid Build Coastguard Worker        ngr     %r2, %r0
103*9880d681SAndroid Build Coastguard Worker        br      %r14
104*9880d681SAndroid Build Coastguard Worker
105*9880d681SAndroid Build Coastguard Workerbut two-address optimizations reverse the order of the AND and force:
106*9880d681SAndroid Build Coastguard Worker
107*9880d681SAndroid Build Coastguard Worker        lhi     %r0, 1
108*9880d681SAndroid Build Coastguard Worker        ngr     %r0, %r2
109*9880d681SAndroid Build Coastguard Worker        lgr     %r2, %r0
110*9880d681SAndroid Build Coastguard Worker        br      %r14
111*9880d681SAndroid Build Coastguard Worker
112*9880d681SAndroid Build Coastguard WorkerCodeGen/SystemZ/and-04.ll has several examples of this.
113*9880d681SAndroid Build Coastguard Worker
114*9880d681SAndroid Build Coastguard Worker--
115*9880d681SAndroid Build Coastguard Worker
116*9880d681SAndroid Build Coastguard WorkerOut-of-range displacements are usually handled by loading the full
117*9880d681SAndroid Build Coastguard Workeraddress into a register.  In many cases it would be better to create
118*9880d681SAndroid Build Coastguard Workeran anchor point instead.  E.g. for:
119*9880d681SAndroid Build Coastguard Worker
120*9880d681SAndroid Build Coastguard Workerdefine void @f4a(i128 *%aptr, i64 %base) {
121*9880d681SAndroid Build Coastguard Worker  %addr = add i64 %base, 524288
122*9880d681SAndroid Build Coastguard Worker  %bptr = inttoptr i64 %addr to i128 *
123*9880d681SAndroid Build Coastguard Worker  %a = load volatile i128 *%aptr
124*9880d681SAndroid Build Coastguard Worker  %b = load i128 *%bptr
125*9880d681SAndroid Build Coastguard Worker  %add = add i128 %a, %b
126*9880d681SAndroid Build Coastguard Worker  store i128 %add, i128 *%aptr
127*9880d681SAndroid Build Coastguard Worker  ret void
128*9880d681SAndroid Build Coastguard Worker}
129*9880d681SAndroid Build Coastguard Worker
130*9880d681SAndroid Build Coastguard Worker(from CodeGen/SystemZ/int-add-08.ll) we load %base+524288 and %base+524296
131*9880d681SAndroid Build Coastguard Workerinto separate registers, rather than using %base+524288 as a base for both.
132*9880d681SAndroid Build Coastguard Worker
133*9880d681SAndroid Build Coastguard Worker--
134*9880d681SAndroid Build Coastguard Worker
135*9880d681SAndroid Build Coastguard WorkerDynamic stack allocations round the size to 8 bytes and then allocate
136*9880d681SAndroid Build Coastguard Workerthat rounded amount.  It would be simpler to subtract the unrounded
137*9880d681SAndroid Build Coastguard Workersize from the copy of the stack pointer and then align the result.
138*9880d681SAndroid Build Coastguard WorkerSee CodeGen/SystemZ/alloca-01.ll for an example.
139*9880d681SAndroid Build Coastguard Worker
140*9880d681SAndroid Build Coastguard Worker--
141*9880d681SAndroid Build Coastguard Worker
142*9880d681SAndroid Build Coastguard WorkerIf needed, we can support 16-byte atomics using LPQ, STPQ and CSDG.
143*9880d681SAndroid Build Coastguard Worker
144*9880d681SAndroid Build Coastguard Worker--
145*9880d681SAndroid Build Coastguard Worker
146*9880d681SAndroid Build Coastguard WorkerWe might want to model all access registers and use them to spill
147*9880d681SAndroid Build Coastguard Worker32-bit values.
148*9880d681SAndroid Build Coastguard Worker
149*9880d681SAndroid Build Coastguard Worker--
150*9880d681SAndroid Build Coastguard Worker
151*9880d681SAndroid Build Coastguard WorkerWe might want to use the 'overflow' condition of eg. AR to support
152*9880d681SAndroid Build Coastguard Workerllvm.sadd.with.overflow.i32 and related instructions - the generated code
153*9880d681SAndroid Build Coastguard Workerfor signed overflow check is currently quite bad.  This would improve
154*9880d681SAndroid Build Coastguard Workerthe results of using -ftrapv.
155