1*d83cc019SAndroid Build Coastguard WorkerWorkload descriptor format 2*d83cc019SAndroid Build Coastguard Worker========================== 3*d83cc019SAndroid Build Coastguard Worker 4*d83cc019SAndroid Build Coastguard Workerctx.engine.duration_us.dependency.wait,... 5*d83cc019SAndroid Build Coastguard Worker<uint>.<str>.<uint>[-<uint>]|*.<int <= 0>[/<int <= 0>][...].<0|1>,... 6*d83cc019SAndroid Build Coastguard WorkerB.<uint> 7*d83cc019SAndroid Build Coastguard WorkerM.<uint>.<str>[|<str>]... 8*d83cc019SAndroid Build Coastguard WorkerP|S|X.<uint>.<int> 9*d83cc019SAndroid Build Coastguard Workerd|p|s|t|q|a|T.<int>,... 10*d83cc019SAndroid Build Coastguard Workerb.<uint>.<str>[|<str>].<str> 11*d83cc019SAndroid Build Coastguard Workerf 12*d83cc019SAndroid Build Coastguard Worker 13*d83cc019SAndroid Build Coastguard WorkerFor duration a range can be given from which a random value will be picked 14*d83cc019SAndroid Build Coastguard Workerbefore every submit. Since this and seqno management requires CPU access to 15*d83cc019SAndroid Build Coastguard Workerobjects, care needs to be taken in order to ensure the submit queue is deep 16*d83cc019SAndroid Build Coastguard Workerenough these operations do not affect the execution speed unless that is 17*d83cc019SAndroid Build Coastguard Workerdesired. 18*d83cc019SAndroid Build Coastguard Worker 19*d83cc019SAndroid Build Coastguard WorkerAdditional workload steps are also supported: 20*d83cc019SAndroid Build Coastguard Worker 21*d83cc019SAndroid Build Coastguard Worker 'd' - Adds a delay (in microseconds). 22*d83cc019SAndroid Build Coastguard Worker 'p' - Adds a delay relative to the start of previous loop so that the each loop 23*d83cc019SAndroid Build Coastguard Worker starts execution with a given period. 24*d83cc019SAndroid Build Coastguard Worker 's' - Synchronises the pipeline to a batch relative to the step. 25*d83cc019SAndroid Build Coastguard Worker 't' - Throttle every n batches. 26*d83cc019SAndroid Build Coastguard Worker 'q' - Throttle to n max queue depth. 27*d83cc019SAndroid Build Coastguard Worker 'f' - Create a sync fence. 28*d83cc019SAndroid Build Coastguard Worker 'a' - Advance the previously created sync fence. 29*d83cc019SAndroid Build Coastguard Worker 'B' - Turn on context load balancing. 30*d83cc019SAndroid Build Coastguard Worker 'b' - Set up engine bonds. 31*d83cc019SAndroid Build Coastguard Worker 'M' - Set up engine map. 32*d83cc019SAndroid Build Coastguard Worker 'P' - Context priority. 33*d83cc019SAndroid Build Coastguard Worker 'S' - Context SSEU configuration. 34*d83cc019SAndroid Build Coastguard Worker 'T' - Terminate an infinite batch. 35*d83cc019SAndroid Build Coastguard Worker 'X' - Context preemption control. 36*d83cc019SAndroid Build Coastguard Worker 37*d83cc019SAndroid Build Coastguard WorkerEngine ids: DEFAULT, RCS, BCS, VCS, VCS1, VCS2, VECS 38*d83cc019SAndroid Build Coastguard Worker 39*d83cc019SAndroid Build Coastguard WorkerExample (leading spaces must not be present in the actual file): 40*d83cc019SAndroid Build Coastguard Worker---------------------------------------------------------------- 41*d83cc019SAndroid Build Coastguard Worker 42*d83cc019SAndroid Build Coastguard Worker 1.VCS1.3000.0.1 43*d83cc019SAndroid Build Coastguard Worker 1.RCS.500-1000.-1.0 44*d83cc019SAndroid Build Coastguard Worker 1.RCS.3700.0.0 45*d83cc019SAndroid Build Coastguard Worker 1.RCS.1000.-2.0 46*d83cc019SAndroid Build Coastguard Worker 1.VCS2.2300.-2.0 47*d83cc019SAndroid Build Coastguard Worker 1.RCS.4700.-1.0 48*d83cc019SAndroid Build Coastguard Worker 1.VCS2.600.-1.1 49*d83cc019SAndroid Build Coastguard Worker p.16000 50*d83cc019SAndroid Build Coastguard Worker 51*d83cc019SAndroid Build Coastguard WorkerThe above workload described in human language works like this: 52*d83cc019SAndroid Build Coastguard Worker 53*d83cc019SAndroid Build Coastguard Worker 1. A batch is sent to the VCS1 engine which will be executing for 3ms on the 54*d83cc019SAndroid Build Coastguard Worker GPU and userspace will wait until it is finished before proceeding. 55*d83cc019SAndroid Build Coastguard Worker 2-4. Now three batches are sent to RCS with durations of 0.5-1.5ms (random 56*d83cc019SAndroid Build Coastguard Worker duration range), 3.7ms and 1ms respectively. The first batch has a data 57*d83cc019SAndroid Build Coastguard Worker dependency on the preceding VCS1 batch, and the last of the group depends 58*d83cc019SAndroid Build Coastguard Worker on the first from the group. 59*d83cc019SAndroid Build Coastguard Worker 5. Now a 2.3ms batch is sent to VCS2, with a data dependency on the 3.7ms 60*d83cc019SAndroid Build Coastguard Worker RCS batch. 61*d83cc019SAndroid Build Coastguard Worker 6. This is followed by a 4.7ms RCS batch with a data dependency on the 2.3ms 62*d83cc019SAndroid Build Coastguard Worker VCS2 batch. 63*d83cc019SAndroid Build Coastguard Worker 7. Then a 0.6ms VCS2 batch is sent depending on the previous RCS one. In the 64*d83cc019SAndroid Build Coastguard Worker same step the tool is told to wait for the batch completes before 65*d83cc019SAndroid Build Coastguard Worker proceeding. 66*d83cc019SAndroid Build Coastguard Worker 8. Finally the tool is told to wait long enough to ensure the next iteration 67*d83cc019SAndroid Build Coastguard Worker starts 16ms after the previous one has started. 68*d83cc019SAndroid Build Coastguard Worker 69*d83cc019SAndroid Build Coastguard WorkerWhen workload descriptors are provided on the command line, commas must be used 70*d83cc019SAndroid Build Coastguard Workerinstead of new lines. 71*d83cc019SAndroid Build Coastguard Worker 72*d83cc019SAndroid Build Coastguard WorkerMultiple dependencies can be given separated by forward slashes. 73*d83cc019SAndroid Build Coastguard Worker 74*d83cc019SAndroid Build Coastguard WorkerExample: 75*d83cc019SAndroid Build Coastguard Worker 76*d83cc019SAndroid Build Coastguard Worker 1.VCS1.3000.0.1 77*d83cc019SAndroid Build Coastguard Worker 1.RCS.3700.0.0 78*d83cc019SAndroid Build Coastguard Worker 1.VCS2.2300.-1/-2.0 79*d83cc019SAndroid Build Coastguard Worker 80*d83cc019SAndroid Build Coastguard WorkerI this case the last step has a data dependency on both first and second steps. 81*d83cc019SAndroid Build Coastguard Worker 82*d83cc019SAndroid Build Coastguard WorkerBatch durations can also be specified as infinite by using the '*' in the 83*d83cc019SAndroid Build Coastguard Workerduration field. Such batches must be ended by the terminate command ('T') 84*d83cc019SAndroid Build Coastguard Workerotherwise they will cause a GPU hang to be reported. 85*d83cc019SAndroid Build Coastguard Worker 86*d83cc019SAndroid Build Coastguard WorkerSync (fd) fences 87*d83cc019SAndroid Build Coastguard Worker---------------- 88*d83cc019SAndroid Build Coastguard Worker 89*d83cc019SAndroid Build Coastguard WorkerSync fences are also supported as dependencies. 90*d83cc019SAndroid Build Coastguard Worker 91*d83cc019SAndroid Build Coastguard WorkerTo use them put a "f<N>" token in the step dependecy list. N is this case the 92*d83cc019SAndroid Build Coastguard Workersame relative step offset to the dependee batch, but instead of the data 93*d83cc019SAndroid Build Coastguard Workerdependency an output fence will be emitted at the dependee step, and passed in 94*d83cc019SAndroid Build Coastguard Workeras a dependency in the current step. 95*d83cc019SAndroid Build Coastguard Worker 96*d83cc019SAndroid Build Coastguard WorkerExample: 97*d83cc019SAndroid Build Coastguard Worker 98*d83cc019SAndroid Build Coastguard Worker 1.VCS1.3000.0.0 99*d83cc019SAndroid Build Coastguard Worker 1.RCS.500-1000.-1/f-1.0 100*d83cc019SAndroid Build Coastguard Worker 101*d83cc019SAndroid Build Coastguard WorkerIn this case the second step will have both a data dependency and a sync fence 102*d83cc019SAndroid Build Coastguard Workerdependency on the previous step. 103*d83cc019SAndroid Build Coastguard Worker 104*d83cc019SAndroid Build Coastguard WorkerExample: 105*d83cc019SAndroid Build Coastguard Worker 106*d83cc019SAndroid Build Coastguard Worker 1.RCS.500-1000.0.0 107*d83cc019SAndroid Build Coastguard Worker 1.VCS1.3000.f-1.0 108*d83cc019SAndroid Build Coastguard Worker 1.VCS2.3000.f-2.0 109*d83cc019SAndroid Build Coastguard Worker 110*d83cc019SAndroid Build Coastguard WorkerVCS1 and VCS2 batches will have a sync fence dependency on the RCS batch. 111*d83cc019SAndroid Build Coastguard Worker 112*d83cc019SAndroid Build Coastguard WorkerExample: 113*d83cc019SAndroid Build Coastguard Worker 114*d83cc019SAndroid Build Coastguard Worker 1.RCS.500-1000.0.0 115*d83cc019SAndroid Build Coastguard Worker f 116*d83cc019SAndroid Build Coastguard Worker 2.VCS1.3000.f-1.0 117*d83cc019SAndroid Build Coastguard Worker 2.VCS2.3000.f-2.0 118*d83cc019SAndroid Build Coastguard Worker 1.RCS.500-1000.0.1 119*d83cc019SAndroid Build Coastguard Worker a.-4 120*d83cc019SAndroid Build Coastguard Worker s.-4 121*d83cc019SAndroid Build Coastguard Worker s.-4 122*d83cc019SAndroid Build Coastguard Worker 123*d83cc019SAndroid Build Coastguard WorkerVCS1 and VCS2 batches have an input sync fence dependecy on the standalone fence 124*d83cc019SAndroid Build Coastguard Workercreated at the second step. They are submitted ahead of time while still not 125*d83cc019SAndroid Build Coastguard Workerrunnable. When the second RCS batch completes the standalone fence is signaled 126*d83cc019SAndroid Build Coastguard Workerwhich allows the two VCS batches to be executed. Finally we wait until the both 127*d83cc019SAndroid Build Coastguard WorkerVCS batches have completed before starting the (optional) next iteration. 128*d83cc019SAndroid Build Coastguard Worker 129*d83cc019SAndroid Build Coastguard WorkerSubmit fences 130*d83cc019SAndroid Build Coastguard Worker------------- 131*d83cc019SAndroid Build Coastguard Worker 132*d83cc019SAndroid Build Coastguard WorkerSubmit fences are a type of input fence which are signalled when the originating 133*d83cc019SAndroid Build Coastguard Workerbatch buffer is submitted to the GPU. (In contrary to normal sync fences, which 134*d83cc019SAndroid Build Coastguard Workerare signalled when completed.) 135*d83cc019SAndroid Build Coastguard Worker 136*d83cc019SAndroid Build Coastguard WorkerSubmit fences have the identical syntax as the sync fences with the lower-case 137*d83cc019SAndroid Build Coastguard Worker's' being used to select them. Eg: 138*d83cc019SAndroid Build Coastguard Worker 139*d83cc019SAndroid Build Coastguard Worker 1.RCS.500-1000.0.0 140*d83cc019SAndroid Build Coastguard Worker 1.VCS1.3000.s-1.0 141*d83cc019SAndroid Build Coastguard Worker 1.VCS2.3000.s-2.0 142*d83cc019SAndroid Build Coastguard Worker 143*d83cc019SAndroid Build Coastguard WorkerHere VCS1 and VCS2 batches will only be submitted for executing once the RCS 144*d83cc019SAndroid Build Coastguard Workerbatch enters the GPU. 145*d83cc019SAndroid Build Coastguard Worker 146*d83cc019SAndroid Build Coastguard WorkerContext priority 147*d83cc019SAndroid Build Coastguard Worker---------------- 148*d83cc019SAndroid Build Coastguard Worker 149*d83cc019SAndroid Build Coastguard Worker P.1.-1 150*d83cc019SAndroid Build Coastguard Worker 1.RCS.1000.0.0 151*d83cc019SAndroid Build Coastguard Worker P.2.1 152*d83cc019SAndroid Build Coastguard Worker 2.BCS.1000.-2.0 153*d83cc019SAndroid Build Coastguard Worker 154*d83cc019SAndroid Build Coastguard WorkerContext 1 is marked as low priority (-1) and then a batch buffer is submitted 155*d83cc019SAndroid Build Coastguard Workeragainst it. Context 2 is marked as high priority (1) and then a batch buffer 156*d83cc019SAndroid Build Coastguard Workeris submitted against it which depends on the batch from context 1. 157*d83cc019SAndroid Build Coastguard Worker 158*d83cc019SAndroid Build Coastguard WorkerContext priority command is executed at workload runtime and is valid until 159*d83cc019SAndroid Build Coastguard Workeroverriden by another (optional) same context priority change. Actual driver 160*d83cc019SAndroid Build Coastguard Workerioctls are executed only if the priority level has changed for the context. 161*d83cc019SAndroid Build Coastguard Worker 162*d83cc019SAndroid Build Coastguard WorkerContext preemption control 163*d83cc019SAndroid Build Coastguard Worker-------------------------- 164*d83cc019SAndroid Build Coastguard Worker 165*d83cc019SAndroid Build Coastguard Worker X.1.0 166*d83cc019SAndroid Build Coastguard Worker 1.RCS.1000.0.0 167*d83cc019SAndroid Build Coastguard Worker X.1.500 168*d83cc019SAndroid Build Coastguard Worker 1.RCS.1000.0.0 169*d83cc019SAndroid Build Coastguard Worker 170*d83cc019SAndroid Build Coastguard WorkerContext 1 is marked as non-preemptable batches and a batch is sent against 1. 171*d83cc019SAndroid Build Coastguard WorkerThe same context is then marked to have batches which can be preempted every 172*d83cc019SAndroid Build Coastguard Worker500us and another batch is submitted. 173*d83cc019SAndroid Build Coastguard Worker 174*d83cc019SAndroid Build Coastguard WorkerSame as with context priority, context preemption commands are valid until 175*d83cc019SAndroid Build Coastguard Workeroptionally overriden by another preemption control change on the same context. 176*d83cc019SAndroid Build Coastguard Worker 177*d83cc019SAndroid Build Coastguard WorkerEngine maps 178*d83cc019SAndroid Build Coastguard Worker----------- 179*d83cc019SAndroid Build Coastguard Worker 180*d83cc019SAndroid Build Coastguard WorkerEngine maps are a per context feature which changes the way engine selection is 181*d83cc019SAndroid Build Coastguard Workerdone in the driver. 182*d83cc019SAndroid Build Coastguard Worker 183*d83cc019SAndroid Build Coastguard WorkerExample: 184*d83cc019SAndroid Build Coastguard Worker 185*d83cc019SAndroid Build Coastguard Worker M.1.VCS1|VCS2 186*d83cc019SAndroid Build Coastguard Worker 187*d83cc019SAndroid Build Coastguard WorkerThis sets up context 1 with an engine map containing VCS1 and VCS2 engine. 188*d83cc019SAndroid Build Coastguard WorkerSubmission to this context can now only reference these two engines. 189*d83cc019SAndroid Build Coastguard Worker 190*d83cc019SAndroid Build Coastguard WorkerEngine maps can also be defined based on class like VCS. 191*d83cc019SAndroid Build Coastguard Worker 192*d83cc019SAndroid Build Coastguard WorkerExample: 193*d83cc019SAndroid Build Coastguard Worker 194*d83cc019SAndroid Build Coastguard WorkerM.1.VCS 195*d83cc019SAndroid Build Coastguard Worker 196*d83cc019SAndroid Build Coastguard WorkerThis sets up the engine map to all available VCS class engines. 197*d83cc019SAndroid Build Coastguard Worker 198*d83cc019SAndroid Build Coastguard WorkerContext load balancing 199*d83cc019SAndroid Build Coastguard Worker---------------------- 200*d83cc019SAndroid Build Coastguard Worker 201*d83cc019SAndroid Build Coastguard WorkerContext load balancing (aka Virtual Engine) is an i915 feature where the driver 202*d83cc019SAndroid Build Coastguard Workerwill pick the best engine (most idle) to submit to given previously configured 203*d83cc019SAndroid Build Coastguard Workerengine map. 204*d83cc019SAndroid Build Coastguard Worker 205*d83cc019SAndroid Build Coastguard WorkerExample: 206*d83cc019SAndroid Build Coastguard Worker 207*d83cc019SAndroid Build Coastguard Worker B.1 208*d83cc019SAndroid Build Coastguard Worker 209*d83cc019SAndroid Build Coastguard WorkerThis enables load balancing for context number one. 210*d83cc019SAndroid Build Coastguard Worker 211*d83cc019SAndroid Build Coastguard WorkerEngine bonds 212*d83cc019SAndroid Build Coastguard Worker------------ 213*d83cc019SAndroid Build Coastguard Worker 214*d83cc019SAndroid Build Coastguard WorkerEngine bonds are extensions on load balanced contexts. They allow expressing 215*d83cc019SAndroid Build Coastguard Workerrules of engine selection between two co-operating contexts tied with submit 216*d83cc019SAndroid Build Coastguard Workerfences. In other words, the rule expression is telling the driver: "If you pick 217*d83cc019SAndroid Build Coastguard Workerthis engine for context one, then you have to pick that engine for context two". 218*d83cc019SAndroid Build Coastguard Worker 219*d83cc019SAndroid Build Coastguard WorkerSyntax is: 220*d83cc019SAndroid Build Coastguard Worker b.<context>.<engine_list>.<master_engine> 221*d83cc019SAndroid Build Coastguard Worker 222*d83cc019SAndroid Build Coastguard WorkerEngine list is a list of one or more sibling engines separated by a pipe 223*d83cc019SAndroid Build Coastguard Workercharacter (eg. "VCS1|VCS2"). 224*d83cc019SAndroid Build Coastguard Worker 225*d83cc019SAndroid Build Coastguard WorkerThere can be multiple bonds tied to the same context. 226*d83cc019SAndroid Build Coastguard Worker 227*d83cc019SAndroid Build Coastguard WorkerExample: 228*d83cc019SAndroid Build Coastguard Worker 229*d83cc019SAndroid Build Coastguard Worker M.1.RCS|VECS 230*d83cc019SAndroid Build Coastguard Worker B.1 231*d83cc019SAndroid Build Coastguard Worker M.2.VCS1|VCS2 232*d83cc019SAndroid Build Coastguard Worker B.2 233*d83cc019SAndroid Build Coastguard Worker b.2.VCS1.RCS 234*d83cc019SAndroid Build Coastguard Worker b.2.VCS2.VECS 235*d83cc019SAndroid Build Coastguard Worker 236*d83cc019SAndroid Build Coastguard WorkerThis tells the driver that if it picked RCS for context one, it has to pick VCS1 237*d83cc019SAndroid Build Coastguard Workerfor context two. And if it picked VECS for context one, it has to pick VCS1 for 238*d83cc019SAndroid Build Coastguard Workercontext two. 239*d83cc019SAndroid Build Coastguard Worker 240*d83cc019SAndroid Build Coastguard WorkerIf we extend the above example with more workload directives: 241*d83cc019SAndroid Build Coastguard Worker 242*d83cc019SAndroid Build Coastguard Worker 1.DEFAULT.1000.0.0 243*d83cc019SAndroid Build Coastguard Worker 2.DEFAULT.1000.s-1.0 244*d83cc019SAndroid Build Coastguard Worker 245*d83cc019SAndroid Build Coastguard WorkerWe get to a fully functional example where two batch buffers are submitted in a 246*d83cc019SAndroid Build Coastguard Workerload balanced fashion, telling the driver they should run simultaneously and 247*d83cc019SAndroid Build Coastguard Workerthat valid engine pairs are either RCS + VCS1 (for two contexts respectively), 248*d83cc019SAndroid Build Coastguard Workeror VECS + VCS2. 249*d83cc019SAndroid Build Coastguard Worker 250*d83cc019SAndroid Build Coastguard WorkerThis can also be extended using sync fences to improve chances of the first 251*d83cc019SAndroid Build Coastguard Workersubmission not getting on the hardware after the second one. Second block would 252*d83cc019SAndroid Build Coastguard Workerthen look like: 253*d83cc019SAndroid Build Coastguard Worker 254*d83cc019SAndroid Build Coastguard Worker f 255*d83cc019SAndroid Build Coastguard Worker 1.DEFAULT.1000.f-1.0 256*d83cc019SAndroid Build Coastguard Worker 2.DEFAULT.1000.s-1.0 257*d83cc019SAndroid Build Coastguard Worker a.-3 258*d83cc019SAndroid Build Coastguard Worker 259*d83cc019SAndroid Build Coastguard WorkerContext SSEU configuration 260*d83cc019SAndroid Build Coastguard Worker-------------------------- 261*d83cc019SAndroid Build Coastguard Worker 262*d83cc019SAndroid Build Coastguard Worker S.1.1 263*d83cc019SAndroid Build Coastguard Worker 1.RCS.1000.0.0 264*d83cc019SAndroid Build Coastguard Worker S.2.-1 265*d83cc019SAndroid Build Coastguard Worker 2.RCS.1000.0.0 266*d83cc019SAndroid Build Coastguard Worker 267*d83cc019SAndroid Build Coastguard WorkerContext 1 is configured to run with one enabled slice (slice mask 1) and a batch 268*d83cc019SAndroid Build Coastguard Workeris sumitted against it. Context 2 is configured to run with all slices (this is 269*d83cc019SAndroid Build Coastguard Workerthe default so the command could also be omitted) and a batch submitted against 270*d83cc019SAndroid Build Coastguard Workerit. 271*d83cc019SAndroid Build Coastguard Worker 272*d83cc019SAndroid Build Coastguard WorkerThis shows the dynamic SSEU reconfiguration cost beween two contexts competing 273*d83cc019SAndroid Build Coastguard Workerfor the render engine. 274*d83cc019SAndroid Build Coastguard Worker 275*d83cc019SAndroid Build Coastguard WorkerSlice mask of -1 has a special meaning of "all slices". Otherwise any integer 276*d83cc019SAndroid Build Coastguard Workercan be specifying as the slice mask, but beware any apart from 1 and -1 can make 277*d83cc019SAndroid Build Coastguard Workerthe workload not portable between different GPUs. 278