1## Batch Size 64 Compile false 2 3| Experiment | Warmup_latency (s) | Average_latency (s) | Throughput (samples/sec) | GPU Utilization (%) | 4| ---------- | ------------------ | ------------------- | ------------------------ | ------------------- | 5| original | 5.823 +/- 0.541 | 8.024 +/- 0.508 | 460.474 +/- 29.896 | 47.112 +/- 6.576 | 6| h2d_d2h_threads | 3.971 +/- 0.709 | 9.583 +/- 1.030 | 393.394 +/- 26.110 | 37.550 +/- 1.381 | 7| 2_predict_workers | 3.535 +/- 0.124 | 8.258 +/- 1.188 | 441.640 +/- 85.705 | 38.630 +/- 3.499 | 8| 3_predict_workers | 3.724 +/- 0.312 | 7.614 +/- 0.951 | 467.236 +/- 36.619 | 37.834 +/- 2.418 | 9| 4_predict_workers | 4.346 +/- 0.744 | 8.417 +/- 0.541 | 430.672 +/- 17.979 | 34.461 +/- 3.257 | 10