1## Batch Size 1 Compile true 2 3| Experiment | Warmup_latency (s) | Average_latency (s) | Throughput (samples/sec) | GPU Utilization (%) | 4| ---------- | ------------------ | ------------------- | ------------------------ | ------------------- | 5| original | 13.828 +/- 0.535 | 0.297 +/- 0.034 | 205.657 +/- 14.429 | 15.630 +/- 1.601 | 6| h2d_d2h_threads | 12.515 +/- 0.666 | 0.519 +/- 0.107 | 138.126 +/- 21.821 | 12.482 +/- 1.822 | 7