1 2Platform: NVIDIA CUDA 3 Device: Quadro P620 4 Driver version : 455.45.01 (Linux x64) 5 Compute units : 4 6 Clock frequency : 1442 MHz 7 8 Global memory bandwidth (GBPS) 9 float : 79.28 10 float2 : 82.40 11 float4 : 84.20 12 float8 : 82.78 13 float16 : 55.16 14 15 Single-precision compute (GFLOPS) 16 float : 1357.53 17 float2 : 1400.77 18 float4 : 1397.70 19 float8 : 1390.10 20 float16 : 1385.35 21 22 No half precision support! Skipped 23 24 Double-precision compute (GFLOPS) 25 double : 45.19 26 double2 : 45.03 27 double4 : 44.97 28 double8 : 44.82 29 double16 : 44.36 30 31 Integer compute (GIOPS) 32 int : 473.37 33 int2 : 472.65 34 int4 : 473.04 35 int8 : 466.72 36 int16 : 458.70 37 38 Integer compute Fast 24bit (GIOPS) 39 int : 473.50 40 int2 : 473.80 41 int4 : 473.05 42 int8 : 470.43 43 int16 : 468.49 44 45 Transfer bandwidth (GBPS) 46 enqueueWriteBuffer : 11.44 47 enqueueReadBuffer : 10.75 48 enqueueWriteBuffer non-blocking : 11.12 49 enqueueReadBuffer non-blocking : 10.50 50 enqueueMapBuffer(for read) : 11.75 51 memcpy from mapped ptr : 15.16 52 enqueueUnmap(after write) : 12.85 53 memcpy to mapped ptr : 15.27 54 55 Kernel launch latency : 3.50 us 56 57