1Platform: NVIDIA CUDA 2 Device: Quadro K620 3 Driver version : 515.76 (Linux x64) 4 Compute units : 3 5 Clock frequency : 1124 MHz 6 7 Global memory bandwidth (GBPS) 8 float : 25.41 9 float2 : 26.21 10 float4 : 26.69 11 float8 : 25.73 12 float16 : 22.42 13 14 Single-precision compute (GFLOPS) 15 float : 569.12 16 float2 : 839.43 17 float4 : 856.16 18 float8 : 851.06 19 float16 : 848.14 20 21 No half precision support! Skipped 22 23 Double-precision compute (GFLOPS) 24 double : 27.47 25 double2 : 27.48 26 double4 : 27.42 27 double8 : 27.32 28 double16 : 27.11 29 30 Integer compute (GIOPS) 31 int : 258.66 32 int2 : 287.30 33 int4 : 289.72 34 int8 : 275.19 35 int16 : 264.43 36 37 Integer compute Fast 24bit (GIOPS) 38 int : 258.66 39 int2 : 287.06 40 int4 : 289.57 41 int8 : 287.89 42 int16 : 286.48 43 44 Transfer bandwidth (GBPS) 45 enqueueWriteBuffer : 5.16 46 enqueueReadBuffer : 5.13 47 enqueueWriteBuffer non-blocking : 2.17 48 enqueueReadBuffer non-blocking : 2.73 49 enqueueMapBuffer(for read) : 5.43 50 memcpy from mapped ptr : 4.48 51 enqueueUnmap(after write) : 6.14 52 memcpy to mapped ptr : 4.49 53 54 Kernel launch latency : 6.87 us 55