1 2Platform: AMD Accelerated Parallel Processing 3 Device: gfx906 4 Driver version : 3204.0 (HSA1.1,LC) (Linux x64) 5 Compute units : 60 6 Clock frequency : 1725 MHz 7 8 Global memory bandwidth (GBPS) 9 float : 766.24 10 float2 : 756.53 11 float4 : 740.95 12 float8 : 727.71 13 float16 : 685.31 14 15 Single-precision compute (GFLOPS) 16 float : 12886.15 17 float2 : 12773.94 18 float4 : 12636.76 19 float8 : 12363.97 20 float16 : 12180.00 21 22 Half-precision compute (GFLOPS) 23 half : 6522.77 24 half2 : 24971.55 25 half4 : 24781.20 26 half8 : 24465.16 27 half16 : 23955.72 28 29 Double-precision compute (GFLOPS) 30 double : 6350.20 31 double2 : 6319.02 32 double4 : 6291.70 33 double8 : 5880.47 34 double16 : 6143.47 35 36 Integer compute (GIOPS) 37 int : 4325.27 38 int2 : 4317.88 39 int4 : 4307.68 40 int8 : 4289.82 41 int16 : 4242.46 42 43 Integer compute Fast 24bit (GIOPS) 44 int : 12395.53 45 int2 : 12199.22 46 int4 : 11631.28 47 int8 : 11757.87 48 int16 : 11833.97 49 50 Transfer bandwidth (GBPS) 51 enqueueWriteBuffer : 11.86 52 enqueueReadBuffer : 11.53 53 enqueueWriteBuffer non-blocking : 11.52 54 enqueueReadBuffer non-blocking : 11.43 55 enqueueMapBuffer(for read) : 192599.44 56 memcpy from mapped ptr : 11.78 57 enqueueUnmap(after write) : 286331.16 58 memcpy to mapped ptr : 11.97 59 60 Kernel launch latency : 11.44 us 61 62