1 2Platform: AMD Accelerated Parallel Processing 3 Device: gfx906 4 Driver version : 3204.0 (HSA1.1,LC) (Linux x64) 5 Compute units : 60 6 Clock frequency : 1700 MHz 7 8 Global memory bandwidth (GBPS) 9 float : 783.04 10 float2 : 741.34 11 float4 : 723.88 12 float8 : 732.36 13 float16 : 679.49 14 15 Single-precision compute (GFLOPS) 16 float : 12727.97 17 float2 : 12632.55 18 float4 : 12403.68 19 float8 : 12147.13 20 float16 : 11960.99 21 22 Half-precision compute (GFLOPS) 23 half : 6425.83 24 half2 : 24459.28 25 half4 : 24278.00 26 half8 : 23921.18 27 half16 : 23455.81 28 29 Double-precision compute (GFLOPS) 30 double : 6206.76 31 double2 : 6176.21 32 double4 : 6135.32 33 double8 : 6107.36 34 double16 : 5924.13 35 36 Integer compute (GIOPS) 37 int : 4186.51 38 int2 : 4019.41 39 int4 : 4003.08 40 int8 : 4029.69 41 int16 : 3976.25 42 43 Integer compute Fast 24bit (GIOPS) 44 int : 11493.50 45 int2 : 10816.38 46 int4 : 10109.61 47 int8 : 10421.03 48 int16 : 10354.31 49 50 Transfer bandwidth (GBPS) 51 enqueueWriteBuffer : 16.91 52 enqueueReadBuffer : 16.85 53 enqueueWriteBuffer non-blocking : 16.91 54 enqueueReadBuffer non-blocking : 16.83 55 enqueueMapBuffer(for read) : 128591.83 56 memcpy from mapped ptr : 16.77 57 enqueueUnmap(after write) : 238609.30 58 memcpy to mapped ptr : 16.91 59 60 Kernel launch latency : 14.06 us 61 62