1 2Platform: AMD Accelerated Parallel Processing 3 Device: gfx906+sram-ecc 4 Driver version : 3137.0 (HSA1.1,LC) (Linux x64) 5 Compute units : 60 6 Clock frequency : 1725 MHz 7 8 Global memory bandwidth (GBPS) 9 float : 765.79 10 float2 : 655.94 11 float4 : 645.82 12 float8 : 652.67 13 float16 : 582.26 14 15 Single-precision compute (GFLOPS) 16 float : 12710.35 17 float2 : 12307.32 18 float4 : 12124.76 19 float8 : 12007.03 20 float16 : 11834.00 21 22 Half-precision compute (GFLOPS) 23 half : 6422.43 24 half2 : 23564.34 25 half4 : 23395.76 26 half8 : 23167.34 27 half16 : 22676.43 28 29 Double-precision compute (GFLOPS) 30 double : 5978.52 31 double2 : 5953.91 32 double4 : 5929.22 33 double8 : 5892.56 34 double16 : 5814.56 35 36 Integer compute (GIOPS) 37 int : 4238.15 38 int2 : 4228.25 39 int4 : 4214.90 40 int8 : 4198.91 41 int16 : 4149.22 42 43 Integer compute Fast 24bit (GIOPS) 44 int : 11816.17 45 int2 : 11582.84 46 int4 : 11094.79 47 int8 : 11323.87 48 int16 : 11321.21 49 50 Transfer bandwidth (GBPS) 51 enqueueWriteBuffer : 15.91 52 enqueueReadBuffer : 15.35 53 enqueueWriteBuffer non-blocking : 11.95 54 enqueueReadBuffer non-blocking : 12.24 55 enqueueMapBuffer(for read) : 130150.53 56 memcpy from mapped ptr : 15.90 57 enqueueUnmap(after write) : 248264.02 58 memcpy to mapped ptr : 16.02 59 60 Kernel launch latency : 15.64 us 61 62