1
2Platform: AMD Accelerated Parallel Processing
3  Device: gfx906
4    Driver version  : 3204.0 (HSA1.1,LC) (Linux x64)
5    Compute units   : 60
6    Clock frequency : 1725 MHz
7
8    Global memory bandwidth (GBPS)
9      float   : 766.24
10      float2  : 756.53
11      float4  : 740.95
12      float8  : 727.71
13      float16 : 685.31
14
15    Single-precision compute (GFLOPS)
16      float   : 12886.15
17      float2  : 12773.94
18      float4  : 12636.76
19      float8  : 12363.97
20      float16 : 12180.00
21
22    Half-precision compute (GFLOPS)
23      half   : 6522.77
24      half2  : 24971.55
25      half4  : 24781.20
26      half8  : 24465.16
27      half16 : 23955.72
28
29    Double-precision compute (GFLOPS)
30      double   : 6350.20
31      double2  : 6319.02
32      double4  : 6291.70
33      double8  : 5880.47
34      double16 : 6143.47
35
36    Integer compute (GIOPS)
37      int   : 4325.27
38      int2  : 4317.88
39      int4  : 4307.68
40      int8  : 4289.82
41      int16 : 4242.46
42
43    Integer compute Fast 24bit (GIOPS)
44      int   : 12395.53
45      int2  : 12199.22
46      int4  : 11631.28
47      int8  : 11757.87
48      int16 : 11833.97
49
50    Transfer bandwidth (GBPS)
51      enqueueWriteBuffer              : 11.86
52      enqueueReadBuffer               : 11.53
53      enqueueWriteBuffer non-blocking : 11.52
54      enqueueReadBuffer non-blocking  : 11.43
55      enqueueMapBuffer(for read)      : 192599.44
56        memcpy from mapped ptr        : 11.78
57      enqueueUnmap(after write)       : 286331.16
58        memcpy to mapped ptr          : 11.97
59
60    Kernel launch latency : 11.44 us
61
62