1
2Platform: AMD Accelerated Parallel Processing
3  Device: gfx906+sram-ecc
4    Driver version  : 3137.0 (HSA1.1,LC) (Linux x64)
5    Compute units   : 60
6    Clock frequency : 1725 MHz
7
8    Global memory bandwidth (GBPS)
9      float   : 765.79
10      float2  : 655.94
11      float4  : 645.82
12      float8  : 652.67
13      float16 : 582.26
14
15    Single-precision compute (GFLOPS)
16      float   : 12710.35
17      float2  : 12307.32
18      float4  : 12124.76
19      float8  : 12007.03
20      float16 : 11834.00
21
22    Half-precision compute (GFLOPS)
23      half   : 6422.43
24      half2  : 23564.34
25      half4  : 23395.76
26      half8  : 23167.34
27      half16 : 22676.43
28
29    Double-precision compute (GFLOPS)
30      double   : 5978.52
31      double2  : 5953.91
32      double4  : 5929.22
33      double8  : 5892.56
34      double16 : 5814.56
35
36    Integer compute (GIOPS)
37      int   : 4238.15
38      int2  : 4228.25
39      int4  : 4214.90
40      int8  : 4198.91
41      int16 : 4149.22
42
43    Integer compute Fast 24bit (GIOPS)
44      int   : 11816.17
45      int2  : 11582.84
46      int4  : 11094.79
47      int8  : 11323.87
48      int16 : 11321.21
49
50    Transfer bandwidth (GBPS)
51      enqueueWriteBuffer              : 15.91
52      enqueueReadBuffer               : 15.35
53      enqueueWriteBuffer non-blocking : 11.95
54      enqueueReadBuffer non-blocking  : 12.24
55      enqueueMapBuffer(for read)      : 130150.53
56        memcpy from mapped ptr        : 15.90
57      enqueueUnmap(after write)       : 248264.02
58        memcpy to mapped ptr          : 16.02
59
60    Kernel launch latency : 15.64 us
61
62