xref: /aosp_15_r20/external/clpeak/results/AMD_Accelerated_Parallel_Processing/Radeon-VII-Pro.log (revision 1cd03ba3888297bc945f2c84574e105e3ced3e34)
1
2Platform: AMD Accelerated Parallel Processing
3  Device: gfx906
4    Driver version  : 3204.0 (HSA1.1,LC) (Linux x64)
5    Compute units   : 60
6    Clock frequency : 1700 MHz
7
8    Global memory bandwidth (GBPS)
9      float   : 783.04
10      float2  : 741.34
11      float4  : 723.88
12      float8  : 732.36
13      float16 : 679.49
14
15    Single-precision compute (GFLOPS)
16      float   : 12727.97
17      float2  : 12632.55
18      float4  : 12403.68
19      float8  : 12147.13
20      float16 : 11960.99
21
22    Half-precision compute (GFLOPS)
23      half   : 6425.83
24      half2  : 24459.28
25      half4  : 24278.00
26      half8  : 23921.18
27      half16 : 23455.81
28
29    Double-precision compute (GFLOPS)
30      double   : 6206.76
31      double2  : 6176.21
32      double4  : 6135.32
33      double8  : 6107.36
34      double16 : 5924.13
35
36    Integer compute (GIOPS)
37      int   : 4186.51
38      int2  : 4019.41
39      int4  : 4003.08
40      int8  : 4029.69
41      int16 : 3976.25
42
43    Integer compute Fast 24bit (GIOPS)
44      int   : 11493.50
45      int2  : 10816.38
46      int4  : 10109.61
47      int8  : 10421.03
48      int16 : 10354.31
49
50    Transfer bandwidth (GBPS)
51      enqueueWriteBuffer              : 16.91
52      enqueueReadBuffer               : 16.85
53      enqueueWriteBuffer non-blocking : 16.91
54      enqueueReadBuffer non-blocking  : 16.83
55      enqueueMapBuffer(for read)      : 128591.83
56        memcpy from mapped ptr        : 16.77
57      enqueueUnmap(after write)       : 238609.30
58        memcpy to mapped ptr          : 16.91
59
60    Kernel launch latency : 14.06 us
61
62