xref: /aosp_15_r20/external/clpeak/results/NVIDIA_CUDA/Quadro_P620.log (revision 1cd03ba3888297bc945f2c84574e105e3ced3e34)
1
2Platform: NVIDIA CUDA
3  Device: Quadro P620
4    Driver version  : 455.45.01 (Linux x64)
5    Compute units   : 4
6    Clock frequency : 1442 MHz
7
8    Global memory bandwidth (GBPS)
9      float   : 79.28
10      float2  : 82.40
11      float4  : 84.20
12      float8  : 82.78
13      float16 : 55.16
14
15    Single-precision compute (GFLOPS)
16      float   : 1357.53
17      float2  : 1400.77
18      float4  : 1397.70
19      float8  : 1390.10
20      float16 : 1385.35
21
22    No half precision support! Skipped
23
24    Double-precision compute (GFLOPS)
25      double   : 45.19
26      double2  : 45.03
27      double4  : 44.97
28      double8  : 44.82
29      double16 : 44.36
30
31    Integer compute (GIOPS)
32      int   : 473.37
33      int2  : 472.65
34      int4  : 473.04
35      int8  : 466.72
36      int16 : 458.70
37
38    Integer compute Fast 24bit (GIOPS)
39      int   : 473.50
40      int2  : 473.80
41      int4  : 473.05
42      int8  : 470.43
43      int16 : 468.49
44
45    Transfer bandwidth (GBPS)
46      enqueueWriteBuffer              : 11.44
47      enqueueReadBuffer               : 10.75
48      enqueueWriteBuffer non-blocking : 11.12
49      enqueueReadBuffer non-blocking  : 10.50
50      enqueueMapBuffer(for read)      : 11.75
51        memcpy from mapped ptr        : 15.16
52      enqueueUnmap(after write)       : 12.85
53        memcpy to mapped ptr          : 15.27
54
55    Kernel launch latency : 3.50 us
56
57