Platform: AMD Accelerated Parallel Processing Device: Cayman Driver version : 1348.4 (Linux x64) Compute units : 24 Global memory bandwidth (GBPS) float : 154.77 float2 : 141.49 float4 : 82.39 float8 : 70.60 float16 : 39.35 Single-precision compute (GFLOPS) float : 674.44 float2 : 1345.68 float4 : 2601.47 float8 : 2586.69 float16 : 2573.38 Double-precision compute (GFLOPS) double : 671.24 double2 : 671.59 double4 : 670.93 double8 : 669.51 double16 : 666.50 Integer compute (GIOPS) int : 337.36 int2 : 448.62 int4 : 537.68 int8 : 536.19 int16 : 533.12 Transfer bandwidth (GBPS) enqueueWriteBuffer : 3.53 enqueueReadBuffer : 4.43 enqueueMapBuffer(for read) : 152.89 memcpy from mapped ptr : 4.40 enqueueUnmap(after write) : 1781.26 memcpy to mapped ptr : 4.42 Kernel launch latency : 44.22 us