1.. SPDX-License-Identifier: GPL-2.0 2 3========================================= 4IAA Compression Accelerator Crypto Driver 5========================================= 6 7Tom Zanussi <[email protected]> 8 9The IAA crypto driver supports compression/decompression compatible 10with the DEFLATE compression standard described in RFC 1951, which is 11the compression/decompression algorithm exported by this module. 12 13The IAA hardware spec can be found here: 14 15 https://cdrdv2.intel.com/v1/dl/getContent/721858 16 17The iaa_crypto driver is designed to work as a layer underneath 18higher-level compression devices such as zswap. 19 20Users can select IAA compress/decompress acceleration by specifying 21one of the supported IAA compression algorithms in whatever facility 22allows compression algorithms to be selected. 23 24For example, a zswap device can select the IAA 'fixed' mode 25represented by selecting the 'deflate-iaa' crypto compression 26algorithm:: 27 28 # echo deflate-iaa > /sys/module/zswap/parameters/compressor 29 30This will tell zswap to use the IAA 'fixed' compression mode for all 31compresses and decompresses. 32 33Currently, there is only one compression modes available, 'fixed' 34mode. 35 36The 'fixed' compression mode implements the compression scheme 37specified by RFC 1951 and is given the crypto algorithm name 38'deflate-iaa'. (Because the IAA hardware has a 4k history-window 39limitation, only buffers <= 4k, or that have been compressed using a 40<= 4k history window, are technically compliant with the deflate spec, 41which allows for a window of up to 32k. Because of this limitation, 42the IAA fixed mode deflate algorithm is given its own algorithm name 43rather than simply 'deflate'). 44 45 46Config options and other setup 47============================== 48 49The IAA crypto driver is available via menuconfig using the following 50path:: 51 52 Cryptographic API -> Hardware crypto devices -> Support for Intel(R) IAA Compression Accelerator 53 54In the configuration file the option called CONFIG_CRYPTO_DEV_IAA_CRYPTO. 55 56The IAA crypto driver also supports statistics, which are available 57via menuconfig using the following path:: 58 59 Cryptographic API -> Hardware crypto devices -> Support for Intel(R) IAA Compression -> Enable Intel(R) IAA Compression Accelerator Statistics 60 61In the configuration file the option called CONFIG_CRYPTO_DEV_IAA_CRYPTO_STATS. 62 63The following config options should also be enabled:: 64 65 CONFIG_IRQ_REMAP=y 66 CONFIG_INTEL_IOMMU=y 67 CONFIG_INTEL_IOMMU_SVM=y 68 CONFIG_PCI_ATS=y 69 CONFIG_PCI_PRI=y 70 CONFIG_PCI_PASID=y 71 CONFIG_INTEL_IDXD=m 72 CONFIG_INTEL_IDXD_SVM=y 73 74IAA is one of the first Intel accelerator IPs that can work in 75conjunction with the Intel IOMMU. There are multiple modes that exist 76for testing. Based on IOMMU configuration, there are 3 modes:: 77 78 - Scalable 79 - Legacy 80 - No IOMMU 81 82 83Scalable mode 84------------- 85 86Scalable mode supports Shared Virtual Memory (SVM or SVA). It is 87entered when using the kernel boot commandline:: 88 89 intel_iommu=on,sm_on 90 91with VT-d turned on in BIOS. 92 93With scalable mode, both shared and dedicated workqueues are available 94for use. 95 96For scalable mode, the following BIOS settings should be enabled:: 97 98 Socket Configuration > IIO Configuration > Intel VT for Directed I/O (VT-d) > Intel VT for Directed I/O 99 100 Socket Configuration > IIO Configuration > PCIe ENQCMD > ENQCMDS 101 102 103Legacy mode 104----------- 105 106Legacy mode is entered when using the kernel boot commandline:: 107 108 intel_iommu=off 109 110or VT-d is not turned on in BIOS. 111 112If you have booted into Linux and not sure if VT-d is on, do a "dmesg 113| grep -i dmar". If you don't see a number of DMAR devices enumerated, 114most likely VT-d is not on. 115 116With legacy mode, only dedicated workqueues are available for use. 117 118 119No IOMMU mode 120------------- 121 122No IOMMU mode is entered when using the kernel boot commandline:: 123 124 iommu=off. 125 126With no IOMMU mode, only dedicated workqueues are available for use. 127 128 129Usage 130===== 131 132accel-config 133------------ 134 135When loaded, the iaa_crypto driver automatically creates a default 136configuration and enables it, and assigns default driver attributes. 137If a different configuration or set of driver attributes is required, 138the user must first disable the IAA devices and workqueues, reset the 139configuration, and then re-register the deflate-iaa algorithm with the 140crypto subsystem by removing and reinserting the iaa_crypto module. 141 142The :ref:`iaa_disable_script` in the 'Use Cases' 143section below can be used to disable the default configuration. 144 145See :ref:`iaa_default_config` below for details of the default 146configuration. 147 148More likely than not, however, and because of the complexity and 149configurability of the accelerator devices, the user will want to 150configure the device and manually enable the desired devices and 151workqueues. 152 153The userspace tool to help doing that is called accel-config. Using 154accel-config to configure device or loading a previously saved config 155is highly recommended. The device can be controlled via sysfs 156directly but comes with the warning that you should do this ONLY if 157you know exactly what you are doing. The following sections will not 158cover the sysfs interface but assumes you will be using accel-config. 159 160The :ref:`iaa_sysfs_config` section in the appendix below can be 161consulted for the sysfs interface details if interested. 162 163The accel-config tool along with instructions for building it can be 164found here: 165 166 https://github.com/intel/idxd-config/#readme 167 168Typical usage 169------------- 170 171In order for the iaa_crypto module to actually do any 172compression/decompression work on behalf of a facility, one or more 173IAA workqueues need to be bound to the iaa_crypto driver. 174 175For instance, here's an example of configuring an IAA workqueue and 176binding it to the iaa_crypto driver (note that device names are 177specified as 'iax' rather than 'iaa' - this is because upstream still 178has the old 'iax' device naming in place) :: 179 180 # configure wq1.0 181 182 accel-config config-wq --group-id=0 --mode=dedicated --type=kernel --priority=10 --name="iaa_crypto" --driver-name="crypto" iax1/wq1.0 183 184 accel-config config-engine iax1/engine1.0 --group-id=0 185 186 # enable IAA device iax1 187 188 accel-config enable-device iax1 189 190 # enable wq1.0 on IAX device iax1 191 192 accel-config enable-wq iax1/wq1.0 193 194Whenever a new workqueue is bound to or unbound from the iaa_crypto 195driver, the available workqueues are 'rebalanced' such that work 196submitted from a particular CPU is given to the most appropriate 197workqueue available. Current best practice is to configure and bind 198at least one workqueue for each IAA device, but as long as there is at 199least one workqueue configured and bound to any IAA device in the 200system, the iaa_crypto driver will work, albeit most likely not as 201efficiently. 202 203The IAA crypto algorigthms is operational and compression and 204decompression operations are fully enabled following the successful 205binding of the first IAA workqueue to the iaa_crypto driver. 206 207Similarly, the IAA crypto algorithm is not operational and compression 208and decompression operations are disabled following the unbinding of 209the last IAA worqueue to the iaa_crypto driver. 210 211As a result, the IAA crypto algorithms and thus the IAA hardware are 212only available when one or more workques are bound to the iaa_crypto 213driver. 214 215When there are no IAA workqueues bound to the driver, the IAA crypto 216algorithms can be unregistered by removing the module. 217 218 219Driver attributes 220----------------- 221 222There are a couple user-configurable driver attributes that can be 223used to configure various modes of operation. They're listed below, 224along with their default values. To set any of these attributes, echo 225the appropriate values to the attribute file located under 226/sys/bus/dsa/drivers/crypto/ 227 228The attribute settings at the time the IAA algorithms are registered 229are captured in each algorithm's crypto_ctx and used for all compresses 230and decompresses when using that algorithm. 231 232The available attributes are: 233 234 - verify_compress 235 236 Toggle compression verification. If set, each compress will be 237 internally decompressed and the contents verified, returning error 238 codes if unsuccessful. This can be toggled with 0/1:: 239 240 echo 0 > /sys/bus/dsa/drivers/crypto/verify_compress 241 242 The default setting is '1' - verify all compresses. 243 244 - sync_mode 245 246 Select mode to be used to wait for completion of each compresses 247 and decompress operation. 248 249 The crypto async interface support implemented by iaa_crypto 250 provides an implementation that satisfies the interface but does 251 so in a synchronous manner - it fills and submits the IDXD 252 descriptor and then loops around waiting for it to complete before 253 returning. This isn't a problem at the moment, since all existing 254 callers (e.g. zswap) wrap any asynchronous callees in a 255 synchronous wrapper anyway. 256 257 The iaa_crypto driver does however provide true asynchronous 258 support for callers that can make use of it. In this mode, it 259 fills and submits the IDXD descriptor, then returns immediately 260 with -EINPROGRESS. The caller can then either poll for completion 261 itself, which requires specific code in the caller which currently 262 nothing in the upstream kernel implements, or go to sleep and wait 263 for an interrupt signaling completion. This latter mode is 264 supported by current users in the kernel such as zswap via 265 synchronous wrappers. Although it is supported this mode is 266 significantly slower than the synchronous mode that does the 267 polling in the iaa_crypto driver previously mentioned. 268 269 This mode can be enabled by writing 'async_irq' to the sync_mode 270 iaa_crypto driver attribute:: 271 272 echo async_irq > /sys/bus/dsa/drivers/crypto/sync_mode 273 274 Async mode without interrupts (caller must poll) can be enabled by 275 writing 'async' to it (please see Caveat):: 276 277 echo async > /sys/bus/dsa/drivers/crypto/sync_mode 278 279 The mode that does the polling in the iaa_crypto driver can be 280 enabled by writing 'sync' to it:: 281 282 echo sync > /sys/bus/dsa/drivers/crypto/sync_mode 283 284 The default mode is 'sync'. 285 286 Caveat: since the only mechanism that iaa_crypto currently implements 287 for async polling without interrupts is via the 'sync' mode as 288 described earlier, writing 'async' to 289 '/sys/bus/dsa/drivers/crypto/sync_mode' will internally enable the 290 'sync' mode. This is to ensure correct iaa_crypto behavior until true 291 async polling without interrupts is enabled in iaa_crypto. 292 293.. _iaa_default_config: 294 295IAA Default Configuration 296------------------------- 297 298When the iaa_crypto driver is loaded, each IAA device has a single 299work queue configured for it, with the following attributes:: 300 301 mode "dedicated" 302 threshold 0 303 size Total WQ Size from WQCAP 304 priority 10 305 type IDXD_WQT_KERNEL 306 group 0 307 name "iaa_crypto" 308 driver_name "crypto" 309 310The devices and workqueues are also enabled and therefore the driver 311is ready to be used without any additional configuration. 312 313The default driver attributes in effect when the driver is loaded are:: 314 315 sync_mode "sync" 316 verify_compress 1 317 318In order to change either the device/work queue or driver attributes, 319the enabled devices and workqueues must first be disabled. In order 320to have the new configuration applied to the deflate-iaa crypto 321algorithm, it needs to be re-registered by removing and reinserting 322the iaa_crypto module. The :ref:`iaa_disable_script` in the 'Use 323Cases' section below can be used to disable the default configuration. 324 325Statistics 326========== 327 328If the optional debugfs statistics support is enabled, the IAA crypto 329driver will generate statistics which can be accessed in debugfs at:: 330 331 # ls -al /sys/kernel/debug/iaa-crypto/ 332 total 0 333 drwxr-xr-x 2 root root 0 Mar 3 07:55 . 334 drwx------ 53 root root 0 Mar 3 07:55 .. 335 -rw-r--r-- 1 root root 0 Mar 3 07:55 global_stats 336 -rw-r--r-- 1 root root 0 Mar 3 07:55 stats_reset 337 -rw-r--r-- 1 root root 0 Mar 3 07:55 wq_stats 338 339The global_stats file shows a set of global statistics collected since 340the driver has been loaded or reset:: 341 342 # cat global_stats 343 global stats: 344 total_comp_calls: 4300 345 total_decomp_calls: 4164 346 total_sw_decomp_calls: 0 347 total_comp_bytes_out: 5993989 348 total_decomp_bytes_in: 5993989 349 total_completion_einval_errors: 0 350 total_completion_timeout_errors: 0 351 total_completion_comp_buf_overflow_errors: 136 352 353The wq_stats file shows per-wq stats, a set for each iaa device and wq 354in addition to some global stats:: 355 356 # cat wq_stats 357 iaa device: 358 id: 1 359 n_wqs: 1 360 comp_calls: 0 361 comp_bytes: 0 362 decomp_calls: 0 363 decomp_bytes: 0 364 wqs: 365 name: iaa_crypto 366 comp_calls: 0 367 comp_bytes: 0 368 decomp_calls: 0 369 decomp_bytes: 0 370 371 iaa device: 372 id: 3 373 n_wqs: 1 374 comp_calls: 0 375 comp_bytes: 0 376 decomp_calls: 0 377 decomp_bytes: 0 378 wqs: 379 name: iaa_crypto 380 comp_calls: 0 381 comp_bytes: 0 382 decomp_calls: 0 383 decomp_bytes: 0 384 385 iaa device: 386 id: 5 387 n_wqs: 1 388 comp_calls: 1360 389 comp_bytes: 1999776 390 decomp_calls: 0 391 decomp_bytes: 0 392 wqs: 393 name: iaa_crypto 394 comp_calls: 1360 395 comp_bytes: 1999776 396 decomp_calls: 0 397 decomp_bytes: 0 398 399 iaa device: 400 id: 7 401 n_wqs: 1 402 comp_calls: 2940 403 comp_bytes: 3994213 404 decomp_calls: 4164 405 decomp_bytes: 5993989 406 wqs: 407 name: iaa_crypto 408 comp_calls: 2940 409 comp_bytes: 3994213 410 decomp_calls: 4164 411 decomp_bytes: 5993989 412 ... 413 414Writing to 'stats_reset' resets all the stats, including the 415per-device and per-wq stats:: 416 417 # echo 1 > stats_reset 418 # cat wq_stats 419 global stats: 420 total_comp_calls: 0 421 total_decomp_calls: 0 422 total_comp_bytes_out: 0 423 total_decomp_bytes_in: 0 424 total_completion_einval_errors: 0 425 total_completion_timeout_errors: 0 426 total_completion_comp_buf_overflow_errors: 0 427 ... 428 429 430Use cases 431========= 432 433Simple zswap test 434----------------- 435 436For this example, the kernel should be configured according to the 437dedicated mode options described above, and zswap should be enabled as 438well:: 439 440 CONFIG_ZSWAP=y 441 442This is a simple test that uses iaa_compress as the compressor for a 443swap (zswap) device. It sets up the zswap device and then uses the 444memory_memadvise program listed below to forcibly swap out and in a 445specified number of pages, demonstrating both compress and decompress. 446 447The zswap test expects the work queues for each IAA device on the 448system to be configured properly as a kernel workqueue with a 449workqueue driver_name of "crypto". 450 451The first step is to make sure the iaa_crypto module is loaded:: 452 453 modprobe iaa_crypto 454 455If the IAA devices and workqueues haven't previously been disabled and 456reconfigured, then the default configuration should be in place and no 457further IAA configuration is necessary. See :ref:`iaa_default_config` 458below for details of the default configuration. 459 460If the default configuration is in place, you should see the iaa 461devices and wq0s enabled:: 462 463 # cat /sys/bus/dsa/devices/iax1/state 464 enabled 465 # cat /sys/bus/dsa/devices/iax1/wq1.0/state 466 enabled 467 468To demonstrate that the following steps work as expected, these 469commands can be used to enable debug output:: 470 471 # echo -n 'module iaa_crypto +p' > /sys/kernel/debug/dynamic_debug/control 472 # echo -n 'module idxd +p' > /sys/kernel/debug/dynamic_debug/control 473 474Use the following commands to enable zswap:: 475 476 # echo 0 > /sys/module/zswap/parameters/enabled 477 # echo 50 > /sys/module/zswap/parameters/max_pool_percent 478 # echo deflate-iaa > /sys/module/zswap/parameters/compressor 479 # echo zsmalloc > /sys/module/zswap/parameters/zpool 480 # echo 1 > /sys/module/zswap/parameters/enabled 481 # echo 100 > /proc/sys/vm/swappiness 482 # echo never > /sys/kernel/mm/transparent_hugepage/enabled 483 # echo 1 > /proc/sys/vm/overcommit_memory 484 485Now you can now run the zswap workload you want to measure. For 486example, using the memory_memadvise code below, the following command 487will swap in and out 100 pages:: 488 489 ./memory_madvise 100 490 491 Allocating 100 pages to swap in/out 492 Swapping out 100 pages 493 Swapping in 100 pages 494 Swapped out and in 100 pages 495 496You should see something like the following in the dmesg output:: 497 498 [ 404.202972] idxd 0000:e7:02.0: iaa_comp_acompress: dma_map_sg, src_addr 223925c000, nr_sgs 1, req->src 00000000ee7cb5e6, req->slen 4096, sg_dma_len(sg) 4096 499 [ 404.202973] idxd 0000:e7:02.0: iaa_comp_acompress: dma_map_sg, dst_addr 21dadf8000, nr_sgs 1, req->dst 000000008d6acea8, req->dlen 4096, sg_dma_len(sg) 8192 500 [ 404.202975] idxd 0000:e7:02.0: iaa_compress: desc->src1_addr 223925c000, desc->src1_size 4096, desc->dst_addr 21dadf8000, desc->max_dst_size 4096, desc->src2_addr 2203543000, desc->src2_size 1568 501 [ 404.202981] idxd 0000:e7:02.0: iaa_compress_verify: (verify) desc->src1_addr 21dadf8000, desc->src1_size 228, desc->dst_addr 223925c000, desc->max_dst_size 4096, desc->src2_addr 0, desc->src2_size 0 502 ... 503 504Now that basic functionality has been demonstrated, the defaults can 505be erased and replaced with a different configuration. To do that, 506first disable zswap:: 507 508 # echo lzo > /sys/module/zswap/parameters/compressor 509 # swapoff -a 510 # echo 0 > /sys/module/zswap/parameters/accept_threshold_percent 511 # echo 0 > /sys/module/zswap/parameters/max_pool_percent 512 # echo 0 > /sys/module/zswap/parameters/enabled 513 # echo 0 > /sys/module/zswap/parameters/enabled 514 515Then run the :ref:`iaa_disable_script` in the 'Use Cases' section 516below to disable the default configuration. 517 518Finally turn swap back on:: 519 520 # swapon -a 521 522Following all that the IAA device(s) can now be re-configured and 523enabled as desired for further testing. Below is one example. 524 525The zswap test expects the work queues for each IAA device on the 526system to be configured properly as a kernel workqueue with a 527workqueue driver_name of "crypto". 528 529The below script automatically does that:: 530 531 #!/bin/bash 532 533 echo "IAA devices:" 534 lspci -d:0cfe 535 echo "# IAA devices:" 536 lspci -d:0cfe | wc -l 537 538 # 539 # count iaa instances 540 # 541 iaa_dev_id="0cfe" 542 num_iaa=$(lspci -d:${iaa_dev_id} | wc -l) 543 echo "Found ${num_iaa} IAA instances" 544 545 # 546 # disable iaa wqs and devices 547 # 548 echo "Disable IAA" 549 550 for ((i = 1; i < ${num_iaa} * 2; i += 2)); do 551 echo disable wq iax${i}/wq${i}.0 552 accel-config disable-wq iax${i}/wq${i}.0 553 echo disable iaa iax${i} 554 accel-config disable-device iax${i} 555 done 556 557 echo "End Disable IAA" 558 559 echo "Reload iaa_crypto module" 560 561 rmmod iaa_crypto 562 modprobe iaa_crypto 563 564 echo "End Reload iaa_crypto module" 565 566 # 567 # configure iaa wqs and devices 568 # 569 echo "Configure IAA" 570 for ((i = 1; i < ${num_iaa} * 2; i += 2)); do 571 accel-config config-wq --group-id=0 --mode=dedicated --wq-size=128 --priority=10 --type=kernel --name="iaa_crypto" --driver-name="crypto" iax${i}/wq${i}.0 572 accel-config config-engine iax${i}/engine${i}.0 --group-id=0 573 done 574 575 echo "End Configure IAA" 576 577 # 578 # enable iaa wqs and devices 579 # 580 echo "Enable IAA" 581 582 for ((i = 1; i < ${num_iaa} * 2; i += 2)); do 583 echo enable iaa iax${i} 584 accel-config enable-device iax${i} 585 echo enable wq iax${i}/wq${i}.0 586 accel-config enable-wq iax${i}/wq${i}.0 587 done 588 589 echo "End Enable IAA" 590 591When the workqueues are bound to the iaa_crypto driver, you should 592see something similar to the following in dmesg output if you've 593enabled debug output (echo -n 'module iaa_crypto +p' > 594/sys/kernel/debug/dynamic_debug/control):: 595 596 [ 60.752344] idxd 0000:f6:02.0: add_iaa_wq: added wq 000000004068d14d to iaa 00000000c9585ba2, n_wq 1 597 [ 60.752346] iaa_crypto: rebalance_wq_table: nr_nodes=2, nr_cpus 160, nr_iaa 8, cpus_per_iaa 20 598 [ 60.752347] iaa_crypto: rebalance_wq_table: iaa=0 599 [ 60.752349] idxd 0000:6a:02.0: request_iaa_wq: getting wq from iaa_device 0000000042d7bc52 (0) 600 [ 60.752350] idxd 0000:6a:02.0: request_iaa_wq: returning unused wq 00000000c8bb4452 (0) from iaa device 0000000042d7bc52 (0) 601 [ 60.752352] iaa_crypto: rebalance_wq_table: assigned wq for cpu=0, node=0 = wq 00000000c8bb4452 602 [ 60.752354] iaa_crypto: rebalance_wq_table: iaa=0 603 [ 60.752355] idxd 0000:6a:02.0: request_iaa_wq: getting wq from iaa_device 0000000042d7bc52 (0) 604 [ 60.752356] idxd 0000:6a:02.0: request_iaa_wq: returning unused wq 00000000c8bb4452 (0) from iaa device 0000000042d7bc52 (0) 605 [ 60.752358] iaa_crypto: rebalance_wq_table: assigned wq for cpu=1, node=0 = wq 00000000c8bb4452 606 [ 60.752359] iaa_crypto: rebalance_wq_table: iaa=0 607 [ 60.752360] idxd 0000:6a:02.0: request_iaa_wq: getting wq from iaa_device 0000000042d7bc52 (0) 608 [ 60.752361] idxd 0000:6a:02.0: request_iaa_wq: returning unused wq 00000000c8bb4452 (0) from iaa device 0000000042d7bc52 (0) 609 [ 60.752362] iaa_crypto: rebalance_wq_table: assigned wq for cpu=2, node=0 = wq 00000000c8bb4452 610 [ 60.752364] iaa_crypto: rebalance_wq_table: iaa=0 611 . 612 . 613 . 614 615Once the workqueues and devices have been enabled, the IAA crypto 616algorithms are enabled and available. When the IAA crypto algorithms 617have been successfully enabled, you should see the following dmesg 618output:: 619 620 [ 64.893759] iaa_crypto: iaa_crypto_enable: iaa_crypto now ENABLED 621 622Now run the following zswap-specific setup commands to have zswap use 623the 'fixed' compression mode:: 624 625 echo 0 > /sys/module/zswap/parameters/enabled 626 echo 50 > /sys/module/zswap/parameters/max_pool_percent 627 echo deflate-iaa > /sys/module/zswap/parameters/compressor 628 echo zsmalloc > /sys/module/zswap/parameters/zpool 629 echo 1 > /sys/module/zswap/parameters/enabled 630 631 echo 100 > /proc/sys/vm/swappiness 632 echo never > /sys/kernel/mm/transparent_hugepage/enabled 633 echo 1 > /proc/sys/vm/overcommit_memory 634 635Finally, you can now run the zswap workload you want to measure. For 636example, using the code below, the following command will swap in and 637out 100 pages:: 638 639 ./memory_madvise 100 640 641 Allocating 100 pages to swap in/out 642 Swapping out 100 pages 643 Swapping in 100 pages 644 Swapped out and in 100 pages 645 646You should see something like the following in the dmesg output if 647you've enabled debug output (echo -n 'module iaa_crypto +p' > 648/sys/kernel/debug/dynamic_debug/control):: 649 650 [ 404.202972] idxd 0000:e7:02.0: iaa_comp_acompress: dma_map_sg, src_addr 223925c000, nr_sgs 1, req->src 00000000ee7cb5e6, req->slen 4096, sg_dma_len(sg) 4096 651 [ 404.202973] idxd 0000:e7:02.0: iaa_comp_acompress: dma_map_sg, dst_addr 21dadf8000, nr_sgs 1, req->dst 000000008d6acea8, req->dlen 4096, sg_dma_len(sg) 8192 652 [ 404.202975] idxd 0000:e7:02.0: iaa_compress: desc->src1_addr 223925c000, desc->src1_size 4096, desc->dst_addr 21dadf8000, desc->max_dst_size 4096, desc->src2_addr 2203543000, desc->src2_size 1568 653 [ 404.202981] idxd 0000:e7:02.0: iaa_compress_verify: (verify) desc->src1_addr 21dadf8000, desc->src1_size 228, desc->dst_addr 223925c000, desc->max_dst_size 4096, desc->src2_addr 0, desc->src2_size 0 654 [ 409.203227] idxd 0000:e7:02.0: iaa_comp_adecompress: dma_map_sg, src_addr 21ddd8b100, nr_sgs 1, req->src 0000000084adab64, req->slen 228, sg_dma_len(sg) 228 655 [ 409.203235] idxd 0000:e7:02.0: iaa_comp_adecompress: dma_map_sg, dst_addr 21ee3dc000, nr_sgs 1, req->dst 000000004e2990d0, req->dlen 4096, sg_dma_len(sg) 4096 656 [ 409.203239] idxd 0000:e7:02.0: iaa_decompress: desc->src1_addr 21ddd8b100, desc->src1_size 228, desc->dst_addr 21ee3dc000, desc->max_dst_size 4096, desc->src2_addr 0, desc->src2_size 0 657 [ 409.203254] idxd 0000:e7:02.0: iaa_comp_adecompress: dma_map_sg, src_addr 21ddd8b100, nr_sgs 1, req->src 0000000084adab64, req->slen 228, sg_dma_len(sg) 228 658 [ 409.203256] idxd 0000:e7:02.0: iaa_comp_adecompress: dma_map_sg, dst_addr 21f1551000, nr_sgs 1, req->dst 000000004e2990d0, req->dlen 4096, sg_dma_len(sg) 4096 659 [ 409.203257] idxd 0000:e7:02.0: iaa_decompress: desc->src1_addr 21ddd8b100, desc->src1_size 228, desc->dst_addr 21f1551000, desc->max_dst_size 4096, desc->src2_addr 0, desc->src2_size 0 660 661In order to unregister the IAA crypto algorithms, and register new 662ones using different parameters, any users of the current algorithm 663should be stopped and the IAA workqueues and devices disabled. 664 665In the case of zswap, remove the IAA crypto algorithm as the 666compressor and turn off swap (to remove all references to 667iaa_crypto):: 668 669 echo lzo > /sys/module/zswap/parameters/compressor 670 swapoff -a 671 672 echo 0 > /sys/module/zswap/parameters/accept_threshold_percent 673 echo 0 > /sys/module/zswap/parameters/max_pool_percent 674 echo 0 > /sys/module/zswap/parameters/enabled 675 676Once zswap is disabled and no longer using iaa_crypto, the IAA wqs and 677devices can be disabled. 678 679.. _iaa_disable_script: 680 681IAA disable script 682------------------ 683 684The below script automatically does that:: 685 686 #!/bin/bash 687 688 echo "IAA devices:" 689 lspci -d:0cfe 690 echo "# IAA devices:" 691 lspci -d:0cfe | wc -l 692 693 # 694 # count iaa instances 695 # 696 iaa_dev_id="0cfe" 697 num_iaa=$(lspci -d:${iaa_dev_id} | wc -l) 698 echo "Found ${num_iaa} IAA instances" 699 700 # 701 # disable iaa wqs and devices 702 # 703 echo "Disable IAA" 704 705 for ((i = 1; i < ${num_iaa} * 2; i += 2)); do 706 echo disable wq iax${i}/wq${i}.0 707 accel-config disable-wq iax${i}/wq${i}.0 708 echo disable iaa iax${i} 709 accel-config disable-device iax${i} 710 done 711 712 echo "End Disable IAA" 713 714Finally, at this point the iaa_crypto module can be removed, which 715will unregister the current IAA crypto algorithms:: 716 717 rmmod iaa_crypto 718 719 720memory_madvise.c (gcc -o memory_memadvise memory_madvise.c):: 721 722 #include <stdio.h> 723 #include <stdlib.h> 724 #include <string.h> 725 #include <unistd.h> 726 #include <sys/mman.h> 727 #include <linux/mman.h> 728 729 #ifndef MADV_PAGEOUT 730 #define MADV_PAGEOUT 21 /* force pages out immediately */ 731 #endif 732 733 #define PG_SZ 4096 734 735 int main(int argc, char **argv) 736 { 737 int i, nr_pages = 1; 738 int64_t *dump_ptr; 739 char *addr, *a; 740 int loop = 1; 741 742 if (argc > 1) 743 nr_pages = atoi(argv[1]); 744 745 printf("Allocating %d pages to swap in/out\n", nr_pages); 746 747 /* allocate pages */ 748 addr = mmap(NULL, nr_pages * PG_SZ, PROT_READ | PROT_WRITE, MAP_SHARED | MAP_ANONYMOUS, -1, 0); 749 *addr = 1; 750 751 /* initialize data in page to all '*' chars */ 752 memset(addr, '*', nr_pages * PG_SZ); 753 754 printf("Swapping out %d pages\n", nr_pages); 755 756 /* Tell kernel to swap it out */ 757 madvise(addr, nr_pages * PG_SZ, MADV_PAGEOUT); 758 759 while (loop > 0) { 760 /* Wait for swap out to finish */ 761 sleep(5); 762 763 a = addr; 764 765 printf("Swapping in %d pages\n", nr_pages); 766 767 /* Access the page ... this will swap it back in again */ 768 for (i = 0; i < nr_pages; i++) { 769 if (a[0] != '*') { 770 printf("Bad data from decompress!!!!!\n"); 771 772 dump_ptr = (int64_t *)a; 773 for (int j = 0; j < 100; j++) { 774 printf(" page %d data: %#llx\n", i, *dump_ptr); 775 dump_ptr++; 776 } 777 } 778 779 a += PG_SZ; 780 } 781 782 loop --; 783 } 784 785 printf("Swapped out and in %d pages\n", nr_pages); 786 787Appendix 788======== 789 790.. _iaa_sysfs_config: 791 792IAA sysfs config interface 793-------------------------- 794 795Below is a description of the IAA sysfs interface, which as mentioned 796in the main document, should only be used if you know exactly what you 797are doing. Even then, there's no compelling reason to use it directly 798since accel-config can do everything the sysfs interface can and in 799fact accel-config is based on it under the covers. 800 801The 'IAA config path' is /sys/bus/dsa/devices and contains 802subdirectories representing each IAA device, workqueue, engine, and 803group. Note that in the sysfs interface, the IAA devices are actually 804named using iax e.g. iax1, iax3, etc. (Note that IAA devices are the 805odd-numbered devices; the even-numbered devices are DSA devices and 806can be ignored for IAA). 807 808The 'IAA device bind path' is /sys/bus/dsa/drivers/idxd/bind and is 809the file that is written to enable an IAA device. 810 811The 'IAA workqueue bind path' is /sys/bus/dsa/drivers/crypto/bind and 812is the file that is written to enable an IAA workqueue. 813 814Similarly /sys/bus/dsa/drivers/idxd/unbind and 815/sys/bus/dsa/drivers/crypto/unbind are used to disable IAA devices and 816workqueues. 817 818The basic sequence of commands needed to set up the IAA devices and 819workqueues is: 820 821For each device:: 822 1) Disable any workqueues enabled on the device. For example to 823 disable workques 0 and 1 on IAA device 3:: 824 825 # echo wq3.0 > /sys/bus/dsa/drivers/crypto/unbind 826 # echo wq3.1 > /sys/bus/dsa/drivers/crypto/unbind 827 828 2) Disable the device. For example to disable IAA device 3:: 829 830 # echo iax3 > /sys/bus/dsa/drivers/idxd/unbind 831 832 3) configure the desired workqueues. For example, to configure 833 workqueue 3 on IAA device 3:: 834 835 # echo dedicated > /sys/bus/dsa/devices/iax3/wq3.3/mode 836 # echo 128 > /sys/bus/dsa/devices/iax3/wq3.3/size 837 # echo 0 > /sys/bus/dsa/devices/iax3/wq3.3/group_id 838 # echo 10 > /sys/bus/dsa/devices/iax3/wq3.3/priority 839 # echo "kernel" > /sys/bus/dsa/devices/iax3/wq3.3/type 840 # echo "iaa_crypto" > /sys/bus/dsa/devices/iax3/wq3.3/name 841 # echo "crypto" > /sys/bus/dsa/devices/iax3/wq3.3/driver_name 842 843 4) Enable the device. For example to enable IAA device 3:: 844 845 # echo iax3 > /sys/bus/dsa/drivers/idxd/bind 846 847 5) Enable the desired workqueues on the device. For example to 848 enable workques 0 and 1 on IAA device 3:: 849 850 # echo wq3.0 > /sys/bus/dsa/drivers/crypto/bind 851 # echo wq3.1 > /sys/bus/dsa/drivers/crypto/bind 852