qemu

FORK: QEMU emulator
git clone https://git.neptards.moe/neptards/qemu.git
Log | Files | Refs | Submodules | LICENSE

vfio-ap.rst (35036B)


      1 Adjunct Processor (AP) Device
      2 =============================
      3 
      4 .. contents::
      5 
      6 Introduction
      7 ------------
      8 
      9 The IBM Adjunct Processor (AP) Cryptographic Facility is comprised
     10 of three AP instructions and from 1 to 256 PCIe cryptographic adapter cards.
     11 These AP devices provide cryptographic functions to all CPUs assigned to a
     12 linux system running in an IBM Z system LPAR.
     13 
     14 On s390x, AP adapter cards are exposed via the AP bus. This document
     15 describes how those cards may be made available to KVM guests using the
     16 VFIO mediated device framework.
     17 
     18 AP Architectural Overview
     19 -------------------------
     20 
     21 In order understand the terminology used in the rest of this document, let's
     22 start with some definitions:
     23 
     24 * AP adapter
     25 
     26   An AP adapter is an IBM Z adapter card that can perform cryptographic
     27   functions. There can be from 0 to 256 adapters assigned to an LPAR depending
     28   on the machine model. Adapters assigned to the LPAR in which a linux host is
     29   running will be available to the linux host. Each adapter is identified by a
     30   number from 0 to 255; however, the maximum adapter number allowed is
     31   determined by machine model. When installed, an AP adapter is accessed by
     32   AP instructions executed by any CPU.
     33 
     34 * AP domain
     35 
     36   An adapter is partitioned into domains. Each domain can be thought of as
     37   a set of hardware registers for processing AP instructions. An adapter can
     38   hold up to 256 domains; however, the maximum domain number allowed is
     39   determined by machine model. Each domain is identified by a number from 0 to
     40   255. Domains can be further classified into two types:
     41 
     42     * Usage domains are domains that can be accessed directly to process AP
     43       commands
     44 
     45     * Control domains are domains that are accessed indirectly by AP
     46       commands sent to a usage domain to control or change the domain; for
     47       example, to set a secure private key for the domain.
     48 
     49 * AP Queue
     50 
     51   An AP queue is the means by which an AP command-request message is sent to an
     52   AP usage domain inside a specific AP. An AP queue is identified by a tuple
     53   comprised of an AP adapter ID (APID) and an AP queue index (APQI). The
     54   APQI corresponds to a given usage domain number within the adapter. This tuple
     55   forms an AP Queue Number (APQN) uniquely identifying an AP queue. AP
     56   instructions include a field containing the APQN to identify the AP queue to
     57   which the AP command-request message is to be sent for processing.
     58 
     59 * AP Instructions:
     60 
     61   There are three AP instructions:
     62 
     63   * NQAP: to enqueue an AP command-request message to a queue
     64   * DQAP: to dequeue an AP command-reply message from a queue
     65   * PQAP: to administer the queues
     66 
     67   AP instructions identify the domain that is targeted to process the AP
     68   command; this must be one of the usage domains. An AP command may modify a
     69   domain that is not one of the usage domains, but the modified domain
     70   must be one of the control domains.
     71 
     72 Start Interpretive Execution (SIE) Instruction
     73 ----------------------------------------------
     74 
     75 A KVM guest is started by executing the Start Interpretive Execution (SIE)
     76 instruction. The SIE state description is a control block that contains the
     77 state information for a KVM guest and is supplied as input to the SIE
     78 instruction. The SIE state description contains a satellite control block called
     79 the Crypto Control Block (CRYCB). The CRYCB contains three fields to identify
     80 the adapters, usage domains and control domains assigned to the KVM guest:
     81 
     82 * The AP Mask (APM) field is a bit mask that identifies the AP adapters assigned
     83   to the KVM guest. Each bit in the mask, from left to right, corresponds to
     84   an APID from 0-255. If a bit is set, the corresponding adapter is valid for
     85   use by the KVM guest.
     86 
     87 * The AP Queue Mask (AQM) field is a bit mask identifying the AP usage domains
     88   assigned to the KVM guest. Each bit in the mask, from left to right,
     89   corresponds to  an AP queue index (APQI) from 0-255. If a bit is set, the
     90   corresponding queue is valid for use by the KVM guest.
     91 
     92 * The AP Domain Mask field is a bit mask that identifies the AP control domains
     93   assigned to the KVM guest. The ADM bit mask controls which domains can be
     94   changed by an AP command-request message sent to a usage domain from the
     95   guest. Each bit in the mask, from left to right, corresponds to a domain from
     96   0-255. If a bit is set, the corresponding domain can be modified by an AP
     97   command-request message sent to a usage domain.
     98 
     99 If you recall from the description of an AP Queue, AP instructions include
    100 an APQN to identify the AP adapter and AP queue to which an AP command-request
    101 message is to be sent (NQAP and PQAP instructions), or from which a
    102 command-reply message is to be received (DQAP instruction). The validity of an
    103 APQN is defined by the matrix calculated from the APM and AQM; it is the
    104 cross product of all assigned adapter numbers (APM) with all assigned queue
    105 indexes (AQM). For example, if adapters 1 and 2 and usage domains 5 and 6 are
    106 assigned to a guest, the APQNs (1,5), (1,6), (2,5) and (2,6) will be valid for
    107 the guest.
    108 
    109 The APQNs can provide secure key functionality - i.e., a private key is stored
    110 on the adapter card for each of its domains - so each APQN must be assigned to
    111 at most one guest or the linux host.
    112 
    113 Example 1: Valid configuration
    114 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    115 
    116 +----------+--------+--------+
    117 |          | Guest1 | Guest2 |
    118 +==========+========+========+
    119 | adapters |  1, 2  |  1, 2  |
    120 +----------+--------+--------+
    121 | domains  |  5, 6  |  7     |
    122 +----------+--------+--------+
    123 
    124 This is valid because both guests have a unique set of APQNs:
    125 
    126 * Guest1 has APQNs (1,5), (1,6), (2,5) and (2,6);
    127 * Guest2 has APQNs (1,7) and (2,7).
    128 
    129 Example 2: Valid configuration
    130 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    131 
    132 +----------+--------+--------+
    133 |          | Guest1 | Guest2 |
    134 +==========+========+========+
    135 | adapters |  1, 2  |  3, 4  |
    136 +----------+--------+--------+
    137 | domains  |  5, 6  |  5, 6  |
    138 +----------+--------+--------+
    139 
    140 This is also valid because both guests have a unique set of APQNs:
    141 
    142 * Guest1 has APQNs (1,5), (1,6), (2,5), (2,6);
    143 * Guest2 has APQNs (3,5), (3,6), (4,5), (4,6)
    144 
    145 Example 3: Invalid configuration
    146 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    147 
    148 +----------+--------+--------+
    149 |          | Guest1 | Guest2 |
    150 +==========+========+========+
    151 | adapters |  1, 2  |  1     |
    152 +----------+--------+--------+
    153 | domains  |  5, 6  |  6, 7  |
    154 +----------+--------+--------+
    155 
    156 This is an invalid configuration because both guests have access to
    157 APQN (1,6).
    158 
    159 AP Matrix Configuration on Linux Host
    160 -------------------------------------
    161 
    162 A linux system is a guest of the LPAR in which it is running and has access to
    163 the AP resources configured for the LPAR. The LPAR's AP matrix is
    164 configured via its Activation Profile which can be edited on the HMC. When the
    165 linux system is started, the AP bus will detect the AP devices assigned to the
    166 LPAR and create the following in sysfs::
    167 
    168   /sys/bus/ap
    169   ... [devices]
    170   ...... xx.yyyy
    171   ...... ...
    172   ...... cardxx
    173   ...... ...
    174 
    175 Where:
    176 
    177 ``cardxx``
    178   is AP adapter number xx (in hex)
    179 
    180 ``xx.yyyy``
    181   is an APQN with xx specifying the APID and yyyy specifying the APQI
    182 
    183 For example, if AP adapters 5 and 6 and domains 4, 71 (0x47), 171 (0xab) and
    184 255 (0xff) are configured for the LPAR, the sysfs representation on the linux
    185 host system would look like this::
    186 
    187   /sys/bus/ap
    188   ... [devices]
    189   ...... 05.0004
    190   ...... 05.0047
    191   ...... 05.00ab
    192   ...... 05.00ff
    193   ...... 06.0004
    194   ...... 06.0047
    195   ...... 06.00ab
    196   ...... 06.00ff
    197   ...... card05
    198   ...... card06
    199 
    200 A set of default device drivers are also created to control each type of AP
    201 device that can be assigned to the LPAR on which a linux host is running::
    202 
    203   /sys/bus/ap
    204   ... [drivers]
    205   ...... [cex2acard]        for Crypto Express 2/3 accelerator cards
    206   ...... [cex2aqueue]       for AP queues served by Crypto Express 2/3
    207                             accelerator cards
    208   ...... [cex4card]         for Crypto Express 4/5/6 accelerator and coprocessor
    209                             cards
    210   ...... [cex4queue]        for AP queues served by Crypto Express 4/5/6
    211                             accelerator and coprocessor cards
    212   ...... [pcixcccard]       for Crypto Express 2/3 coprocessor cards
    213   ...... [pcixccqueue]      for AP queues served by Crypto Express 2/3
    214                             coprocessor cards
    215 
    216 Binding AP devices to device drivers
    217 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    218 
    219 There are two sysfs files that specify bitmasks marking a subset of the APQN
    220 range as 'usable by the default AP queue device drivers' or 'not usable by the
    221 default device drivers' and thus available for use by the alternate device
    222 driver(s). The sysfs locations of the masks are::
    223 
    224    /sys/bus/ap/apmask
    225    /sys/bus/ap/aqmask
    226 
    227 The ``apmask`` is a 256-bit mask that identifies a set of AP adapter IDs
    228 (APID). Each bit in the mask, from left to right (i.e., from most significant
    229 to least significant bit in big endian order), corresponds to an APID from
    230 0-255. If a bit is set, the APID is marked as usable only by the default AP
    231 queue device drivers; otherwise, the APID is usable by the vfio_ap
    232 device driver.
    233 
    234 The ``aqmask`` is a 256-bit mask that identifies a set of AP queue indexes
    235 (APQI). Each bit in the mask, from left to right (i.e., from most significant
    236 to least significant bit in big endian order), corresponds to an APQI from
    237 0-255. If a bit is set, the APQI is marked as usable only by the default AP
    238 queue device drivers; otherwise, the APQI is usable by the vfio_ap device
    239 driver.
    240 
    241 Take, for example, the following mask::
    242 
    243       0x7dffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff
    244 
    245 It indicates:
    246 
    247       1, 2, 3, 4, 5, and 7-255 belong to the default drivers' pool, and 0 and 6
    248       belong to the vfio_ap device driver's pool.
    249 
    250 The APQN of each AP queue device assigned to the linux host is checked by the
    251 AP bus against the set of APQNs derived from the cross product of APIDs
    252 and APQIs marked as usable only by the default AP queue device drivers. If a
    253 match is detected,  only the default AP queue device drivers will be probed;
    254 otherwise, the vfio_ap device driver will be probed.
    255 
    256 By default, the two masks are set to reserve all APQNs for use by the default
    257 AP queue device drivers. There are two ways the default masks can be changed:
    258 
    259  1. The sysfs mask files can be edited by echoing a string into the
    260     respective sysfs mask file in one of two formats:
    261 
    262     * An absolute hex string starting with 0x - like "0x12345678" - sets
    263       the mask. If the given string is shorter than the mask, it is padded
    264       with 0s on the right; for example, specifying a mask value of 0x41 is
    265       the same as specifying::
    266 
    267            0x4100000000000000000000000000000000000000000000000000000000000000
    268 
    269       Keep in mind that the mask reads from left to right (i.e., most
    270       significant to least significant bit in big endian order), so the mask
    271       above identifies device numbers 1 and 7 (``01000001``).
    272 
    273       If the string is longer than the mask, the operation is terminated with
    274       an error (EINVAL).
    275 
    276     * Individual bits in the mask can be switched on and off by specifying
    277       each bit number to be switched in a comma separated list. Each bit
    278       number string must be prepended with a (``+``) or minus (``-``) to indicate
    279       the corresponding bit is to be switched on (``+``) or off (``-``). Some
    280       valid values are::
    281 
    282            "+0"    switches bit 0 on
    283            "-13"   switches bit 13 off
    284            "+0x41" switches bit 65 on
    285            "-0xff" switches bit 255 off
    286 
    287       The following example::
    288 
    289               +0,-6,+0x47,-0xf0
    290 
    291       Switches bits 0 and 71 (0x47) on
    292       Switches bits 6 and 240 (0xf0) off
    293 
    294       Note that the bits not specified in the list remain as they were before
    295       the operation.
    296 
    297  2. The masks can also be changed at boot time via parameters on the kernel
    298     command line like this::
    299 
    300          ap.apmask=0xffff ap.aqmask=0x40
    301 
    302     This would create the following masks:
    303 
    304     apmask::
    305 
    306             0xffff000000000000000000000000000000000000000000000000000000000000
    307 
    308     aqmask::
    309 
    310             0x4000000000000000000000000000000000000000000000000000000000000000
    311 
    312     Resulting in these two pools::
    313 
    314             default drivers pool:    adapter 0-15, domain 1
    315             alternate drivers pool:  adapter 16-255, domains 0, 2-255
    316 
    317 Configuring an AP matrix for a linux guest
    318 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    319 
    320 The sysfs interfaces for configuring an AP matrix for a guest are built on the
    321 VFIO mediated device framework. To configure an AP matrix for a guest, a
    322 mediated matrix device must first be created for the ``/sys/devices/vfio_ap/matrix``
    323 device. When the vfio_ap device driver is loaded, it registers with the VFIO
    324 mediated device framework. When the driver registers, the sysfs interfaces for
    325 creating mediated matrix devices is created::
    326 
    327   /sys/devices
    328   ... [vfio_ap]
    329   ......[matrix]
    330   ......... [mdev_supported_types]
    331   ............ [vfio_ap-passthrough]
    332   ............... create
    333   ............... [devices]
    334 
    335 A mediated AP matrix device is created by writing a UUID to the attribute file
    336 named ``create``, for example::
    337 
    338    uuidgen > create
    339 
    340 or
    341 
    342 ::
    343 
    344    echo $uuid > create
    345 
    346 When a mediated AP matrix device is created, a sysfs directory named after
    347 the UUID is created in the ``devices`` subdirectory::
    348 
    349   /sys/devices
    350   ... [vfio_ap]
    351   ......[matrix]
    352   ......... [mdev_supported_types]
    353   ............ [vfio_ap-passthrough]
    354   ............... create
    355   ............... [devices]
    356   .................. [$uuid]
    357 
    358 There will also be three sets of attribute files created in the mediated
    359 matrix device's sysfs directory to configure an AP matrix for the
    360 KVM guest::
    361 
    362   /sys/devices
    363   ... [vfio_ap]
    364   ......[matrix]
    365   ......... [mdev_supported_types]
    366   ............ [vfio_ap-passthrough]
    367   ............... create
    368   ............... [devices]
    369   .................. [$uuid]
    370   ..................... assign_adapter
    371   ..................... assign_control_domain
    372   ..................... assign_domain
    373   ..................... matrix
    374   ..................... unassign_adapter
    375   ..................... unassign_control_domain
    376   ..................... unassign_domain
    377 
    378 ``assign_adapter``
    379    To assign an AP adapter to the mediated matrix device, its APID is written
    380    to the ``assign_adapter`` file. This may be done multiple times to assign more
    381    than one adapter. The APID may be specified using conventional semantics
    382    as a decimal, hexadecimal, or octal number. For example, to assign adapters
    383    4, 5 and 16 to a mediated matrix device in decimal, hexadecimal and octal
    384    respectively::
    385 
    386        echo 4 > assign_adapter
    387        echo 0x5 > assign_adapter
    388        echo 020 > assign_adapter
    389 
    390    In order to successfully assign an adapter:
    391 
    392    * The adapter number specified must represent a value from 0 up to the
    393      maximum adapter number allowed by the machine model. If an adapter number
    394      higher than the maximum is specified, the operation will terminate with
    395      an error (ENODEV).
    396 
    397    * All APQNs that can be derived from the adapter ID being assigned and the
    398      IDs of the previously assigned domains must be bound to the vfio_ap device
    399      driver. If no domains have yet been assigned, then there must be at least
    400      one APQN with the specified APID bound to the vfio_ap driver. If no such
    401      APQNs are bound to the driver, the operation will terminate with an
    402      error (EADDRNOTAVAIL).
    403 
    404    * No APQN that can be derived from the adapter ID and the IDs of the
    405      previously assigned domains can be assigned to another mediated matrix
    406      device. If an APQN is assigned to another mediated matrix device, the
    407      operation will terminate with an error (EADDRINUSE).
    408 
    409 ``unassign_adapter``
    410    To unassign an AP adapter, its APID is written to the ``unassign_adapter``
    411    file. This may also be done multiple times to unassign more than one adapter.
    412 
    413 ``assign_domain``
    414    To assign a usage domain, the domain number is written into the
    415    ``assign_domain`` file. This may be done multiple times to assign more than one
    416    usage domain. The domain number is specified using conventional semantics as
    417    a decimal, hexadecimal, or octal number. For example, to assign usage domains
    418    4, 8, and 71 to a mediated matrix device in decimal, hexadecimal and octal
    419    respectively::
    420 
    421       echo 4 > assign_domain
    422       echo 0x8 > assign_domain
    423       echo 0107 > assign_domain
    424 
    425    In order to successfully assign a domain:
    426 
    427    * The domain number specified must represent a value from 0 up to the
    428      maximum domain number allowed by the machine model. If a domain number
    429      higher than the maximum is specified, the operation will terminate with
    430      an error (ENODEV).
    431 
    432    * All APQNs that can be derived from the domain ID being assigned and the IDs
    433      of the previously assigned adapters must be bound to the vfio_ap device
    434      driver. If no domains have yet been assigned, then there must be at least
    435      one APQN with the specified APQI bound to the vfio_ap driver. If no such
    436      APQNs are bound to the driver, the operation will terminate with an
    437      error (EADDRNOTAVAIL).
    438 
    439    * No APQN that can be derived from the domain ID being assigned and the IDs
    440      of the previously assigned adapters can be assigned to another mediated
    441      matrix device. If an APQN is assigned to another mediated matrix device,
    442      the operation will terminate with an error (EADDRINUSE).
    443 
    444 ``unassign_domain``
    445    To unassign a usage domain, the domain number is written into the
    446    ``unassign_domain`` file. This may be done multiple times to unassign more than
    447    one usage domain.
    448 
    449 ``assign_control_domain``
    450    To assign a control domain, the domain number is written into the
    451    ``assign_control_domain`` file. This may be done multiple times to
    452    assign more than one control domain. The domain number may be specified using
    453    conventional semantics as a decimal, hexadecimal, or octal number. For
    454    example, to assign  control domains 4, 8, and 71 to  a mediated matrix device
    455    in decimal, hexadecimal and octal respectively::
    456 
    457       echo 4 > assign_domain
    458       echo 0x8 > assign_domain
    459       echo 0107 > assign_domain
    460 
    461    In order to successfully assign a control domain, the domain number
    462    specified must represent a value from 0 up to the maximum domain number
    463    allowed by the machine model. If a control domain number higher than the
    464    maximum is specified, the operation will terminate with an error (ENODEV).
    465 
    466 ``unassign_control_domain``
    467    To unassign a control domain, the domain number is written into the
    468    ``unassign_domain`` file. This may be done multiple times to unassign more than
    469    one control domain.
    470 
    471 Notes: No changes to the AP matrix will be allowed while a guest using
    472 the mediated matrix device is running. Attempts to assign an adapter,
    473 domain or control domain will be rejected and an error (EBUSY) returned.
    474 
    475 Starting a Linux Guest Configured with an AP Matrix
    476 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    477 
    478 To provide a mediated matrix device for use by a guest, the following option
    479 must be specified on the QEMU command line::
    480 
    481    -device vfio_ap,sysfsdev=$path-to-mdev
    482 
    483 The sysfsdev parameter specifies the path to the mediated matrix device.
    484 There are a number of ways to specify this path::
    485 
    486   /sys/devices/vfio_ap/matrix/$uuid
    487   /sys/bus/mdev/devices/$uuid
    488   /sys/bus/mdev/drivers/vfio_mdev/$uuid
    489   /sys/devices/vfio_ap/matrix/mdev_supported_types/vfio_ap-passthrough/devices/$uuid
    490 
    491 When the linux guest is started, the guest will open the mediated
    492 matrix device's file descriptor to get information about the mediated matrix
    493 device. The ``vfio_ap`` device driver will update the APM, AQM, and ADM fields in
    494 the guest's CRYCB with the adapter, usage domain and control domains assigned
    495 via the mediated matrix device's sysfs attribute files. Programs running on the
    496 linux guest will then:
    497 
    498 1. Have direct access to the APQNs derived from the cross product of the AP
    499    adapter numbers (APID) and queue indexes (APQI) specified in the APM and AQM
    500    fields of the guests's CRYCB respectively. These APQNs identify the AP queues
    501    that are valid for use by the guest; meaning, AP commands can be sent by the
    502    guest to any of these queues for processing.
    503 
    504 2. Have authorization to process AP commands to change a control domain
    505    identified in the ADM field of the guest's CRYCB. The AP command must be sent
    506    to a valid APQN (see 1 above).
    507 
    508 CPU model features:
    509 
    510 Three CPU model features are available for controlling guest access to AP
    511 facilities:
    512 
    513 1. AP facilities feature
    514 
    515    The AP facilities feature indicates that AP facilities are installed on the
    516    guest. This feature will be exposed for use only if the AP facilities
    517    are installed on the host system. The feature is s390-specific and is
    518    represented as a parameter of the -cpu option on the QEMU command line::
    519 
    520       qemu-system-s390x -cpu $model,ap=on|off
    521 
    522    Where:
    523 
    524       ``$model``
    525         is the CPU model defined for the guest (defaults to the model of
    526         the host system if not specified).
    527 
    528       ``ap=on|off``
    529         indicates whether AP facilities are installed (on) or not
    530         (off). The default for CPU models zEC12 or newer
    531         is ``ap=on``. AP facilities must be installed on the guest if a
    532         vfio-ap device (``-device vfio-ap,sysfsdev=$path``) is configured
    533         for the guest, or the guest will fail to start.
    534 
    535 2. Query Configuration Information (QCI) facility
    536 
    537    The QCI facility is used by the AP bus running on the guest to query the
    538    configuration of the AP facilities. This facility will be available
    539    only if the QCI facility is installed on the host system. The feature is
    540    s390-specific and is represented as a parameter of the -cpu option on the
    541    QEMU command line::
    542 
    543       qemu-system-s390x -cpu $model,apqci=on|off
    544 
    545    Where:
    546 
    547       ``$model``
    548         is the CPU model defined for the guest
    549 
    550       ``apqci=on|off``
    551         indicates whether the QCI facility is installed (on) or
    552         not (off). The default for CPU models zEC12 or newer
    553         is ``apqci=on``; for older models, QCI will not be installed.
    554 
    555         If QCI is installed (``apqci=on``) but AP facilities are not
    556         (``ap=off``), an error message will be logged, but the guest
    557         will be allowed to start. It makes no sense to have QCI
    558         installed if the AP facilities are not; this is considered
    559         an invalid configuration.
    560 
    561         If the QCI facility is not installed, APQNs with an APQI
    562         greater than 15 will not be detected by the AP bus
    563         running on the guest.
    564 
    565 3. Adjunct Process Facility Test (APFT) facility
    566 
    567    The APFT facility is used by the AP bus running on the guest to test the
    568    AP facilities available for a given AP queue. This facility will be available
    569    only if the APFT facility is installed on the host system. The feature is
    570    s390-specific and is represented as a parameter of the -cpu option on the
    571    QEMU command line::
    572 
    573       qemu-system-s390x -cpu $model,apft=on|off
    574 
    575    Where:
    576 
    577       ``$model``
    578         is the CPU model defined for the guest (defaults to the model of
    579         the host system if not specified).
    580 
    581       ``apft=on|off``
    582         indicates whether the APFT facility is installed (on) or
    583         not (off). The default for CPU models zEC12 and
    584         newer is ``apft=on`` for older models, APFT will not be
    585         installed.
    586 
    587         If APFT is installed (``apft=on``) but AP facilities are not
    588         (``ap=off``), an error message will be logged, but the guest
    589         will be allowed to start. It makes no sense to have APFT
    590         installed if the AP facilities are not; this is considered
    591         an invalid configuration.
    592 
    593         It also makes no sense to turn APFT off because the AP bus
    594         running on the guest will not detect CEX4 and newer devices
    595         without it. Since only CEX4 and newer devices are supported
    596         for guest usage, no AP devices can be made accessible to a
    597         guest started without APFT installed.
    598 
    599 Hot plug a vfio-ap device into a running guest
    600 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    601 
    602 Only one vfio-ap device can be attached to the virtual machine's ap-bus, so a
    603 vfio-ap device can be hot plugged if and only if no vfio-ap device is attached
    604 to the bus already, whether via the QEMU command line or a prior hot plug
    605 action.
    606 
    607 To hot plug a vfio-ap device, use the QEMU ``device_add`` command::
    608 
    609     (qemu) device_add vfio-ap,sysfsdev="$path-to-mdev",id="$id"
    610 
    611 Where the ``$path-to-mdev`` value specifies the absolute path to a mediated
    612 device to which AP resources to be used by the guest have been assigned.
    613 ``$id`` is the name value for the optional id parameter.
    614 
    615 Note that on Linux guests, the AP devices will be created in the
    616 ``/sys/bus/ap/devices`` directory when the AP bus subsequently performs its periodic
    617 scan, so there may be a short delay before the AP devices are accessible on the
    618 guest.
    619 
    620 The command will fail if:
    621 
    622 * A vfio-ap device has already been attached to the virtual machine's ap-bus.
    623 
    624 * The CPU model features for controlling guest access to AP facilities are not
    625   enabled (see 'CPU model features' subsection in the previous section).
    626 
    627 Hot unplug a vfio-ap device from a running guest
    628 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    629 
    630 A vfio-ap device can be unplugged from a running KVM guest if a vfio-ap device
    631 has been attached to the virtual machine's ap-bus via the QEMU command line
    632 or a prior hot plug action.
    633 
    634 To hot unplug a vfio-ap device, use the QEMU ``device_del`` command::
    635 
    636     (qemu) device_del "$id"
    637 
    638 Where ``$id`` is the same id that was specified at device creation.
    639 
    640 On a Linux guest, the AP devices will be removed from the ``/sys/bus/ap/devices``
    641 directory on the guest when the AP bus subsequently performs its periodic scan,
    642 so there may be a short delay before the AP devices are no longer accessible by
    643 the guest.
    644 
    645 The command will fail if the ``$path-to-mdev`` specified on the ``device_del`` command
    646 does not match the value specified when the vfio-ap device was attached to
    647 the virtual machine's ap-bus.
    648 
    649 Example: Configure AP Matrices for Three Linux Guests
    650 -----------------------------------------------------
    651 
    652 Let's now provide an example to illustrate how KVM guests may be given
    653 access to AP facilities. For this example, we will show how to configure
    654 three guests such that executing the lszcrypt command on the guests would
    655 look like this:
    656 
    657 Guest1::
    658 
    659   CARD.DOMAIN TYPE  MODE
    660   ------------------------------
    661   05          CEX5C CCA-Coproc
    662   05.0004     CEX5C CCA-Coproc
    663   05.00ab     CEX5C CCA-Coproc
    664   06          CEX5A Accelerator
    665   06.0004     CEX5A Accelerator
    666   06.00ab     CEX5C CCA-Coproc
    667 
    668 Guest2::
    669 
    670   CARD.DOMAIN TYPE  MODE
    671   ------------------------------
    672   05          CEX5A Accelerator
    673   05.0047     CEX5A Accelerator
    674   05.00ff     CEX5A Accelerator
    675 
    676 Guest3::
    677 
    678   CARD.DOMAIN TYPE  MODE
    679   ------------------------------
    680   06          CEX5A Accelerator
    681   06.0047     CEX5A Accelerator
    682   06.00ff     CEX5A Accelerator
    683 
    684 These are the steps:
    685 
    686 1. Install the vfio_ap module on the linux host. The dependency chain for the
    687    vfio_ap module is:
    688 
    689    * iommu
    690    * s390
    691    * zcrypt
    692    * vfio
    693    * vfio_mdev
    694    * vfio_mdev_device
    695    * KVM
    696 
    697    To build the vfio_ap module, the kernel build must be configured with the
    698    following Kconfig elements selected:
    699 
    700    * IOMMU_SUPPORT
    701    * S390
    702    * ZCRYPT
    703    * S390_AP_IOMMU
    704    * VFIO
    705    * VFIO_MDEV
    706    * VFIO_MDEV_DEVICE
    707    * KVM
    708 
    709    If using make menuconfig select the following to build the vfio_ap module::
    710      -> Device Drivers
    711         -> IOMMU Hardware Support
    712            select S390 AP IOMMU Support
    713         -> VFIO Non-Privileged userspace driver framework
    714            -> Mediated device driver framework
    715               -> VFIO driver for Mediated devices
    716      -> I/O subsystem
    717         -> VFIO support for AP devices
    718 
    719 2. Secure the AP queues to be used by the three guests so that the host can not
    720    access them. To secure the AP queues 05.0004, 05.0047, 05.00ab, 05.00ff,
    721    06.0004, 06.0047, 06.00ab, and 06.00ff for use by the vfio_ap device driver,
    722    the corresponding APQNs must be removed from the default queue drivers pool
    723    as follows::
    724 
    725       echo -5,-6 > /sys/bus/ap/apmask
    726 
    727       echo -4,-0x47,-0xab,-0xff > /sys/bus/ap/aqmask
    728 
    729    This will result in AP queues 05.0004, 05.0047, 05.00ab, 05.00ff, 06.0004,
    730    06.0047, 06.00ab, and 06.00ff getting bound to the vfio_ap device driver. The
    731    sysfs directory for the vfio_ap device driver will now contain symbolic links
    732    to the AP queue devices bound to it::
    733 
    734      /sys/bus/ap
    735      ... [drivers]
    736      ...... [vfio_ap]
    737      ......... [05.0004]
    738      ......... [05.0047]
    739      ......... [05.00ab]
    740      ......... [05.00ff]
    741      ......... [06.0004]
    742      ......... [06.0047]
    743      ......... [06.00ab]
    744      ......... [06.00ff]
    745 
    746    Keep in mind that only type 10 and newer adapters (i.e., CEX4 and later)
    747    can be bound to the vfio_ap device driver. The reason for this is to
    748    simplify the implementation by not needlessly complicating the design by
    749    supporting older devices that will go out of service in the relatively near
    750    future, and for which there are few older systems on which to test.
    751 
    752    The administrator, therefore, must take care to secure only AP queues that
    753    can be bound to the vfio_ap device driver. The device type for a given AP
    754    queue device can be read from the parent card's sysfs directory. For example,
    755    to see the hardware type of the queue 05.0004::
    756 
    757      cat /sys/bus/ap/devices/card05/hwtype
    758 
    759    The hwtype must be 10 or higher (CEX4 or newer) in order to be bound to the
    760    vfio_ap device driver.
    761 
    762 3. Create the mediated devices needed to configure the AP matrixes for the
    763    three guests and to provide an interface to the vfio_ap driver for
    764    use by the guests::
    765 
    766      /sys/devices/vfio_ap/matrix/
    767      ... [mdev_supported_types]
    768      ...... [vfio_ap-passthrough] (passthrough mediated matrix device type)
    769      ......... create
    770      ......... [devices]
    771 
    772    To create the mediated devices for the three guests::
    773 
    774        uuidgen > create
    775        uuidgen > create
    776        uuidgen > create
    777 
    778    or
    779 
    780    ::
    781 
    782        echo $uuid1 > create
    783        echo $uuid2 > create
    784        echo $uuid3 > create
    785 
    786    This will create three mediated devices in the [devices] subdirectory named
    787    after the UUID used to create the mediated device. We'll call them $uuid1,
    788    $uuid2 and $uuid3 and this is the sysfs directory structure after creation::
    789 
    790      /sys/devices/vfio_ap/matrix/
    791      ... [mdev_supported_types]
    792      ...... [vfio_ap-passthrough]
    793      ......... [devices]
    794      ............ [$uuid1]
    795      ............... assign_adapter
    796      ............... assign_control_domain
    797      ............... assign_domain
    798      ............... matrix
    799      ............... unassign_adapter
    800      ............... unassign_control_domain
    801      ............... unassign_domain
    802 
    803      ............ [$uuid2]
    804      ............... assign_adapter
    805      ............... assign_control_domain
    806      ............... assign_domain
    807      ............... matrix
    808      ............... unassign_adapter
    809      ............... unassign_control_domain
    810      ............... unassign_domain
    811 
    812      ............ [$uuid3]
    813      ............... assign_adapter
    814      ............... assign_control_domain
    815      ............... assign_domain
    816      ............... matrix
    817      ............... unassign_adapter
    818      ............... unassign_control_domain
    819      ............... unassign_domain
    820 
    821 4. The administrator now needs to configure the matrixes for the mediated
    822    devices $uuid1 (for Guest1), $uuid2 (for Guest2) and $uuid3 (for Guest3).
    823 
    824    This is how the matrix is configured for Guest1::
    825 
    826       echo 5 > assign_adapter
    827       echo 6 > assign_adapter
    828       echo 4 > assign_domain
    829       echo 0xab > assign_domain
    830 
    831    Control domains can similarly be assigned using the assign_control_domain
    832    sysfs file.
    833 
    834    If a mistake is made configuring an adapter, domain or control domain,
    835    you can use the ``unassign_xxx`` interfaces to unassign the adapter, domain or
    836    control domain.
    837 
    838    To display the matrix configuration for Guest1::
    839 
    840          cat matrix
    841 
    842    The output will display the APQNs in the format ``xx.yyyy``, where xx is
    843    the adapter number and yyyy is the domain number. The output for Guest1
    844    will look like this::
    845 
    846          05.0004
    847          05.00ab
    848          06.0004
    849          06.00ab
    850 
    851    This is how the matrix is configured for Guest2::
    852 
    853       echo 5 > assign_adapter
    854       echo 0x47 > assign_domain
    855       echo 0xff > assign_domain
    856 
    857    This is how the matrix is configured for Guest3::
    858 
    859       echo 6 > assign_adapter
    860       echo 0x47 > assign_domain
    861       echo 0xff > assign_domain
    862 
    863 5. Start Guest1::
    864 
    865    /usr/bin/qemu-system-s390x ... -cpu host,ap=on,apqci=on,apft=on -device vfio-ap,sysfsdev=/sys/devices/vfio_ap/matrix/$uuid1 ...
    866 
    867 7. Start Guest2::
    868 
    869    /usr/bin/qemu-system-s390x ... -cpu host,ap=on,apqci=on,apft=on -device vfio-ap,sysfsdev=/sys/devices/vfio_ap/matrix/$uuid2 ...
    870 
    871 7. Start Guest3::
    872 
    873    /usr/bin/qemu-system-s390x ... -cpu host,ap=on,apqci=on,apft=on -device vfio-ap,sysfsdev=/sys/devices/vfio_ap/matrix/$uuid3 ...
    874 
    875 When the guest is shut down, the mediated matrix devices may be removed.
    876 
    877 Using our example again, to remove the mediated matrix device $uuid1::
    878 
    879    /sys/devices/vfio_ap/matrix/
    880    ... [mdev_supported_types]
    881    ...... [vfio_ap-passthrough]
    882    ......... [devices]
    883    ............ [$uuid1]
    884    ............... remove
    885 
    886 
    887    echo 1 > remove
    888 
    889 This will remove all of the mdev matrix device's sysfs structures including
    890 the mdev device itself. To recreate and reconfigure the mdev matrix device,
    891 all of the steps starting with step 3 will have to be performed again. Note
    892 that the remove will fail if a guest using the mdev is still running.
    893 
    894 It is not necessary to remove an mdev matrix device, but one may want to
    895 remove it if no guest will use it during the remaining lifetime of the linux
    896 host. If the mdev matrix device is removed, one may want to also reconfigure
    897 the pool of adapters and queues reserved for use by the default drivers.
    898 
    899 Limitations
    900 -----------
    901 
    902 * The KVM/kernel interfaces do not provide a way to prevent restoring an APQN
    903   to the default drivers pool of a queue that is still assigned to a mediated
    904   device in use by a guest. It is incumbent upon the administrator to
    905   ensure there is no mediated device in use by a guest to which the APQN is
    906   assigned lest the host be given access to the private data of the AP queue
    907   device, such as a private key configured specifically for the guest.
    908 
    909 * Dynamically assigning AP resources to or unassigning AP resources from a
    910   mediated matrix device - see `Configuring an AP matrix for a linux guest`_
    911   section above - while a running guest is using it is currently not supported.
    912 
    913 * Live guest migration is not supported for guests using AP devices. If a guest
    914   is using AP devices, the vfio-ap device configured for the guest must be
    915   unplugged before migrating the guest (see `Hot unplug a vfio-ap device from a
    916   running guest`_ section above.)