qemu

FORK: QEMU emulator
git clone https://git.neptards.moe/neptards/qemu.git
Log | Files | Refs | Submodules | LICENSE

pseries.rst (12713B)


      1 ===================================
      2 pSeries family boards (``pseries``)
      3 ===================================
      4 
      5 The Power machine para-virtualized environment described by the Linux on Power
      6 Architecture Reference ([LoPAR]_) document is called pSeries. This environment
      7 is also known as sPAPR, System p guests, or simply Power Linux guests (although
      8 it is capable of running other operating systems, such as AIX).
      9 
     10 Even though pSeries is designed to behave as a guest environment, it is also
     11 capable of acting as a hypervisor OS, providing, on that role, nested
     12 virtualization capabilities.
     13 
     14 Supported devices
     15 =================
     16 
     17  * Multi processor support for many Power processors generations: POWER7,
     18    POWER7+, POWER8, POWER8NVL, POWER9, and Power10. Support for POWER5+ exists,
     19    but its state is unknown.
     20  * Interrupt Controller, XICS (POWER8) and XIVE (POWER9 and Power10)
     21  * vPHB PCIe Host bridge.
     22  * vscsi and vnet devices, compatible with the same devices available on a
     23    PowerVM hypervisor with VIOS managing LPARs.
     24  * Virtio based devices.
     25  * PCIe device pass through.
     26 
     27 Missing devices
     28 ===============
     29 
     30  * SPICE support.
     31 
     32 Firmware
     33 ========
     34 
     35 The pSeries platform in QEMU comes with 2 firmwares:
     36 
     37 `SLOF <https://github.com/aik/SLOF>`_ (Slimline Open Firmware) is an
     38 implementation of the `IEEE 1275-1994, Standard for Boot (Initialization
     39 Configuration) Firmware: Core Requirements and Practices
     40 <https://standards.ieee.org/standard/1275-1994.html>`_.
     41 
     42 SLOF performs bus scanning, PCI resource allocation, provides the client
     43 interface to boot from block devices and network.
     44 
     45 QEMU includes a prebuilt image of SLOF which is updated when a more recent
     46 version is required.
     47 
     48 VOF (Virtual Open Firmware) is a minimalistic firmware to work with
     49 ``-machine pseries,x-vof=on``. When enabled, the firmware acts as a slim
     50 shim and QEMU implements parts of the IEEE 1275 Open Firmware interface.
     51 
     52 VOF does not have device drivers, does not do PCI resource allocation and
     53 relies on ``-kernel`` used with Linux kernels recent enough (v5.4+)
     54 to PCI resource assignment. It is ideal to use with petitboot.
     55 
     56 Booting via ``-kernel`` supports the following:
     57 
     58 +-------------------+-------------------+------------------+
     59 | kernel            | pseries,x-vof=off | pseries,x-vof=on |
     60 +===================+===================+==================+
     61 | vmlinux BE        |     ✓             |     ✓            |
     62 +-------------------+-------------------+------------------+
     63 | vmlinux LE        |     ✓             |     ✓            |
     64 +-------------------+-------------------+------------------+
     65 | zImage.pseries BE |     ✓¹            |     ✓¹           |
     66 +-------------------+-------------------+------------------+
     67 | zImage.pseries LE |     ✓             |     ✓            |
     68 +-------------------+-------------------+------------------+
     69 
     70 ¹ must set kernel-addr=0
     71 
     72 Build directions
     73 ================
     74 
     75 .. code-block:: bash
     76 
     77   ./configure --target-list=ppc64-softmmu && make
     78 
     79 Running instructions
     80 ====================
     81 
     82 Someone can select the pSeries machine type by running QEMU with the following
     83 options:
     84 
     85 .. code-block:: bash
     86 
     87   qemu-system-ppc64 -M pseries <other QEMU arguments>
     88 
     89 sPAPR devices
     90 =============
     91 
     92 The sPAPR specification defines a set of para-virtualized devices, which are
     93 also supported by the pSeries machine in QEMU and can be instantiated with the
     94 ``-device`` option:
     95 
     96 * ``spapr-vlan`` : a virtual network interface.
     97 * ``spapr-vscsi`` : a virtual SCSI disk interface.
     98 * ``spapr-rng`` : a pseudo-device for passing random number generator data to the
     99   guest (see the `H_RANDOM hypercall feature
    100   <https://wiki.qemu.org/Features/HRandomHypercall>`_ for details).
    101 * ``spapr-vty``: a virtual teletype.
    102 * ``spapr-pci-host-bridge``: a PCI host bridge.
    103 * ``tpm-spapr``: a Trusted Platform Module (TPM).
    104 * ``spapr-tpm-proxy``: a TPM proxy.
    105 
    106 These are compatible with the devices historically available for use when
    107 running the IBM PowerVM hypervisor with LPARs.
    108 
    109 However, since these devices have originally been specified with another
    110 hypervisor and non-Linux guests in mind, you should use the virtio counterparts
    111 (virtio-net, virtio-blk/scsi and virtio-rng for instance) if possible instead,
    112 since they will most probably give you better performance with Linux guests in a
    113 QEMU environment.
    114 
    115 The pSeries machine in QEMU is always instantiated with the following devices:
    116 
    117 * A NVRAM device (``spapr-nvram``).
    118 * A virtual teletype (``spapr-vty``).
    119 * A PCI host bridge (``spapr-pci-host-bridge``).
    120 
    121 Hence, it is not needed to add them manually, unless you use the ``-nodefaults``
    122 command line option in QEMU.
    123 
    124 In the case of the default ``spapr-nvram`` device, if someone wants to make the
    125 contents of the NVRAM device persistent, they will need to specify a PFLASH
    126 device when starting QEMU, i.e. either use
    127 ``-drive if=pflash,file=<filename>,format=raw`` to set the default PFLASH
    128 device, or specify one with an ID
    129 (``-drive if=none,file=<filename>,format=raw,id=pfid``) and pass that ID to the
    130 NVRAM device with ``-global spapr-nvram.drive=pfid``.
    131 
    132 sPAPR specification
    133 -------------------
    134 
    135 The main source of documentation on the sPAPR standard is the [LoPAR]_ document.
    136 However, documentation specific to QEMU's implementation of the specification
    137 can  also be found in QEMU documentation:
    138 
    139 .. toctree::
    140    :maxdepth: 1
    141 
    142    ../../specs/ppc-spapr-hotplug.rst
    143    ../../specs/ppc-spapr-hcalls.rst
    144    ../../specs/ppc-spapr-numa.rst
    145    ../../specs/ppc-spapr-uv-hcalls.rst
    146    ../../specs/ppc-spapr-xive.rst
    147 
    148 Switching between the KVM-PR and KVM-HV kernel module
    149 =====================================================
    150 
    151 Currently, there are two implementations of KVM on Power, ``kvm_hv.ko`` and
    152 ``kvm_pr.ko``.
    153 
    154 
    155 If a host supports both KVM modes, and both KVM kernel modules are loaded, it is
    156 possible to switch between the two modes with the ``kvm-type`` parameter:
    157 
    158 * Use ``qemu-system-ppc64 -M pseries,accel=kvm,kvm-type=PR`` to use the
    159   ``kvm_pr.ko`` kernel module.
    160 * Use ``qemu-system-ppc64 -M pseries,accel=kvm,kvm-type=HV`` to use ``kvm_hv.ko``
    161   instead.
    162 
    163 KVM-PR
    164 ------
    165 
    166 KVM-PR uses the so-called **PR**\ oblem state of the PPC CPUs to run the guests,
    167 i.e. the virtual machine is run in user mode and all privileged instructions
    168 trap and have to be emulated by the host. That means you can run KVM-PR inside
    169 a pSeries guest (or a PowerVM LPAR for that matter), and that is where it has
    170 originated, as historically (prior to POWER7) it was not possible to run Linux
    171 on hypervisor mode on a Power processor (this function was restricted to
    172 PowerVM, the IBM proprietary hypervisor).
    173 
    174 Because all privileged instructions are trapped, guests that use a lot of
    175 privileged instructions run quite slow with KVM-PR. On the other hand, because
    176 of that, this kernel module can run on pretty much every PPC hardware, and is
    177 able to emulate a lot of guests CPUs. This module can even be used to run other
    178 PowerPC guests like an emulated PowerMac.
    179 
    180 As KVM-PR can be run inside a pSeries guest, it can also provide nested
    181 virtualization capabilities (i.e. running a guest from within a guest).
    182 
    183 It is important to notice that, as KVM-HV provides a much better execution
    184 performance, maintenance work has been much more focused on it in the past
    185 years. Maintenance for KVM-PR has been minimal.
    186 
    187 In order to run KVM-PR guests with POWER9 processors, someone will need to start
    188 QEMU with ``kernel_irqchip=off`` command line option.
    189 
    190 KVM-HV
    191 ------
    192 
    193 KVM-HV uses the hypervisor mode of more recent Power processors, that allow
    194 access to the bare metal hardware directly. Although POWER7 had this capability,
    195 it was only starting with POWER8 that this was officially supported by IBM.
    196 
    197 Originally, KVM-HV was only available when running on a PowerNV platform (a.k.a.
    198 Power bare metal). Although it runs on a PowerNV platform, it can only be used
    199 to start pSeries guests. As the pSeries guest doesn't have access to the
    200 hypervisor mode of the Power CPU, it wasn't possible to run KVM-HV on a guest.
    201 This limitation has been lifted, and now it is possible to run KVM-HV inside
    202 pSeries guests as well, making nested virtualization possible with KVM-HV.
    203 
    204 As KVM-HV has access to privileged instructions, guests that use a lot of these
    205 can run much faster than with KVM-PR. On the other hand, the guest CPU has to be
    206 of the same type as the host CPU this way, e.g. it is not possible to specify an
    207 embedded PPC CPU for the guest with KVM-HV. However, there is at least the
    208 possibility to run the guest in a backward-compatibility mode of the previous
    209 CPUs generations, e.g. you can run a POWER7 guest on a POWER8 host by using
    210 ``-cpu POWER8,compat=power7`` as parameter to QEMU.
    211 
    212 Modules support
    213 ===============
    214 
    215 As noticed in the sections above, each module can run in a different
    216 environment. The following table shows with which environment each module can
    217 run. As long as you are in a supported environment, you can run KVM-PR or KVM-HV
    218 nested. Combinations not shown in the table are not available.
    219 
    220 +--------------+------------+------+-------------------+----------+--------+
    221 | Platform     | Host type  | Bits | Page table format | KVM-HV   | KVM-PR |
    222 +==============+============+======+===================+==========+========+
    223 | PowerNV      | bare metal | 32   | hash              | no       | yes    |
    224 |              |            |      +-------------------+----------+--------+
    225 |              |            |      | radix             | N/A      | N/A    |
    226 |              |            +------+-------------------+----------+--------+
    227 |              |            | 64   | hash              | yes      | yes    |
    228 |              |            |      +-------------------+----------+--------+
    229 |              |            |      | radix             | yes      | no     |
    230 +--------------+------------+------+-------------------+----------+--------+
    231 | pSeries [1]_ | PowerNV    | 32   | hash              | no       | yes    |
    232 |              |            |      +-------------------+----------+--------+
    233 |              |            |      | radix             | N/A      | N/A    |
    234 |              |            +------+-------------------+----------+--------+
    235 |              |            | 64   | hash              | no       | yes    |
    236 |              |            |      +-------------------+----------+--------+
    237 |              |            |      | radix             | yes [2]_ | no     |
    238 |              +------------+------+-------------------+----------+--------+
    239 |              | PowerVM    | 32   | hash              | no       | yes    |
    240 |              |            |      +-------------------+----------+--------+
    241 |              |            |      | radix             | N/A      | N/A    |
    242 |              |            +------+-------------------+----------+--------+
    243 |              |            | 64   | hash              | no       | yes    |
    244 |              |            |      +-------------------+----------+--------+
    245 |              |            |      | radix [3]_        | no       | yes    |
    246 +--------------+------------+------+-------------------+----------+--------+
    247 
    248 .. [1] On POWER9 DD2.1 processors, the page table format on the host and guest
    249    must be the same.
    250 
    251 .. [2] KVM-HV cannot run nested on POWER8 machines.
    252 
    253 .. [3] Introduced on Power10 machines.
    254 
    255 
    256 .. _power-papr-protected-execution-facility-pef:
    257 
    258 POWER (PAPR) Protected Execution Facility (PEF)
    259 -----------------------------------------------
    260 
    261 Protected Execution Facility (PEF), also known as Secure Guest support
    262 is a feature found on IBM POWER9 and POWER10 processors.
    263 
    264 If a suitable firmware including an Ultravisor is installed, it adds
    265 an extra memory protection mode to the CPU.  The ultravisor manages a
    266 pool of secure memory which cannot be accessed by the hypervisor.
    267 
    268 When this feature is enabled in QEMU, a guest can use ultracalls to
    269 enter "secure mode".  This transfers most of its memory to secure
    270 memory, where it cannot be eavesdropped by a compromised hypervisor.
    271 
    272 Launching
    273 ^^^^^^^^^
    274 
    275 To launch a guest which will be permitted to enter PEF secure mode::
    276 
    277   $ qemu-system-ppc64 \
    278       -object pef-guest,id=pef0 \
    279       -machine confidential-guest-support=pef0 \
    280       ...
    281 
    282 Live Migration
    283 ^^^^^^^^^^^^^^
    284 
    285 Live migration is not yet implemented for PEF guests.  For
    286 consistency, QEMU currently prevents migration if the PEF feature is
    287 enabled, whether or not the guest has actually entered secure mode.
    288 
    289 
    290 Maintainer contact information
    291 ==============================
    292 
    293 Cédric Le Goater <clg@kaod.org>
    294 
    295 Daniel Henrique Barboza <danielhb413@gmail.com>
    296 
    297 .. [LoPAR] `Linux on Power Architecture Reference document (LoPAR) revision
    298    2.9 <https://openpowerfoundation.org/wp-content/uploads/2020/07/LoPAR-20200812.pdf>`_.