qemu

FORK: QEMU emulator
git clone https://git.neptards.moe/neptards/qemu.git
Log | Files | Refs | Submodules | LICENSE

ppc-xive.rst (8967B)


      1 ================================
      2 POWER9 XIVE interrupt controller
      3 ================================
      4 
      5 The POWER9 processor comes with a new interrupt controller
      6 architecture, called XIVE as "eXternal Interrupt Virtualization
      7 Engine".
      8 
      9 Compared to the previous architecture, the main characteristics of
     10 XIVE are to support a larger number of interrupt sources and to
     11 deliver interrupts directly to virtual processors without hypervisor
     12 assistance. This removes the context switches required for the
     13 delivery process.
     14 
     15 
     16 XIVE architecture
     17 =================
     18 
     19 The XIVE IC is composed of three sub-engines, each taking care of a
     20 processing layer of external interrupts:
     21 
     22 - Interrupt Virtualization Source Engine (IVSE), or Source Controller
     23   (SC). These are found in PCI PHBs, in the Processor Service
     24   Interface (PSI) host bridge Controller, but also inside the main
     25   controller for the core IPIs and other sub-chips (NX, CAP, NPU) of
     26   the chip/processor. They are configured to feed the IVRE with
     27   events.
     28 - Interrupt Virtualization Routing Engine (IVRE) or Virtualization
     29   Controller (VC). It handles event coalescing and perform interrupt
     30   routing by matching an event source number with an Event
     31   Notification Descriptor (END).
     32 - Interrupt Virtualization Presentation Engine (IVPE) or Presentation
     33   Controller (PC). It maintains the interrupt context state of each
     34   thread and handles the delivery of the external interrupt to the
     35   thread.
     36 
     37 ::
     38 
     39                 XIVE Interrupt Controller
     40                 +------------------------------------+      IPIs
     41                 | +---------+ +---------+ +--------+ |    +-------+
     42                 | |IVRE     | |Common Q | |IVPE    |----> | CORES |
     43                 | |     esb | |         | |        |----> |       |
     44                 | |     eas | |  Bridge | |   tctx |----> |       |
     45                 | |SC   end | |         | |    nvt | |    |       |
     46     +------+    | +---------+ +----+----+ +--------+ |    +-+-+-+-+
     47     | RAM  |    +------------------|-----------------+      | | |
     48     |      |                       |                        | | |
     49     |      |                       |                        | | |
     50     |      |  +--------------------v------------------------v-v-v--+    other
     51     |      <--+                     Power Bus                      +--> chips
     52     |  esb |  +---------+-----------------------+------------------+
     53     |  eas |            |                       |
     54     |  end |         +--|------+                |
     55     |  nvt |       +----+----+ |           +----+----+
     56     +------+       |IVSE     | |           |IVSE     |
     57                    |         | |           |         |
     58                    | PQ-bits | |           | PQ-bits |
     59                    | local   |-+           |  in VC  |
     60                    +---------+             +---------+
     61                       PCIe                 NX,NPU,CAPI
     62 
     63 
     64     PQ-bits: 2 bits source state machine (P:pending Q:queued)
     65     esb: Event State Buffer (Array of PQ bits in an IVSE)
     66     eas: Event Assignment Structure
     67     end: Event Notification Descriptor
     68     nvt: Notification Virtual Target
     69     tctx: Thread interrupt Context registers
     70 
     71 
     72 
     73 XIVE internal tables
     74 --------------------
     75 
     76 Each of the sub-engines uses a set of tables to redirect interrupts
     77 from event sources to CPU threads.
     78 
     79 ::
     80 
     81                                             +-------+
     82     User or O/S                             |  EQ   |
     83         or                          +------>|entries|
     84     Hypervisor                      |       |  ..   |
     85       Memory                        |       +-------+
     86                                     |           ^
     87                                     |           |
     88                +-------------------------------------------------+
     89                                     |           |
     90     Hypervisor      +------+    +---+--+    +---+--+   +------+
     91       Memory        | ESB  |    | EAT  |    | ENDT |   | NVTT |
     92      (skiboot)      +----+-+    +----+-+    +----+-+   +------+
     93                       ^  |        ^  |        ^  |       ^
     94                       |  |        |  |        |  |       |
     95                +-------------------------------------------------+
     96                       |  |        |  |        |  |       |
     97                       |  |        |  |        |  |       |
     98                  +----|--|--------|--|--------|--|-+   +-|-----+    +------+
     99                  |    |  |        |  |        |  | |   | | tctx|    |Thread|
    100      IPI or   ---+    +  v        +  v        +  v |---| +  .. |----->     |
    101     HW events    |                                 |   |       |    |      |
    102                  |             IVRE                |   | IVPE  |    +------+
    103                  +---------------------------------+   +-------+
    104 
    105 
    106 The IVSE have a 2-bits state machine, P for pending and Q for queued,
    107 for each source that allows events to be triggered. They are stored in
    108 an Event State Buffer (ESB) array and can be controlled by MMIOs.
    109 
    110 If the event is let through, the IVRE looks up in the Event Assignment
    111 Structure (EAS) table for an Event Notification Descriptor (END)
    112 configured for the source. Each Event Notification Descriptor defines
    113 a notification path to a CPU and an in-memory Event Queue, in which
    114 will be enqueued an EQ data for the O/S to pull.
    115 
    116 The IVPE determines if a Notification Virtual Target (NVT) can handle
    117 the event by scanning the thread contexts of the VCPUs dispatched on
    118 the processor HW threads. It maintains the interrupt context state of
    119 each thread in a NVT table.
    120 
    121 XIVE thread interrupt context
    122 -----------------------------
    123 
    124 The XIVE presenter can generate four different exceptions to its
    125 HW threads:
    126 
    127 - hypervisor exception
    128 - O/S exception
    129 - Event-Based Branch (user level)
    130 - msgsnd (doorbell)
    131 
    132 Each exception has a state independent from the others called a Thread
    133 Interrupt Management context. This context is a set of registers which
    134 lets the thread handle priority management and interrupt
    135 acknowledgment among other things. The most important ones being :
    136 
    137 - Interrupt Priority Register  (PIPR)
    138 - Interrupt Pending Buffer     (IPB)
    139 - Current Processor Priority   (CPPR)
    140 - Notification Source Register (NSR)
    141 
    142 TIMA
    143 ~~~~
    144 
    145 The Thread Interrupt Management registers are accessible through a
    146 specific MMIO region, called the Thread Interrupt Management Area
    147 (TIMA), four aligned pages, each exposing a different view of the
    148 registers. First page (page address ending in ``0b00``) gives access
    149 to the entire context and is reserved for the ring 0 view for the
    150 physical thread context. The second (page address ending in ``0b01``)
    151 is for the hypervisor, ring 1 view. The third (page address ending in
    152 ``0b10``) is for the operating system, ring 2 view. The fourth (page
    153 address ending in ``0b11``) is for user level, ring 3 view.
    154 
    155 Interrupt flow from an O/S perspective
    156 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    157 
    158 After an event data has been enqueued in the O/S Event Queue, the IVPE
    159 raises the bit corresponding to the priority of the pending interrupt
    160 in the register IBP (Interrupt Pending Buffer) to indicate that an
    161 event is pending in one of the 8 priority queues. The Pending
    162 Interrupt Priority Register (PIPR) is also updated using the IPB. This
    163 register represent the priority of the most favored pending
    164 notification.
    165 
    166 The PIPR is then compared to the Current Processor Priority
    167 Register (CPPR). If it is more favored (numerically less than), the
    168 CPU interrupt line is raised and the EO bit of the Notification Source
    169 Register (NSR) is updated to notify the presence of an exception for
    170 the O/S. The O/S acknowledges the interrupt with a special load in the
    171 Thread Interrupt Management Area.
    172 
    173 The O/S handles the interrupt and when done, performs an EOI using a
    174 MMIO operation on the ESB management page of the associate source.
    175 
    176 Overview of the QEMU models for XIVE
    177 ====================================
    178 
    179 The XiveSource models the IVSE in general, internal and external. It
    180 handles the source ESBs and the MMIO interface to control them.
    181 
    182 The XiveNotifier is a small helper interface interconnecting the
    183 XiveSource to the XiveRouter.
    184 
    185 The XiveRouter is an abstract model acting as a combined IVRE and
    186 IVPE. It routes event notifications using the EAS and END tables to
    187 the IVPE sub-engine which does a CAM scan to find a CPU to deliver the
    188 exception. Storage should be provided by the inheriting classes.
    189 
    190 XiveEnDSource is a special source object. It exposes the END ESB MMIOs
    191 of the Event Queues which are used for coalescing event notifications
    192 and for escalation. Not used on the field, only to sync the EQ cache
    193 in OPAL.
    194 
    195 Finally, the XiveTCTX contains the interrupt state context of a thread,
    196 four sets of registers, one for each exception that can be delivered
    197 to a CPU. These contexts are scanned by the IVPE to find a matching VP
    198 when a notification is triggered. It also models the Thread Interrupt
    199 Management Area (TIMA), which exposes the thread context registers to
    200 the CPU for interrupt management.