ppc-xive.rst (8967B)
1 ================================ 2 POWER9 XIVE interrupt controller 3 ================================ 4 5 The POWER9 processor comes with a new interrupt controller 6 architecture, called XIVE as "eXternal Interrupt Virtualization 7 Engine". 8 9 Compared to the previous architecture, the main characteristics of 10 XIVE are to support a larger number of interrupt sources and to 11 deliver interrupts directly to virtual processors without hypervisor 12 assistance. This removes the context switches required for the 13 delivery process. 14 15 16 XIVE architecture 17 ================= 18 19 The XIVE IC is composed of three sub-engines, each taking care of a 20 processing layer of external interrupts: 21 22 - Interrupt Virtualization Source Engine (IVSE), or Source Controller 23 (SC). These are found in PCI PHBs, in the Processor Service 24 Interface (PSI) host bridge Controller, but also inside the main 25 controller for the core IPIs and other sub-chips (NX, CAP, NPU) of 26 the chip/processor. They are configured to feed the IVRE with 27 events. 28 - Interrupt Virtualization Routing Engine (IVRE) or Virtualization 29 Controller (VC). It handles event coalescing and perform interrupt 30 routing by matching an event source number with an Event 31 Notification Descriptor (END). 32 - Interrupt Virtualization Presentation Engine (IVPE) or Presentation 33 Controller (PC). It maintains the interrupt context state of each 34 thread and handles the delivery of the external interrupt to the 35 thread. 36 37 :: 38 39 XIVE Interrupt Controller 40 +------------------------------------+ IPIs 41 | +---------+ +---------+ +--------+ | +-------+ 42 | |IVRE | |Common Q | |IVPE |----> | CORES | 43 | | esb | | | | |----> | | 44 | | eas | | Bridge | | tctx |----> | | 45 | |SC end | | | | nvt | | | | 46 +------+ | +---------+ +----+----+ +--------+ | +-+-+-+-+ 47 | RAM | +------------------|-----------------+ | | | 48 | | | | | | 49 | | | | | | 50 | | +--------------------v------------------------v-v-v--+ other 51 | <--+ Power Bus +--> chips 52 | esb | +---------+-----------------------+------------------+ 53 | eas | | | 54 | end | +--|------+ | 55 | nvt | +----+----+ | +----+----+ 56 +------+ |IVSE | | |IVSE | 57 | | | | | 58 | PQ-bits | | | PQ-bits | 59 | local |-+ | in VC | 60 +---------+ +---------+ 61 PCIe NX,NPU,CAPI 62 63 64 PQ-bits: 2 bits source state machine (P:pending Q:queued) 65 esb: Event State Buffer (Array of PQ bits in an IVSE) 66 eas: Event Assignment Structure 67 end: Event Notification Descriptor 68 nvt: Notification Virtual Target 69 tctx: Thread interrupt Context registers 70 71 72 73 XIVE internal tables 74 -------------------- 75 76 Each of the sub-engines uses a set of tables to redirect interrupts 77 from event sources to CPU threads. 78 79 :: 80 81 +-------+ 82 User or O/S | EQ | 83 or +------>|entries| 84 Hypervisor | | .. | 85 Memory | +-------+ 86 | ^ 87 | | 88 +-------------------------------------------------+ 89 | | 90 Hypervisor +------+ +---+--+ +---+--+ +------+ 91 Memory | ESB | | EAT | | ENDT | | NVTT | 92 (skiboot) +----+-+ +----+-+ +----+-+ +------+ 93 ^ | ^ | ^ | ^ 94 | | | | | | | 95 +-------------------------------------------------+ 96 | | | | | | | 97 | | | | | | | 98 +----|--|--------|--|--------|--|-+ +-|-----+ +------+ 99 | | | | | | | | | | tctx| |Thread| 100 IPI or ---+ + v + v + v |---| + .. |-----> | 101 HW events | | | | | | 102 | IVRE | | IVPE | +------+ 103 +---------------------------------+ +-------+ 104 105 106 The IVSE have a 2-bits state machine, P for pending and Q for queued, 107 for each source that allows events to be triggered. They are stored in 108 an Event State Buffer (ESB) array and can be controlled by MMIOs. 109 110 If the event is let through, the IVRE looks up in the Event Assignment 111 Structure (EAS) table for an Event Notification Descriptor (END) 112 configured for the source. Each Event Notification Descriptor defines 113 a notification path to a CPU and an in-memory Event Queue, in which 114 will be enqueued an EQ data for the O/S to pull. 115 116 The IVPE determines if a Notification Virtual Target (NVT) can handle 117 the event by scanning the thread contexts of the VCPUs dispatched on 118 the processor HW threads. It maintains the interrupt context state of 119 each thread in a NVT table. 120 121 XIVE thread interrupt context 122 ----------------------------- 123 124 The XIVE presenter can generate four different exceptions to its 125 HW threads: 126 127 - hypervisor exception 128 - O/S exception 129 - Event-Based Branch (user level) 130 - msgsnd (doorbell) 131 132 Each exception has a state independent from the others called a Thread 133 Interrupt Management context. This context is a set of registers which 134 lets the thread handle priority management and interrupt 135 acknowledgment among other things. The most important ones being : 136 137 - Interrupt Priority Register (PIPR) 138 - Interrupt Pending Buffer (IPB) 139 - Current Processor Priority (CPPR) 140 - Notification Source Register (NSR) 141 142 TIMA 143 ~~~~ 144 145 The Thread Interrupt Management registers are accessible through a 146 specific MMIO region, called the Thread Interrupt Management Area 147 (TIMA), four aligned pages, each exposing a different view of the 148 registers. First page (page address ending in ``0b00``) gives access 149 to the entire context and is reserved for the ring 0 view for the 150 physical thread context. The second (page address ending in ``0b01``) 151 is for the hypervisor, ring 1 view. The third (page address ending in 152 ``0b10``) is for the operating system, ring 2 view. The fourth (page 153 address ending in ``0b11``) is for user level, ring 3 view. 154 155 Interrupt flow from an O/S perspective 156 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 157 158 After an event data has been enqueued in the O/S Event Queue, the IVPE 159 raises the bit corresponding to the priority of the pending interrupt 160 in the register IBP (Interrupt Pending Buffer) to indicate that an 161 event is pending in one of the 8 priority queues. The Pending 162 Interrupt Priority Register (PIPR) is also updated using the IPB. This 163 register represent the priority of the most favored pending 164 notification. 165 166 The PIPR is then compared to the Current Processor Priority 167 Register (CPPR). If it is more favored (numerically less than), the 168 CPU interrupt line is raised and the EO bit of the Notification Source 169 Register (NSR) is updated to notify the presence of an exception for 170 the O/S. The O/S acknowledges the interrupt with a special load in the 171 Thread Interrupt Management Area. 172 173 The O/S handles the interrupt and when done, performs an EOI using a 174 MMIO operation on the ESB management page of the associate source. 175 176 Overview of the QEMU models for XIVE 177 ==================================== 178 179 The XiveSource models the IVSE in general, internal and external. It 180 handles the source ESBs and the MMIO interface to control them. 181 182 The XiveNotifier is a small helper interface interconnecting the 183 XiveSource to the XiveRouter. 184 185 The XiveRouter is an abstract model acting as a combined IVRE and 186 IVPE. It routes event notifications using the EAS and END tables to 187 the IVPE sub-engine which does a CAM scan to find a CPU to deliver the 188 exception. Storage should be provided by the inheriting classes. 189 190 XiveEnDSource is a special source object. It exposes the END ESB MMIOs 191 of the Event Queues which are used for coalescing event notifications 192 and for escalation. Not used on the field, only to sync the EQ cache 193 in OPAL. 194 195 Finally, the XiveTCTX contains the interrupt state context of a thread, 196 four sets of registers, one for each exception that can be delivered 197 to a CPU. These contexts are scanned by the IVPE to find a matching VP 198 when a notification is triggered. It also models the Thread Interrupt 199 Management Area (TIMA), which exposes the thread context registers to 200 the CPU for interrupt management.