18c2ecf20Sopenharmony_ci.. SPDX-License-Identifier: GPL-2.0
28c2ecf20Sopenharmony_ci
38c2ecf20Sopenharmony_ci===============
48c2ecf20Sopenharmony_ciBoot Interrupts
58c2ecf20Sopenharmony_ci===============
68c2ecf20Sopenharmony_ci
78c2ecf20Sopenharmony_ci:Author: - Sean V Kelley <sean.v.kelley@linux.intel.com>
88c2ecf20Sopenharmony_ci
98c2ecf20Sopenharmony_ciOverview
108c2ecf20Sopenharmony_ci========
118c2ecf20Sopenharmony_ci
128c2ecf20Sopenharmony_ciOn PCI Express, interrupts are represented with either MSI or inbound
138c2ecf20Sopenharmony_ciinterrupt messages (Assert_INTx/Deassert_INTx). The integrated IO-APIC in a
148c2ecf20Sopenharmony_cigiven Core IO converts the legacy interrupt messages from PCI Express to
158c2ecf20Sopenharmony_ciMSI interrupts.  If the IO-APIC is disabled (via the mask bits in the
168c2ecf20Sopenharmony_ciIO-APIC table entries), the messages are routed to the legacy PCH. This
178c2ecf20Sopenharmony_ciin-band interrupt mechanism was traditionally necessary for systems that
188c2ecf20Sopenharmony_cidid not support the IO-APIC and for boot. Intel in the past has used the
198c2ecf20Sopenharmony_citerm "boot interrupts" to describe this mechanism. Further, the PCI Express
208c2ecf20Sopenharmony_ciprotocol describes this in-band legacy wire-interrupt INTx mechanism for
218c2ecf20Sopenharmony_ciI/O devices to signal PCI-style level interrupts. The subsequent paragraphs
228c2ecf20Sopenharmony_cidescribe problems with the Core IO handling of INTx message routing to the
238c2ecf20Sopenharmony_ciPCH and mitigation within BIOS and the OS.
248c2ecf20Sopenharmony_ci
258c2ecf20Sopenharmony_ci
268c2ecf20Sopenharmony_ciIssue
278c2ecf20Sopenharmony_ci=====
288c2ecf20Sopenharmony_ci
298c2ecf20Sopenharmony_ciWhen in-band legacy INTx messages are forwarded to the PCH, they in turn
308c2ecf20Sopenharmony_citrigger a new interrupt for which the OS likely lacks a handler. When an
318c2ecf20Sopenharmony_ciinterrupt goes unhandled over time, they are tracked by the Linux kernel as
328c2ecf20Sopenharmony_ciSpurious Interrupts. The IRQ will be disabled by the Linux kernel after it
338c2ecf20Sopenharmony_cireaches a specific count with the error "nobody cared". This disabled IRQ
348c2ecf20Sopenharmony_cinow prevents valid usage by an existing interrupt which may happen to share
358c2ecf20Sopenharmony_cithe IRQ line::
368c2ecf20Sopenharmony_ci
378c2ecf20Sopenharmony_ci  irq 19: nobody cared (try booting with the "irqpoll" option)
388c2ecf20Sopenharmony_ci  CPU: 0 PID: 2988 Comm: irq/34-nipalk Tainted: 4.14.87-rt49-02410-g4a640ec-dirty #1
398c2ecf20Sopenharmony_ci  Hardware name: National Instruments NI PXIe-8880/NI PXIe-8880, BIOS 2.1.5f1 01/09/2020
408c2ecf20Sopenharmony_ci  Call Trace:
418c2ecf20Sopenharmony_ci
428c2ecf20Sopenharmony_ci  <IRQ>
438c2ecf20Sopenharmony_ci   ? dump_stack+0x46/0x5e
448c2ecf20Sopenharmony_ci   ? __report_bad_irq+0x2e/0xb0
458c2ecf20Sopenharmony_ci   ? note_interrupt+0x242/0x290
468c2ecf20Sopenharmony_ci   ? nNIKAL100_memoryRead16+0x8/0x10 [nikal]
478c2ecf20Sopenharmony_ci   ? handle_irq_event_percpu+0x55/0x70
488c2ecf20Sopenharmony_ci   ? handle_irq_event+0x4f/0x80
498c2ecf20Sopenharmony_ci   ? handle_fasteoi_irq+0x81/0x180
508c2ecf20Sopenharmony_ci   ? handle_irq+0x1c/0x30
518c2ecf20Sopenharmony_ci   ? do_IRQ+0x41/0xd0
528c2ecf20Sopenharmony_ci   ? common_interrupt+0x84/0x84
538c2ecf20Sopenharmony_ci  </IRQ>
548c2ecf20Sopenharmony_ci
558c2ecf20Sopenharmony_ci  handlers:
568c2ecf20Sopenharmony_ci  irq_default_primary_handler threaded usb_hcd_irq
578c2ecf20Sopenharmony_ci  Disabling IRQ #19
588c2ecf20Sopenharmony_ci
598c2ecf20Sopenharmony_ci
608c2ecf20Sopenharmony_ciConditions
618c2ecf20Sopenharmony_ci==========
628c2ecf20Sopenharmony_ci
638c2ecf20Sopenharmony_ciThe use of threaded interrupts is the most likely condition to trigger
648c2ecf20Sopenharmony_cithis problem today. Threaded interrupts may not be reenabled after the IRQ
658c2ecf20Sopenharmony_cihandler wakes. These "one shot" conditions mean that the threaded interrupt
668c2ecf20Sopenharmony_cineeds to keep the interrupt line masked until the threaded handler has run.
678c2ecf20Sopenharmony_ciEspecially when dealing with high data rate interrupts, the thread needs to
688c2ecf20Sopenharmony_cirun to completion; otherwise some handlers will end up in stack overflows
698c2ecf20Sopenharmony_cisince the interrupt of the issuing device is still active.
708c2ecf20Sopenharmony_ci
718c2ecf20Sopenharmony_ciAffected Chipsets
728c2ecf20Sopenharmony_ci=================
738c2ecf20Sopenharmony_ci
748c2ecf20Sopenharmony_ciThe legacy interrupt forwarding mechanism exists today in a number of
758c2ecf20Sopenharmony_cidevices including but not limited to chipsets from AMD/ATI, Broadcom, and
768c2ecf20Sopenharmony_ciIntel. Changes made through the mitigations below have been applied to
778c2ecf20Sopenharmony_cidrivers/pci/quirks.c
788c2ecf20Sopenharmony_ci
798c2ecf20Sopenharmony_ciStarting with ICX there are no longer any IO-APICs in the Core IO's
808c2ecf20Sopenharmony_cidevices.  IO-APIC is only in the PCH.  Devices connected to the Core IO's
818c2ecf20Sopenharmony_ciPCIe Root Ports will use native MSI/MSI-X mechanisms.
828c2ecf20Sopenharmony_ci
838c2ecf20Sopenharmony_ciMitigations
848c2ecf20Sopenharmony_ci===========
858c2ecf20Sopenharmony_ci
868c2ecf20Sopenharmony_ciThe mitigations take the form of PCI quirks. The preference has been to
878c2ecf20Sopenharmony_cifirst identify and make use of a means to disable the routing to the PCH.
888c2ecf20Sopenharmony_ciIn such a case a quirk to disable boot interrupt generation can be
898c2ecf20Sopenharmony_ciadded. [1]_
908c2ecf20Sopenharmony_ci
918c2ecf20Sopenharmony_ciIntel® 6300ESB I/O Controller Hub
928c2ecf20Sopenharmony_ci  Alternate Base Address Register:
938c2ecf20Sopenharmony_ci   BIE: Boot Interrupt Enable
948c2ecf20Sopenharmony_ci
958c2ecf20Sopenharmony_ci	  ==  ===========================
968c2ecf20Sopenharmony_ci	  0   Boot interrupt is enabled.
978c2ecf20Sopenharmony_ci	  1   Boot interrupt is disabled.
988c2ecf20Sopenharmony_ci	  ==  ===========================
998c2ecf20Sopenharmony_ci
1008c2ecf20Sopenharmony_ciIntel® Sandy Bridge through Sky Lake based Xeon servers:
1018c2ecf20Sopenharmony_ci  Coherent Interface Protocol Interrupt Control
1028c2ecf20Sopenharmony_ci   dis_intx_route2pch/dis_intx_route2ich/dis_intx_route2dmi2:
1038c2ecf20Sopenharmony_ci	  When this bit is set. Local INTx messages received from the
1048c2ecf20Sopenharmony_ci	  Intel® Quick Data DMA/PCI Express ports are not routed to legacy
1058c2ecf20Sopenharmony_ci	  PCH - they are either converted into MSI via the integrated IO-APIC
1068c2ecf20Sopenharmony_ci	  (if the IO-APIC mask bit is clear in the appropriate entries)
1078c2ecf20Sopenharmony_ci	  or cause no further action (when mask bit is set)
1088c2ecf20Sopenharmony_ci
1098c2ecf20Sopenharmony_ciIn the absence of a way to directly disable the routing, another approach
1108c2ecf20Sopenharmony_cihas been to make use of PCI Interrupt pin to INTx routing tables for
1118c2ecf20Sopenharmony_cipurposes of redirecting the interrupt handler to the rerouted interrupt
1128c2ecf20Sopenharmony_ciline by default.  Therefore, on chipsets where this INTx routing cannot be
1138c2ecf20Sopenharmony_cidisabled, the Linux kernel will reroute the valid interrupt to its legacy
1148c2ecf20Sopenharmony_ciinterrupt. This redirection of the handler will prevent the occurrence of
1158c2ecf20Sopenharmony_cithe spurious interrupt detection which would ordinarily disable the IRQ
1168c2ecf20Sopenharmony_ciline due to excessive unhandled counts. [2]_
1178c2ecf20Sopenharmony_ci
1188c2ecf20Sopenharmony_ciThe config option X86_REROUTE_FOR_BROKEN_BOOT_IRQS exists to enable (or
1198c2ecf20Sopenharmony_cidisable) the redirection of the interrupt handler to the PCH interrupt
1208c2ecf20Sopenharmony_ciline. The option can be overridden by either pci=ioapicreroute or
1218c2ecf20Sopenharmony_cipci=noioapicreroute. [3]_
1228c2ecf20Sopenharmony_ci
1238c2ecf20Sopenharmony_ci
1248c2ecf20Sopenharmony_ciMore Documentation
1258c2ecf20Sopenharmony_ci==================
1268c2ecf20Sopenharmony_ci
1278c2ecf20Sopenharmony_ciThere is an overview of the legacy interrupt handling in several datasheets
1288c2ecf20Sopenharmony_ci(6300ESB and 6700PXH below). While largely the same, it provides insight
1298c2ecf20Sopenharmony_ciinto the evolution of its handling with chipsets.
1308c2ecf20Sopenharmony_ci
1318c2ecf20Sopenharmony_ciExample of disabling of the boot interrupt
1328c2ecf20Sopenharmony_ci------------------------------------------
1338c2ecf20Sopenharmony_ci
1348c2ecf20Sopenharmony_ci      - Intel® 6300ESB I/O Controller Hub (Document # 300641-004US)
1358c2ecf20Sopenharmony_ci	5.7.3 Boot Interrupt
1368c2ecf20Sopenharmony_ci	https://www.intel.com/content/dam/doc/datasheet/6300esb-io-controller-hub-datasheet.pdf
1378c2ecf20Sopenharmony_ci
1388c2ecf20Sopenharmony_ci      - Intel® Xeon® Processor E5-1600/2400/2600/4600 v3 Product Families
1398c2ecf20Sopenharmony_ci	Datasheet - Volume 2: Registers (Document # 330784-003)
1408c2ecf20Sopenharmony_ci	6.6.41 cipintrc Coherent Interface Protocol Interrupt Control
1418c2ecf20Sopenharmony_ci	https://www.intel.com/content/dam/www/public/us/en/documents/datasheets/xeon-e5-v3-datasheet-vol-2.pdf
1428c2ecf20Sopenharmony_ci
1438c2ecf20Sopenharmony_ciExample of handler rerouting
1448c2ecf20Sopenharmony_ci----------------------------
1458c2ecf20Sopenharmony_ci
1468c2ecf20Sopenharmony_ci      - Intel® 6700PXH 64-bit PCI Hub (Document # 302628)
1478c2ecf20Sopenharmony_ci	2.15.2 PCI Express Legacy INTx Support and Boot Interrupt
1488c2ecf20Sopenharmony_ci	https://www.intel.com/content/dam/doc/datasheet/6700pxh-64-bit-pci-hub-datasheet.pdf
1498c2ecf20Sopenharmony_ci
1508c2ecf20Sopenharmony_ci
1518c2ecf20Sopenharmony_ciIf you have any legacy PCI interrupt questions that aren't answered, email me.
1528c2ecf20Sopenharmony_ci
1538c2ecf20Sopenharmony_ciCheers,
1548c2ecf20Sopenharmony_ci    Sean V Kelley
1558c2ecf20Sopenharmony_ci    sean.v.kelley@linux.intel.com
1568c2ecf20Sopenharmony_ci
1578c2ecf20Sopenharmony_ci.. [1] https://lore.kernel.org/r/12131949181903-git-send-email-sassmann@suse.de/
1588c2ecf20Sopenharmony_ci.. [2] https://lore.kernel.org/r/12131949182094-git-send-email-sassmann@suse.de/
1598c2ecf20Sopenharmony_ci.. [3] https://lore.kernel.org/r/487C8EA7.6020205@suse.de/
160