162306a36Sopenharmony_ci.. SPDX-License-Identifier: GPL-2.0 262306a36Sopenharmony_ci 362306a36Sopenharmony_ci=============== 462306a36Sopenharmony_ciBoot Interrupts 562306a36Sopenharmony_ci=============== 662306a36Sopenharmony_ci 762306a36Sopenharmony_ci:Author: - Sean V Kelley <sean.v.kelley@linux.intel.com> 862306a36Sopenharmony_ci 962306a36Sopenharmony_ciOverview 1062306a36Sopenharmony_ci======== 1162306a36Sopenharmony_ci 1262306a36Sopenharmony_ciOn PCI Express, interrupts are represented with either MSI or inbound 1362306a36Sopenharmony_ciinterrupt messages (Assert_INTx/Deassert_INTx). The integrated IO-APIC in a 1462306a36Sopenharmony_cigiven Core IO converts the legacy interrupt messages from PCI Express to 1562306a36Sopenharmony_ciMSI interrupts. If the IO-APIC is disabled (via the mask bits in the 1662306a36Sopenharmony_ciIO-APIC table entries), the messages are routed to the legacy PCH. This 1762306a36Sopenharmony_ciin-band interrupt mechanism was traditionally necessary for systems that 1862306a36Sopenharmony_cidid not support the IO-APIC and for boot. Intel in the past has used the 1962306a36Sopenharmony_citerm "boot interrupts" to describe this mechanism. Further, the PCI Express 2062306a36Sopenharmony_ciprotocol describes this in-band legacy wire-interrupt INTx mechanism for 2162306a36Sopenharmony_ciI/O devices to signal PCI-style level interrupts. The subsequent paragraphs 2262306a36Sopenharmony_cidescribe problems with the Core IO handling of INTx message routing to the 2362306a36Sopenharmony_ciPCH and mitigation within BIOS and the OS. 2462306a36Sopenharmony_ci 2562306a36Sopenharmony_ci 2662306a36Sopenharmony_ciIssue 2762306a36Sopenharmony_ci===== 2862306a36Sopenharmony_ci 2962306a36Sopenharmony_ciWhen in-band legacy INTx messages are forwarded to the PCH, they in turn 3062306a36Sopenharmony_citrigger a new interrupt for which the OS likely lacks a handler. When an 3162306a36Sopenharmony_ciinterrupt goes unhandled over time, they are tracked by the Linux kernel as 3262306a36Sopenharmony_ciSpurious Interrupts. The IRQ will be disabled by the Linux kernel after it 3362306a36Sopenharmony_cireaches a specific count with the error "nobody cared". This disabled IRQ 3462306a36Sopenharmony_cinow prevents valid usage by an existing interrupt which may happen to share 3562306a36Sopenharmony_cithe IRQ line:: 3662306a36Sopenharmony_ci 3762306a36Sopenharmony_ci irq 19: nobody cared (try booting with the "irqpoll" option) 3862306a36Sopenharmony_ci CPU: 0 PID: 2988 Comm: irq/34-nipalk Tainted: 4.14.87-rt49-02410-g4a640ec-dirty #1 3962306a36Sopenharmony_ci Hardware name: National Instruments NI PXIe-8880/NI PXIe-8880, BIOS 2.1.5f1 01/09/2020 4062306a36Sopenharmony_ci Call Trace: 4162306a36Sopenharmony_ci 4262306a36Sopenharmony_ci <IRQ> 4362306a36Sopenharmony_ci ? dump_stack+0x46/0x5e 4462306a36Sopenharmony_ci ? __report_bad_irq+0x2e/0xb0 4562306a36Sopenharmony_ci ? note_interrupt+0x242/0x290 4662306a36Sopenharmony_ci ? nNIKAL100_memoryRead16+0x8/0x10 [nikal] 4762306a36Sopenharmony_ci ? handle_irq_event_percpu+0x55/0x70 4862306a36Sopenharmony_ci ? handle_irq_event+0x4f/0x80 4962306a36Sopenharmony_ci ? handle_fasteoi_irq+0x81/0x180 5062306a36Sopenharmony_ci ? handle_irq+0x1c/0x30 5162306a36Sopenharmony_ci ? do_IRQ+0x41/0xd0 5262306a36Sopenharmony_ci ? common_interrupt+0x84/0x84 5362306a36Sopenharmony_ci </IRQ> 5462306a36Sopenharmony_ci 5562306a36Sopenharmony_ci handlers: 5662306a36Sopenharmony_ci irq_default_primary_handler threaded usb_hcd_irq 5762306a36Sopenharmony_ci Disabling IRQ #19 5862306a36Sopenharmony_ci 5962306a36Sopenharmony_ci 6062306a36Sopenharmony_ciConditions 6162306a36Sopenharmony_ci========== 6262306a36Sopenharmony_ci 6362306a36Sopenharmony_ciThe use of threaded interrupts is the most likely condition to trigger 6462306a36Sopenharmony_cithis problem today. Threaded interrupts may not be reenabled after the IRQ 6562306a36Sopenharmony_cihandler wakes. These "one shot" conditions mean that the threaded interrupt 6662306a36Sopenharmony_cineeds to keep the interrupt line masked until the threaded handler has run. 6762306a36Sopenharmony_ciEspecially when dealing with high data rate interrupts, the thread needs to 6862306a36Sopenharmony_cirun to completion; otherwise some handlers will end up in stack overflows 6962306a36Sopenharmony_cisince the interrupt of the issuing device is still active. 7062306a36Sopenharmony_ci 7162306a36Sopenharmony_ciAffected Chipsets 7262306a36Sopenharmony_ci================= 7362306a36Sopenharmony_ci 7462306a36Sopenharmony_ciThe legacy interrupt forwarding mechanism exists today in a number of 7562306a36Sopenharmony_cidevices including but not limited to chipsets from AMD/ATI, Broadcom, and 7662306a36Sopenharmony_ciIntel. Changes made through the mitigations below have been applied to 7762306a36Sopenharmony_cidrivers/pci/quirks.c 7862306a36Sopenharmony_ci 7962306a36Sopenharmony_ciStarting with ICX there are no longer any IO-APICs in the Core IO's 8062306a36Sopenharmony_cidevices. IO-APIC is only in the PCH. Devices connected to the Core IO's 8162306a36Sopenharmony_ciPCIe Root Ports will use native MSI/MSI-X mechanisms. 8262306a36Sopenharmony_ci 8362306a36Sopenharmony_ciMitigations 8462306a36Sopenharmony_ci=========== 8562306a36Sopenharmony_ci 8662306a36Sopenharmony_ciThe mitigations take the form of PCI quirks. The preference has been to 8762306a36Sopenharmony_cifirst identify and make use of a means to disable the routing to the PCH. 8862306a36Sopenharmony_ciIn such a case a quirk to disable boot interrupt generation can be 8962306a36Sopenharmony_ciadded. [1]_ 9062306a36Sopenharmony_ci 9162306a36Sopenharmony_ciIntel® 6300ESB I/O Controller Hub 9262306a36Sopenharmony_ci Alternate Base Address Register: 9362306a36Sopenharmony_ci BIE: Boot Interrupt Enable 9462306a36Sopenharmony_ci 9562306a36Sopenharmony_ci == =========================== 9662306a36Sopenharmony_ci 0 Boot interrupt is enabled. 9762306a36Sopenharmony_ci 1 Boot interrupt is disabled. 9862306a36Sopenharmony_ci == =========================== 9962306a36Sopenharmony_ci 10062306a36Sopenharmony_ciIntel® Sandy Bridge through Sky Lake based Xeon servers: 10162306a36Sopenharmony_ci Coherent Interface Protocol Interrupt Control 10262306a36Sopenharmony_ci dis_intx_route2pch/dis_intx_route2ich/dis_intx_route2dmi2: 10362306a36Sopenharmony_ci When this bit is set. Local INTx messages received from the 10462306a36Sopenharmony_ci Intel® Quick Data DMA/PCI Express ports are not routed to legacy 10562306a36Sopenharmony_ci PCH - they are either converted into MSI via the integrated IO-APIC 10662306a36Sopenharmony_ci (if the IO-APIC mask bit is clear in the appropriate entries) 10762306a36Sopenharmony_ci or cause no further action (when mask bit is set) 10862306a36Sopenharmony_ci 10962306a36Sopenharmony_ciIn the absence of a way to directly disable the routing, another approach 11062306a36Sopenharmony_cihas been to make use of PCI Interrupt pin to INTx routing tables for 11162306a36Sopenharmony_cipurposes of redirecting the interrupt handler to the rerouted interrupt 11262306a36Sopenharmony_ciline by default. Therefore, on chipsets where this INTx routing cannot be 11362306a36Sopenharmony_cidisabled, the Linux kernel will reroute the valid interrupt to its legacy 11462306a36Sopenharmony_ciinterrupt. This redirection of the handler will prevent the occurrence of 11562306a36Sopenharmony_cithe spurious interrupt detection which would ordinarily disable the IRQ 11662306a36Sopenharmony_ciline due to excessive unhandled counts. [2]_ 11762306a36Sopenharmony_ci 11862306a36Sopenharmony_ciThe config option X86_REROUTE_FOR_BROKEN_BOOT_IRQS exists to enable (or 11962306a36Sopenharmony_cidisable) the redirection of the interrupt handler to the PCH interrupt 12062306a36Sopenharmony_ciline. The option can be overridden by either pci=ioapicreroute or 12162306a36Sopenharmony_cipci=noioapicreroute. [3]_ 12262306a36Sopenharmony_ci 12362306a36Sopenharmony_ci 12462306a36Sopenharmony_ciMore Documentation 12562306a36Sopenharmony_ci================== 12662306a36Sopenharmony_ci 12762306a36Sopenharmony_ciThere is an overview of the legacy interrupt handling in several datasheets 12862306a36Sopenharmony_ci(6300ESB and 6700PXH below). While largely the same, it provides insight 12962306a36Sopenharmony_ciinto the evolution of its handling with chipsets. 13062306a36Sopenharmony_ci 13162306a36Sopenharmony_ciExample of disabling of the boot interrupt 13262306a36Sopenharmony_ci------------------------------------------ 13362306a36Sopenharmony_ci 13462306a36Sopenharmony_ci - Intel® 6300ESB I/O Controller Hub (Document # 300641-004US) 13562306a36Sopenharmony_ci 5.7.3 Boot Interrupt 13662306a36Sopenharmony_ci https://www.intel.com/content/dam/doc/datasheet/6300esb-io-controller-hub-datasheet.pdf 13762306a36Sopenharmony_ci 13862306a36Sopenharmony_ci - Intel® Xeon® Processor E5-1600/2400/2600/4600 v3 Product Families 13962306a36Sopenharmony_ci Datasheet - Volume 2: Registers (Document # 330784-003) 14062306a36Sopenharmony_ci 6.6.41 cipintrc Coherent Interface Protocol Interrupt Control 14162306a36Sopenharmony_ci https://www.intel.com/content/dam/www/public/us/en/documents/datasheets/xeon-e5-v3-datasheet-vol-2.pdf 14262306a36Sopenharmony_ci 14362306a36Sopenharmony_ciExample of handler rerouting 14462306a36Sopenharmony_ci---------------------------- 14562306a36Sopenharmony_ci 14662306a36Sopenharmony_ci - Intel® 6700PXH 64-bit PCI Hub (Document # 302628) 14762306a36Sopenharmony_ci 2.15.2 PCI Express Legacy INTx Support and Boot Interrupt 14862306a36Sopenharmony_ci https://www.intel.com/content/dam/doc/datasheet/6700pxh-64-bit-pci-hub-datasheet.pdf 14962306a36Sopenharmony_ci 15062306a36Sopenharmony_ci 15162306a36Sopenharmony_ciIf you have any legacy PCI interrupt questions that aren't answered, email me. 15262306a36Sopenharmony_ci 15362306a36Sopenharmony_ciCheers, 15462306a36Sopenharmony_ci Sean V Kelley 15562306a36Sopenharmony_ci sean.v.kelley@linux.intel.com 15662306a36Sopenharmony_ci 15762306a36Sopenharmony_ci.. [1] https://lore.kernel.org/r/12131949181903-git-send-email-sassmann@suse.de/ 15862306a36Sopenharmony_ci.. [2] https://lore.kernel.org/r/12131949182094-git-send-email-sassmann@suse.de/ 15962306a36Sopenharmony_ci.. [3] https://lore.kernel.org/r/487C8EA7.6020205@suse.de/ 160