18c2ecf20Sopenharmony_ci==========================
28c2ecf20Sopenharmony_ciPCI Bus EEH Error Recovery
38c2ecf20Sopenharmony_ci==========================
48c2ecf20Sopenharmony_ci
58c2ecf20Sopenharmony_ciLinas Vepstas <linas@austin.ibm.com>
68c2ecf20Sopenharmony_ci
78c2ecf20Sopenharmony_ci12 January 2005
88c2ecf20Sopenharmony_ci
98c2ecf20Sopenharmony_ci
108c2ecf20Sopenharmony_ciOverview:
118c2ecf20Sopenharmony_ci---------
128c2ecf20Sopenharmony_ciThe IBM POWER-based pSeries and iSeries computers include PCI bus
138c2ecf20Sopenharmony_cicontroller chips that have extended capabilities for detecting and
148c2ecf20Sopenharmony_cireporting a large variety of PCI bus error conditions.  These features
158c2ecf20Sopenharmony_cigo under the name of "EEH", for "Enhanced Error Handling".  The EEH
168c2ecf20Sopenharmony_cihardware features allow PCI bus errors to be cleared and a PCI
178c2ecf20Sopenharmony_cicard to be "rebooted", without also having to reboot the operating
188c2ecf20Sopenharmony_cisystem.
198c2ecf20Sopenharmony_ci
208c2ecf20Sopenharmony_ciThis is in contrast to traditional PCI error handling, where the
218c2ecf20Sopenharmony_ciPCI chip is wired directly to the CPU, and an error would cause
228c2ecf20Sopenharmony_cia CPU machine-check/check-stop condition, halting the CPU entirely.
238c2ecf20Sopenharmony_ciAnother "traditional" technique is to ignore such errors, which
248c2ecf20Sopenharmony_cican lead to data corruption, both of user data or of kernel data,
258c2ecf20Sopenharmony_cihung/unresponsive adapters, or system crashes/lockups.  Thus,
268c2ecf20Sopenharmony_cithe idea behind EEH is that the operating system can become more
278c2ecf20Sopenharmony_cireliable and robust by protecting it from PCI errors, and giving
288c2ecf20Sopenharmony_cithe OS the ability to "reboot"/recover individual PCI devices.
298c2ecf20Sopenharmony_ci
308c2ecf20Sopenharmony_ciFuture systems from other vendors, based on the PCI-E specification,
318c2ecf20Sopenharmony_cimay contain similar features.
328c2ecf20Sopenharmony_ci
338c2ecf20Sopenharmony_ci
348c2ecf20Sopenharmony_ciCauses of EEH Errors
358c2ecf20Sopenharmony_ci--------------------
368c2ecf20Sopenharmony_ciEEH was originally designed to guard against hardware failure, such
378c2ecf20Sopenharmony_cias PCI cards dying from heat, humidity, dust, vibration and bad
388c2ecf20Sopenharmony_cielectrical connections. The vast majority of EEH errors seen in
398c2ecf20Sopenharmony_ci"real life" are due to either poorly seated PCI cards, or,
408c2ecf20Sopenharmony_ciunfortunately quite commonly, due to device driver bugs, device firmware
418c2ecf20Sopenharmony_cibugs, and sometimes PCI card hardware bugs.
428c2ecf20Sopenharmony_ci
438c2ecf20Sopenharmony_ciThe most common software bug, is one that causes the device to
448c2ecf20Sopenharmony_ciattempt to DMA to a location in system memory that has not been
458c2ecf20Sopenharmony_cireserved for DMA access for that card.  This is a powerful feature,
468c2ecf20Sopenharmony_cias it prevents what; otherwise, would have been silent memory
478c2ecf20Sopenharmony_cicorruption caused by the bad DMA.  A number of device driver
488c2ecf20Sopenharmony_cibugs have been found and fixed in this way over the past few
498c2ecf20Sopenharmony_ciyears.  Other possible causes of EEH errors include data or
508c2ecf20Sopenharmony_ciaddress line parity errors (for example, due to poor electrical
518c2ecf20Sopenharmony_ciconnectivity due to a poorly seated card), and PCI-X split-completion
528c2ecf20Sopenharmony_cierrors (due to software, device firmware, or device PCI hardware bugs).
538c2ecf20Sopenharmony_ciThe vast majority of "true hardware failures" can be cured by
548c2ecf20Sopenharmony_ciphysically removing and re-seating the PCI card.
558c2ecf20Sopenharmony_ci
568c2ecf20Sopenharmony_ci
578c2ecf20Sopenharmony_ciDetection and Recovery
588c2ecf20Sopenharmony_ci----------------------
598c2ecf20Sopenharmony_ciIn the following discussion, a generic overview of how to detect
608c2ecf20Sopenharmony_ciand recover from EEH errors will be presented. This is followed
618c2ecf20Sopenharmony_ciby an overview of how the current implementation in the Linux
628c2ecf20Sopenharmony_cikernel does it.  The actual implementation is subject to change,
638c2ecf20Sopenharmony_ciand some of the finer points are still being debated.  These
648c2ecf20Sopenharmony_cimay in turn be swayed if or when other architectures implement
658c2ecf20Sopenharmony_cisimilar functionality.
668c2ecf20Sopenharmony_ci
678c2ecf20Sopenharmony_ciWhen a PCI Host Bridge (PHB, the bus controller connecting the
688c2ecf20Sopenharmony_ciPCI bus to the system CPU electronics complex) detects a PCI error
698c2ecf20Sopenharmony_cicondition, it will "isolate" the affected PCI card.  Isolation
708c2ecf20Sopenharmony_ciwill block all writes (either to the card from the system, or
718c2ecf20Sopenharmony_cifrom the card to the system), and it will cause all reads to
728c2ecf20Sopenharmony_cireturn all-ff's (0xff, 0xffff, 0xffffffff for 8/16/32-bit reads).
738c2ecf20Sopenharmony_ciThis value was chosen because it is the same value you would
748c2ecf20Sopenharmony_ciget if the device was physically unplugged from the slot.
758c2ecf20Sopenharmony_ciThis includes access to PCI memory, I/O space, and PCI config
768c2ecf20Sopenharmony_cispace.  Interrupts; however, will continued to be delivered.
778c2ecf20Sopenharmony_ci
788c2ecf20Sopenharmony_ciDetection and recovery are performed with the aid of ppc64
798c2ecf20Sopenharmony_cifirmware.  The programming interfaces in the Linux kernel
808c2ecf20Sopenharmony_ciinto the firmware are referred to as RTAS (Run-Time Abstraction
818c2ecf20Sopenharmony_ciServices).  The Linux kernel does not (should not) access
828c2ecf20Sopenharmony_cithe EEH function in the PCI chipsets directly, primarily because
838c2ecf20Sopenharmony_cithere are a number of different chipsets out there, each with
848c2ecf20Sopenharmony_cidifferent interfaces and quirks. The firmware provides a
858c2ecf20Sopenharmony_ciuniform abstraction layer that will work with all pSeries
868c2ecf20Sopenharmony_ciand iSeries hardware (and be forwards-compatible).
878c2ecf20Sopenharmony_ci
888c2ecf20Sopenharmony_ciIf the OS or device driver suspects that a PCI slot has been
898c2ecf20Sopenharmony_ciEEH-isolated, there is a firmware call it can make to determine if
908c2ecf20Sopenharmony_cithis is the case. If so, then the device driver should put itself
918c2ecf20Sopenharmony_ciinto a consistent state (given that it won't be able to complete any
928c2ecf20Sopenharmony_cipending work) and start recovery of the card.  Recovery normally
938c2ecf20Sopenharmony_ciwould consist of resetting the PCI device (holding the PCI #RST
948c2ecf20Sopenharmony_ciline high for two seconds), followed by setting up the device
958c2ecf20Sopenharmony_ciconfig space (the base address registers (BAR's), latency timer,
968c2ecf20Sopenharmony_cicache line size, interrupt line, and so on).  This is followed by a
978c2ecf20Sopenharmony_cireinitialization of the device driver.  In a worst-case scenario,
988c2ecf20Sopenharmony_cithe power to the card can be toggled, at least on hot-plug-capable
998c2ecf20Sopenharmony_cislots.  In principle, layers far above the device driver probably
1008c2ecf20Sopenharmony_cido not need to know that the PCI card has been "rebooted" in this
1018c2ecf20Sopenharmony_ciway; ideally, there should be at most a pause in Ethernet/disk/USB
1028c2ecf20Sopenharmony_ciI/O while the card is being reset.
1038c2ecf20Sopenharmony_ci
1048c2ecf20Sopenharmony_ciIf the card cannot be recovered after three or four resets, the
1058c2ecf20Sopenharmony_cikernel/device driver should assume the worst-case scenario, that the
1068c2ecf20Sopenharmony_cicard has died completely, and report this error to the sysadmin.
1078c2ecf20Sopenharmony_ciIn addition, error messages are reported through RTAS and also through
1088c2ecf20Sopenharmony_cisyslogd (/var/log/messages) to alert the sysadmin of PCI resets.
1098c2ecf20Sopenharmony_ciThe correct way to deal with failed adapters is to use the standard
1108c2ecf20Sopenharmony_ciPCI hotplug tools to remove and replace the dead card.
1118c2ecf20Sopenharmony_ci
1128c2ecf20Sopenharmony_ci
1138c2ecf20Sopenharmony_ciCurrent PPC64 Linux EEH Implementation
1148c2ecf20Sopenharmony_ci--------------------------------------
1158c2ecf20Sopenharmony_ciAt this time, a generic EEH recovery mechanism has been implemented,
1168c2ecf20Sopenharmony_ciso that individual device drivers do not need to be modified to support
1178c2ecf20Sopenharmony_ciEEH recovery.  This generic mechanism piggy-backs on the PCI hotplug
1188c2ecf20Sopenharmony_ciinfrastructure,  and percolates events up through the userspace/udev
1198c2ecf20Sopenharmony_ciinfrastructure.  Following is a detailed description of how this is
1208c2ecf20Sopenharmony_ciaccomplished.
1218c2ecf20Sopenharmony_ci
1228c2ecf20Sopenharmony_ciEEH must be enabled in the PHB's very early during the boot process,
1238c2ecf20Sopenharmony_ciand if a PCI slot is hot-plugged. The former is performed by
1248c2ecf20Sopenharmony_cieeh_init() in arch/powerpc/platforms/pseries/eeh.c, and the later by
1258c2ecf20Sopenharmony_cidrivers/pci/hotplug/pSeries_pci.c calling in to the eeh.c code.
1268c2ecf20Sopenharmony_ciEEH must be enabled before a PCI scan of the device can proceed.
1278c2ecf20Sopenharmony_ciCurrent Power5 hardware will not work unless EEH is enabled;
1288c2ecf20Sopenharmony_cialthough older Power4 can run with it disabled.  Effectively,
1298c2ecf20Sopenharmony_ciEEH can no longer be turned off.  PCI devices *must* be
1308c2ecf20Sopenharmony_ciregistered with the EEH code; the EEH code needs to know about
1318c2ecf20Sopenharmony_cithe I/O address ranges of the PCI device in order to detect an
1328c2ecf20Sopenharmony_cierror.  Given an arbitrary address, the routine
1338c2ecf20Sopenharmony_cipci_get_device_by_addr() will find the pci device associated
1348c2ecf20Sopenharmony_ciwith that address (if any).
1358c2ecf20Sopenharmony_ci
1368c2ecf20Sopenharmony_ciThe default arch/powerpc/include/asm/io.h macros readb(), inb(), insb(),
1378c2ecf20Sopenharmony_cietc. include a check to see if the i/o read returned all-0xff's.
1388c2ecf20Sopenharmony_ciIf so, these make a call to eeh_dn_check_failure(), which in turn
1398c2ecf20Sopenharmony_ciasks the firmware if the all-ff's value is the sign of a true EEH
1408c2ecf20Sopenharmony_cierror.  If it is not, processing continues as normal.  The grand
1418c2ecf20Sopenharmony_citotal number of these false alarms or "false positives" can be
1428c2ecf20Sopenharmony_ciseen in /proc/ppc64/eeh (subject to change).  Normally, almost
1438c2ecf20Sopenharmony_ciall of these occur during boot, when the PCI bus is scanned, where
1448c2ecf20Sopenharmony_cia large number of 0xff reads are part of the bus scan procedure.
1458c2ecf20Sopenharmony_ci
1468c2ecf20Sopenharmony_ciIf a frozen slot is detected, code in
1478c2ecf20Sopenharmony_ciarch/powerpc/platforms/pseries/eeh.c will print a stack trace to
1488c2ecf20Sopenharmony_cisyslog (/var/log/messages).  This stack trace has proven to be very
1498c2ecf20Sopenharmony_ciuseful to device-driver authors for finding out at what point the EEH
1508c2ecf20Sopenharmony_cierror was detected, as the error itself usually occurs slightly
1518c2ecf20Sopenharmony_cibeforehand.
1528c2ecf20Sopenharmony_ci
1538c2ecf20Sopenharmony_ciNext, it uses the Linux kernel notifier chain/work queue mechanism to
1548c2ecf20Sopenharmony_ciallow any interested parties to find out about the failure.  Device
1558c2ecf20Sopenharmony_cidrivers, or other parts of the kernel, can use
1568c2ecf20Sopenharmony_ci`eeh_register_notifier(struct notifier_block *)` to find out about EEH
1578c2ecf20Sopenharmony_cievents.  The event will include a pointer to the pci device, the
1588c2ecf20Sopenharmony_cidevice node and some state info.  Receivers of the event can "do as
1598c2ecf20Sopenharmony_cithey wish"; the default handler will be described further in this
1608c2ecf20Sopenharmony_cisection.
1618c2ecf20Sopenharmony_ci
1628c2ecf20Sopenharmony_ciTo assist in the recovery of the device, eeh.c exports the
1638c2ecf20Sopenharmony_cifollowing functions:
1648c2ecf20Sopenharmony_ci
1658c2ecf20Sopenharmony_cirtas_set_slot_reset()
1668c2ecf20Sopenharmony_ci   assert the  PCI #RST line for 1/8th of a second
1678c2ecf20Sopenharmony_cirtas_configure_bridge()
1688c2ecf20Sopenharmony_ci   ask firmware to configure any PCI bridges
1698c2ecf20Sopenharmony_ci   located topologically under the pci slot.
1708c2ecf20Sopenharmony_cieeh_save_bars() and eeh_restore_bars():
1718c2ecf20Sopenharmony_ci   save and restore the PCI
1728c2ecf20Sopenharmony_ci   config-space info for a device and any devices under it.
1738c2ecf20Sopenharmony_ci
1748c2ecf20Sopenharmony_ci
1758c2ecf20Sopenharmony_ciA handler for the EEH notifier_block events is implemented in
1768c2ecf20Sopenharmony_cidrivers/pci/hotplug/pSeries_pci.c, called handle_eeh_events().
1778c2ecf20Sopenharmony_ciIt saves the device BAR's and then calls rpaphp_unconfig_pci_adapter().
1788c2ecf20Sopenharmony_ciThis last call causes the device driver for the card to be stopped,
1798c2ecf20Sopenharmony_ciwhich causes uevents to go out to user space. This triggers
1808c2ecf20Sopenharmony_ciuser-space scripts that might issue commands such as "ifdown eth0"
1818c2ecf20Sopenharmony_cifor ethernet cards, and so on.  This handler then sleeps for 5 seconds,
1828c2ecf20Sopenharmony_cihoping to give the user-space scripts enough time to complete.
1838c2ecf20Sopenharmony_ciIt then resets the PCI card, reconfigures the device BAR's, and
1848c2ecf20Sopenharmony_ciany bridges underneath. It then calls rpaphp_enable_pci_slot(),
1858c2ecf20Sopenharmony_ciwhich restarts the device driver and triggers more user-space
1868c2ecf20Sopenharmony_cievents (for example, calling "ifup eth0" for ethernet cards).
1878c2ecf20Sopenharmony_ci
1888c2ecf20Sopenharmony_ci
1898c2ecf20Sopenharmony_ciDevice Shutdown and User-Space Events
1908c2ecf20Sopenharmony_ci-------------------------------------
1918c2ecf20Sopenharmony_ciThis section documents what happens when a pci slot is unconfigured,
1928c2ecf20Sopenharmony_cifocusing on how the device driver gets shut down, and on how the
1938c2ecf20Sopenharmony_cievents get delivered to user-space scripts.
1948c2ecf20Sopenharmony_ci
1958c2ecf20Sopenharmony_ciFollowing is an example sequence of events that cause a device driver
1968c2ecf20Sopenharmony_ciclose function to be called during the first phase of an EEH reset.
1978c2ecf20Sopenharmony_ciThe following sequence is an example of the pcnet32 device driver::
1988c2ecf20Sopenharmony_ci
1998c2ecf20Sopenharmony_ci    rpa_php_unconfig_pci_adapter (struct slot *)  // in rpaphp_pci.c
2008c2ecf20Sopenharmony_ci    {
2018c2ecf20Sopenharmony_ci      calls
2028c2ecf20Sopenharmony_ci      pci_remove_bus_device (struct pci_dev *) // in /drivers/pci/remove.c
2038c2ecf20Sopenharmony_ci      {
2048c2ecf20Sopenharmony_ci        calls
2058c2ecf20Sopenharmony_ci        pci_destroy_dev (struct pci_dev *)
2068c2ecf20Sopenharmony_ci        {
2078c2ecf20Sopenharmony_ci          calls
2088c2ecf20Sopenharmony_ci          device_unregister (&dev->dev) // in /drivers/base/core.c
2098c2ecf20Sopenharmony_ci          {
2108c2ecf20Sopenharmony_ci            calls
2118c2ecf20Sopenharmony_ci            device_del (struct device *)
2128c2ecf20Sopenharmony_ci            {
2138c2ecf20Sopenharmony_ci              calls
2148c2ecf20Sopenharmony_ci              bus_remove_device() // in /drivers/base/bus.c
2158c2ecf20Sopenharmony_ci              {
2168c2ecf20Sopenharmony_ci                calls
2178c2ecf20Sopenharmony_ci                device_release_driver()
2188c2ecf20Sopenharmony_ci                {
2198c2ecf20Sopenharmony_ci                  calls
2208c2ecf20Sopenharmony_ci                  struct device_driver->remove() which is just
2218c2ecf20Sopenharmony_ci                  pci_device_remove()  // in /drivers/pci/pci_driver.c
2228c2ecf20Sopenharmony_ci                  {
2238c2ecf20Sopenharmony_ci                    calls
2248c2ecf20Sopenharmony_ci                    struct pci_driver->remove() which is just
2258c2ecf20Sopenharmony_ci                    pcnet32_remove_one() // in /drivers/net/pcnet32.c
2268c2ecf20Sopenharmony_ci                    {
2278c2ecf20Sopenharmony_ci                      calls
2288c2ecf20Sopenharmony_ci                      unregister_netdev() // in /net/core/dev.c
2298c2ecf20Sopenharmony_ci                      {
2308c2ecf20Sopenharmony_ci                        calls
2318c2ecf20Sopenharmony_ci                        dev_close()  // in /net/core/dev.c
2328c2ecf20Sopenharmony_ci                        {
2338c2ecf20Sopenharmony_ci                           calls dev->stop();
2348c2ecf20Sopenharmony_ci                           which is just pcnet32_close() // in pcnet32.c
2358c2ecf20Sopenharmony_ci                           {
2368c2ecf20Sopenharmony_ci                             which does what you wanted
2378c2ecf20Sopenharmony_ci                             to stop the device
2388c2ecf20Sopenharmony_ci                           }
2398c2ecf20Sopenharmony_ci                        }
2408c2ecf20Sopenharmony_ci                     }
2418c2ecf20Sopenharmony_ci                   which
2428c2ecf20Sopenharmony_ci                   frees pcnet32 device driver memory
2438c2ecf20Sopenharmony_ci                }
2448c2ecf20Sopenharmony_ci     }}}}}}
2458c2ecf20Sopenharmony_ci
2468c2ecf20Sopenharmony_ci
2478c2ecf20Sopenharmony_ciin drivers/pci/pci_driver.c,
2488c2ecf20Sopenharmony_cistruct device_driver->remove() is just pci_device_remove()
2498c2ecf20Sopenharmony_ciwhich calls struct pci_driver->remove() which is pcnet32_remove_one()
2508c2ecf20Sopenharmony_ciwhich calls unregister_netdev()  (in net/core/dev.c)
2518c2ecf20Sopenharmony_ciwhich calls dev_close()  (in net/core/dev.c)
2528c2ecf20Sopenharmony_ciwhich calls dev->stop() which is pcnet32_close()
2538c2ecf20Sopenharmony_ciwhich then does the appropriate shutdown.
2548c2ecf20Sopenharmony_ci
2558c2ecf20Sopenharmony_ci---
2568c2ecf20Sopenharmony_ci
2578c2ecf20Sopenharmony_ciFollowing is the analogous stack trace for events sent to user-space
2588c2ecf20Sopenharmony_ciwhen the pci device is unconfigured::
2598c2ecf20Sopenharmony_ci
2608c2ecf20Sopenharmony_ci  rpa_php_unconfig_pci_adapter() {             // in rpaphp_pci.c
2618c2ecf20Sopenharmony_ci    calls
2628c2ecf20Sopenharmony_ci    pci_remove_bus_device (struct pci_dev *) { // in /drivers/pci/remove.c
2638c2ecf20Sopenharmony_ci      calls
2648c2ecf20Sopenharmony_ci      pci_destroy_dev (struct pci_dev *) {
2658c2ecf20Sopenharmony_ci        calls
2668c2ecf20Sopenharmony_ci        device_unregister (&dev->dev) {        // in /drivers/base/core.c
2678c2ecf20Sopenharmony_ci          calls
2688c2ecf20Sopenharmony_ci          device_del(struct device * dev) {    // in /drivers/base/core.c
2698c2ecf20Sopenharmony_ci            calls
2708c2ecf20Sopenharmony_ci            kobject_del() {                    //in /libs/kobject.c
2718c2ecf20Sopenharmony_ci              calls
2728c2ecf20Sopenharmony_ci              kobject_uevent() {               // in /libs/kobject.c
2738c2ecf20Sopenharmony_ci                calls
2748c2ecf20Sopenharmony_ci                kset_uevent() {                // in /lib/kobject.c
2758c2ecf20Sopenharmony_ci                  calls
2768c2ecf20Sopenharmony_ci                  kset->uevent_ops->uevent()   // which is really just
2778c2ecf20Sopenharmony_ci                  a call to
2788c2ecf20Sopenharmony_ci                  dev_uevent() {               // in /drivers/base/core.c
2798c2ecf20Sopenharmony_ci                    calls
2808c2ecf20Sopenharmony_ci                    dev->bus->uevent() which is really just a call to
2818c2ecf20Sopenharmony_ci                    pci_uevent () {            // in drivers/pci/hotplug.c
2828c2ecf20Sopenharmony_ci                      which prints device name, etc....
2838c2ecf20Sopenharmony_ci                   }
2848c2ecf20Sopenharmony_ci                 }
2858c2ecf20Sopenharmony_ci                 then kobject_uevent() sends a netlink uevent to userspace
2868c2ecf20Sopenharmony_ci                 --> userspace uevent
2878c2ecf20Sopenharmony_ci                 (during early boot, nobody listens to netlink events and
2888c2ecf20Sopenharmony_ci                 kobject_uevent() executes uevent_helper[], which runs the
2898c2ecf20Sopenharmony_ci                 event process /sbin/hotplug)
2908c2ecf20Sopenharmony_ci             }
2918c2ecf20Sopenharmony_ci           }
2928c2ecf20Sopenharmony_ci           kobject_del() then calls sysfs_remove_dir(), which would
2938c2ecf20Sopenharmony_ci           trigger any user-space daemon that was watching /sysfs,
2948c2ecf20Sopenharmony_ci           and notice the delete event.
2958c2ecf20Sopenharmony_ci
2968c2ecf20Sopenharmony_ci
2978c2ecf20Sopenharmony_ciPro's and Con's of the Current Design
2988c2ecf20Sopenharmony_ci-------------------------------------
2998c2ecf20Sopenharmony_ciThere are several issues with the current EEH software recovery design,
3008c2ecf20Sopenharmony_ciwhich may be addressed in future revisions.  But first, note that the
3018c2ecf20Sopenharmony_cibig plus of the current design is that no changes need to be made to
3028c2ecf20Sopenharmony_ciindividual device drivers, so that the current design throws a wide net.
3038c2ecf20Sopenharmony_ciThe biggest negative of the design is that it potentially disturbs
3048c2ecf20Sopenharmony_cinetwork daemons and file systems that didn't need to be disturbed.
3058c2ecf20Sopenharmony_ci
3068c2ecf20Sopenharmony_ci-  A minor complaint is that resetting the network card causes
3078c2ecf20Sopenharmony_ci   user-space back-to-back ifdown/ifup burps that potentially disturb
3088c2ecf20Sopenharmony_ci   network daemons, that didn't need to even know that the pci
3098c2ecf20Sopenharmony_ci   card was being rebooted.
3108c2ecf20Sopenharmony_ci
3118c2ecf20Sopenharmony_ci-  A more serious concern is that the same reset, for SCSI devices,
3128c2ecf20Sopenharmony_ci   causes havoc to mounted file systems.  Scripts cannot post-facto
3138c2ecf20Sopenharmony_ci   unmount a file system without flushing pending buffers, but this
3148c2ecf20Sopenharmony_ci   is impossible, because I/O has already been stopped.  Thus,
3158c2ecf20Sopenharmony_ci   ideally, the reset should happen at or below the block layer,
3168c2ecf20Sopenharmony_ci   so that the file systems are not disturbed.
3178c2ecf20Sopenharmony_ci
3188c2ecf20Sopenharmony_ci   Reiserfs does not tolerate errors returned from the block device.
3198c2ecf20Sopenharmony_ci   Ext3fs seems to be tolerant, retrying reads/writes until it does
3208c2ecf20Sopenharmony_ci   succeed. Both have been only lightly tested in this scenario.
3218c2ecf20Sopenharmony_ci
3228c2ecf20Sopenharmony_ci   The SCSI-generic subsystem already has built-in code for performing
3238c2ecf20Sopenharmony_ci   SCSI device resets, SCSI bus resets, and SCSI host-bus-adapter
3248c2ecf20Sopenharmony_ci   (HBA) resets.  These are cascaded into a chain of attempted
3258c2ecf20Sopenharmony_ci   resets if a SCSI command fails. These are completely hidden
3268c2ecf20Sopenharmony_ci   from the block layer.  It would be very natural to add an EEH
3278c2ecf20Sopenharmony_ci   reset into this chain of events.
3288c2ecf20Sopenharmony_ci
3298c2ecf20Sopenharmony_ci-  If a SCSI error occurs for the root device, all is lost unless
3308c2ecf20Sopenharmony_ci   the sysadmin had the foresight to run /bin, /sbin, /etc, /var
3318c2ecf20Sopenharmony_ci   and so on, out of ramdisk/tmpfs.
3328c2ecf20Sopenharmony_ci
3338c2ecf20Sopenharmony_ci
3348c2ecf20Sopenharmony_ciConclusions
3358c2ecf20Sopenharmony_ci-----------
3368c2ecf20Sopenharmony_ciThere's forward progress ...
337