162306a36Sopenharmony_ci=================
262306a36Sopenharmony_ciScheduler debugfs
362306a36Sopenharmony_ci=================
462306a36Sopenharmony_ci
562306a36Sopenharmony_ciBooting a kernel with CONFIG_SCHED_DEBUG=y will give access to
662306a36Sopenharmony_cischeduler specific debug files under /sys/kernel/debug/sched. Some of
762306a36Sopenharmony_cithose files are described below.
862306a36Sopenharmony_ci
962306a36Sopenharmony_cinuma_balancing
1062306a36Sopenharmony_ci==============
1162306a36Sopenharmony_ci
1262306a36Sopenharmony_ci`numa_balancing` directory is used to hold files to control NUMA
1362306a36Sopenharmony_cibalancing feature.  If the system overhead from the feature is too
1462306a36Sopenharmony_cihigh then the rate the kernel samples for NUMA hinting faults may be
1562306a36Sopenharmony_cicontrolled by the `scan_period_min_ms, scan_delay_ms,
1662306a36Sopenharmony_ciscan_period_max_ms, scan_size_mb` files.
1762306a36Sopenharmony_ci
1862306a36Sopenharmony_ci
1962306a36Sopenharmony_ciscan_period_min_ms, scan_delay_ms, scan_period_max_ms, scan_size_mb
2062306a36Sopenharmony_ci-------------------------------------------------------------------
2162306a36Sopenharmony_ci
2262306a36Sopenharmony_ciAutomatic NUMA balancing scans tasks address space and unmaps pages to
2362306a36Sopenharmony_cidetect if pages are properly placed or if the data should be migrated to a
2462306a36Sopenharmony_cimemory node local to where the task is running.  Every "scan delay" the task
2562306a36Sopenharmony_ciscans the next "scan size" number of pages in its address space. When the
2662306a36Sopenharmony_ciend of the address space is reached the scanner restarts from the beginning.
2762306a36Sopenharmony_ci
2862306a36Sopenharmony_ciIn combination, the "scan delay" and "scan size" determine the scan rate.
2962306a36Sopenharmony_ciWhen "scan delay" decreases, the scan rate increases.  The scan delay and
3062306a36Sopenharmony_cihence the scan rate of every task is adaptive and depends on historical
3162306a36Sopenharmony_cibehaviour. If pages are properly placed then the scan delay increases,
3262306a36Sopenharmony_ciotherwise the scan delay decreases.  The "scan size" is not adaptive but
3362306a36Sopenharmony_cithe higher the "scan size", the higher the scan rate.
3462306a36Sopenharmony_ci
3562306a36Sopenharmony_ciHigher scan rates incur higher system overhead as page faults must be
3662306a36Sopenharmony_citrapped and potentially data must be migrated. However, the higher the scan
3762306a36Sopenharmony_cirate, the more quickly a tasks memory is migrated to a local node if the
3862306a36Sopenharmony_ciworkload pattern changes and minimises performance impact due to remote
3962306a36Sopenharmony_cimemory accesses. These files control the thresholds for scan delays and
4062306a36Sopenharmony_cithe number of pages scanned.
4162306a36Sopenharmony_ci
4262306a36Sopenharmony_ci``scan_period_min_ms`` is the minimum time in milliseconds to scan a
4362306a36Sopenharmony_citasks virtual memory. It effectively controls the maximum scanning
4462306a36Sopenharmony_cirate for each task.
4562306a36Sopenharmony_ci
4662306a36Sopenharmony_ci``scan_delay_ms`` is the starting "scan delay" used for a task when it
4762306a36Sopenharmony_ciinitially forks.
4862306a36Sopenharmony_ci
4962306a36Sopenharmony_ci``scan_period_max_ms`` is the maximum time in milliseconds to scan a
5062306a36Sopenharmony_citasks virtual memory. It effectively controls the minimum scanning
5162306a36Sopenharmony_cirate for each task.
5262306a36Sopenharmony_ci
5362306a36Sopenharmony_ci``scan_size_mb`` is how many megabytes worth of pages are scanned for
5462306a36Sopenharmony_cia given scan.
55