12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364656667686970717273747576777879808182838485868788899091929394959697989910010110210310410510610710810911011111211311411511611711811912012112212312412512612712812913013113213313413513613713813914014114214314414514614714814915015115215315415515615715815916016116216316416516616716816917017117217317417517617717817918018118218318418518618718818919019119219319419519619719819920020120220320420520620720820921021121221321421521621721821922022122222322422522622722822923023123223323423523623723823924024124224324424524624724824925025125225325425525625725825926026126226326426526626726826927027127227327427527627727827928028128228328428528628728828929029129229329429529629729829930030130230330430530630730830931031131231331431531631731831932032132232332432532632732832933033133233333433533633733833934034134234334434534634734834935035135235335435535635735835936036136236336436536636736836937037137237337437537637737837938038138238338438538638738838939039139239339439539639739839940040140240340440540640740840941041141241341441541641741841942042142242342442542642742842943043143243343443543643743843944044144244344444544644744844945045145245345445545645745845946046146246346446546646746846947047147247347447547647747847948048148248348448548648748848949049149249349449549649749849950050150250350450550650750850951051151251351451551651751851952052152252352452552652752852953053153253353453553653753853954054154254354454554654754854955055155255355455555655755855956056156256356456556656756856957057157257357457557657757857958058158258358458558658758858959059159259359459559659759859960060160260360460560660760860961061161261361461561661761861962062162262362462562662762862963063163263363463563663763863964064164264364464564664764864965065165265365465565665765865966066166266366466566666766866967067167267367467567667767867968068168268368468568668768868969069169269369469569669769869970070170270370470570670770870971071171271371471571671771871972072172272372472572672772872973073173273373473573673773873974074174274374474574674774874975075175275375475575675775875976076176276376476576676776876977077177277377477577677777877978078178278378478578678778878979079179279379479579679779879980080180280380480580680780880981081181281381481581681781881982082182282382482582682782882983083183283383483583683783883984084184284384484584684784884985085185285385485585685785885986086186286386486586686786886987087187287387487587687787887988088188288388488588688788888989089189289389489589689789889990090190290390490590690790890991091191291391491591691791891992092192292392492592692792892993093193293393493593693793893994094194294394494594694794894995095195295395495595695795895996096196296396496596696796896997097197297397497597697797897998098198298398498598698798898999099199299399499599699799899910001001100210031004100510061007100810091010101110121013101410151016101710181019102010211022102310241025102610271028102910301031103210331034103510361037103810391040104110421043104410451046104710481049105010511052105310541055105610571058105910601061106210631064106510661067106810691070107110721073107410751076107710781079108010811082108310841085 |
- Documentation for /proc/sys/kernel/* kernel version 2.2.10
- (c) 1998, 1999, Rik van Riel <[email protected]>
- (c) 2009, Shen Feng<[email protected]>
- For general info and legal blurb, please look in README.
- ==============================================================
- This file contains documentation for the sysctl files in
- /proc/sys/kernel/ and is valid for Linux kernel version 2.2.
- The files in this directory can be used to tune and monitor
- miscellaneous and general things in the operation of the Linux
- kernel. Since some of the files _can_ be used to screw up your
- system, it is advisable to read both documentation and source
- before actually making adjustments.
- Currently, these files might (depending on your configuration)
- show up in /proc/sys/kernel:
- - acct
- - acpi_video_flags
- - auto_msgmni
- - bootloader_type [ X86 only ]
- - bootloader_version [ X86 only ]
- - boot_reason [ ARM and ARM64 only ]
- - callhome [ S390 only ]
- - cap_last_cap
- - cold_boot [ ARM and ARM64 only ]
- - core_pattern
- - core_pipe_limit
- - core_uses_pid
- - ctrl-alt-del
- - dmesg_restrict
- - domainname
- - hostname
- - hotplug
- - hardlockup_all_cpu_backtrace
- - hung_task_panic
- - hung_task_check_count
- - hung_task_timeout_secs
- - hung_task_warnings
- - kexec_load_disabled
- - kptr_restrict
- - kstack_depth_to_print [ X86 only ]
- - l2cr [ PPC only ]
- - modprobe ==> Documentation/debugging-modules.txt
- - modules_disabled
- - msg_next_id [ sysv ipc ]
- - msgmax
- - msgmnb
- - msgmni
- - nmi_watchdog
- - osrelease
- - ostype
- - overflowgid
- - overflowuid
- - panic
- - panic_on_oops
- - panic_on_stackoverflow
- - panic_on_unrecovered_nmi
- - panic_on_warn
- - panic_on_rcu_stall
- - perf_cpu_time_max_percent
- - perf_event_paranoid
- - perf_event_max_stack
- - perf_event_max_contexts_per_stack
- - pid_max
- - powersave-nap [ PPC only ]
- - printk
- - printk_delay
- - printk_ratelimit
- - printk_ratelimit_burst
- - pty ==> Documentation/filesystems/devpts.txt
- - randomize_va_space
- - real-root-dev ==> Documentation/initrd.txt
- - reboot-cmd [ SPARC only ]
- - rtsig-max
- - rtsig-nr
- - sem
- - sem_next_id [ sysv ipc ]
- - sg-big-buff [ generic SCSI device (sg) ]
- - shm_next_id [ sysv ipc ]
- - shm_rmid_forced
- - shmall
- - shmmax [ sysv ipc ]
- - shmmni
- - softlockup_all_cpu_backtrace
- - soft_watchdog
- - stop-a [ SPARC only ]
- - sysrq ==> Documentation/sysrq.txt
- - sysctl_writes_strict
- - tainted
- - threads-max
- - unknown_nmi_panic
- - watchdog
- - watchdog_thresh
- - version
- ==============================================================
- acct:
- highwater lowwater frequency
- If BSD-style process accounting is enabled these values control
- its behaviour. If free space on filesystem where the log lives
- goes below <lowwater>% accounting suspends. If free space gets
- above <highwater>% accounting resumes. <Frequency> determines
- how often do we check the amount of free space (value is in
- seconds). Default:
- 4 2 30
- That is, suspend accounting if there left <= 2% free; resume it
- if we got >=4%; consider information about amount of free space
- valid for 30 seconds.
- ==============================================================
- acpi_video_flags:
- flags
- See Doc*/kernel/power/video.txt, it allows mode of video boot to be
- set during run time.
- ==============================================================
- auto_msgmni:
- This variable has no effect and may be removed in future kernel
- releases. Reading it always returns 0.
- Up to Linux 3.17, it enabled/disabled automatic recomputing of msgmni
- upon memory add/remove or upon ipc namespace creation/removal.
- Echoing "1" into this file enabled msgmni automatic recomputing.
- Echoing "0" turned it off. auto_msgmni default value was 1.
- ==============================================================
- bootloader_type:
- x86 bootloader identification
- This gives the bootloader type number as indicated by the bootloader,
- shifted left by 4, and OR'd with the low four bits of the bootloader
- version. The reason for this encoding is that this used to match the
- type_of_loader field in the kernel header; the encoding is kept for
- backwards compatibility. That is, if the full bootloader type number
- is 0x15 and the full version number is 0x234, this file will contain
- the value 340 = 0x154.
- See the type_of_loader and ext_loader_type fields in
- Documentation/x86/boot.txt for additional information.
- ==============================================================
- bootloader_version:
- x86 bootloader version
- The complete bootloader version number. In the example above, this
- file will contain the value 564 = 0x234.
- See the type_of_loader and ext_loader_ver fields in
- Documentation/x86/boot.txt for additional information.
- ==============================================================
- boot_reason:
- ARM and ARM64 -- reason for device boot
- A single bit will be set in the unsigned integer value to identify the
- reason the device was booted / powered on. The value will be zero if this
- feature is not supported on the ARM device being booted.
- See the power-on-status field definitions in
- Documentation/arm/msm/boot.txt for Qualcomm's family of devices.
- ==============================================================
- callhome:
- Controls the kernel's callhome behavior in case of a kernel panic.
- The s390 hardware allows an operating system to send a notification
- to a service organization (callhome) in case of an operating system panic.
- When the value in this file is 0 (which is the default behavior)
- nothing happens in case of a kernel panic. If this value is set to "1"
- the complete kernel oops message is send to the IBM customer service
- organization in case the mainframe the Linux operating system is running
- on has a service contract with IBM.
- ==============================================================
- cap_last_cap
- Highest valid capability of the running kernel. Exports
- CAP_LAST_CAP from the kernel.
- ===============================================================
- cold_boot
- ARM and ARM64 -- indicator for system cold boot
- A single bit will be set in the unsigned integer value to identify
- whether the device was booted from a cold or warm state. Zero
- indicating a warm boot and one indicating a cold boot.
- ==============================================================
- core_pattern:
- core_pattern is used to specify a core dumpfile pattern name.
- . max length 128 characters; default value is "core"
- . core_pattern is used as a pattern template for the output filename;
- certain string patterns (beginning with '%') are substituted with
- their actual values.
- . backward compatibility with core_uses_pid:
- If core_pattern does not include "%p" (default does not)
- and core_uses_pid is set, then .PID will be appended to
- the filename.
- . corename format specifiers:
- %<NUL> '%' is dropped
- %% output one '%'
- %p pid
- %P global pid (init PID namespace)
- %i tid
- %I global tid (init PID namespace)
- %u uid (in initial user namespace)
- %g gid (in initial user namespace)
- %d dump mode, matches PR_SET_DUMPABLE and
- /proc/sys/fs/suid_dumpable
- %s signal number
- %t UNIX time of dump
- %h hostname
- %e executable filename (may be shortened)
- %E executable path
- %<OTHER> both are dropped
- . If the first character of the pattern is a '|', the kernel will treat
- the rest of the pattern as a command to run. The core dump will be
- written to the standard input of that program instead of to a file.
- ==============================================================
- core_pipe_limit:
- This sysctl is only applicable when core_pattern is configured to pipe
- core files to a user space helper (when the first character of
- core_pattern is a '|', see above). When collecting cores via a pipe
- to an application, it is occasionally useful for the collecting
- application to gather data about the crashing process from its
- /proc/pid directory. In order to do this safely, the kernel must wait
- for the collecting process to exit, so as not to remove the crashing
- processes proc files prematurely. This in turn creates the
- possibility that a misbehaving userspace collecting process can block
- the reaping of a crashed process simply by never exiting. This sysctl
- defends against that. It defines how many concurrent crashing
- processes may be piped to user space applications in parallel. If
- this value is exceeded, then those crashing processes above that value
- are noted via the kernel log and their cores are skipped. 0 is a
- special value, indicating that unlimited processes may be captured in
- parallel, but that no waiting will take place (i.e. the collecting
- process is not guaranteed access to /proc/<crashing pid>/). This
- value defaults to 0.
- ==============================================================
- core_uses_pid:
- The default coredump filename is "core". By setting
- core_uses_pid to 1, the coredump filename becomes core.PID.
- If core_pattern does not include "%p" (default does not)
- and core_uses_pid is set, then .PID will be appended to
- the filename.
- ==============================================================
- ctrl-alt-del:
- When the value in this file is 0, ctrl-alt-del is trapped and
- sent to the init(1) program to handle a graceful restart.
- When, however, the value is > 0, Linux's reaction to a Vulcan
- Nerve Pinch (tm) will be an immediate reboot, without even
- syncing its dirty buffers.
- Note: when a program (like dosemu) has the keyboard in 'raw'
- mode, the ctrl-alt-del is intercepted by the program before it
- ever reaches the kernel tty layer, and it's up to the program
- to decide what to do with it.
- ==============================================================
- dmesg_restrict:
- This toggle indicates whether unprivileged users are prevented
- from using dmesg(8) to view messages from the kernel's log buffer.
- When dmesg_restrict is set to (0) there are no restrictions. When
- dmesg_restrict is set set to (1), users must have CAP_SYSLOG to use
- dmesg(8).
- The kernel config option CONFIG_SECURITY_DMESG_RESTRICT sets the
- default value of dmesg_restrict.
- ==============================================================
- domainname & hostname:
- These files can be used to set the NIS/YP domainname and the
- hostname of your box in exactly the same way as the commands
- domainname and hostname, i.e.:
- # echo "darkstar" > /proc/sys/kernel/hostname
- # echo "mydomain" > /proc/sys/kernel/domainname
- has the same effect as
- # hostname "darkstar"
- # domainname "mydomain"
- Note, however, that the classic darkstar.frop.org has the
- hostname "darkstar" and DNS (Internet Domain Name Server)
- domainname "frop.org", not to be confused with the NIS (Network
- Information Service) or YP (Yellow Pages) domainname. These two
- domain names are in general different. For a detailed discussion
- see the hostname(1) man page.
- ==============================================================
- hardlockup_all_cpu_backtrace:
- This value controls the hard lockup detector behavior when a hard
- lockup condition is detected as to whether or not to gather further
- debug information. If enabled, arch-specific all-CPU stack dumping
- will be initiated.
- 0: do nothing. This is the default behavior.
- 1: on detection capture more debug information.
- ==============================================================
- hotplug:
- Path for the hotplug policy agent.
- Default value is "/sbin/hotplug".
- ==============================================================
- hung_task_panic:
- Controls the kernel's behavior when a hung task is detected.
- This file shows up if CONFIG_DETECT_HUNG_TASK is enabled.
- 0: continue operation. This is the default behavior.
- 1: panic immediately.
- ==============================================================
- hung_task_check_count:
- The upper bound on the number of tasks that are checked.
- This file shows up if CONFIG_DETECT_HUNG_TASK is enabled.
- ==============================================================
- hung_task_timeout_secs:
- Check interval. When a task in D state did not get scheduled
- for more than this value report a warning.
- This file shows up if CONFIG_DETECT_HUNG_TASK is enabled.
- 0: means infinite timeout - no checking done.
- Possible values to set are in range {0..LONG_MAX/HZ}.
- ==============================================================
- hung_task_warnings:
- The maximum number of warnings to report. During a check interval
- if a hung task is detected, this value is decreased by 1.
- When this value reaches 0, no more warnings will be reported.
- This file shows up if CONFIG_DETECT_HUNG_TASK is enabled.
- -1: report an infinite number of warnings.
- ==============================================================
- kexec_load_disabled:
- A toggle indicating if the kexec_load syscall has been disabled. This
- value defaults to 0 (false: kexec_load enabled), but can be set to 1
- (true: kexec_load disabled). Once true, kexec can no longer be used, and
- the toggle cannot be set back to false. This allows a kexec image to be
- loaded before disabling the syscall, allowing a system to set up (and
- later use) an image without it being altered. Generally used together
- with the "modules_disabled" sysctl.
- ==============================================================
- kptr_restrict:
- This toggle indicates whether restrictions are placed on
- exposing kernel addresses via /proc and other interfaces.
- When kptr_restrict is set to (0), the default, there are no restrictions.
- When kptr_restrict is set to (1), kernel pointers printed using the %pK
- format specifier will be replaced with 0's unless the user has CAP_SYSLOG
- and effective user and group ids are equal to the real ids. This is
- because %pK checks are done at read() time rather than open() time, so
- if permissions are elevated between the open() and the read() (e.g via
- a setuid binary) then %pK will not leak kernel pointers to unprivileged
- users. Note, this is a temporary solution only. The correct long-term
- solution is to do the permission checks at open() time. Consider removing
- world read permissions from files that use %pK, and using dmesg_restrict
- to protect against uses of %pK in dmesg(8) if leaking kernel pointer
- values to unprivileged users is a concern.
- When kptr_restrict is set to (2), kernel pointers printed using
- %pK will be replaced with 0's regardless of privileges.
- ==============================================================
- kstack_depth_to_print: (X86 only)
- Controls the number of words to print when dumping the raw
- kernel stack.
- ==============================================================
- l2cr: (PPC only)
- This flag controls the L2 cache of G3 processor boards. If
- 0, the cache is disabled. Enabled if nonzero.
- ==============================================================
- modules_disabled:
- A toggle value indicating if modules are allowed to be loaded
- in an otherwise modular kernel. This toggle defaults to off
- (0), but can be set true (1). Once true, modules can be
- neither loaded nor unloaded, and the toggle cannot be set back
- to false. Generally used with the "kexec_load_disabled" toggle.
- ==============================================================
- msg_next_id, sem_next_id, and shm_next_id:
- These three toggles allows to specify desired id for next allocated IPC
- object: message, semaphore or shared memory respectively.
- By default they are equal to -1, which means generic allocation logic.
- Possible values to set are in range {0..INT_MAX}.
- Notes:
- 1) kernel doesn't guarantee, that new object will have desired id. So,
- it's up to userspace, how to handle an object with "wrong" id.
- 2) Toggle with non-default value will be set back to -1 by kernel after
- successful IPC object allocation.
- ==============================================================
- nmi_watchdog:
- This parameter can be used to control the NMI watchdog
- (i.e. the hard lockup detector) on x86 systems.
- 0 - disable the hard lockup detector
- 1 - enable the hard lockup detector
- The hard lockup detector monitors each CPU for its ability to respond to
- timer interrupts. The mechanism utilizes CPU performance counter registers
- that are programmed to generate Non-Maskable Interrupts (NMIs) periodically
- while a CPU is busy. Hence, the alternative name 'NMI watchdog'.
- The NMI watchdog is disabled by default if the kernel is running as a guest
- in a KVM virtual machine. This default can be overridden by adding
- nmi_watchdog=1
- to the guest kernel command line (see Documentation/kernel-parameters.txt).
- ==============================================================
- numa_balancing
- Enables/disables automatic page fault based NUMA memory
- balancing. Memory is moved automatically to nodes
- that access it often.
- Enables/disables automatic NUMA memory balancing. On NUMA machines, there
- is a performance penalty if remote memory is accessed by a CPU. When this
- feature is enabled the kernel samples what task thread is accessing memory
- by periodically unmapping pages and later trapping a page fault. At the
- time of the page fault, it is determined if the data being accessed should
- be migrated to a local memory node.
- The unmapping of pages and trapping faults incur additional overhead that
- ideally is offset by improved memory locality but there is no universal
- guarantee. If the target workload is already bound to NUMA nodes then this
- feature should be disabled. Otherwise, if the system overhead from the
- feature is too high then the rate the kernel samples for NUMA hinting
- faults may be controlled by the numa_balancing_scan_period_min_ms,
- numa_balancing_scan_delay_ms, numa_balancing_scan_period_max_ms,
- numa_balancing_scan_size_mb, and numa_balancing_settle_count sysctls.
- ==============================================================
- numa_balancing_scan_period_min_ms, numa_balancing_scan_delay_ms,
- numa_balancing_scan_period_max_ms, numa_balancing_scan_size_mb
- Automatic NUMA balancing scans tasks address space and unmaps pages to
- detect if pages are properly placed or if the data should be migrated to a
- memory node local to where the task is running. Every "scan delay" the task
- scans the next "scan size" number of pages in its address space. When the
- end of the address space is reached the scanner restarts from the beginning.
- In combination, the "scan delay" and "scan size" determine the scan rate.
- When "scan delay" decreases, the scan rate increases. The scan delay and
- hence the scan rate of every task is adaptive and depends on historical
- behaviour. If pages are properly placed then the scan delay increases,
- otherwise the scan delay decreases. The "scan size" is not adaptive but
- the higher the "scan size", the higher the scan rate.
- Higher scan rates incur higher system overhead as page faults must be
- trapped and potentially data must be migrated. However, the higher the scan
- rate, the more quickly a tasks memory is migrated to a local node if the
- workload pattern changes and minimises performance impact due to remote
- memory accesses. These sysctls control the thresholds for scan delays and
- the number of pages scanned.
- numa_balancing_scan_period_min_ms is the minimum time in milliseconds to
- scan a tasks virtual memory. It effectively controls the maximum scanning
- rate for each task.
- numa_balancing_scan_delay_ms is the starting "scan delay" used for a task
- when it initially forks.
- numa_balancing_scan_period_max_ms is the maximum time in milliseconds to
- scan a tasks virtual memory. It effectively controls the minimum scanning
- rate for each task.
- numa_balancing_scan_size_mb is how many megabytes worth of pages are
- scanned for a given scan.
- ==============================================================
- osrelease, ostype & version:
- # cat osrelease
- 2.1.88
- # cat ostype
- Linux
- # cat version
- #5 Wed Feb 25 21:49:24 MET 1998
- The files osrelease and ostype should be clear enough. Version
- needs a little more clarification however. The '#5' means that
- this is the fifth kernel built from this source base and the
- date behind it indicates the time the kernel was built.
- The only way to tune these values is to rebuild the kernel :-)
- ==============================================================
- overflowgid & overflowuid:
- if your architecture did not always support 32-bit UIDs (i.e. arm,
- i386, m68k, sh, and sparc32), a fixed UID and GID will be returned to
- applications that use the old 16-bit UID/GID system calls, if the
- actual UID or GID would exceed 65535.
- These sysctls allow you to change the value of the fixed UID and GID.
- The default is 65534.
- ==============================================================
- panic:
- The value in this file represents the number of seconds the kernel
- waits before rebooting on a panic. When you use the software watchdog,
- the recommended setting is 60.
- ==============================================================
- panic_on_io_nmi:
- Controls the kernel's behavior when a CPU receives an NMI caused by
- an IO error.
- 0: try to continue operation (default)
- 1: panic immediately. The IO error triggered an NMI. This indicates a
- serious system condition which could result in IO data corruption.
- Rather than continuing, panicking might be a better choice. Some
- servers issue this sort of NMI when the dump button is pushed,
- and you can use this option to take a crash dump.
- ==============================================================
- panic_on_oops:
- Controls the kernel's behaviour when an oops or BUG is encountered.
- 0: try to continue operation
- 1: panic immediately. If the `panic' sysctl is also non-zero then the
- machine will be rebooted.
- ==============================================================
- panic_on_stackoverflow:
- Controls the kernel's behavior when detecting the overflows of
- kernel, IRQ and exception stacks except a user stack.
- This file shows up if CONFIG_DEBUG_STACKOVERFLOW is enabled.
- 0: try to continue operation.
- 1: panic immediately.
- ==============================================================
- panic_on_unrecovered_nmi:
- The default Linux behaviour on an NMI of either memory or unknown is
- to continue operation. For many environments such as scientific
- computing it is preferable that the box is taken out and the error
- dealt with than an uncorrected parity/ECC error get propagated.
- A small number of systems do generate NMI's for bizarre random reasons
- such as power management so the default is off. That sysctl works like
- the existing panic controls already in that directory.
- ==============================================================
- panic_on_warn:
- Calls panic() in the WARN() path when set to 1. This is useful to avoid
- a kernel rebuild when attempting to kdump at the location of a WARN().
- 0: only WARN(), default behaviour.
- 1: call panic() after printing out WARN() location.
- ==============================================================
- panic_on_rcu_stall:
- When set to 1, calls panic() after RCU stall detection messages. This
- is useful to define the root cause of RCU stalls using a vmcore.
- 0: do not panic() when RCU stall takes place, default behavior.
- 1: panic() after printing RCU stall messages.
- ==============================================================
- perf_cpu_time_max_percent:
- Hints to the kernel how much CPU time it should be allowed to
- use to handle perf sampling events. If the perf subsystem
- is informed that its samples are exceeding this limit, it
- will drop its sampling frequency to attempt to reduce its CPU
- usage.
- Some perf sampling happens in NMIs. If these samples
- unexpectedly take too long to execute, the NMIs can become
- stacked up next to each other so much that nothing else is
- allowed to execute.
- 0: disable the mechanism. Do not monitor or correct perf's
- sampling rate no matter how CPU time it takes.
- 1-100: attempt to throttle perf's sample rate to this
- percentage of CPU. Note: the kernel calculates an
- "expected" length of each sample event. 100 here means
- 100% of that expected length. Even if this is set to
- 100, you may still see sample throttling if this
- length is exceeded. Set to 0 if you truly do not care
- how much CPU is consumed.
- ==============================================================
- perf_event_paranoid:
- Controls use of the performance events system by unprivileged
- users (without CAP_SYS_ADMIN). The default value is 3 if
- CONFIG_SECURITY_PERF_EVENTS_RESTRICT is set, or 2 otherwise.
- -1: Allow use of (almost) all events by all users
- >=0: Disallow raw tracepoint access by users without CAP_IOC_LOCK
- >=1: Disallow CPU event access by users without CAP_SYS_ADMIN
- >=2: Disallow kernel profiling by users without CAP_SYS_ADMIN
- >=3: Disallow all event access by users without CAP_SYS_ADMIN
- ==============================================================
- perf_event_max_stack:
- Controls maximum number of stack frames to copy for (attr.sample_type &
- PERF_SAMPLE_CALLCHAIN) configured events, for instance, when using
- 'perf record -g' or 'perf trace --call-graph fp'.
- This can only be done when no events are in use that have callchains
- enabled, otherwise writing to this file will return -EBUSY.
- The default value is 127.
- ==============================================================
- perf_event_max_contexts_per_stack:
- Controls maximum number of stack frame context entries for
- (attr.sample_type & PERF_SAMPLE_CALLCHAIN) configured events, for
- instance, when using 'perf record -g' or 'perf trace --call-graph fp'.
- This can only be done when no events are in use that have callchains
- enabled, otherwise writing to this file will return -EBUSY.
- The default value is 8.
- ==============================================================
- pid_max:
- PID allocation wrap value. When the kernel's next PID value
- reaches this value, it wraps back to a minimum PID value.
- PIDs of value pid_max or larger are not allocated.
- ==============================================================
- ns_last_pid:
- The last pid allocated in the current (the one task using this sysctl
- lives in) pid namespace. When selecting a pid for a next task on fork
- kernel tries to allocate a number starting from this one.
- ==============================================================
- powersave-nap: (PPC only)
- If set, Linux-PPC will use the 'nap' mode of powersaving,
- otherwise the 'doze' mode will be used.
- ==============================================================
- printk:
- The four values in printk denote: console_loglevel,
- default_message_loglevel, minimum_console_loglevel and
- default_console_loglevel respectively.
- These values influence printk() behavior when printing or
- logging error messages. See 'man 2 syslog' for more info on
- the different loglevels.
- - console_loglevel: messages with a higher priority than
- this will be printed to the console
- - default_message_loglevel: messages without an explicit priority
- will be printed with this priority
- - minimum_console_loglevel: minimum (highest) value to which
- console_loglevel can be set
- - default_console_loglevel: default value for console_loglevel
- ==============================================================
- printk_delay:
- Delay each printk message in printk_delay milliseconds
- Value from 0 - 10000 is allowed.
- ==============================================================
- printk_ratelimit:
- Some warning messages are rate limited. printk_ratelimit specifies
- the minimum length of time between these messages (in jiffies), by
- default we allow one every 5 seconds.
- A value of 0 will disable rate limiting.
- ==============================================================
- printk_ratelimit_burst:
- While long term we enforce one message per printk_ratelimit
- seconds, we do allow a burst of messages to pass through.
- printk_ratelimit_burst specifies the number of messages we can
- send before ratelimiting kicks in.
- ==============================================================
- printk_devkmsg:
- Control the logging to /dev/kmsg from userspace:
- ratelimit: default, ratelimited
- on: unlimited logging to /dev/kmsg from userspace
- off: logging to /dev/kmsg disabled
- The kernel command line parameter printk.devkmsg= overrides this and is
- a one-time setting until next reboot: once set, it cannot be changed by
- this sysctl interface anymore.
- ==============================================================
- randomize_va_space:
- This option can be used to select the type of process address
- space randomization that is used in the system, for architectures
- that support this feature.
- 0 - Turn the process address space randomization off. This is the
- default for architectures that do not support this feature anyways,
- and kernels that are booted with the "norandmaps" parameter.
- 1 - Make the addresses of mmap base, stack and VDSO page randomized.
- This, among other things, implies that shared libraries will be
- loaded to random addresses. Also for PIE-linked binaries, the
- location of code start is randomized. This is the default if the
- CONFIG_COMPAT_BRK option is enabled.
- 2 - Additionally enable heap randomization. This is the default if
- CONFIG_COMPAT_BRK is disabled.
- There are a few legacy applications out there (such as some ancient
- versions of libc.so.5 from 1996) that assume that brk area starts
- just after the end of the code+bss. These applications break when
- start of the brk area is randomized. There are however no known
- non-legacy applications that would be broken this way, so for most
- systems it is safe to choose full randomization.
- Systems with ancient and/or broken binaries should be configured
- with CONFIG_COMPAT_BRK enabled, which excludes the heap from process
- address space randomization.
- ==============================================================
- reboot-cmd: (Sparc only)
- ??? This seems to be a way to give an argument to the Sparc
- ROM/Flash boot loader. Maybe to tell it what to do after
- rebooting. ???
- ==============================================================
- rtsig-max & rtsig-nr:
- The file rtsig-max can be used to tune the maximum number
- of POSIX realtime (queued) signals that can be outstanding
- in the system.
- rtsig-nr shows the number of RT signals currently queued.
- ==============================================================
- sched_schedstats:
- Enables/disables scheduler statistics. Enabling this feature
- incurs a small amount of overhead in the scheduler but is
- useful for debugging and performance tuning.
- ==============================================================
- sg-big-buff:
- This file shows the size of the generic SCSI (sg) buffer.
- You can't tune it just yet, but you could change it on
- compile time by editing include/scsi/sg.h and changing
- the value of SG_BIG_BUFF.
- There shouldn't be any reason to change this value. If
- you can come up with one, you probably know what you
- are doing anyway :)
- ==============================================================
- shmall:
- This parameter sets the total amount of shared memory pages that
- can be used system wide. Hence, SHMALL should always be at least
- ceil(shmmax/PAGE_SIZE).
- If you are not sure what the default PAGE_SIZE is on your Linux
- system, you can run the following command:
- # getconf PAGE_SIZE
- ==============================================================
- shmmax:
- This value can be used to query and set the run time limit
- on the maximum shared memory segment size that can be created.
- Shared memory segments up to 1Gb are now supported in the
- kernel. This value defaults to SHMMAX.
- ==============================================================
- shm_rmid_forced:
- Linux lets you set resource limits, including how much memory one
- process can consume, via setrlimit(2). Unfortunately, shared memory
- segments are allowed to exist without association with any process, and
- thus might not be counted against any resource limits. If enabled,
- shared memory segments are automatically destroyed when their attach
- count becomes zero after a detach or a process termination. It will
- also destroy segments that were created, but never attached to, on exit
- from the process. The only use left for IPC_RMID is to immediately
- destroy an unattached segment. Of course, this breaks the way things are
- defined, so some applications might stop working. Note that this
- feature will do you no good unless you also configure your resource
- limits (in particular, RLIMIT_AS and RLIMIT_NPROC). Most systems don't
- need this.
- Note that if you change this from 0 to 1, already created segments
- without users and with a dead originative process will be destroyed.
- ==============================================================
- sysctl_writes_strict:
- Control how file position affects the behavior of updating sysctl values
- via the /proc/sys interface:
- -1 - Legacy per-write sysctl value handling, with no printk warnings.
- Each write syscall must fully contain the sysctl value to be
- written, and multiple writes on the same sysctl file descriptor
- will rewrite the sysctl value, regardless of file position.
- 0 - Same behavior as above, but warn about processes that perform writes
- to a sysctl file descriptor when the file position is not 0.
- 1 - (default) Respect file position when writing sysctl strings. Multiple
- writes will append to the sysctl value buffer. Anything past the max
- length of the sysctl value buffer will be ignored. Writes to numeric
- sysctl entries must always be at file position 0 and the value must
- be fully contained in the buffer sent in the write syscall.
- ==============================================================
- softlockup_all_cpu_backtrace:
- This value controls the soft lockup detector thread's behavior
- when a soft lockup condition is detected as to whether or not
- to gather further debug information. If enabled, each cpu will
- be issued an NMI and instructed to capture stack trace.
- This feature is only applicable for architectures which support
- NMI.
- 0: do nothing. This is the default behavior.
- 1: on detection capture more debug information.
- ==============================================================
- soft_watchdog
- This parameter can be used to control the soft lockup detector.
- 0 - disable the soft lockup detector
- 1 - enable the soft lockup detector
- The soft lockup detector monitors CPUs for threads that are hogging the CPUs
- without rescheduling voluntarily, and thus prevent the 'watchdog/N' threads
- from running. The mechanism depends on the CPUs ability to respond to timer
- interrupts which are needed for the 'watchdog/N' threads to be woken up by
- the watchdog timer function, otherwise the NMI watchdog - if enabled - can
- detect a hard lockup condition.
- ==============================================================
- tainted:
- Non-zero if the kernel has been tainted. Numeric values, which
- can be ORed together:
- 1 - A module with a non-GPL license has been loaded, this
- includes modules with no license.
- Set by modutils >= 2.4.9 and module-init-tools.
- 2 - A module was force loaded by insmod -f.
- Set by modutils >= 2.4.9 and module-init-tools.
- 4 - Unsafe SMP processors: SMP with CPUs not designed for SMP.
- 8 - A module was forcibly unloaded from the system by rmmod -f.
- 16 - A hardware machine check error occurred on the system.
- 32 - A bad page was discovered on the system.
- 64 - The user has asked that the system be marked "tainted". This
- could be because they are running software that directly modifies
- the hardware, or for other reasons.
- 128 - The system has died.
- 256 - The ACPI DSDT has been overridden with one supplied by the user
- instead of using the one provided by the hardware.
- 512 - A kernel warning has occurred.
- 1024 - A module from drivers/staging was loaded.
- 2048 - The system is working around a severe firmware bug.
- 4096 - An out-of-tree module has been loaded.
- 8192 - An unsigned module has been loaded in a kernel supporting module
- signature.
- 16384 - A soft lockup has previously occurred on the system.
- 32768 - The kernel has been live patched.
- ==============================================================
- threads-max
- This value controls the maximum number of threads that can be created
- using fork().
- During initialization the kernel sets this value such that even if the
- maximum number of threads is created, the thread structures occupy only
- a part (1/8th) of the available RAM pages.
- The minimum value that can be written to threads-max is 20.
- The maximum value that can be written to threads-max is given by the
- constant FUTEX_TID_MASK (0x3fffffff).
- If a value outside of this range is written to threads-max an error
- EINVAL occurs.
- The value written is checked against the available RAM pages. If the
- thread structures would occupy too much (more than 1/8th) of the
- available RAM pages threads-max is reduced accordingly.
- ==============================================================
- unknown_nmi_panic:
- The value in this file affects behavior of handling NMI. When the
- value is non-zero, unknown NMI is trapped and then panic occurs. At
- that time, kernel debugging information is displayed on console.
- NMI switch that most IA32 servers have fires unknown NMI up, for
- example. If a system hangs up, try pressing the NMI switch.
- ==============================================================
- watchdog:
- This parameter can be used to disable or enable the soft lockup detector
- _and_ the NMI watchdog (i.e. the hard lockup detector) at the same time.
- 0 - disable both lockup detectors
- 1 - enable both lockup detectors
- The soft lockup detector and the NMI watchdog can also be disabled or
- enabled individually, using the soft_watchdog and nmi_watchdog parameters.
- If the watchdog parameter is read, for example by executing
- cat /proc/sys/kernel/watchdog
- the output of this command (0 or 1) shows the logical OR of soft_watchdog
- and nmi_watchdog.
- ==============================================================
- watchdog_cpumask:
- This value can be used to control on which cpus the watchdog may run.
- The default cpumask is all possible cores, but if NO_HZ_FULL is
- enabled in the kernel config, and cores are specified with the
- nohz_full= boot argument, those cores are excluded by default.
- Offline cores can be included in this mask, and if the core is later
- brought online, the watchdog will be started based on the mask value.
- Typically this value would only be touched in the nohz_full case
- to re-enable cores that by default were not running the watchdog,
- if a kernel lockup was suspected on those cores.
- The argument value is the standard cpulist format for cpumasks,
- so for example to enable the watchdog on cores 0, 2, 3, and 4 you
- might say:
- echo 0,2-4 > /proc/sys/kernel/watchdog_cpumask
- ==============================================================
- watchdog_thresh:
- This value can be used to control the frequency of hrtimer and NMI
- events and the soft and hard lockup thresholds. The default threshold
- is 10 seconds.
- The softlockup threshold is (2 * watchdog_thresh). Setting this
- tunable to zero will disable lockup detection altogether.
- ==============================================================
|