ADMINISTERING


Fault recovery
You can set up fault recovery to automatically handle server crashes.

When the server crashes, it shuts itself down and then restarts automatically, without any administrator intervention. A fatal error such as an operating system exception or an internal panic terminates each Domino® process and releases all associated resources. The startup script detects the situation and restarts the server. If you are using multiple server partitions and a failure occurs in a single partition, only that partition is terminated and restarted.

Domino records crash information in the data directory. When the server restarts, Domino checks to see if it is restarting after a crash. If it is, an email is sent automatically to the person or group in the Mail Fault Notification to field. The email contains the time of the crash, the server name, and, if available, the FAULT_RECOVERY.ATT file, which includes additional failure information from an optional cleanup script.

The fault-recovery system is initialized before the Domino Directory can be read. During this initialization, fault-recovery settings are read from the NOTES.INI file, and then later read from the Domino Directory and saved back to the NOTES.INI file. Any changes to the Domino Directory or the NOTES.INI file become effective when the Domino server is restarted. To disable the reading of the Domino Directory, and subsequent update to the NOTES.INI file, use the NOTES.INI setting FaultRecoveryFromIni=1.

Operating systems and fault recovery

Because fault recovery runs after an exception has occurred, it cannot rely on Domino's internal facilities. Instead, fault recovery makes heavy use of operating system features.

UNIX™ systems primarily use message queues. Therefore, it is important to configure the operating system so that sufficient message queue resources are available. If you are using multiple Domino server partitions, each partition requires a complete set of resources. Consult your operating system documentation for additional details on configuring message queue parameters.

Microsoft™ Windows™ 2003 systems do not require any system resource changes.

Related tasks
Using transaction logging for recovery
Specifying a cleanup script for fault recovery
Enabling fault recovery