System-Level Fault Tolerance (Clustering/Network Load Balancing) - Restoring a Single Node After a Complete Server Failure (Page 20 of 26 ) When a single node fails, whether because of hardware problems or software corruption that cannot be repaired in a reasonable amount of time, the node must be rebuilt from scratch. After any hardware problems are resolved, the organization can decide what the best approach to server recovery will be. The two basic approaches to node recovery are outlined next. Evicting and Rebuilding the Failed Node This first node recovery process evicts the failed node from the cluster and requires the cluster administrator to rebuild the cluster node from scratch, rejoin the node to the cluster, install any cluster applications, and finally reconfigure the cluster's group failover and failback configurations. To evict and rebuild the failed node, follow these steps: Shut down the failed cluster node. On an available cluster node, log in using a Cluster Administrator account. Click Start, Administrative Tools, Cluster Administrator. If Cluster Administrator does not connect to the cluster or connects to a different cluster, choose File, Open Connection. From the Active drop-down box, choose Open Connection to Cluster. Then, in the Cluster or Server Name drop-down box, type . (period) and click OK to connect. In the left pane of the Cluster Administrator window, right-click the offline cluster node and choose Evict Node. When the node is evicted, close Cluster Administrator and immediately start a backup of the local node's system state. Refer to the previous section "Backing Up the Cluster Node System State" for detailed steps for system state backup. On the failed node, install a clean copy of Windows Server 2003 Enterprise or Datacenter server. After it is loaded, configure the server to join the correct domain and configure all local drive letters and network card IP addresses as previously configured on the original cluster node. Then reboot if necessary. Follow the steps to rejoin the cluster as outlined in the previous section, "Adding Additional Nodes to a Cluster." After the node rejoins the cluster, install any cluster applications as outlined in the vendor's installation guide for cluster installation. Configure cluster group failover and failback as necessary and move cluster groups to their preferred node.
Restoring the Failed Node Using the ASR Restore To restore the failed node using the ASR restore, follow these steps: Shut down the failed cluster node. On an available cluster node, log in using a Cluster Administrator account. Click Start, Administrative Tools, Cluster Administrator. If Cluster Administrator does not connect to the cluster or connects to a different cluster, choose File, Open Connection. From the Active drop-down box, choose Open Connection to Cluster. Then, in the Cluster or Server Name drop-down box, type . (period) and click OK to connect. Within each cluster group, make sure to disable failback to prevent these groups from failing over to a cluster node that is not completely restored. Close Cluster Administrator. Locate the ASR floppy created for the failed node or create the floppy from the files saved in the ASR backup media. For information on creating the ASR floppy from the ASR backup media, refer to Help and Support from any Windows Server 2003 Help and Support tool. Insert the operating system CD in the failed server and start the server. If necessary, when prompted, press F6 to install any third-party storage device drivers. This includes any third-party disk or tape controllers that Windows Server 2003 will not recognize. Press F2 when prompted to perform an automated system recovery. When prompted, insert the ASR floppy disk and press Enter. The operating system installation will proceed by restoring disk volume information and reformatting the volumes associated with the operating system. When this process is complete, restart the server as requested by pressing F3 and then Enter in the next window. After the system restarts, press a key if necessary to restart the CD installation. If necessary, when prompted, press F6 to install any third-party storage device drivers. This includes any third-party disk or tape controllers that Windows Server 2003 will not recognize. Press F2 when prompted to perform an automated system recovery. When prompted, insert the ASR floppy disk and press Enter. This time, the disks can be properly identified and will be formatted, and the system files will be copied to the respective disk volumes. When this process is complete, the ASR restore will automatically reboot the server. Remove the ASR floppy disk from the drive. The graphic-based OS installation will begin. If necessary, specify the network location of the backup media using a UNC path and enter authentication information if prompted. The ASR backup will attempt to reconnect to the backup media automatically but will be unable if the backup media are on a network drive. When the media are located, open the media and click Next. Then finish recovering the remaining ASR data. When the ASR restore is complete, if any local disk data was not restored with the ASR restore, restore all local disks. Click Start, All Programs, Accessories, System Tools, Backup. If this is the first time you've run Backup, it will open in Wizard mode. Choose to run it in Advanced mode by clicking the Advanced Mode hyperlink. After you change to Advanced mode, the window should look like the one in Figure 31.13. Click the Restore Wizard (Advanced) button to start the Restore Wizard. Click Next on the Restore Wizard Welcome screen to continue. On the What To Restore page, select the appropriate cataloged backup media, expand the catalog selection, and check each local drive. Click Next to continue. If the correct tape or file backup media do not appear in this window, cancel the restore process. Then locate and catalog the appropriate media from the Restore Wizard page and return to the restore process from step 23.
Note - Refer to Chapter 33 for information on how to catalog tape and file backup media.
On the Completing the Restore Wizard page, click Finish to start the restore. Because you want to restore only what ASR did not, you do not need to make any advanced restore configuration changes. When the restore is complete, reboot the server as prompted. After the reboot is complete, log on to the restored cluster node and check cluster node functionality. If everything is working properly, open Cluster Administrator and configure all cluster group failover and failback configurations. Move cluster groups to their preferred node and close Cluster Administrator.
This chapter is from Microsoft Windows Server 2003 Unleashed, by Rand Morimoto, et al. (Sams Publishing, 2004, ISBN: 0672326671). Check it out at your favorite bookstore today.
Buy this book now. |
Next: Restoring an Entire Cluster to a Previous State >>
More MS SQL Server Articles More By Sams Publishing |