2.  Crash Recovery

2.1.  Preparation Tips

It is a good idea to backup the important system files like /etc/fstab, /etc/lilo.conf after you login using Tomsrtbt floppy or RedHat Linux CDROM (Rescue option) in following sections. This can be very handy during crash situation or something happens to system files.

bash# cp /etc/fstab /etc/fstab.orig
bash# cp /etc/lilo.conf /etc/lilo.conf.orig
bash# cp /etc/hosts /etc/hosts.orig
bash# cp /etc/hosts.allow /etc/hosts.allow.orig
bash# cp /etc/hosts.deny /etc/hosts.deny.orig
bash# cp /etc/inetd.conf /etc/inetd.conf.orig
bash# cp /etc/inittab /etc/inittab.orig
bash# cp /etc/networks /etc/networks.orig

2.2.  Using Linux CDROM In Rescue Mode

Most of the distributions like RedHat, SUSE, Debian provide CDROM which have "Rescue" option. For this, you have should set the BIOS of your computer to boot first from IDE CDROM drive. Usually you set the BIOS (using F8 key during boot) to boot first from CDROM drive, second from Floppy drive and third from hard disk. Load the Linux cdrom into the CD drive and reboot the system. The Linux distribution will load and at the prompt select "Rescue Operation". In the resuce operation mount the hard disks and try to repair.

# chroot /mnt/SYSIMAGE
# df 

After doing chroot, the system will look as if you had booted the system from hard disk. You can see all the partitions and you can repair or recover the files.

2.3.  Quick Steps to recovery

Follow these steps to recover from LILO or system failures.

  1. SCENE 1: If your system does not boot -

    Get the tomsrtbt floppy "http://www.toms.net/rb" or MuLinux floppy (see also Section 1.1, “ Tiny Floppy Linux ” ). Boot with tomsrtbt floppy Use fdisk to find the partitions. Try to recognise the root and boot partition. Watch out, you may be having the /boot files on the root partition itself.

    The Linux's root partition has the following directories bin , boot , etc , usr .

    And the Linux's boot partition has these directories: vmlinuz , boot.b , chain.b , map .

    To find out root partition do this :

    bash# fdisk /dev/hda
    Command (m for help): m		(Gives you help on commands)
    Command (m for help): p 	(Gives you list of partitons)
    Command (m for help): q
    bash# mkdir /test
    bash# mount /dev/hda1 /test
    bash# ls /test
    You should see root-partition list like this -
    bin   fd    lib   mnt  proc  sbin  usr
    boot  dev   etc   home  lost+found  opt  root  tmp   var

    If this is not a root partition, then try the next partition /dev/hda2. Keep trying hda3, hda4, hda5, etc.. untill you find the root partition. If you do not find root partition in hda device then repeat the above steps for other hard disk devices like hdb , hdc , hdd etc..

    Next, you should find the /boot, /usr and /var partitions. The disk locations of these partitions are needed to create the new lilo configuration.

    In my case the root partition is /dev/hda4 which is used in the examples below:

    bash# mkdir /rootpartition
    bash# mount /dev/hda4 /rootpartition
    bash# cat /rootpartition/etc/fstab
    	Read the output of fstab and mount partitions as per fstab file, see below -
    bash# mount /dev/hda5 /rootpartition/boot
    bash# mount /dev/hda6 /rootpartition/usr
    bash# mount /dev/hda7 /rootpartition/var
    bash# mount /dev/hda8 /rootpartition/opt
    bash# mount /dev/hda9 /rootpartition/root
    bash# mount /dev/hda10 /rootpartition/home

    In my case, as per fstab file hda5 was boot , hda6 was usr , hda7 was var , hda8 was opt , hda9 was root , hda10 was home and hda11 was windows95 (FAT16 partition).

    Edit /etc/fstab (not /rootpartition/etc/fstab) and put (sample code given here) -

    	/dev/hda4  /rootpartition           ext2 defaults 1 1
    	/dev/hda5  /rootpartition/boot      ext2 defaults 1 1
    	/dev/hda6  /rootpartition/usr       ext2 defaults 1 1
    	/dev/hda7  /rootpartition/var       ext2 defaults 1 1
    	/dev/hda8  /rootpartition/opt       ext2 defaults 1 1
    	/dev/hda9  /rootpartition/root      ext2 defaults 1 1
    	/dev/hda10 /rootpartition/home      ext2 defaults 1 1
    	/dev/hda11 /rootpartition/win95part vfat defaults 1 1
    On my computer hda4 contains the linux root partition, hda5 had boot partition and
    hda11 has windows 95 vfat system.
    bash# mkdir /rootpartition/win95part
    bash# mount /rootpartition/win95part
    	And repair the problem partitions using fsck or e2fsck commands.
    bash# man fsck
    bash# man e2fsck

  2. SCENE 2: If LILO is not working..

    Follow scene 1 above, if that fails then follow these steps. After executing steps in scene 1 above, you should have already mounted /rootpartition and have created /etc/fstab file.

    Note: It is very important to note how chroot command works below. The /sbin/lilo file which chroot uses is actually located in /rootpartition/sbin/lilo and NOT in /sbin!! Hence, do not get confused.

    bash# mount -a 
    bash# chroot /rootpartition /sbin/lilo -q
    bash# man chroot
    bash# chroot /rootpartition /sbin/lilo 

    Note: New users of chroot will be confused. If chroot command complains that it cannot find /boot/map file then it actually means it that it cannot find /rootpartition/boot/map. Because you gave /rootpartition as the first argument to chroot and all references are with respect to /rootpartition.

    Alternatively, you can directly use /sbin/lilo instead of chroot. The -r option of lilo actually does chroot. It is very strongly recommended that you use chroot, instead of lilo -r, as it is more convenient and can catch errors more easily.

    bash# man lilo
    bash# /sbin/lilo -r /rootpartition

  3. SCENE 3: If LILO is not working..

    If scene 1 and 2 failes, then if you made the boot disk with 'mkbootdisk' (during install or by using 'man mkbootdisk'), boot with it and repair your partitions. The mkbootdisk is in mkbootdisk*.rpm package, you must install this. Or get boot disks for Linux/NT/Windows/DOS/Mac are at "http://www.bootdisk.com" Other option is - get a hold of installation Linux-CDROM. Just about every Linux distribution provides a image of a rescue disk on their CD. Under Linux use "dd if=/cdrom/disks/rescue of=/dev/fd0" to create a rescue floppy disk. Under DOS use rawrite.exe (included on Linux CD) and then do "rawrite image-name a:".

  4. SCENE 4: If 1, 2 and 3 above fails and you do not have boot disk

    If you have another computer running linux, then login as root and do -

    Note: If you compile your own kernel as a bzImage (for instance, bzImage-2.4.4), then you should create a hard link to vmlinuz-2.4.4 as follows (note the the z in name vmlinuz and it is not vmlinux). If you do not do this then mkbootdisk command may fail.

    bash# cd /boot
    bash# ls -l vmlinuz*
    bash# ln /boot/bzImage-2.4.4  /boot/vmlinuz-2.4.4

    Now that you have bzImage and vmlinuz, give these commands -

    bash$ man mkbootdisk
    bash# cp /etc/lilo.conf /etc/lilo-original.conf

    Edit the /etc/lilo.conf and put the root partition name as you obtained in 'scene 1' above and insert a blank floppy and give -

    bash$ mkbootdisk --device /dev/fd0 2.2.12-20

    The mkbootdisk is in mkbootdisk*.rpm package, you must install this. Make sure you move the /etc/lilo-original.conf back to /etc/lilo.conf!! And then take this floppy and goto scene 3

  5. SCENE 5: This is the worst scenerio and hopefully you will never come to this stage. Scenes from 1 to 4 will take care of majority of cases. But just in case, all the above scenes 1, 2, 3 and 4 fail then -

    Step 1: Boot tomsrtbt (see Section 1.1, “ Tiny Floppy Linux ” ) and mount the partitions and backup the root partition to another partition having disk space with comamnds -

    	Edit /etc/fstab and put (sample code given here, you may have to 
    	change as per your disk layout) -
    		/dev/hda4  /rootpartition	ext2 defaults 1 1
    		/dev/hda11 /b1 		vfat defaults 1 1
    bash$ mkdir /rootpartition; mount /rootpartition
    bash$ mkdir /b1; mount /b1
    bash$ cd /
    bash$ df 
    	And see that there is enough disk space in /b1 to tar up the root partition
    bash$ tar cvf /b1/root-hda4.tar   /rootpartition

    Step 2: Insert Linux cdrom, reboot and install the redhat linux on /dev/hda4 (but DO NOT install any extra packages, you just need to install only the root, boot systems and LILO manager that is, a very bare minimum). This will also install the LILO on hard disk. Boot linux now and login as root and do -

    bash$ man mkbootdisk
    bash# cp /etc/lilo.conf /etc/lilo-original.conf

    Note: You MUST remember to copy back lilo-original.conf to lilo.conf!! Edit the /etc/lilo.conf and put the root partition name as you obtained in 'scene 1' above and insert a blank floppy and give -

    bash$ mkbootdisk --device /dev/fd0 2.2.12-20
    bash# cp /etc/lilo-original.conf /etc/lilo.conf

    Test this boot floppy to see that this works and then restore back the all the files which you backedup using tar on /b1/root-hda4.tar as in step 1 above.

2.4.  Precautionary measures

You should take the following pre-cautionary measures to tackle the problems in future.

  • You MUST make emergency boot disk from time to time and whenever you make changes to the partition. Insert a blank floppy and do this -

    bash$ man mkbootdisk
    The mkbootdisk is in mkbootdisk*.rpm package, you must install this.
    bash$ mkbootdisk --help
    bash$ mkbootdisk --device /dev/fd0 2.2.12-20

  • You MUST backup the partition tables setup to a floppy and to a hard disk. You should also print this out and paste it on the computer box.

    bash$ su - root
    bash# man fdisk
    bash# fdisk -l /dev/sda > partition_table_backup.txt

    Very helpful if you need to repartition the hard disk. From the printout, you would know where your partition starts. During recovery, after repatitioning and formating you can restore data from the backup.

  • You must keep the tomsrtbt boot floppy handy. Visit "http://www.toms.net/rb" (see also Section 1.1, “ Tiny Floppy Linux ” )

  • You must keep the Yard rescue and boot floppy disk handy. Visit "http://www.linuxlots.com/~fawcett/yard"

  • Backup /root and /boot directories. Boot the Tomsrtbt floppy (see also Section 1.1, “ Tiny Floppy Linux ” ) and then

    bash# vi /etc/fstab
    And put these lines -
    		/dev/hda1 /a1 vfat defaults 1 1
    		/dev/hdb1 /b1 vfat defaults 1 1
    In my case hda1 had the linux root partition '/'
    bash# cd / 
    bash# tar cvf /b1/linux-root-partition-hda1.tar  a1
    bash# tar cvf /b1/linux-boot-partition-hda1.tar  a1/boot

2.5.  Removing LILO

You can replace the boot sector with the DOS boot loader by issuing the DOS command at MS DOS prompt:

	FDISK  /MBR

where MBR stands for "Master Boot Record".

See also LILO documentation on linux at /usr/doc/lilo* for other methods of uninstalling the LILO. And see also 'man lilo'.

2.6.  Common mistakes

After making changes to /etc/lilo.conf you MUST run lilo to make changes to go in effect. It is a very common mistake committed by newusers. Type -

bash# lilo -v -v -v