Replacing a Disk in RAID
Mounting a File System
After booting into Rescue mode, you need to manually mount the file system (if this did not happen automatically). To do this, run the following:
If infiltrate-root did not work, then one of the reasons may be that the RAID has not been assembled.
Check the path by running the following:
If there are no md devices there but sda, sdb, and so on have partitions with the “Linux RAID” type, then you need to assemble the RAID.
To access the file system:
- Boot into Rescue.
Run the following:
mdadm --assemble /dev/md0 /dev/sda2 /dev/sdb2 mdadm --assemble /dev/md1 /dev/sda3 /dev/sdb3
Please note that you need to adapt titles to match the existing disks. See more about mdadm.
When mounting, specify the direct path to the vg-root partition as an argument by running the following:
Example of Disk Replacement
The server has 2 disks: /dev/sda and /dev/sdb. These disks are assembled into software RAID1 using mdadm.
Let’s say one of the disks failed, for example, /dev/sdb.
Removing a Disk From the Array
Please note that before replacing a disk, it is advisable to remove it from the array.
View the array state by running the following:
cat /proc/mdstat Personalities : [raid1] md1 : active raid1 sda3 sdb3 975628288 blocks super 1.2 [2/2] [UU] bitmap: 3/8 pages [12KB], 65536KB chunk md0 : active raid1 sda2 sdb2 999872 blocks super 1.2 [2/2] [UU] unused devices: <none>
In this case, the array is assembled so that md0 consists of sda2 and sdb2, and md1 consists of sda3 and sdb3В.
On this server, md0 is /boot, and md1 is swap and root.
lsblk NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT loop0 7:0 0 985M 1 loop sda 8:0 0 931.5G 0 disk ├─sda1 8:1 0 1M 0 part ├─sda2 8:2 0 977M 0 part │ └─md0 9:0 0 976.4M 0 raid1 └─sda3 8:3 0 930.6G 0 part └─md1 9:1 0 930.4G 0 raid1 ├─vg0-swap_1 253:0 0 4.8G 0 lvm └─vg0-root 253:1 0 925.7G 0 lvm / sdb 8:16 0 931.5G 0 disk ├─sdb1 8:17 0 1M 0 part ├─sdb2 8:18 0 977M 0 part │ └─md0 9:0 0 976.4M 0 raid1 └─sdb3 8:19 0 930.6G 0 part └─md1 9:1 0 930.4G 0 raid1 ├─vg0-swap_1 253:0 0 4.8G 0 lvm └─vg0-root 253:1 0 925.7G 0 lvm /
Remove sdb from all devices:
mdadm /dev/md0 --remove /dev/sdb2 mdadm /dev/md1 --remove /dev/sdb3
If partitions are not removed from the array (as in this case), mdadm does not consider the disk to be failed and uses it. When removing a disk, an error is displayed that the device is in use.
In this case, mark the disk as failed before removing it:
mdadm /dev/md0 -f /dev/sdb2 mdadm /dev/md1 -f /dev/sdb3
Run the commands to remove partitions from the array again.
After removing the failed disk from the array, request disk replacement by creating a ticket specifying the s/n of the failed disk. Downtime availability depends on server configuration.
Defining the Partition Table (GPT or MBR) and Moving It to the New Disk
After replacing the failed disk, you need to add the new disk to the array. To do this, you need to determine the partition table type: GPT or MBR. The gdisk is used for this.
Install the gdisk:
apt-get install gdisk -y
Run the following:
gdisk -l /dev/sda
, where /dev/sda is a healthy disk in the raid.
The output looks something like this for MBR:
Partition table scan: MBR: MBR only BSD: not present APM: not present GPT: not present
And something like this for GPT:
Partition table scan: MBR: protective BSD: not present APM: not present GPT: present
Before adding a disk to the array, you need to create the same partitions as on sda. This process varies depending on the disk layout.
Please note that the disk that the layout is copied to is written first, and the disk that the layout is copied from is the second. If you swap them, the layout on the initially healthy disk will be destroyed.
Copying the Partition Layout for GPT:
sgdisk -R /dev/sdb /dev/sda
Assign a new random UIDD:
sgdisk -G /dev/sdb
Please note that here the process is opposite. The disk that the layout is copied from is written first, and the disk that the layout is copied to is the second.
Copying the Partition Layout for MBR:
sfdisk -d /dev/sda | sfdisk /dev/sdb
If you cannot see the partitions in the system, then you can re-read the partition table by running the following:
sfdisk -R /dev/sdb
Adding a Disk to the Array
When partitions on /dev/sdb are created, you can add the disk to the array:
mdadm /dev/md0 -a /dev/sdb2 mdadm /dev/md1 -a /dev/sdb3
After adding the disk to the array, synchronization starts. The speed depends on the disk size and type (ssd/hdd).
cat /proc/mdstat Personalities : [raid1] md1 : active raid1 sda3 sdb3 975628288 blocks super 1.2 [2/1] [U_] [============>........] recovery = 64.7% (632091968/975628288) finish=41.1min speed=139092K/sec bitmap: 3/8 pages [12KB], 65536KB chunk md0 : active raid1 sda2 sdb2 999872 blocks super 1.2 [2/2] [UU] unused devices: <none>
Installing a Boot Loader
After adding the disk to the array, you need to install a boot loader on it.
If the server is booted into normal mode or in infiltrate-root (which we entered earlier), this can be done by running the following:
If the server is booted into recovery or rescue mode, i.e. with a live cd, the boot loader installation looks like this:
Mount the root file system to /mnt:
mount /dev/md2 /mnt
mount /dev/md0 /mnt/boot
Mount /dev, /proc, and /sys:
mount --bind /dev /mnt/dev mount --bind /proc /mnt/proc mount --bind /sys /mnt/sys
Chroot into the mounted file system:
Install grub on sdb:
Now you can try to boot into normal mode.
Replacing a Failed Disk
You can conditionally make the disk in the array failed using
mdadm /dev/md0 --fail /dev/sda1 mdadm /dev/md0 -f /dev/sda1
You can remove the failed disk using
mdadm /dev/md0 --remove /dev/sda1 mdadm /dev/md0 -r /dev/sda1
You can add a new disk in the array using
--add (-a) and
mdadm /dev/md0 --add /dev/sda1 mdadm /dev/md0 -a /dev/sda1
Error while Restoring the Boot Loader after Replacing the Disk in RAID1
If the following error appears while installing grub:
root #grub-install --root-directory=/boot /dev/sda Could not find device for /boot/boot: not found or not a block device
Run the following:
root #grep -v rootfs /proc/mounts > /etc/mtab