RAID

From Segfault
Jump to navigation Jump to search

Linux

mdadm

mdadm is still the way to go when it comes to RAID in Linux.

RAID-1

We have 2 IDE disks (hdc, hdd). We assume that `/dev/hdc2` and `/dev/hdd2` will be partitions from which we're going to build our array. The next command will create a RAID-1 array and has to be issued ONLY ONCE:

$ mdadm --create /dev/md0 --level=1 --raid-devices=2 /dev/hdc2 /dev/hdd2
$ cat /proc/mdstat
Personalities : [raid1]
md0 : active raid1 hdc2[0] hdd2[1]
      97739392 blocks [2/2] [UU]

If /dev/md* have not been created yet, we could:

  • Fix udev (yeah, right....)
  • Create the missing devicefiles by hand:
for i in 0 1 2 3 4 5; do (mknod -m 0660 /dev/md"$i" b 9 "$i"); done
  • Or use the `--auto` feature of mdadm(8):
mdadm --create /dev/md2 --level 1 --auto=yes --raid-devices=2 /dev/hda1 /dev/hdb1

Now we're going to play around with our device, the commands are often self-explanatory, but it's good to have them handy.

$ mdadm --stop /dev/md0
$ cat /proc/mdstat
Personalities : [raid1]
unused devices: <none>
$ cat /etc/mdadm/mdadm.conf
DEVICE partitions
ARRAY /dev/md0 level=raid1 num-devices=2 devices=/dev/hdc2,/dev/hdd2
$ mdadm --assemble /dev/md0
mdadm: /dev/md0 has been started with 2 drives.

See how --assemble does not need arguments on the commandline to set up the array as it is listed in the configuration file.

Now we're going to remove/re-add distinct devices from/to the array:

$ mdadm /dev/md0 --manage --fail /dev/hdd2
$ cat /proc/mdstat
Personalities : [raid1]
md0 : active raid1 hdc2[0] hdd2[2](F)
      97739392 blocks [2/1] [U_]

$ mdadm /dev/md0 --manage --remove /dev/hdd2
$ cat /proc/mdstat
Personalities : [raid1]
md0 : active raid1 hdc2[0]
      97739392 blocks [2/1] [U_]

$ mdadm /dev/md0 --manage --add /dev/hdd2
mdadm: re-added /dev/hdd2
Personalities : [raid1]

$ cat /proc/mdstat
md0 : active raid1 hdd2[1] hdc2[0]
      97739392 blocks [2/1] [U_]
      [>....................]  recovery =  0.0% (89920/97739392) finish=36.1min speed=44960K/sec

Querying arrays and devices:

$ mdadm --query /dev/md0
/dev/md0: 93.21GiB raid1 2 devices, 0 spares. Use mdadm --detail for more detail.

$ mdadm --query /dev/hdc2
/dev/hdc2: is not an md array
/dev/hdc2: device 0 in 2 device active raid1 /dev/md0.  Use mdadm --examine for more detail.

$ mdadm --examine --scan
ARRAY /dev/md0 level=raid1 num-devices=2 UUID=5f49df56:cebabdc3:c078155d:f7604a8d

$ mdadm --examine /dev/hdd2
/dev/hdd2:
          Magic : a92b4efc
        Version : 00.90.00
           UUID : 5f49df56:cebabdc3:c078155d:f7604a8d
  Creation Time : Tue Nov 28 04:58:50 2006
     Raid Level : raid1
    Device Size : 97739392 (93.21 GiB 100.09 GB)
     Array Size : 97739392 (93.21 GiB 100.09 GB)
   Raid Devices : 2
  Total Devices : 2
Preferred Minor : 0

    Update Time : Tue Nov 28 04:58:50 2006
          State : active
 Active Devices : 2
Working Devices : 2
 Failed Devices : 0
  Spare Devices : 0
       Checksum : 1fb333d9 - correct
         Events : 0.1


      Number   Major   Minor   RaidDevice State
this     1      22       66        1      active sync   /dev/hdd2
   0     0      22        2        0      active sync   /dev/hdc2
   1     1      22       66        1      active sync   /dev/hdd2

$ mdadm --detail /dev/md0
/dev/md0:
        Version : 00.90.03
  Creation Time : Tue Nov 28 04:58:50 2006
     Raid Level : raid1
     Array Size : 97739392 (93.21 GiB 100.09 GB)
    Device Size : 97739392 (93.21 GiB 100.09 GB)
   Raid Devices : 2
  Total Devices : 2
Preferred Minor : 0
    Persistence : Superblock is persistent

    Update Time : Tue Nov 28 05:22:13 2006
          State : clean, degraded, recovering
 Active Devices : 1
Working Devices : 2
 Failed Devices : 0
  Spare Devices : 1

 Rebuild Status : 7% complete

           UUID : 5f49df56:cebabdc3:c078155d:f7604a8d
         Events : 0.8

    Number   Major   Minor   RaidDevice State
       0      22        2        0      active sync   /dev/hdc2
       1      22       66        1      spare rebuilding   /dev/hdd2

Adding more active devices to an existing RAID-1

$ mdadm --assemble /dev/md1
mdadm: /dev/md1 has been started with 2 drives.

$ mdadm --add /dev/md1 /dev/hdc5
$ mdadm --add /dev/md1 /dev/hdd5
$ mdadm --grow -n 4 /dev/md1

The last command should trigger the resync. However, we now have a RAID-1 with 4 active devices, but the size is still the same (say we had 10GB with 2 active devices we now have 10GB with 4 active devices, but now 3 drives can fail and we still have our data). TODO: find out how to actually enlarge the RAID-1.

dmsetup

The Gentoo forum has a great, though somewhat dated article[1] on how to create RAID devices with dmsetup instead of mdadm. The syntax is somewhat more cryptic though, here are a few examples:

Linear

Create a mapping device on top of a single device. Let's up the ante a bit and specify an offset:

$ parted /dev/sdb u s p | grep -A1 ^Number
Number  Start  End       Size      Type     File system  Flags
1       63s    8388607s  8388545s  primary

$ echo 0 $(expr `blockdev --getsz /dev/sdb` - 63) linear /dev/sdb 63 | dmsetup create test

Of course, we could've just created the mapping device on /dev/sdb1 and could have omitted the offset:

$ dmsetup remove test
$ echo 0 `blockdev --getsz /dev/sdb1` linear /dev/sdb1 0 | dmsetup create test
$ dmsetup table
test: 0 8388545 linear 8:17 0

RAID-0

Double the capacity, halve the MTBF:

echo 0 $(expr `blockdev --getsz /dev/sdb` + `blockdev --getsz /dev/sdc`) striped 2 128 /dev/sdb 0 /dev/sdc 0 | dmsetup create test

Let's see what we have now:

$ blockdev --getsz /dev/sd[bc] /dev/mapper/test
 8388608
 8388608
16777216

RAID-1

Due to the nature of RAID-1, SIZE will be the smallest available device:

$ blockdev --getsz /dev/sd[bc]
8388608
8388608

In our case, both devices have the same size.

SIZE=8388608
echo 0 $SIZE mirror core 2 128 nosync 2 /dev/sdb 0 /dev/sdc 0 | dmsetup create test

OpenBSD

Even though OpenBSD advises not not use RAID[2], let's create a software RAID anyway.

As a silly example, we'll create a RAID-1 from two partitions of the same disk:

$ dmesg | grep wd1
wd1 at pciide0 channel 1 drive 0: <VBOX HARDDISK>
wd1: 128-sector PIO, LBA, 10240MB, 20971520 sectors
wd1(pciide0:1:0): using PIO mode 4, Ultra-DMA mode 2

Create a new MBR on the disk:

$ fdisk -i wd1
Do you wish to write new MBR and partition table? [n] y
Writing MBR at offset 0.

$ fdisk wd1
Disk: wd1       geometry: 1305/255/63 [20971520 Sectors]
Offset: 0       Signature: 0xAA55
            Starting         Ending         LBA Info:
 #: id      C   H   S -      C   H   S [       start:        size ]
-------------------------------------------------------------------------------
 0: 00      0   0   0 -      0   0   0 [           0:           0 ] unused
 1: 00      0   0   0 -      0   0   0 [           0:           0 ] unused
 2: 00      0   0   0 -      0   0   0 [           0:           0 ] unused
*3: A6      0   1   2 -   1304 254  63 [          64:    20964761 ] OpenBSD

Create two partitions:

$ disklabel -E wd1
> a
partition: [a]
offset: [64]
size: [20964761] 8388608                           # 4 GB
FS type: [4.2BSD] RAID

> a
partition: [b]
offset: [8388672]
size: [12576153] 8388608                           # 4 GB
FS type: [swap] RAID
> p
OpenBSD area: 64-20964825; size: 20964761; free: 4187545
#                size           offset  fstype [fsize bsize  cpg]
  a:          8388608               64    RAID
  b:          8388608          8388672    RAID
  c:         20971520                0  unused
> w
> q
No label changes.

With that in place, we can create a virtual softraid0 device:

$ bioctl -c 1 -l /dev/wd1a,/dev/wd1b softraid0
softraid0: RAID 1 volume attached as sd0

$ dmesg | tail -2
sd0 at scsibus2 targ 1 lun 0: <OPENBSD, SR RAID 1, 005> SCSI2 0/direct fixed
sd0: 4095MB, 512 bytes/sector, 8388080 sectors

Note: it's always softraid0, even when we create more RAID devices on the system.

Check if our new disk is running:

$ bioctl softraid0
Volume      Status               Size Device  
softraid0 0 Online         4294696960 sd0     RAID1 
          0 Online         4294696960 0:0.0   noencl <wd1a>
          1 Online         4294696960 0:1.0   noencl <wd1b>

Create a new filesystem:

$ newfs -q /dev/rsd0c 
/dev/rsd0c: 4095.7MB in 8388080 sectors of 512 bytes
21 cylinder groups of 202.47MB, 12958 blocks, 25984 inodes each

$ file -Ls  /dev/rsd0c
/dev/rsd0c: Unix Fast File system [v1] (little-endian),[...]

Mount it:

$ mount -t ffs /dev/sd0c /mnt/disk    
$ mount | tail -1
/dev/sd0c on /mnt/disk type ffs (local)

$ df -h /mnt/disk/
Filesystem     Size    Used   Avail Capacity  Mounted on
/dev/sd0c      3.9G    2.0K    3.7G     0%    /mnt/disk

Tear it all down again:

$ umount /mnt/disk/
$ bioctl -d sd0
$ dmesg | tail -1
sd0 detached

Links

References