Recovering an old Intel RST RAID0 Array

Date: 2023-10-29 | permalink | Tags: mdadm

Many years ago, I was using a mainboard's integrated Intel Rapid Storage Technology feature to create a larger disk than was available for SSDs back then. I'd like to recover some data from them, so let's see what Linux can do.

Imaging the Disks

Whenever you work on data recovery, you really want to do everything you do on disk images instead of the live disks. It is too easy to mistype some command or get some unexpected software behavior, ruining your chances at recovery by writing over critical data.

Taking images is luckily trivial, you just need enough space:

# dd status=progress if=/dev/sda of=disk.img bs=1M

In this example, /dev/sda is the disk you want to image, and we set a blocksize of 1 megabyte. A larger blocksize than the default 512 bytes can make copying more efficient, depending on the type of disk etc.

I also took an image of the second disk, ending up with two files disk.img and disk2.img.

Exploring the Disks

Tools often expect blockdevices and not image files, so as a first step, let's create loopback devices for both:

# losetup --find --show disk.img
/dev/loop0
# losetup --find --show disk2.img
/dev/loop1

fdisk can already help us determine the order of the disks:

# fdisk -l /dev/loop0
Disk /dev/loop0: 238.47 GiB, 256060514304 bytes, 500118192 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: dos
Disk identifier: 0x1234abcd

Device       Boot     Start        End   Sectors   Size Id Type
/dev/loop0p1 *         2048     718847    716800   350M  7 HPFS/NTFS/exFAT
/dev/loop0p2         718848  999301119 998582272 476.2G  7 HPFS/NTFS/exFAT
/dev/loop0p3      999301120 1000222719    921600   450M 27 Hidden NTFS WinRE

# fdisk -l /dev/loop1
Disk /dev/loop1: 238.47 GiB, 256060514304 bytes, 500118192 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes

RAID0 works by interleaving chunks of data between the disks. Since the MBR with the partition table is quite small (512 bytes), it will entirely fit even in the smallest of stripe settings. For this reason, we only get a valid partition table on one of the disks, while not on the other.

We can also see that the second partition is actually larger than the single disk itself, further hinting at a RAID0 situation. It could of course be any sort of RAID technology but there should be further metadata about the configuration of the array. A first hint is this:

# blkid /dev/loop0
/dev/loop0: TYPE="isw_raid_member"

which is a string associated with Intel's Rapid Storage Technology (RST) or Matrix Storage Manager (MSM).

Trying mdadm

mdadm claims to support Intel RST, so let's give it a go:

# mdadm --examine -e imsm /dev/loop0 /dev/loop1
mdadm: /dev/loop0 is not attached to Intel(R) RAID controller.
mdadm: /dev/loop0 is not attached to Intel(R) RAID controller.
mdadm: Failed to retrieve serial for /dev/loop0
mdadm: Failed to load all information sections on /dev/loop0
mdadm: /dev/loop1 is not attached to Intel(R) RAID controller.
mdadm: /dev/loop1 is not attached to Intel(R) RAID controller.
mdadm: Failed to retrieve serial for /dev/loop1
mdadm: Failed to load all information sections on /dev/loop1

This is a bit disappointing, it seems like mdadm expects the array to be set up with the current mainboard, but we are looking at image files, not a live array. Consulting the internet reveals one can set an environment variable to tell mdadm to not expect a controller:

# IMSM_NO_PLATFORM=1 mdadm --examine -e imsm /dev/loop0 /dev/loop1
mdadm: Failed to retrieve serial for /dev/loop0
mdadm: Failed to load all information sections on /dev/loop0
mdadm: Failed to retrieve serial for /dev/loop1
mdadm: Failed to load all information sections on /dev/loop1

This is not really much better, but hey, we got a different error. The internet doesn't seem helpful in this case, so let's go poke a bit at the source code. The code is at on kernel.org but there also seems to be an older github mirror by the former maintainer. Since Github has some nicer code browsing features, let's use that.

Let's start by searching for the first error string: "Failed to retrieve serial for" which leads us to the promising sounding super-intel.c. The context around the line is this:

rv = nvme_get_serial(fd, buf, sizeof(buf));

if (rv)
    rv = scsi_get_serial(fd, buf, sizeof(buf));

if (rv && check_env("IMSM_DEVNAME_AS_SERIAL")) {
    memset(serial, 0, MAX_RAID_SERIAL_LEN);
    fd2devname(fd, (char *) serial);
    return 0;
}

if (rv != 0) {
    if (devname)
        pr_err("Failed to retrieve serial for %s\n",
               devname);
    return rv;
}

So it looks like mdadm is trying to read the serial number of the device it's looking at. This fails for loopback devices. But in the code, we can also see that there's a check for another environment variable to get around it: IMSM_DEVNAME_AS_SERIAL. Let's try that:

# IMSM_DEVNAME_AS_SERIAL=1 IMSM_NO_PLATFORM=1 mdadm --examine -e imsm /dev/loop0 /dev/loop1
/dev/loop0:
        Magic : Intel Raid ISM Cfg Sig.
        Version : 1.0.00
    Orig Family : c241033d
        Family : c241033d
    Generation : 00002613
Creation Time : Unknown
    Attributes : All supported
        UUID : 4de8c4a7:559e81a4:e0687ed8:b47e5ce8
    Checksum : 579a65d2 correct
    MPB Sectors : 1
        Disks : 2
RAID Devices : 1

[root]:
    Subarray : 0
        UUID : f9dd0ad6:7f914762:8b8feecb:32bda46b
    RAID Level : 0
        Members : 2
        Slots : [UU]
    Failed disk : none
    This Slot : ?
    Sector Size : 512
    Array Size : 1000226816 (476.95 GiB 512.12 GB)
Per Dev Size : 500113672 (238.47 GiB 256.06 GB)
Sector Offset : 0
    Num Stripes : 7814272
    Chunk Size : 32 KiB
    Reserved : 0
Migrate State : idle
    Map State : normal
    Dirty State : clean
    RWH Policy : off
    Volume ID : 0

Disk00 Serial : DISKSERIAL1
        State : active
            Id : 00000000
    Usable Size : 500107790 (238.47 GiB 256.06 GB)

Disk01 Serial : DISKSERIAL2
        State : active
            Id : 00010000
    Usable Size : 500107790 (238.47 GiB 256.06 GB)
/dev/loop1:
        Magic : Intel Raid ISM Cfg Sig.
        Version : 1.0.00
    Orig Family : c241033d
        Family : c241033d
    Generation : 00002613
Creation Time : Unknown
    Attributes : All supported
        UUID : 4de8c4a7:559e81a4:e0687ed8:b47e5ce8
    Checksum : 579a65d2 correct
    MPB Sectors : 1
        Disks : 2
RAID Devices : 1

[root]:
    Subarray : 0
        UUID : f9dd0ad6:7f914762:8b8feecb:32bda46b
    RAID Level : 0
        Members : 2
        Slots : [UU]
    Failed disk : none
    This Slot : ?
    Sector Size : 512
    Array Size : 1000226816 (476.95 GiB 512.12 GB)
Per Dev Size : 500113672 (238.47 GiB 256.06 GB)
Sector Offset : 0
    Num Stripes : 7814272
    Chunk Size : 32 KiB
    Reserved : 0
Migrate State : idle
    Map State : normal
    Dirty State : clean
    RWH Policy : off
    Volume ID : 0

Disk00 Serial : DISKSERIAL1
        State : active
            Id : 00000000
    Usable Size : 500107790 (238.47 GiB 256.06 GB)

Disk01 Serial : DISKSERIAL2
        State : active
            Id : 00010000
    Usable Size : 500107790 (238.47 GiB 256.06 GB)

Very nice! This information looks all nice and correct.

Mounting the Array

Let's try to assemble the array:

# IMSM_DEVNAME_AS_SERIAL=1 IMSM_NO_PLATFORM=1  mdadm --assemble --verbose --metadata=imsm /dev/md99 /dev/loop0 /dev/loop1
mdadm: looking for devices for /dev/md99
mdadm: /dev/loop0 is identified as a member of /dev/md99, slot -1.
mdadm: /dev/loop1 is identified as a member of /dev/md99, slot -1.
mdadm: added /dev/loop1 to /dev/md99 as -1
mdadm: added /dev/loop0 to /dev/md99 as -1
mdadm: Container /dev/md99 has been assembled with 2 drives

Unfortunately this only seems to assemble a container which doesn't look like the actual array. Searching the internet is not really giving me the answer here, and such an abstract topic seems tricky to figure out without a complete study of the mdadm source code.

But we have another trick up our sleeve. We can use the device mapper directly to set up a device as we have all the configuration values from mdadm --examine. We know /dev/loop0 comes first and that the chunk size is 32 KiB. The total size is 1000226816. This number is weird though, weren't we expecting around 500000000000 bytes? Turns out, this number is in 512 byte sectors, not bytes. The good thing is dmsetup also expects sectors, so we don't have to convert it:

# dmsetup create restore --table "0 1000226816 striped 2 64 /dev/loop0 0 /dev/loop1 0"

This creates a device mapper volume called restore (so will be found at /dev/mapper/restore). The --table argument contains the config for the volume. The pieces are:

0 the start sector. We start at the beginning, so 0 it is.
1000226816 the number of sectors of the complete device. Copied from above.
striped data is striped as this is a RAID0.
2 number of disks.
64 stripe size in sectors. (32*1024)/512 = 32*2 = 64.
/dev/loop0 the first disk.
0 start with the first sector.
/dev/loop1 the second disk.
0 start with the first sector.

We can use kpartx and ntfs-3g to discover the partitions (so we get /dev/mapper/restore2 like we would with /dev/sda2) and mount the filesystem:

# kpartx -a /dev/mapper/restore
# ntfs-3g -o ro /dev/mapper/restore2 /mnt/restore

And there we go, all the data is there, yay!

Final Thoughts

While this was a good learning experience, having figured out everything let's you do more targeted searches for the magic keywords. And of course it turns out Andrew Brampton went through pretty much the same journey. Oh well ;)