Unix & Linux Stack Exchange

Q&A for users of Linux, FreeBSD and other Unix-like operating systems

Latest Questions

1 votes

2 answers

2114 views

Help recovering a raid5 array

A little bit of background first. I store a bunch of data on a Thecus N4200Pro NAS array. I had gotten a report that one of the 4 drives in the array was showing smart errors. - So I swapped out the offending drive (#4) and it got to work rebuilding. About 60% into the rebuild one of the other drive...

                                  A little bit of background first. I store a bunch of data on a Thecus N4200Pro NAS array. I had gotten a report that one of the 4 drives in the array was showing smart errors.

- So I swapped out the offending drive (#4) and it got to work rebuilding. About 60% into the rebuild one of the other drives in the array drops out, #1 in this case. 
- Great.. I shut down and try swapping back in the original #4 to see if it will come back up. No dice. 
- So I shut down and swap #1 & #2 to see if they can recover with the bad drive swapped around and replace the #4 with the half-rebuilt #4. In hindsight this was bad. I should have shut down after the first one and cloned all the original discs from there. 
- The device boots back up and of course the raid fails to assemble, showing only discs 3 and 4, 4 being marked as a spare. 
- At this point I shut everything down and pull all the discs and clone them, making sure to keep track of the number order. 
- I put all 4 cloned discs into my unbutu 16.04 LTS box in the correct drive order and booted up. 
- All 4 discs show up, and show the partitions in Disks. It shows a raid5 array and a raid1 array as well. 

The raid1 array is the system info for the NAS, not really concerned with that. The raid5 array is the one I'm interested in with all my data on it, but I can't access anything on it. So time to start digging.

First I ran cat /proc/mdstat to see the arrays- 

    jake@ubuntu-box:~$ cat /proc/mdstat
    Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] 
    [raid10] 
    md0 : active raid1 sdd1
      1959884 blocks super 1.0 [4/1] [___U]
      
    md1 : inactive sdd2(S) sdc2(S) sdb2(S) sda2(S)
      3899202560 blocks
       
    unused devices: 

Ok, sees two arrays. So we get the details on md1 from: mdadm --detail /dev/md1

    jake@ubuntu-box:~$ sudo mdadm --detail /dev/md1
    /dev/md1:
        Version : 0.90
     Raid Level : raid0
    Total Devices : 4
    Preferred Minor : 0
    Persistence : Superblock is persistent


          State : inactive


           UUID : e7ab07c3:b9ffa9ae:377e3cd3:a8ece374
         Events : 0.14344


    Number   Major   Minor   RaidDevice


       -       8       50        -        /dev/sdd2
       -       8       34        -        /dev/sdc2
       -       8       18        -        /dev/sdb2
       -       8        2        -        /dev/sda2[/CODE]

Hrmm.. that's odd. showing the raid as raid0, which is not the case. Ok, lets check out each individual partition with: mdadm --examine /dev/sdXX

Disc 1

    jake@ubuntu-box:~$ sudo mdadm --examine /dev/sda2/
    dev/sda2:
          Magic : a92b4efc
        Version : 0.90.00
           UUID : e7ab07c3:b9ffa9ae:377e3cd3:a8ece374
     Creation Time : Thu Aug 18 14:30:36 2011
     Raid Level : raid5
     Used Dev Size : 974800000 (929.64 GiB 998.20 GB)
     Array Size : 2924400000 (2788.93 GiB 2994.59 GB)
     Raid Devices : 4
     Total Devices : 4
     Preferred Minor : 1


    Update Time : Tue Mar 13 14:00:33 2018
          State : clean
    Active Devices : 3
    Working Devices : 4
    Failed Devices : 1
    Spare Devices : 1
       Checksum : e52c5f8 - correct
         Events : 20364


         Layout : left-symmetric
     Chunk Size : 64K


      Number   Major   Minor   RaidDevice State
    this     0       8        2        0      active sync   /dev/sda2


    0     0       8        2        0      active sync   /dev/sda2
    1     1       8       18        1      active sync   /dev/sdb2
    2     2       8       34        2      active sync   /dev/sdc2
    3     3       0        0        3      faulty removed
    4     4       8       50        4      spare   /dev/sdd2

Disc 2

    jake@ubuntu-box:~$ sudo mdadm --examine /dev/sdb2/
    dev/sdb2:
          Magic : a92b4efc
        Version : 0.90.00
           UUID : e7ab07c3:b9ffa9ae:377e3cd3:a8ece374
    Creation Time : Thu Aug 18 14:30:36 2011
     Raid Level : raid5
    Used Dev Size : 974800000 (929.64 GiB 998.20 GB)
     Array Size : 2924400000 (2788.93 GiB 2994.59 GB)
    Raid Devices : 4
    Total Devices : 4
    Preferred Minor : 1


    Update Time : Tue Mar 13 14:56:30 2018
          State : clean
    Active Devices : 2
    Working Devices : 3
    Failed Devices : 1
    Spare Devices : 1
       Checksum : e597e42 - correct
         Events : 238868


         Layout : left-symmetric
     Chunk Size : 64K


      Number   Major   Minor   RaidDevice State
    this     1       8       18        1      active sync   /dev/sdb2


     0     0       0        0        0      removed
     1     1       8       18        1      active sync   /dev/sdb2
     2     2       8       34        2      active sync   /dev/sdc2
     3     3       0        0        3      faulty removed
     4     4       8       50        4      spare   /dev/sdd2

Disc 3

    jake@ubuntu-box:~$ sudo mdadm --examine /dev/sdc2/
    dev/sdc2:
          Magic : a92b4efc
        Version : 0.90.00
           UUID : e7ab07c3:b9ffa9ae:377e3cd3:a8ece374
    Creation Time : Thu Aug 18 14:30:36 2011
     Raid Level : raid5
    Used Dev Size : 974800000 (929.64 GiB 998.20 GB)
     Array Size : 2924400000 (2788.93 GiB 2994.59 GB)
    Raid Devices : 4
    Total Devices : 3
    Preferred Minor : 1


    Update Time : Tue Mar 13 15:10:07 2018
          State : clean
    Active Devices : 1
    Working Devices : 2
    Failed Devices : 2
    Spare Devices : 1
       Checksum : e598570 - correct
         Events : 239374


         Layout : left-symmetric
     Chunk Size : 64K


      Number   Major   Minor   RaidDevice State
    this     2       8       34        2      active sync   /dev/sdc2


    0     0       0        0        0      removed
    1     1       0        0        1      faulty removed
    2     2       8       34        2      active sync   /dev/sdc2
    3     3       0        0        3      faulty removed
    4     4       8       50        4      spare   /dev/sdd2

and Disc 4

    jake@ubuntu-box:~$ sudo mdadm --examine /dev/sdd2/
    dev/sdd2:
          Magic : a92b4efc
        Version : 0.90.00
           UUID : e7ab07c3:b9ffa9ae:377e3cd3:a8ece374
    Creation Time : Thu Aug 18 14:30:36 2011
     Raid Level : raid5
    Used Dev Size : 974800000 (929.64 GiB 998.20 GB)
     Array Size : 2924400000 (2788.93 GiB 2994.59 GB)
    Raid Devices : 4
    Total Devices : 4
    Preferred Minor : 1


    Update Time : Tue Mar 13 11:03:10 2018
          State : clean
    Active Devices : 4
    Working Devices : 4
    Failed Devices : 0
    Spare Devices : 0
       Checksum : e526d87 - correct
         Events : 14344


         Layout : left-symmetric
     Chunk Size : 64K


      Number   Major   Minor   RaidDevice State
    this     3       8       50        3      active sync   /dev/sdd2


     0     0       8        2        0      active sync   /dev/sda2
     1     1       8       18        1      active sync   /dev/sdb2
     2     2       8       34        2      active sync   /dev/sdc2
     3     3       8       50        3      active sync   /dev/sdd2

So - Magic numbers and UUID are all good between the set. Events are all out of whack because it had tried to rebuild the replaced #4 as a spare instead of just rebuilding #4

Disc 4 has the correct info for the raid, and the sequencing as it was the drive I pulled originally and didn't get anything re-written. Discs 1-3 are showing in various states of chaos from swapping things around.

So two questions -

1. Why is it showing up as raid0 in the mdadm --detail

2. Is it possible to update the info for the first three discs that I got from the mdadm --examine /dev/sdd2 so that it sees everything as it should be, instead of the cluster that I inadvertently made of it. I *think* if I can find a way to update the info for those partitions or discs the raid should reassemble correctly and rebuild itself so I can access my data

Any ideas would be helpful, as I've gotten about as far as I can get trying to figure this out on my own and doing a ton of searching.


                                

psykokid (11 rep)

Mar 16, 2018, 01:01 AM • Last activity: Jul 14, 2025, 04:07 PM

0 votes

1 answers

820 views

Raid5 mdadm array change size

mdadm size raid5

I have created a raid5 array with 4 disks. Initially i had 3x 3tb and 1x 4tb (because 3tb was unavailable at the time). After some years i have replaced most of these disks and have come to the point where all array disks are now 4tb in size. Still my mdadm array is 3TB. Is there any way to change t...

                                  I have created a raid5 array with 4 disks. Initially i had 3x 3tb and 1x 4tb (because 3tb was unavailable at the time). After some years i have replaced most of these disks and have come to the point where all array disks are now 4tb in size. Still my mdadm array is 3TB.
Is there any way to change the mdadm array size to match the 4tb disk size w/o loosing my data?
Thanks for your help!
                                

jack (101 rep)

Aug 30, 2020, 09:00 PM • Last activity: Jun 7, 2025, 03:17 PM

2 votes

1 answers

80 views

Reassemble Raid5 Array after disabling TPM

software-raid mdadm raid5 data-recovery smartctl

Edit: Both /dev/sdd and /dev/sde are missing super blocks. I assume this cannot be fixed. I am whipping the drives and starting over. I just finished coping 8TB worth of data to a new raid5 array. I just turned off TPM in my bios, and this array was no longer readable. I would like to fix this rathe...

                                  Edit: Both /dev/sdd and /dev/sde are missing super blocks. I assume this cannot be fixed. I am whipping the drives and starting over.

I just finished coping 8TB worth of data to a new raid5 array. I just turned off TPM in my bios, and this array was no longer readable. I would like to fix this rather than starting over. I tried to reassemble it, and got this error.

    $ sudo mdadm --assemble /dev/md0 /dev/sda /dev/sdb /dev/sdd /dev/sde -f
    mdadm: No super block found on /dev/sdd (Expected magic a92b4efc, got 00000000)
    mdadm: no RAID superblock on /dev/sdd
    mdadm: /dev/sdd has no superblock - assembly aborted

Here's what examining /sdd resulted in.

    $sudo mdadm -E /dev/sdd
    /dev/sdd:
       MBR Magic : aa55
    Partition :   4294967295 sectors at            1 (type ee)

Here's some more diagnostics:

    sudo mdadm --examine /dev/sd*
    /dev/sda:
              Magic : a92b4efc
            Version : 1.2
        Feature Map : 0x1
         Array UUID : 7844a579:00996056:06c4e1dd:0e70ebcb
               Name : scott-LinuxMint:0  (local to host scott-LinuxMint)
      Creation Time : Thu Jan  2 12:50:26 2025
         Raid Level : raid5
       Raid Devices : 4
    
     Avail Dev Size : 7813772976 sectors (3.64 TiB 4.00 TB)
         Array Size : 11720659392 KiB (10.92 TiB 12.00 TB)
      Used Dev Size : 7813772928 sectors (3.64 TiB 4.00 TB)
        Data Offset : 264192 sectors
       Super Offset : 8 sectors
       Unused Space : before=264112 sectors, after=48 sectors
              State : clean
        Device UUID : 0febcd7e:7581f3c8:7b5962c5:cbddee7c
    
    Internal Bitmap : 8 sectors from superblock
        Update Time : Fri Jan  3 22:05:37 2025
      Bad Block Log : 512 entries available at offset 24 sectors
           Checksum : 852d7efe - correct
             Events : 6116
    
             Layout : left-symmetric
         Chunk Size : 64K
    
       Device Role : Active device 0
       Array State : AAAA ('A' == active, '.' == missing, 'R' == replacing)
    /dev/sdb:
              Magic : a92b4efc
            Version : 1.2
        Feature Map : 0x1
         Array UUID : 7844a579:00996056:06c4e1dd:0e70ebcb
               Name : scott-LinuxMint:0  (local to host scott-LinuxMint)
      Creation Time : Thu Jan  2 12:50:26 2025
         Raid Level : raid5
       Raid Devices : 4
    
     Avail Dev Size : 7813772976 sectors (3.64 TiB 4.00 TB)
         Array Size : 11720659392 KiB (10.92 TiB 12.00 TB)
      Used Dev Size : 7813772928 sectors (3.64 TiB 4.00 TB)
        Data Offset : 264192 sectors
       Super Offset : 8 sectors
       Unused Space : before=264112 sectors, after=48 sectors
              State : clean
        Device UUID : d2280c55:cf16ae93:aaa5e4a0:71e30dbb
    
    Internal Bitmap : 8 sectors from superblock
        Update Time : Fri Jan  3 22:05:37 2025
      Bad Block Log : 512 entries available at offset 24 sectors
           Checksum : 3fc7a3f1 - correct
             Events : 6116
    
             Layout : left-symmetric
         Chunk Size : 64K
    
       Device Role : Active device 1
       Array State : AAAA ('A' == active, '.' == missing, 'R' == replacing)
    /dev/sdc:
       MBR Magic : aa55
    Partition :   4294967295 sectors at            1 (type ee)
    /dev/sdc1:
       MBR Magic : aa55
    Partition :   1836016416 sectors at   1936269394 (type 4f)
    Partition :    544437093 sectors at   1917848077 (type 73)
    Partition :    544175136 sectors at   1818575915 (type 2b)
    Partition :        54974 sectors at   2844524554 (type 61)
    /dev/sdd:
       MBR Magic : aa55
    Partition :   4294967295 sectors at            1 (type ee)
    mdadm: No md superblock detected on /dev/sdd1.
    /dev/sde:
       MBR Magic : aa55
    Partition :   4294967295 sectors at            1 (type ee)
    mdadm: No md superblock detected on /dev/sde1.

And the drive seems healthy.

    $sudo smartctl -d ata -a /dev/sdd
    smartctl 7.4 2023-08-01 r5530 [x86_64-linux-6.8.0-51-generic] (local build)
    Copyright (C) 2002-23, Bruce Allen, Christian Franke, www.smartmontools.org
    
    === START OF INFORMATION SECTION ===
    Model Family:     Seagate Skyhawk
    Device Model:     ST4000VX007-2DT166
    Serial Number:    ZDH61N4Z
    LU WWN Device Id: 5 000c50 0b4cf0507
    Firmware Version: CV11
    User Capacity:    4,000,787,030,016 bytes [4.00 TB]
    Sector Sizes:     512 bytes logical, 4096 bytes physical
    Rotation Rate:    5980 rpm
    Form Factor:      3.5 inches
    Device is:        In smartctl database 7.3/5528
    ATA Version is:   ACS-3 T13/2161-D revision 5
    SATA Version is:  SATA 3.1, 6.0 Gb/s (current: 6.0 Gb/s)
    Local Time is:    Fri Jan  3 23:01:40 2025 EST
    SMART support is: Available - device has SMART capability.
    SMART support is: Enabled
    
    === START OF READ SMART DATA SECTION ===
    SMART overall-health self-assessment test result: PASSED
    
    General SMART Values:
    Offline data collection status:  (0x82)	Offline data collection activity
    					was completed without error.
    					Auto Offline Data Collection: Enabled.
    Self-test execution status:      (   0)	The previous self-test routine completed
    					without error or no self-test has ever 
    					been run.
    Total time to complete Offline 
    data collection: 		(  591) seconds.
    Offline data collection
    capabilities: 			 (0x7b) SMART execute Offline immediate.
    					Auto Offline data collection on/off support.
    					Suspend Offline collection upon new
    					command.
    					Offline surface scan supported.
    					Self-test supported.
    					Conveyance Self-test supported.
    					Selective Self-test supported.
    SMART capabilities:            (0x0003)	Saves SMART data before entering
    					power-saving mode.
    					Supports SMART auto save timer.
    Error logging capability:        (0x01)	Error logging supported.
    					General Purpose Logging supported.
    Short self-test routine 
    recommended polling time: 	 (   1) minutes.
    Extended self-test routine
    recommended polling time: 	 ( 633) minutes.
    Conveyance self-test routine
    recommended polling time: 	 (   2) minutes.
    SCT capabilities: 	       (0x50bd)	SCT Status supported.
    					SCT Error Recovery Control supported.
    					SCT Feature Control supported.
    					SCT Data Table supported.
    
    SMART Attributes Data Structure revision number: 10
    Vendor Specific SMART Attributes with Thresholds:
    ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
      1 Raw_Read_Error_Rate     0x000f   075   064   044    Pre-fail  Always       -       30305794
      3 Spin_Up_Time            0x0003   094   093   000    Pre-fail  Always       -       0
      4 Start_Stop_Count        0x0032   100   100   020    Old_age   Always       -       276
      5 Reallocated_Sector_Ct   0x0033   100   100   010    Pre-fail  Always       -       0
      7 Seek_Error_Rate         0x000f   095   060   045    Pre-fail  Always       -       3166340513
      9 Power_On_Hours          0x0032   069   069   000    Old_age   Always       -       27536h+49m+43.964s
     10 Spin_Retry_Count        0x0013   100   100   097    Pre-fail  Always       -       0
     12 Power_Cycle_Count       0x0032   100   100   020    Old_age   Always       -       104
    184 End-to-End_Error        0x0032   100   100   099    Old_age   Always       -       0
    187 Reported_Uncorrect      0x0032   100   100   000    Old_age   Always       -       0
    188 Command_Timeout         0x0032   100   099   000    Old_age   Always       -       7864440
    189 High_Fly_Writes         0x003a   100   100   000    Old_age   Always       -       0
    190 Airflow_Temperature_Cel 0x0022   081   047   040    Old_age   Always       -       19 (Min/Max 19/19)
    191 G-Sense_Error_Rate      0x0032   100   100   000    Old_age   Always       -       0
    192 Power-Off_Retract_Count 0x0032   100   100   000    Old_age   Always       -       117
    193 Load_Cycle_Count        0x0032   099   099   000    Old_age   Always       -       2608
    194 Temperature_Celsius     0x0022   019   053   000    Old_age   Always       -       19 (0 6 0 0 0)
    197 Current_Pending_Sector  0x0012   100   100   000    Old_age   Always       -       0
    198 Offline_Uncorrectable   0x0010   100   100   000    Old_age   Offline      -       0
    199 UDMA_CRC_Error_Count    0x003e   200   200   000    Old_age   Always       -       0
    240 Head_Flying_Hours       0x0000   100   253   000    Old_age   Offline      -       27376h+00m+21.311s
    241 Total_LBAs_Written      0x0000   100   253   000    Old_age   Offline      -       247975821685
    242 Total_LBAs_Read         0x0000   100   253   000    Old_age   Offline      -       124682775664
    
    SMART Error Log Version: 1
    No Errors Logged
    
    SMART Self-test log structure revision number 1
    No self-tests have been logged.  [To run self-tests, use: smartctl -t]
    
    SMART Selective self-test log data structure revision number 1
     SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
        1        0        0  Not_testing
        2        0        0  Not_testing
        3        0        0  Not_testing
        4        0        0  Not_testing
        5        0        0  Not_testing
    Selective self-test flags (0x0):
      After scanning selected spans, do NOT read-scan remainder of disk.
    If Selective self-test is pending on power-up, resume after 0 minute delay.
    
    The above only provides legacy SMART information - try 'smartctl -x' for more

Let me know if you can help. I am very new to this.

Edit: added this fdisk test. I do have another unrelated drive, /dev/sdc.

    $ sudo fdisk -l /dev/sd?       
    The primary GPT table is corrupt, but the backup appears OK, so that will be used.
    Disk /dev/sda: 3.64 TiB, 4000787030016 bytes, 7814037168 sectors
    Disk model: ST4000VX007-2DT1
    Units: sectors of 1 * 512 = 512 bytes
    Sector size (logical/physical): 512 bytes / 4096 bytes
    I/O size (minimum/optimal): 4096 bytes / 4096 bytes
    Disklabel type: gpt
    Disk identifier: 0384604C-4E8B-4E0A-8423-2139A918120C
    
    Device     Start        End    Sectors  Size Type
    /dev/sda1   2048 7814035455 7814033408  3.6T Linux filesystem
    The primary GPT table is corrupt, but the backup appears OK, so that will be used.
    
    
    Disk /dev/sdb: 3.64 TiB, 4000787030016 bytes, 7814037168 sectors
    Disk model: ST4000VX007-2DT1
    Units: sectors of 1 * 512 = 512 bytes
    Sector size (logical/physical): 512 bytes / 4096 bytes
    I/O size (minimum/optimal): 4096 bytes / 4096 bytes
    Disklabel type: gpt
    Disk identifier: C079AF04-F6C8-4FB3-9E12-FEFCC65D008F
    
    Device     Start        End    Sectors  Size Type
    /dev/sdb1   2048 7814035455 7814033408  3.6T Linux filesystem
    
    
    Disk /dev/sdc: 7.28 TiB, 8001563222016 bytes, 15628053168 sectors
    Disk model: HGST HDN728080AL
    Units: sectors of 1 * 512 = 512 bytes
    Sector size (logical/physical): 512 bytes / 4096 bytes
    I/O size (minimum/optimal): 4096 bytes / 4096 bytes
    Disklabel type: gpt
    Disk identifier: 5237C016-4DE9-408A-A37B-F1F59F33776E
    
    Device     Start         End     Sectors  Size Type
    /dev/sdc1   2048 15627233279 15627231232  7.3T Microsoft basic data
    
    
    Disk /dev/sdd: 3.64 TiB, 4000787030016 bytes, 7814037168 sectors
    Disk model: ST4000VX007-2DT1
    Units: sectors of 1 * 512 = 512 bytes
    Sector size (logical/physical): 512 bytes / 4096 bytes
    I/O size (minimum/optimal): 4096 bytes / 4096 bytes
    Disklabel type: gpt
    Disk identifier: 56B6E76B-3B41-486B-8857-AD2BEA8D589A
    
    Device     Start        End    Sectors  Size Type
    /dev/sdd1   2048 7814035455 7814033408  3.6T Linux filesystem
    
    
    Disk /dev/sde: 3.64 TiB, 4000787030016 bytes, 7814037168 sectors
    Disk model: ST4000VX007-2DT1
    Units: sectors of 1 * 512 = 512 bytes
    Sector size (logical/physical): 512 bytes / 4096 bytes
    I/O size (minimum/optimal): 4096 bytes / 4096 bytes
    Disklabel type: gpt
    Disk identifier: B306782C-5C23-4C41-A6B9-79AF1FCC6F0E
    
    Device     Start        End    Sectors  Size Type
    /dev/sde1   2048 7814035455 7814033408  3.6T Linux filesystem


                                

Scott Mayo (21 rep)

Jan 4, 2025, 04:32 AM • Last activity: Jan 10, 2025, 11:07 PM

1 votes

1 answers

51 views

Is software raid5 created by mdadm in debian compatable with openbsd softraid

debian mdadm openbsd software-raid raid5

I have created software raid5 using mdadm in Debian Linux. Now I want to switch to OpenBSD and interested if I will be able to mount my raid5 under new system?

                                  I have created software raid5 using mdadm in Debian Linux. Now I want to switch to OpenBSD and interested if I will be able to mount my raid5 under new system?
                                

gio (19 rep)

Jan 3, 2025, 02:08 AM • Last activity: Jan 3, 2025, 01:23 PM

1 votes

1 answers

91 views

Destroyed file system after adding disks to LVM on RAID5?

linux filesystems lvm raid5

Just adding a bit of storage capacity on my Openmediavault storage server went wrong. What should I do to recover? Can you lead me? This is what I did: 1 /dev/md2 raid5 set created, went well 2 /dev/md2 pv created, went well 3 /dev/md2 pv included in volume group datavg went well 4 creating a new file system in the logical volume datalv was not possible - no device available. I now know that only one fs per lv is possible, but was not aware of that when I tried this. 5 lvextend on datalv, went well - datalv now the full capacity including the new drives 6 creating a new file system in datalv still not possible - no device available. 7 trying fsgrow, allocated the entire capacity in one fell swoop - not what I wanted... 8 lvreduce on datalv to approx original capacity, went well 9 lvcreate data2lv, went well 10 creating the new file system in data2lv (XFS) this time, worked After rebooting the system, the old file system (data) doesn't mount and there is no sign of the new (XFS) one. I was wrong in the assumption that an lv can have more than one file system, so step 4 is when things started to go wrong. Step 5 was wrong too, but maybe reversible. Step 7 was a big mistake, since the file system can not be shrunk back. The rest of the steps still more wrong, but how can I fix this? Undo steps 8 to 10 and live with just one file system? I figured I should ask here before I mess things up worse. I would be very grateful for some assistance on this, /C

$ cat /etc/fstab
# /etc/fstab: static file system information.
#
# Use 'blkid' to print the universally unique identifier for a
# device; this may be used with UUID= as a more robust way to name devices
# that works even if disks are added and removed. See fstab(5).
#
#                
# / was on /dev/sda1 during installation
UUID=be85bbea-6e8b-4710-b1e0-894ad8d34d20 /               ext4    errors=remount-ro 0       1
# swap was on /dev/sda5 during installation
UUID=035fdb6b-e1bb-46c9-81c5-ff28840063c7 none            swap    sw              0       0
# >>> [openmediavault]
/dev/disk/by-label/data        /srv/dev-disk-by-label-data    ext4    defaults,nofail,user_xattr,noexec,usrjquota=aquota.user,grpjquota=aquota.group,jqfmt=vfsv0,acl    0 2
/dev/disk/by-uuid/035fdb6b-e1bb-46c9-81c5-ff28840063c7        /srv/dev-disk-by-uuid-035fdb6b-e1bb-46c9-81c5-ff28840063c7    ext2    defaults,nofail,user_xattr,acl    0 2
/srv/dev-disk-by-label-data/data/        /export/data    none    bind,nofail    0 0
/srv/dev-disk-by-label-data/data/Backups/HomeAssistant/        /export/HAss    none    bind,nofail    0 0
/srv/dev-disk-by-label-data/data/Backups/LinHES/        /export/LinHES_backup    none    bind,nofail    0 0
/srv/dev-disk-by-label-data/data/LinHES/        /export/LinHES    none    bind,nofail    0 0
/srv/dev-disk-by-label-data/data/securitycams/        /export/securitycams    none    bind,nofail    0 0
# .
[   42.306079] systemd: /etc/systemd/system/clamav-onaccess.service:13: Standard output type syslog is obsolete, automatically updating to journal. Please update your unit file, and consider removing the setting altogether.
[   42.306476] systemd: /lib/systemd/system/clamav-freshclam.service:11: Standard output type syslog is obsolete, automatically updating to journal. Please update your unit file, and consider removing the setting altogether.
[   42.307153] systemd: /lib/systemd/system/clamav-daemon.service:12: Standard output type syslog is obsolete, automatically updating to journal. Please update your unit file, and consider removing the setting altogether.
[   42.312038] systemd: Queued start job for default target Graphical Interface.
[   42.330323] systemd: Created slice system-getty.slice.
[   42.330620] systemd: Created slice system-modprobe.slice.
[   42.330854] systemd: Created slice system-postfix.slice.
[   42.331079] systemd: Created slice system-systemd\x2dfsck.slice.
[   42.331305] systemd: Created slice User and Session Slice.
[   42.331368] systemd: Started Dispatch Password Requests to Console Directory Watch.
[   42.331420] systemd: Started Forward Password Requests to Wall Directory Watch.
[   42.331575] systemd: Set up automount Arbitrary Executable File Formats File System Automount Point.
[   42.331608] systemd: Reached target Local Encrypted Volumes.
[   42.331652] systemd: Reached target Paths.
[   42.331680] systemd: Reached target Slices.
[   42.331703] systemd: Reached target System Time Set.
[   42.331781] systemd: Listening on Device-mapper event daemon FIFOs.
[   42.331874] systemd: Listening on LVM2 poll daemon socket.
[   42.336222] systemd: Listening on RPCbind Server Activation Socket.
[   42.336356] systemd: Listening on Syslog Socket.
[   42.336470] systemd: Listening on fsck to fsckd communication Socket.
[   42.336541] systemd: Listening on initctl Compatibility Named Pipe.
[   42.336712] systemd: Listening on Journal Audit Socket.
[   42.336827] systemd: Listening on Journal Socket (/dev/log).
[   42.336946] systemd: Listening on Journal Socket.
[   42.337078] systemd: Listening on Network Service Netlink Socket.
[   42.337484] systemd: Listening on udev Control Socket.
[   42.337575] systemd: Listening on udev Kernel Socket.
[   42.338502] systemd: Mounting Huge Pages File System...
[   42.339406] systemd: Mounting POSIX Message Queue File System...
[   42.340365] systemd: Mounting NFSD configuration filesystem...
[   42.341305] systemd: Mounting RPC Pipe File System...
[   42.342485] systemd: Mounting Kernel Debug File System...
[   42.343458] systemd: Mounting Kernel Trace File System...
[   42.343588] systemd: Condition check resulted in Kernel Module supporting RPCSEC_GSS being skipped.
[   42.343753] systemd: Finished Availability of block devices.
[   42.345380] systemd: Starting Set the console keyboard layout...
[   42.346510] systemd: Starting Create list of static device nodes for the current kernel...
[   42.347606] systemd: Starting Monitoring of LVM2 mirrors, snapshots etc. using dmeventd or progress polling...
[   42.348989] systemd: Starting Load Kernel Module configfs...
[   42.350348] systemd: Starting Load Kernel Module drm...
[   42.351388] systemd: Starting Load Kernel Module fuse...
[   42.351481] systemd: Condition check resulted in OpenVSwitch configuration for cleanup being skipped.
[   42.352694] systemd: Condition check resulted in Set Up Additional Binary Formats being skipped.
[   42.352730] systemd: Condition check resulted in File System Check on Root Device being skipped.
[   42.354950] systemd: Starting Journal Service...
[   42.357137] systemd: Starting Load Kernel Modules...
[   42.358386] systemd: Starting Remount Root and Kernel File Systems...
[   42.359555] systemd: Starting Coldplug All udev Devices...
[   42.362006] systemd: Mounted Huge Pages File System.
[   42.362148] systemd: Mounted POSIX Message Queue File System.
[   42.362262] systemd: Mounted Kernel Debug File System.
[   42.362373] systemd: Mounted Kernel Trace File System.
[   42.362841] systemd: Finished Create list of static device nodes for the current kernel.
[   42.363275] systemd: modprobe@configfs.service: Succeeded.
[   42.363694] systemd: Finished Load Kernel Module configfs.
[   42.364942] systemd: Mounting Kernel Configuration File System...
[   42.368675] systemd: Finished Load Kernel Modules.
[   42.368852] systemd: Mounted Kernel Configuration File System.
[   42.370036] systemd: Starting Apply Kernel Variables...
[   42.370843] fuse: init (API version 7.37)
[   42.371752] systemd: modprobe@fuse.service: Succeeded.
[   42.372144] systemd: Finished Load Kernel Module fuse.
[   42.373401] systemd: Mounting FUSE Control File System...
[   42.376648] systemd: Mounted FUSE Control File System.
[   42.383402] systemd: Finished Apply Kernel Variables.
[   42.384015] EXT4-fs (md126): re-mounted. Quota mode: none.
[   42.385366] systemd: Finished Remount Root and Kernel File Systems.
[   42.387332] systemd: Starting Initial Check File System Quotas...
[   42.387952] systemd: Condition check resulted in Rebuild Hardware Database being skipped.
[   42.388047] systemd: Condition check resulted in Platform Persistent Storage Archival being skipped.
[   42.389092] systemd: Starting Load/Save Random Seed...
[   42.390273] systemd: Starting Create System Users...
[   42.406816] systemd: Finished Load/Save Random Seed.
[   42.408709] systemd: Condition check resulted in First Boot Complete being skipped.
[   42.410165] systemd: Finished Create System Users.
[   42.412453] systemd: Starting Create Static Device Nodes in /dev...
[   42.418172] systemd: Finished Initial Check File System Quotas.
[   42.429120] systemd: Finished Create Static Device Nodes in /dev.
[   42.430906] systemd: Starting Rule-based Manager for Device Events and Files...
[   42.436223] RPC: Registered named UNIX socket transport module.
[   42.436227] RPC: Registered udp transport module.
[   42.436228] RPC: Registered tcp transport module.
[   42.436229] RPC: Registered tcp NFSv4.1 backchannel transport module.
[   42.449951] ACPI: bus type drm_connector registered
[   42.450404] systemd: Mounted RPC Pipe File System.
[   42.451406] systemd: Starting pNFS block layout mapping daemon...
[   42.451934] systemd: modprobe@drm.service: Succeeded.
[   42.452283] systemd: Finished Load Kernel Module drm.
[   42.454355] systemd: Started pNFS block layout mapping daemon.
[   42.466920] systemd: Started Rule-based Manager for Device Events and Files.
[   42.468985] systemd: Starting Network Service...
[   42.482992] systemd: Finished Set the console keyboard layout.
[   42.507752] systemd: Finished Monitoring of LVM2 mirrors, snapshots etc. using dmeventd or progress polling.
[   42.507870] systemd: Reached target Local File Systems (Pre).
[   42.539579] systemd: Started Journal Service.
[   42.553503] acpi_cpufreq: overriding BIOS provided _PSD data
[   42.561672] sd 3:0:0:0: Attached scsi generic sg0 type 0
[   42.565078] input: Power Button as /devices/LNXSYSTM:00/LNXSYBUS:00/PNP0C0C:00/input/input1
[   42.568531] ACPI: button: Power Button [PWRB]
[   42.568874] sd 3:0:1:0: Attached scsi generic sg1 type 0
[   42.570834] sd 3:0:2:0: Attached scsi generic sg2 type 0
[   42.574606] sd 3:0:3:0: Attached scsi generic sg3 type 0
[   42.576518] input: Power Button as /devices/LNXSYSTM:00/LNXPWRBN:00/input/input2
[   42.576551] sd 3:0:4:0: Attached scsi generic sg4 type 0
[   42.577588] sd 3:0:5:0: Attached scsi generic sg5 type 0
[   42.584095] sd 3:0:6:0: Attached scsi generic sg6 type 0
[   42.592174] ACPI: button: Power Button [PWRF]
[   42.595542] sd 3:0:7:0: Attached scsi generic sg7 type 0
[   42.595748] r8169 0000:04:00.0: firmware: direct-loading firmware rtl_nic/rtl8168g-2.fw
[   42.596555] ses 3:0:8:0: Attached scsi generic sg8 type 13
[   42.596684] sd 2:0:0:0: Attached scsi generic sg9 type 0
[   42.599088] sd 4:0:0:0: Attached scsi generic sg10 type 0
[   42.604298] input: PC Speaker as /devices/platform/pcspkr/input/input3
[   42.615393] systemd-journald: Received client request to flush runtime journal.
[   42.619045] sp5100_tco: SP5100/SB800 TCO WatchDog Timer Driver
[   42.621719] Generic FE-GE Realtek PHY r8169-0-400:00: attached PHY driver (mii_bus:phy_addr=r8169-0-400:00, irq=MAC)
[   42.642344] sp5100-tco sp5100-tco: Failed to reserve MMIO or alternate MMIO region
[   42.642395] sp5100-tco: probe of sp5100-tco failed with error -16
[   42.655985] cryptd: max_cpu_qlen set to 1000
[   42.685219] AVX version of gcm_enc/dec engaged.
[   42.685272] AES CTR mode by8 optimization enabled
[   42.800800] input: HDA ATI HDMI HDMI/DP,pcm=3 as /devices/pci0000:00/0000:00:01.0/0000:01:05.1/sound/card0/input4
[   42.818231] r8169 0000:04:00.0 enp4s0: Link is Down
[   42.888925] [drm] radeon kernel modesetting enabled.
[   42.889075] radeon 0000:01:05.0: vgaarb: deactivate vga console
[   42.889633] Console: switching to colour dummy device 80x25
[   42.889920] [drm] initializing kernel modesetting (RS780 0x1002:0x9616 0x1458:0xD000 0x00).
[   42.890569] ATOM BIOS: B27732
[   42.890585] radeon 0000:01:05.0: VRAM: 512M 0x00000000C0000000 - 0x00000000DFFFFFFF (512M used)
[   42.890590] radeon 0000:01:05.0: GTT: 512M 0x00000000A0000000 - 0x00000000BFFFFFFF
[   42.890598] [drm] Detected VRAM RAM=512M, BAR=256M
[   42.890600] [drm] RAM width 32bits DDR
[   42.890629] [drm] radeon: 512M of VRAM memory ready
[   42.890632] [drm] radeon: 512M of GTT memory ready.
[   42.890641] [drm] Loading RS780 Microcode
[   42.893910] radeon 0000:01:05.0: firmware: direct-loading firmware radeon/RS780_pfp.bin
[   42.894752] radeon 0000:01:05.0: firmware: direct-loading firmware radeon/RS780_me.bin
[   42.896489] radeon 0000:01:05.0: firmware: direct-loading firmware radeon/R600_rlc.bin
[   42.896503] [drm] radeon: power management initialized
[   42.896507] [drm] GART: num cpu pages 131072, num gpu pages 131072
[   42.915405] [drm] PCIE GART of 512M enabled (table at 0x00000000C0040000).
[   42.915440] radeon 0000:01:05.0: WB enabled
[   42.915445] radeon 0000:01:05.0: fence driver on ring 0 use gpu addr 0x00000000a0000c00
[   42.919478] radeon 0000:01:05.0: radeon: MSI limited to 32-bit
[   42.919495] [drm] radeon: irq initialized.
[   42.950634] [drm] ring test on 0 succeeded in 1 usecs
[   42.950879] [drm] ib test on ring 0 succeeded in 0 usecs
[   42.951901] [drm] Radeon Display Connectors
[   42.951903] [drm] Connector 0:
[   42.951905] [drm]   VGA-1
[   42.951906] [drm]   DDC: 0x7e40 0x7e40 0x7e44 0x7e44 0x7e48 0x7e48 0x7e4c 0x7e4c
[   42.951909] [drm]   Encoders:
[   42.951909] [drm]     CRT1: INTERNAL_KLDSCP_DAC1
[   42.951911] [drm] Connector 1:
[   42.951912] [drm]   DVI-D-1
[   42.951912] [drm]   HPD1
[   42.951913] [drm]   DDC: 0x7e50 0x7e50 0x7e54 0x7e54 0x7e58 0x7e58 0x7e5c 0x7e5c
[   42.951915] [drm]   Encoders:
[   42.951916] [drm]     DFP1: INTERNAL_KLDSCP_LVTMA
[   42.968693] [drm] fb mappable at 0xD0141000
[   42.968699] [drm] vram apper at 0xD0000000
[   42.968701] [drm] size 3145728
[   42.968704] [drm] fb depth is 24
[   42.968705] [drm]    pitch is 4096
[   42.968809] fbcon: radeondrmfb (fb0) is primary device
[   42.993990] Console: switching to colour frame buffer device 128x48
[   42.994982] radeon 0000:01:05.0: [drm] fb0: radeondrmfb frame buffer device
[   43.035372] [drm] Initialized radeon 2.50.0 20080528 for 0000:01:05.0 on minor 0
[   43.064106] EXT4-fs (md1): mounting ext2 file system using the ext4 subsystem
[   43.099664] EXT4-fs (md1): mounted filesystem without journal. Quota mode: none.
[   43.197238] EXT4-fs (dm-0): bad geometry: block count 9522326528 exceeds size of device (2856698880 blocks)
[   43.413872] SVM: TSC scaling supported
[   43.413877] kvm: Nested Virtualization enabled
[   43.413879] SVM: kvm: Nested Paging enabled
[   43.413890] SVM: LBR virtualization supported
[   43.429597] MCE: In-kernel MCE decoding enabled.
[   45.391715] [drm] amdgpu kernel modesetting enabled.
[   45.391813] amdgpu: CRAT table not found
[   45.391816] amdgpu: Virtual CRAT table created for CPU
[   45.391832] amdgpu: Topology: Add CPU node
[   45.737903] r8169 0000:04:00.0 enp4s0: Link is Up - 1Gbps/Full - flow control rx/tx
[   45.737917] IPv6: ADDRCONF(NETDEV_CHANGE): enp4s0: link becomes ready
[   46.720565] NFSD: Using UMH upcall client tracking operations.
[   46.720571] NFSD: starting 90-second grace period (net f0000000)
[  136.827116] EXT4-fs (dm-0): bad geometry: block count 9522326528 exceeds size of device (2856698880 blocks)
[135143.417824] md: data-check of RAID array md1
[135143.489597] md: delaying data-check of md126 until md1 has finished (they share one or more physical units)
[135340.210882] md: md1: data-check done.
[135340.217461] md: data-check of RAID array md126
[135761.486563] md: md126: data-check done.
root@bolivar:/#

Chris (37 rep)

Nov 8, 2024, 09:27 AM • Last activity: Nov 8, 2024, 07:30 PM

2 votes

2 answers

6221 views

How do I make a spare device active in a degraded mdadm RAID5

mdadm raid5

A bit of history to start with. I had a 4 disk RAID5 and one disk failed. I removed it from the array and had it in a degraded state for a while: ``` mdadm --manage /dev/md127 --fail /dev/sde1 --remove /dev/sde1 ``` My data requirement suddenly dropped so I decided to permanently reduce the array to...

A bit of history to start with. I had a 4 disk RAID5 and one disk failed. I removed it from the array and had it in a degraded state for a while:

mdadm --manage /dev/md127 --fail /dev/sde1 --remove /dev/sde1

My data requirement suddenly dropped so I decided to permanently reduce the array to 3 disks. I shrank the file system to much less than the new array size then:

mdadm --grow /dev/md127 --array-size 35156183040 # reduces array size
mdadm --grow --raid-devices=3 /dev/md127 --backup-file /store/4TB_WD/md127.backup # reshape array removing 1 disk.

This has now completed:

cat /proc/mdstat
Personalities : [raid6] [raid5] [raid4]
md127 : active raid5 sdd1 sdc1(S) sdb1
      35156183040 blocks super 1.2 level 5, 64k chunk, algorithm 2 [3/2] [_UU]
      bitmap: 103/131 pages [412KB], 65536KB chunk

unused devices:

but has left me with a 3 disk degraded RAID5 with 2 active disks and one spare:

mdadm -D /dev/md127
/dev/md127:
           Version : 1.2
     Creation Time : Fri Sep  9 22:39:53 2022
        Raid Level : raid5
        Array Size : 35156183040 (32.74 TiB 36.00 TB)
     Used Dev Size : 17578091520 (16.37 TiB 18.00 TB)
      Raid Devices : 3
     Total Devices : 3
       Persistence : Superblock is persistent

     Intent Bitmap : Internal

       Update Time : Fri Jan 20 11:12:10 2023
             State : active, degraded
    Active Devices : 2
   Working Devices : 3
    Failed Devices : 0
     Spare Devices : 1

            Layout : left-symmetric
        Chunk Size : 64K

Consistency Policy : bitmap

              Name : oldserver-h.oldserver.lan:127
              UUID : 589dd683:d9945b24:768d9b2b:28441f90
            Events : 555962

    Number   Major   Minor   RaidDevice State
       -       0        0        0      removed
       1       8       49        1      active sync   /dev/sdd1
       2       8       17        2      active sync   /dev/sdb1

       3       8       33        -      spare   /dev/sdc1

How do I make this spare disk active so the array can rebuild to a healthy state? cat /sys/block/md127/md/sync_action shows idle and echoing repair into it does nothing. As a follow up, where did I go wrong in the first place? [edit] Adding output to lsblk as requested:

lsblk
NAME             MAJ:MIN RM   SIZE RO TYPE  MOUNTPOINT
sda                8:0    0   100G  0 disk
├─sda1             8:1    0     1G  0 part  /boot
└─sda2             8:2    0    99G  0 part
  ├─clearos-root 253:0    0  91.1G  0 lvm   /
  └─clearos-swap 253:1    0   7.9G  0 lvm   [SWAP]
sdb                8:16   0  16.4T  0 disk
└─sdb1             8:17   0  16.4T  0 part
  └─md127          9:127  0  32.8T  0 raid5 /store/RAID_A
sdc                8:32   0  16.4T  0 disk
└─sdc1             8:33   0  16.4T  0 part
  └─md127          9:127  0  32.8T  0 raid5 /store/RAID_A
sdd                8:48   0  16.4T  0 disk
└─sdd1             8:49   0  16.4T  0 part
  └─md127          9:127  0  32.8T  0 raid5 /store/RAID_A
sde                8:64   0   3.7T  0 disk
└─sde1             8:65   0   3.7T  0 part  /store/4TB_WD
sdf                8:80   0 931.5G  0 disk
└─sdf1             8:81   0 931.5G  0 part  /store/1TB1
sdg                8:96   0 931.5G  0 disk
└─sdg1             8:97   0 931.5G  0 part  /store/1TB2
sr0               11:0    1   1.2G  0 rom

[/edit]

NickH (23 rep)

Jan 20, 2023, 11:33 AM • Last activity: Nov 4, 2024, 06:44 AM

0 votes

2 answers

55 views

Reassembling RAID 5 when 1 disk has been overwritten

data-recovery mdadm raid5

I was having trouble with my Ubuntu system that was running a Raid 5 setup with 4x4Tb drives: /dev/sda[1234]. The OS was installed on /dev/sde. So I decided it was time to do a clean install of the OS.. And here is where my stupidity came in.. I accidentally installed Ubuntu on /dev/sda. I've now re...

I was having trouble with my Ubuntu system that was running a Raid 5 setup with 4x4Tb drives: /dev/sda. The OS was installed on /dev/sde. So I decided it was time to do a clean install of the OS.. And here is where my stupidity came in.. I accidentally installed Ubuntu on /dev/sda. I've now re-installed to /dev/sde but I'm having trouble re-assembling the raid I had backed up my /etc folder but again stupidity, I backed it up to my raid thinking I wouldn't have any issues reassembling it I know the data on /dev/sda is lost.. I'm just wondering how much I can recover with the other 3 drives. In general, i get the following error on most attempts. If I change the order of the disks, then i get this error for the first drive in the list

mdadm: no recogniseable superblock on /dev/sdb1
mdadm: /dev/sdb1 has no superblock - assembly aborted

I've tried to assemble a number of different ways but always hitting an issue

sudo mdadm --assemble --scan
sudo mdadm --assemble --force /dev/md0 /dev/sda1 /dev/sdb1 /dev/sdc1 /dev/sdd1

I've thought that maybe i just assemble with the 3 good disks and then I could add /dev/sda after but I get that same error

sudo mdadm --assemble --force /dev/md0 /dev/sdb1 /dev/sdc1 /dev/sdd1

I even thought maybe I could create a new raid 5 with /dev/sda and then add the others in but that didn't seem to work either

sudo mdadm --create /dev/md0 --level=5 --raid-devices=4 /dev/sda1 missing missing missing
sudo mdadm --manage /dev/md0 --re-add /dev/sdb1
sudo mdadm --manage /dev/md0 --add /dev/sdb1

Output of fdisk -l

Disk /dev/sda: 3.64 TiB, 4000787030016 bytes, 7814037168 sectors
Disk model: ST4000DM000-1F21
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disklabel type: gpt
Disk identifier: EEFB7529-8BC9-2C46-B529-DFEC565586AE

Device     Start        End    Sectors  Size Type
/dev/sda1   2048 7814037134 7814035087  3.6T Linux filesystem


Disk /dev/sdb: 3.64 TiB, 4000787030016 bytes, 7814037168 sectors
Disk model: ST4000DM000-1F21
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disklabel type: gpt
Disk identifier: 6559C610-734A-4978-B52D-FB8561D8CB6E

Device     Start        End    Sectors  Size Type
/dev/sdb1   2048 7814035455 7814033408  3.6T Linux RAID


Disk /dev/sdc: 3.64 TiB, 4000787030016 bytes, 7814037168 sectors
Disk model: ST4000DM000-1F21
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disklabel type: gpt
Disk identifier: 0F3B652A-6AB8-4A98-B91B-F50C1373347A

Device     Start        End    Sectors  Size Type
/dev/sdc1   2048 7814035455 7814033408  3.6T Linux RAID


Disk /dev/sdd: 3.64 TiB, 4000787030016 bytes, 7814037168 sectors
Disk model: ST4000DM000-1F21
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disklabel type: gpt
Disk identifier: E50D44A3-67C3-43AD-AD96-D11A66A0DCC8

Device     Start        End    Sectors  Size Type
/dev/sdd1   2048 7814035455 7814033408  3.6T Linux RAID


Disk /dev/sde: 111.79 GiB, 120034123776 bytes, 234441648 sectors
Disk model: OCZ-SOLID3
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: gpt
Disk identifier: 6D5DF417-2864-48FC-83DC-A1E42EF3DE9C

Device       Start       End   Sectors   Size Type
/dev/sde1     2048   2203647   2201600     1G EFI System
/dev/sde2  2203648 234438655 232235008 110.7G Linux filesystem

Output from mdadm --examine /dev/sd*

/dev/sda:
   MBR Magic : aa55
Partition :   4294967295 sectors at            1 (type ee)
/dev/sda1:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x1
     Array UUID : 1d5b33b8:7c4c31ce:c694e0b0:d0335928
           Name : mediaserver:0  (local to host mediaserver)
  Creation Time : Wed Sep 25 21:14:48 2024
     Raid Level : raid5
   Raid Devices : 4

 Avail Dev Size : 7813770895 sectors (3.64 TiB 4.00 TB)
     Array Size : 11720655360 KiB (10.92 TiB 12.00 TB)
  Used Dev Size : 7813770240 sectors (3.64 TiB 4.00 TB)
    Data Offset : 264192 sectors
   Super Offset : 8 sectors
   Unused Space : before=264112 sectors, after=655 sectors
          State : clean
    Device UUID : 15aba39b:06fdb0ba:3b3fe20f:d646646e

Internal Bitmap : 8 sectors from superblock
    Update Time : Wed Sep 25 21:14:48 2024
  Bad Block Log : 512 entries available at offset 24 sectors
       Checksum : 650e8daa - correct
         Events : 0

         Layout : left-symmetric
     Chunk Size : 512K

   Device Role : Active device 0
   Array State : A... ('A' == active, '.' == missing, 'R' == replacing)
/dev/sdb:
   MBR Magic : aa55
Partition :   4294967295 sectors at            1 (type ee)
mdadm: No md superblock detected on /dev/sdb1.
/dev/sdc:
   MBR Magic : aa55
Partition :   4294967295 sectors at            1 (type ee)
mdadm: No md superblock detected on /dev/sdc1.
/dev/sdd:
   MBR Magic : aa55
Partition :   4294967295 sectors at            1 (type ee)
mdadm: No md superblock detected on /dev/sdd1.
/dev/sde:
   MBR Magic : aa55
Partition :    234441647 sectors at            1 (type ee)
/dev/sde1:
   MBR Magic : aa55
mdadm: No md superblock detected on /dev/sde2.

ikbenben (1 rep)

Sep 26, 2024, 11:04 AM • Last activity: Sep 28, 2024, 09:34 AM

2 votes

2 answers

284 views

Does RAID always read/write an entire chunk?

raid software-raid raid5

Many resources on the internet contain conflicting information regarding read/write logic for RAID chunks. [This](https://unix.stackexchange.com/questions/118302/understanding-the-chunk-size-in-context-of-raid) answer contains the following (seemingly conflicting) pieces of information: > A 512 KB c...

                                  Many resources on the internet contain conflicting information regarding read/write logic for RAID chunks.

[This](https://unix.stackexchange.com/questions/118302/understanding-the-chunk-size-in-context-of-raid)  answer contains the following (seemingly conflicting) pieces of information:

> A 512 KB chunk size doesn't require system to write e.g. 512 KB for every 4 KB write or to read 512 KB of device surface for a 4 KB application read.

> [When reading a 16-KiB block from RAID with a 64-KiB chunk size] the RAID will perform a read/modify/write operation when writing that 4-KiB file/16-KiB block because the RAID's smallest unit of storage is 64-KiB.

On the other hand, [this](https://larryjordan.com/articles/explaining-raid-chunk-size-and-which-to-pick-for-media/)  resource contains the following pieces of information:

> For example, if you have a 10 KB text file and the chunk size is 256 KB, then that 10 KB of data is stored in a 256 KB block, with the rest of the block left empty. Conversely, with 16 KB chunks, there is much less wasted space when storing that 10 KB file.

In particular, I have the following questions:

1. When reading/writing some unit of data smaller than the RAID chunk size using a scheme _without_ parity, does this require a read/modify/write operation for the entire chunk, or only the part of the chunk that is modified?
2. When using a RAID scheme _with_ parity, does this change anything in the answer to question 1?
3. As alluded to in the second reference, does writing a unit of data smaller than the RAID chunk somehow leave the rest of the RAID chunk empty? This seems incorrect to me, but I wanted to clarify as this resource quite unambiguously states this.
4. Do any of these answers change depending on the RAID implementation (Linux kernel, hardware RAID, etc.)?

If possible, providing some sort of authoritative reference (some RAID specification, source code, etc.) would be awesome.

Thanks in advance!

quixotrykd (359 rep)

Jul 8, 2024, 12:49 AM • Last activity: Sep 24, 2024, 08:43 AM

1 votes

0 answers

45 views

Failed Raid5 array with 5 drives - 2 drives removed

dd mdadm software-raid raid5 qnap

# Synopsis Healthy Raid-5 array had 1 drive removed, quickly reinserted and started rebuilding. Then a second drive was removed within 10 minutes. Original drive assignments (sda, sdb etc) have changed due to further user errors (rebooting/swapping drives). Need advice on next steps. # Backstory I a...

                                  # Synopsis
Healthy Raid-5 array had 1 drive removed, quickly reinserted and started rebuilding. Then a second drive was removed within 10 minutes.  Original drive assignments (sda, sdb etc) have changed due to further user errors (rebooting/swapping drives).  Need advice on next steps.


# Backstory

I am sorry this is so long, but here is the backstory if it helps
*****************
My name is Mike.  I am not a daily user of Linux, but I can work my way through things that I need to get done usually by doing quick searches to remind me of syntax and reading man pages.  I thought I could figure this out with time (its been months), and I now realized this is something I am not comfortable doing without help since the data is invaluable to my friends family.  He has no other backups of the data since his backup drive also failed and he did not realize it… He just assumed it was working.

To start, this is a QNAP appliance that had a 5 drive raid 5 array using 8TB drives.  He logged in and noticed that a drive was marked unhealthy due to bad blocks but it was still a member and the array was still working just fine, so he wanted to replace it with a new drive before it got worse.  Unfortunately, he pulled out the wrong drive.  He quickly realized it was the wrong drive and put it back in, and it started rebuilding on that drive (I saw that in the qnap logs).  Without knowing any better, he pulled out the actual drive he wanted to replace within less than 10 minutes and put in a new drive.  He noticed the array was offline and his data was inaccessible, so he put the origional drive back in and rebooted the QNAP hoping that would fix it.  Obviously, it didn't.

He then called, and I said we do not want to do anything until we backup the data that’s on all of the origional drives.  He just so happened to have a few 12/18TB external drives that I used dd to clone the md /sdX3 partitions to (not all partitions - /sdX).

(Exact commands I used + a note as to which external drive they are on)

    dd if=/dev/sda3 of=/share/external/DEV3302_1/2024022_170502-sda3.img  (DST:18TB-1)
    dd if=/dev/sdf3 of=/share/DiskImages/2024022_164848-sdf3.img  (DST:18TB-2)
    dd if=/dev/sdb3 of=/share/external/DEV3302_1/2024022_170502-sdb3.img  (DST:18TB-1)
    dd if=/dev/sdg3 of=/share/external/DEV3305_1/2024022_170502-sdg3.img  (DST:12TB)
    dd if=/dev/sdd3 of=/share/DiskImages/2024022_170502-sdd3_Spare.img  (DST:18TB-2)

These were just quick backups and due to the age of the drives (5+ years) we figured we would also replace all of the NAS drives with new ones. I then repeated this process with each of the drives, one by one except used a process/commands like this:

Insert new drive in an empty slot, it got assigned sdh

    dd if=/dev/sda of=/dev/sdh

Wait 14 hours for it to complete, remove the drive, replace with another new drive and repeat.

    dd if=/dev/sdb of=/dev/sdh

Etc…

So, we should have exact copies of the drives.  I assumed (I think incorrectly) that we could power off the QNAP, swap the old drives out with the copied drives and then we could start trying commands like 

    mdadm -CfR /dev/md1 --assume-clean -l 5 -n 5 -c 512 -e 1.0 /dev/sda3 /dev/sdb3 /dev/sdg3 missing /dev/sdd3

(I am not certain that command is correct even before the next paragraph)

Unfortunatly, after swapping the drives we now have two missing drives instead of 1 and the assignments seemed to have changed (ex: sda is no longer sda).  I figured I must have messed up a dd copy of a drive, so we were going to start the process over on the missing drive.   I tracked which ones were showing/missing, we reinserted the original disks, however now they again have different assignments but it is back to showing only a single missing drive - I am lost.  I might be able to figure out the original order by comparing the drive UUID's?  But I do not want to touch anything before asking for advice.

# Technical description 

Here is the output of the recommended commands that were supported on the QNAP.

    [QNAPUser@QNAP ~]$ uname -a
    Linux QNAP 5.10.60-qnap #1 SMP Mon Feb 19 12:14:12 CST 2024 x86_64 GNU/Linux

    [QNAPUser@QNAP ~]$ mdadm --version
    mdadm - v3.3.4 - 3rd August 2015

    [QNAPUser@QNAP ~]$ smartctl --xall /dev/sda
    -sh: smartctl: command not found

    [QNAPUser@QNAP ~]$ sudo mdadm --examine /dev/sdb
    /dev/sdb:
       MBR Magic : aa55
    Partition :   4294967295 sectors at            1 (type ee)

    [QNAPUser@QNAP ~]$ sudo mdadm --examine /dev/sdc
    /dev/sdc:
       MBR Magic : aa55
    Partition :   4294967295 sectors at            1 (type ee)

    [QNAPUser@QNAP ~]$ sudo mdadm --examine /dev/sdd
    /dev/sdd:
       MBR Magic : aa55
    Partition :   4294967295 sectors at            1 (type ee)

    [QNAPUser@QNAP ~]$ sudo mdadm --examine /dev/sde
    /dev/sde:
       MBR Magic : aa55
    Partition :   4294967295 sectors at            1 (type ee)

    [QNAPUser@QNAP ~]$ sudo mdadm --examine /dev/sdf
    /dev/sdf:
       MBR Magic : aa55

    Partition :   4294967295 sectors at            1 (type ee)
    [QNAPUser@QNAP ~]$ sudo mdadm --examine /dev/sdg
    /dev/sdg:
       MBR Magic : aa55
    Partition :   4294967295 sectors at            1 (type ee)

    [QNAPUser@QNAP ~]$ sudo mdadm --examine /dev/sdh

[QNAPUser@QNAP ~]$ sudo mdadm --examine /dev/sdb3

    /dev/sdb3:
              Magic : a92b4efc
            Version : 1.0
        Feature Map : 0x0
         Array UUID : 29f7c4cf:b6273e81:34f3f156:1cd1cfe2
               Name : 1
      Creation Time : Thu Aug 17 13:28:50 2017
         Raid Level : raid5
       Raid Devices : 5
    
     Avail Dev Size : 15608143240 (7442.54 GiB 7991.37 GB)
         Array Size : 31216285696 (29770.17 GiB 31965.48 GB)
      Used Dev Size : 15608142848 (7442.54 GiB 7991.37 GB)
       Super Offset : 15608143504 sectors
       Unused Space : before=0 sectors, after=648 sectors
              State : clean
        Device UUID : f49eadd1:661a76d3:6ed998ad:3a39f4a9
    
        Update Time : Thu Feb 29 17:05:02 2024
      Bad Block Log : 512 entries available at offset -8 sectors
           Checksum : d61a661f - correct
             Events : 89359
    
             Layout : left-symmetric
         Chunk Size : 512K
    
       Device Role : Active device 0
       Array State : AAAA. ('A' == active, '.' == missing, 'R' == replacing)

[QNAPUser@QNAP ~]$ sudo mdadm --examine /dev/sdc3

    /dev/sdc3:
              Magic : a92b4efc
            Version : 1.0
        Feature Map : 0x0
         Array UUID : 29f7c4cf:b6273e81:34f3f156:1cd1cfe2
               Name : 1
      Creation Time : Thu Aug 17 13:28:50 2017
         Raid Level : raid5
       Raid Devices : 5
    
     Avail Dev Size : 15608143240 (7442.54 GiB 7991.37 GB)
         Array Size : 31216285696 (29770.17 GiB 31965.48 GB)
      Used Dev Size : 15608142848 (7442.54 GiB 7991.37 GB)
       Super Offset : 15608143504 sectors
       Unused Space : before=0 sectors, after=648 sectors
              State : clean
        Device UUID : b50fdcc1:3024551b:e56c1e38:8f9bc7f8
    
        Update Time : Thu Feb 29 17:05:02 2024
      Bad Block Log : 512 entries available at offset -8 sectors
           Checksum : e780d676 - correct
             Events : 89359
    
             Layout : left-symmetric
         Chunk Size : 512K
    
       Device Role : Active device 1
       Array State : AAAA. ('A' == active, '.' == missing, 'R' == replacing)

[QNAPUser@QNAP ~]$ sudo mdadm --examine /dev/sde3

    /dev/sde3:
              Magic : a92b4efc
            Version : 1.0
        Feature Map : 0x0
         Array UUID : 29f7c4cf:b6273e81:34f3f156:1cd1cfe2
               Name : 1
      Creation Time : Thu Aug 17 13:28:50 2017
         Raid Level : raid5
       Raid Devices : 5
    
     Avail Dev Size : 15608143240 (7442.54 GiB 7991.37 GB)
         Array Size : 31216285696 (29770.17 GiB 31965.48 GB)
      Used Dev Size : 15608142848 (7442.54 GiB 7991.37 GB)
       Super Offset : 15608143504 sectors
       Unused Space : before=0 sectors, after=648 sectors
              State : clean
        Device UUID : ae2c3578:723041ba:f06efdb1:7df6cbb2
    
        Update Time : Thu Feb 29 17:05:02 2024
      Bad Block Log : 512 entries available at offset -8 sectors
           Checksum : 70a95caf - correct
             Events : 89359
    
             Layout : left-symmetric
         Chunk Size : 512K
    
       Device Role : spare
       Array State : AAAA. ('A' == active, '.' == missing, 'R' == replacing)

[QNAPUser@QNAP ~]$ sudo mdadm --examine /dev/sdg3

    /dev/sdg3:
              Magic : a92b4efc
            Version : 1.0
        Feature Map : 0x0
         Array UUID : 29f7c4cf:b6273e81:34f3f156:1cd1cfe2
               Name : 1
      Creation Time : Thu Aug 17 13:28:50 2017
         Raid Level : raid5
       Raid Devices : 5
    
     Avail Dev Size : 15608143240 (7442.54 GiB 7991.37 GB)
         Array Size : 31216285696 (29770.17 GiB 31965.48 GB)
      Used Dev Size : 15608142848 (7442.54 GiB 7991.37 GB)
       Super Offset : 15608143504 sectors
       Unused Space : before=0 sectors, after=648 sectors
              State : clean
        Device UUID : cf03e7e1:2ad22385:41793b2c:4f93666c
    
        Update Time : Thu Feb 29 16:38:38 2024
      Bad Block Log : 512 entries available at offset -8 sectors
           Checksum : da1a5378 - correct
             Events : 80401
    
             Layout : left-symmetric
         Chunk Size : 512K
    
       Device Role : Active device 3
       Array State : AAAAA ('A' == active, '.' == missing, 'R' == replacing)

[QNAPUser@QNAP ~]$ sudo mdadm --examine /dev/sdh3

    /dev/sdh3:
              Magic : a92b4efc
            Version : 1.0
        Feature Map : 0x0
         Array UUID : 29f7c4cf:b6273e81:34f3f156:1cd1cfe2
               Name : 1
      Creation Time : Thu Aug 17 13:28:50 2017
         Raid Level : raid5
       Raid Devices : 5
    
     Avail Dev Size : 15608143240 (7442.54 GiB 7991.37 GB)
         Array Size : 31216285696 (29770.17 GiB 31965.48 GB)
      Used Dev Size : 15608142848 (7442.54 GiB 7991.37 GB)
       Super Offset : 15608143504 sectors
       Unused Space : before=0 sectors, after=648 sectors
              State : clean
        Device UUID : a06d8a8d:965b58fe:360c43cd:e252a328
    
        Update Time : Thu Feb 29 17:05:02 2024
      Bad Block Log : 512 entries available at offset -8 sectors
           Checksum : 5b32c26d - correct
             Events : 89359
    
             Layout : left-symmetric
         Chunk Size : 512K
    
       Device Role : Active device 2
       Array State : AAAA. ('A' == active, '.' == missing, 'R' == replacing)




[QNAPUser@QNAP ~]$ sudo mdadm --detail /dev/md1
(This is the array that is broken))

mdadm: cannot open /dev/md1: No such file or directory


[QNAPUser@QNAP ~]$ git clone git://github.com/pturmel/lsdrv.git lsdrv
-sh: git: command not found

[QNAPUser@QNAP ~]$ cat /proc/mdstat

    Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] [multipath]
    md3 : active raid1 sdd3
          17568371520 blocks super 1.0 [1/1] [U]
    
    md2 : active raid1 sdf3
          7804071616 blocks super 1.0 [1/1] [U]
    
    md322 : active raid1 sdd5(S) sdf5(S) sde5(S) sdg5(S) sdh5(S) sdb5 sdc5
          6702656 blocks super 1.0 [2/2] [UU]
          bitmap: 0/1 pages [0KB], 65536KB chunk
    
    md256 : active raid1 sdd2(S) sdf2(S) sde2(S) sdg2(S) sdh2(S) sdb2 sdc2
          530112 blocks super 1.0 [2/2] [UU]
          bitmap: 0/1 pages [0KB], 65536KB chunk
    
    md13 : active raid1 sde4 sdg4 sdh4 sdb4 sdc4 sdf4
          458880 blocks super 1.0 [24/6] [_UUUUUU_________________]
          bitmap: 1/1 pages [4KB], 65536KB chunk
    
    md9 : active raid1 sde1 sdg1 sdh1 sdb1 sdc1 sdf1
          530048 blocks super 1.0 [24/6] [_UUUUUU_________________]
          bitmap: 1/1 pages [4KB], 65536KB chunk

    unused devices: 

[QNAPUser@QNAP ~]$ sudo md_checker

Welcome to MD superblock checker (v2.0) - have a nice day~

Scanning system...


    RAID metadata found!
    UUID:           29f7c4cf:b6273e81:34f3f156:1cd1cfe2
    Level:          raid5
    Devices:        5
    Name:           md1
    Chunk Size:     512K
    md Version:     1.0
    Creation Time:  Aug 17 13:28:50 2017
    Status:         OFFLINE
    ===============================================================================================
     Enclosure | Port | Block Dev Name | # | Status |   Last Update Time   | Events | Array State
    ===============================================================================================
     NAS_HOST       8        /dev/sdb3   0   Active   Feb 29 17:05:02 2024    89359   AAAA.
     NAS_HOST       7        /dev/sdc3   1   Active   Feb 29 17:05:02 2024    89359   AAAA.
     NAS_HOST       9        /dev/sdh3   2   Active   Feb 29 17:05:02 2024    89359   AAAA.
     NAS_HOST      10        /dev/sdg3   3   Active   Feb 29 16:38:38 2024    80401   AAAAA
     ----------------------------------  4  Missing   -------------------------------------------
    ===============================================================================================

md_checker is a QNAP command, so you might not be familiar with it, but the output should be useful.

Based on the output above (**specifically the Last Update Time and Events**), I believe that sdg3 was the first drive to be temporarily pulled from the array and was in the process of rebuilding when the second drive was pulled (now showing as "4 Missing"?) .  I believe the second drive is now assigned to sde which is showing Device Role : spare.  I am basing this on the the fact that the number of events and Last update time of sdb3, sdc3, sdh3 and sde3 are identical.

My goal is to do a recovery using copies of the drives, not the original drives in case something happens to make the issue worse.  We do not need the array to be "healthy" or writable since we just need to make a copy/backup of the data.

What would be the best way to accomplish this?   

How can I be certain of the command and order to reassemble the array, and what is the least destructive way to assemble it?

I would greatly appreciate any advice I can get, since I am just starting to confuse myself and possibly making the issue worse.





                                

MikeD (11 rep)

Sep 3, 2024, 10:19 AM • Last activity: Sep 3, 2024, 10:21 AM

2 votes

1 answers

46 views

"mdadm --grow" is stuck with faulty devices

linux mdadm raid5

I am not the first one with a stuck `mdadm --grow` but I think that mine is a bit different from the others. In my case all devices are *faulty* and the state says *FAILED*: ``` root@linux:~# mdadm --detail /dev/md126 /dev/md126: Version : 1.2 Creation Time : Tue Jan 26 11:57:52 2021 Raid Level : ra...

I am not the first one with a stuck mdadm --grow but I think that mine is a bit different from the others. In my case all devices are *faulty* and the state says *FAILED*:

root@linux:~# mdadm --detail /dev/md126 
/dev/md126:
           Version : 1.2
     Creation Time : Tue Jan 26 11:57:52 2021
        Raid Level : raid5
        Array Size : 1953258496 (1862.77 GiB 2000.14 GB)
     Used Dev Size : 976629248 (931.39 GiB 1000.07 GB)
      Raid Devices : 4
     Total Devices : 4
       Persistence : Superblock is persistent

     Intent Bitmap : Internal

       Update Time : Mon Aug 19 14:56:31 2024
             State : active, FAILED, reshaping 
    Active Devices : 0
    Failed Devices : 4
     Spare Devices : 0

            Layout : left-symmetric
        Chunk Size : 512K

Consistency Policy : bitmap

    Reshape Status : 38% complete

    Number   Major   Minor   RaidDevice State
       0       8       17        0      faulty   /dev/sdb1
       1       8       65        1      faulty   /dev/sde1
       3       8       49        2      faulty   /dev/sdd1
       4       8       33        3      faulty   /dev/sdc1

I created the RAID5 with 3 1TB SSD disks and used it completely for LVM. Yesterday I added a 4th 1TB SSD disk and did the following commands: mdadm --add /dev/md126 /dev/sdc1 mdadm --grow /dev/md126 --raid-devices=4 At first there was no problem. The RAID5 was still active and slowly accessible. About 4 hours later something must have happened. This morning I checked and the status of mdadm did not change but I lost my RAID5 in LVM although the LVs are still mounted but somehow crippled. With dmesg I get errors like:

[81591.695415] EXT4-fs (dm-5): I/O error while writing superblock
[81591.710467] EXT4-fs error (device dm-5): __ext4_get_inode_loc_noinmem:4617: inode #524289: block 2097184: comm ls: unable to read itable block
[81591.710488] Buffer I/O error on dev dm-5, logical block 0, lost sync page write
[81591.710495] EXT4-fs (dm-5): I/O error while writing superblock
[82806.711267] sd 0:0:0:0: [sdb] tag#8 FAILED Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK cmd_age=0s
[82806.711279] sd 0:0:0:0: [sdb] tag#8 CDB: ATA command pass through(16) 85 06 2c 00 00 00 00 00 00 00 00 00 00 00 e5 00
[82806.711333] sd 5:0:0:0: [sdc] tag#9 FAILED Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK cmd_age=0s
[82806.711339] sd 5:0:0:0: [sdc] tag#9 CDB: ATA command pass through(16) 85 06 2c 00 00 00 00 00 00 00 00 00 00 00 e5 00
[82806.711382] sd 4:0:0:0: [sdd] tag#11 FAILED Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK cmd_age=0s
[82806.711388] sd 4:0:0:0: [sdd] tag#11 CDB: ATA command pass through(16) 85 06 2c 00 00 00 00 00 00 00 00 00 00 00 e5 00
[82806.711431] sd 1:0:0:0: [sde] tag#21 FAILED Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK cmd_age=0s
[82806.711436] sd 1:0:0:0: [sde] tag#21 CDB: ATA command pass through(16) 85 06 2c 00 00 00 00 00 00 00 00 00 00 00 e5 00

The mdadm --examine --scan gives no output at all, it just stops. The current content is:

# automatically tag new arrays as belonging to the local system
HOMEHOST 

# instruct the monitoring daemon where to send mail alerts
MAILADDR root

# definitions of existing MD arrays
ARRAY /dev/md126 level=raid5 num-devices=3 metadata=1.2 name=horus:0 UUID=b187df52:41d7a47e:98e7fa00:cae9bf67
   devices=/dev/sda1,/dev/sdb1,/dev/sdc1

Looks like the *md superblocks* are gone:

root@horus:~# mdadm --examine /dev/sd[abcde]1
/dev/sda1:
   MBR Magic : aa55
Partition :    234441647 sectors at            1 (type ee)
mdadm: No md superblock detected on /dev/sdb1.
mdadm: No md superblock detected on /dev/sdc1.
mdadm: No md superblock detected on /dev/sdd1.
mdadm: No md superblock detected on /dev/sde1.

By looking at it now I notice that the devices were not correct. Since 2021 the devices must have changed. There has always been something strange about my RAID5. It changed from /dev/md0 to /dev/md127 and back once in a while and now, after adding the 4th disk, /dev/md126. The issue started with:

Aug 19 10:18:20 horus kernel: [ 1787.252546] md: reshape of RAID array md126
Aug 19 14:53:31 horus kernel: [18298.539238] ahci 0000:01:00.1: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x000d address=0xdfd6c000 flags=0x0000]
Aug 19 14:53:31 horus kernel: [18298.835872] ata6.00: exception Emask 0x10 SAct 0xe0000 SErr 0x0 action 0x6 frozen
Aug 19 14:53:31 horus kernel: [18298.835898] ata6.00: irq_stat 0x08000000, interface fatal error
Aug 19 14:53:31 horus kernel: [18298.835914] ata6.00: failed command: WRITE FPDMA QUEUED
Aug 19 14:53:31 horus kernel: [18298.835925] ata6.00: cmd 61/18:88:68:86:ee/00:00:2c:00:00/40 tag 17 ncq dma 12288 out
Aug 19 14:53:31 horus kernel: [18298.835925]          res 40/00:98:80:86:ee/00:00:2c:00:00/40 Emask 0x10 (ATA bus error)
Aug 19 14:53:31 horus kernel: [18298.835962] ata6.00: status: { DRDY }
Aug 19 14:53:31 horus kernel: [18298.835973] ata6.00: failed command: WRITE FPDMA QUEUED
Aug 19 14:53:31 horus kernel: [18298.835985] ata6.00: cmd 61/18:90:80:8a:ee/00:00:2c:00:00/40 tag 18 ncq dma 12288 out
Aug 19 14:53:31 horus kernel: [18298.835985]          res 40/00:98:80:86:ee/00:00:2c:00:00/40 Emask 0x10 (ATA bus error)
Aug 19 14:53:31 horus kernel: [18298.836022] ata6.00: status: { DRDY }
Aug 19 14:53:31 horus kernel: [18298.836034] ata6.00: failed command: WRITE FPDMA QUEUED
Aug 19 14:53:31 horus kernel: [18298.836045] ata6.00: cmd 61/18:98:80:86:ee/00:00:2c:00:00/40 tag 19 ncq dma 12288 out
Aug 19 14:53:31 horus kernel: [18298.836045]          res 40/00:98:80:86:ee/00:00:2c:00:00/40 Emask 0x10 (ATA bus error)
Aug 19 14:53:31 horus kernel: [18298.836082] ata6.00: status: { DRDY }
Aug 19 14:53:31 horus kernel: [18298.836096] ata6: hard resetting link

Looks like the last entered disk has died during the *grow*. Is there a solution for this? Should I stop mdadm and restart the grow? Something else?

Marco (915 rep)

Aug 20, 2024, 07:38 AM • Last activity: Aug 20, 2024, 03:23 PM

0 votes

2 answers

1433 views

What is the difference between BTRFS RAID1 and BTRFS RAID5 on 3+ devices?

btrfs raid1 raid5

According to [_"Examining btrfs, Linux’s perpetually half-finished filesystem"_](https://arstechnica.com/gadgets/2021/09/examining-btrfs-linuxs-perpetually-half-finished-filesystem/), BTRFS RAID1 is said to be _"guaranteed redundancy—copies of all blocks will be **saved on two separate devices**"_....

                                  According to [_"Examining btrfs, Linux’s perpetually half-finished filesystem"_](https://arstechnica.com/gadgets/2021/09/examining-btrfs-linuxs-perpetually-half-finished-filesystem/) , BTRFS RAID1 is said to be _"guaranteed redundancy—copies of all blocks will be **saved on two separate devices**"_. It goes on to say that with BTRFS on both RAID1 and RAID5 you can have devices of different sizes. You can also have more than 3 devices with both.

Assuming you have three disks, what is the difference in btrfs between RAID1 and RAID5? They both protect against failure of one drive in the array.

Evan Carroll (34663 rep)

Jul 31, 2023, 04:37 AM • Last activity: Jul 24, 2024, 10:12 PM

0 votes

1 answers

132 views

mdadm RAID5 array became "active, FAILED" and now no longer mounts

linux ubuntu raid software-raid raid5

Today my RAID5 array status from `mdadm --detail /dev/md0` became `active, FAILED` (I have mobile notifications setup) Most of the files were present but some were missing. This happened before and I solved it by rebooting the machine (I think the status was different tho) No luck this time, now the...

Today my RAID5 array status from mdadm --detail /dev/md0 became active, FAILED (I have mobile notifications setup) Most of the files were present but some were missing. This happened before and I solved it by rebooting the machine (I think the status was different tho) No luck this time, now the RAID no longer works Here are some details about my setup:

mdadm --detail /dev/md0

/dev/md0:
           Version : 1.2
        Raid Level : raid5
     Total Devices : 4
       Persistence : Superblock is persistent

             State : inactive
   Working Devices : 4

              Name : RHomeServer:0  (local to host RHomeServer)
              UUID : d88446ae:f9f5c759:b4ac531c:933f8c62
            Events : 104300

    Number   Major   Minor   RaidDevice

       -       8       64        -        /dev/sde
       -       8       32        -        /dev/sdc
       -       8       48        -        /dev/sdd
       -       8       16        -        /dev/sdb

cat /etc/mdadm/mdadm.conf

ARRAY /dev/md0 metadata=1.2 name=RHomeServer:0 UUID=d88446ae:f9f5c759:b4ac531c:933f8c62

sudo blkid | grep sd

/dev/sdd: UUID="d88446ae-f9f5-c759-b4ac-531c933f8c62" UUID_SUB="ce9975e7-ec99-8630-acad-b3d090287950" LABEL="RHomeServer:0" TYPE="linux_raid_member"
/dev/sdb: UUID="d88446ae-f9f5-c759-b4ac-531c933f8c62" UUID_SUB="f20836f9-8960-0b38-e1c3-ea22cba58014" LABEL="RHomeServer:0" TYPE="linux_raid_member"
/dev/sde: UUID="d88446ae-f9f5-c759-b4ac-531c933f8c62" UUID_SUB="d49ff5be-6b1b-8e8d-26b1-52e90bb05ce2" LABEL="RHomeServer:0" TYPE="linux_raid_member"
/dev/sdc: UUID="d88446ae-f9f5-c759-b4ac-531c933f8c62" UUID_SUB="863075ad-9e12-6cfd-3dbf-35017b1f408d" LABEL="RHomeServer:0" TYPE="linux_raid_member"

mdadm --examine /dev/sd[b-e]

/dev/sdb:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x1
     Array UUID : d88446ae:f9f5c759:b4ac531c:933f8c62
           Name : RHomeServer:0  (local to host RHomeServer)
  Creation Time : Thu Aug 10 16:54:57 2023
     Raid Level : raid5
   Raid Devices : 4

 Avail Dev Size : 35156391936 sectors (16.37 TiB 18.00 TB)
     Array Size : 52734587904 KiB (49.11 TiB 54.00 TB)
    Data Offset : 264192 sectors
   Super Offset : 8 sectors
   Unused Space : before=264096 sectors, after=0 sectors
          State : clean
    Device UUID : f20836f9:89600b38:e1c3ea22:cba58014

Internal Bitmap : 8 sectors from superblock
    Update Time : Tue Jul 16 18:13:26 2024
  Bad Block Log : 512 entries available at offset 80 sectors
       Checksum : dd7e994d - correct
         Events : 104300

         Layout : left-symmetric
     Chunk Size : 512K

   Device Role : spare
   Array State : .... ('A' == active, '.' == missing, 'R' == replacing)

/dev/sdc:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x1
     Array UUID : d88446ae:f9f5c759:b4ac531c:933f8c62
           Name : RHomeServer:0  (local to host RHomeServer)
  Creation Time : Thu Aug 10 16:54:57 2023
     Raid Level : raid5
   Raid Devices : 4

 Avail Dev Size : 35156391936 sectors (16.37 TiB 18.00 TB)
     Array Size : 52734587904 KiB (49.11 TiB 54.00 TB)
    Data Offset : 264192 sectors
   Super Offset : 8 sectors
   Unused Space : before=264096 sectors, after=0 sectors
          State : active
    Device UUID : 863075ad:9e126cfd:3dbf3501:7b1f408d

Internal Bitmap : 8 sectors from superblock
    Update Time : Tue Jul 16 18:13:26 2024
  Bad Block Log : 512 entries available at offset 80 sectors
       Checksum : ae28e804 - correct
         Events : 104300

         Layout : left-symmetric
     Chunk Size : 512K

   Device Role : spare
   Array State : .... ('A' == active, '.' == missing, 'R' == replacing)

/dev/sdd:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x1
     Array UUID : d88446ae:f9f5c759:b4ac531c:933f8c62
           Name : RHomeServer:0  (local to host RHomeServer)
  Creation Time : Thu Aug 10 16:54:57 2023
     Raid Level : raid5
   Raid Devices : 4

 Avail Dev Size : 35156391936 sectors (16.37 TiB 18.00 TB)
     Array Size : 52734587904 KiB (49.11 TiB 54.00 TB)
    Data Offset : 264192 sectors
   Super Offset : 8 sectors
   Unused Space : before=264096 sectors, after=0 sectors
          State : active
    Device UUID : ce9975e7:ec998630:acadb3d0:90287950

Internal Bitmap : 8 sectors from superblock
    Update Time : Tue Jul 16 18:13:26 2024
  Bad Block Log : 512 entries available at offset 80 sectors
       Checksum : adfad01f - correct
         Events : 104300

         Layout : left-symmetric
     Chunk Size : 512K

   Device Role : spare
   Array State : .... ('A' == active, '.' == missing, 'R' == replacing)

/dev/sde:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x1
     Array UUID : d88446ae:f9f5c759:b4ac531c:933f8c62
           Name : RHomeServer:0  (local to host RHomeServer)
  Creation Time : Thu Aug 10 16:54:57 2023
     Raid Level : raid5
   Raid Devices : 4

 Avail Dev Size : 35156391936 sectors (16.37 TiB 18.00 TB)
     Array Size : 52734587904 KiB (49.11 TiB 54.00 TB)
    Data Offset : 264192 sectors
   Super Offset : 8 sectors
   Unused Space : before=264096 sectors, after=0 sectors
          State : clean
    Device UUID : d49ff5be:6b1b8e8d:26b152e9:0bb05ce2

Internal Bitmap : 8 sectors from superblock
    Update Time : Tue Jul 16 18:13:26 2024
  Bad Block Log : 512 entries available at offset 80 sectors
       Checksum : 8d04e29c - correct
         Events : 104300

         Layout : left-symmetric
     Chunk Size : 512K

   Device Role : spare
   Array State : .... ('A' == active, '.' == missing, 'R' == replacing)

cat /proc/mdstat

Personalities : [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
unused devices:

The drives seem healthy, no SMART issues I tried running mdadm --assemble --scan but I get

mdadm: /dev/md0 assembled from 0 drives and 4 spares - not enough to start the array.

mdadm: failed to RUN_ARRAY /dev/md0: Invalid argument
mdadm: Not enough devices to start the array.

The exact same for mdadm --assemble /dev/md0 /dev/sdb /dev/sdc /dev/sdd /dev/sde and with/without the flags --run --force I'm running Ubuntu 24.10, everything up to date including kernel 6.9.9 What can I do to recover my array? I've read about doing

mdadm --stop /dev/md0
mdadm --zero-superblock /dev/sdb /dev/sdc /dev/sdd /dev/sde
mdadm --create --assume-clean /dev/md0 --level=5 --raid-devices=4 /dev/sdb /dev/sdc /dev/sdd /dev/sde

but I'm afraid that if I lose my data I also break any other recovery attempt.

Radu Ursache (111 rep)

Jul 16, 2024, 04:30 PM • Last activity: Jul 16, 2024, 04:54 PM

4 votes

1 answers

328 views

RAID5 - Mark a disk faulty during reshape

linux mdadm software-raid raid5

# Context I have a software RAID5 array (mdadm) on 3 disks. Last week, one disk started to get reading issue: ``` # dmesg ata3.00: exception Emask 0x0 SAct 0x30000001 SErr 0x0 action 0x0 ata3.00: irq_stat 0x40000008 ata3.00: failed command: READ FPDMA QUEUED ata3.00: cmd 60/08:e0:40:0b:c6/00:00:a1:0...

# Context I have a software RAID5 array (mdadm) on 3 disks. Last week, one disk started to get reading issue:

# dmesg
ata3.00: exception Emask 0x0 SAct 0x30000001 SErr 0x0 action 0x0
ata3.00: irq_stat 0x40000008
ata3.00: failed command: READ FPDMA QUEUED
ata3.00: cmd 60/08:e0:40:0b:c6/00:00:a1:00:00/40 tag 28 ncq dma 4096 in
         res 41/40:00:40:0b:c6/00:00:a1:00:00/40 Emask 0x409 (media error) 
ata3.00: status: { DRDY ERR }
ata3.00: error: { UNC }
ata3.00: configured for UDMA/133
sd 2:0:0:0: [sdc] tag#28 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_OK cmd_age=5s
sd 2:0:0:0: [sdc] tag#28 Sense Key : Medium Error [current] 
sd 2:0:0:0: [sdc] tag#28 Add. Sense: Unrecovered read error - auto reallocate failed
sd 2:0:0:0: [sdc] tag#28 CDB: Read(16) 88 00 00 00 00 00 a1 c6 0b 40 00 00 00 08 00 00
I/O error, dev sdc, sector 2714110784 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 2
ata3: EH complete

So I've formatted and added a new device to the array and then grow the array

# mdadm --add /dev/md0 /dev/sdd1
# mdadm --grow --raid-devices=4 /dev/md0

It seems that it wasn't the best idea. Due to reading issue of the faulty disk, reshape operation estimated duration is more or less 6 months. (below 12-hours progress)

$ cat /proc/mdstat 
Personalities : [raid6] [raid5] [raid4] [linear] [multipath] [raid0] [raid1] [raid10] 
md0 : active raid5 sdd1 sdc1 sde1 sdb1
      5860269184 blocks super 1.2 level 5, 64k chunk, algorithm 2 [4/4] [UUUU]
      [>....................]  reshape =  0.2% (6232960/2930134592) finish=265471.5min speed=183K/sec
      bitmap: 4/22 pages [16KB], 65536KB chunk

unused devices:

So many events can occur meanwhile like power issue or second disk fail for example. I would love telling mdadm to stop reading the faulty disk but it seems stopping reshaping operation may lead to data loss. # Questions 1. Should I mark faulty the disk with reading issue while reshaping operation? 2. Is there a clever way to speed up reshaping? 3. Any other advices? Thanks a lot for your ideas and help.

BiBzz (43 rep)

Jun 17, 2024, 11:57 AM • Last activity: Jun 17, 2024, 02:55 PM

0 votes

3 answers

549 views

How to extend LVM RAID5

lvm software-raid raid5

This is the first time I'm posting here; usually, I always find someone who has had the same problem as me. But I'm facing an issue with my RAID5 setup using LVM and could really use some help. Initially, I had a RAID5 array with three 20TB disks, providing a usable partition of 37TB. Everything was working perfectly. Recently, I decided to add an additional disk to increase my storage capacity while maintaining the benefits of RAID5. Here is my current situation: - The new disk is successfully added to my Volume Group (VG), which now has a total size of 72.76TB. - I extended my Logical Volume (LV) to 54TB. - However, I'm unable to increase the size of my partition. I'm not sure what I'm missing or doing wrong. And of course, I don't have the ability to make backups. Any advice or guidance would be greatly appreciated! Thanks in advance!

vgs
  VG      #PV #LV #SN Attr   VSize    VFree
  vg-raid   4   1   0 wz--n-   72,76t    0

lvs --all
  LV                  VG      Attr       LSize   Pool Origin Data%  Meta%  Move Log Cpy%Sync Convert
  home                hyp1-vg -wi-ao---- 895,50g                                        

  root                hyp1-vg -wi-ao----  23,28g                                        

  swap_1              hyp1-vg -wi-ao---- 976,00m                                        

  tmp                 hyp1-vg -wi-ao----   2500), lowering kernel.perf_event_max_sample_rate to 79750
[11111.774521] Buffer I/O error on dev dm-8, logical block 9766436864, lost async page write
[11111.774613] Buffer I/O error on dev dm-8, logical block 9766961152, lost async page write
[11111.774699] Buffer I/O error on dev dm-8, logical block 9767485440, lost async page write
[11111.774792] Buffer I/O error on dev dm-8, logical block 9768009728, lost async page write
[11111.774882] Buffer I/O error on dev dm-8, logical block 9768534016, lost async page write
[11111.774898] Buffer I/O error on dev dm-8, logical block 9769058304, lost async page write
[11111.775052] Buffer I/O error on dev dm-8, logical block 9769582592, lost async page write
[11111.775151] Buffer I/O error on dev dm-8, logical block 9770106880, lost async page write
[11111.775264] Buffer I/O error on dev dm-8, logical block 9770631168, lost async page write
[11111.775376] Buffer I/O error on dev dm-8, logical block 9771155456, lost async page write

Tonnulus (1 rep)

May 21, 2024, 01:19 PM • Last activity: May 23, 2024, 10:40 PM

0 votes

1 answers

31 views

raid 5 mdadm replace with read errors

raid mdadm software-raid raid5

I have a software RAID 5 with 3 disks. For a few days, sda has been giving errors, and today the new disk arrived. I launched mdadm --replace and the process started correctly, but due to read errors on the disk, the operation is taking a very long time (over a week for about 1TB). Should I be patie...

                                  I have a software RAID 5 with 3 disks.

For a few days, sda has been giving errors, and today the new disk arrived.

I launched mdadm --replace and the process started correctly, but due to read errors on the disk, the operation is taking a very long time (over a week for about 1TB).

Should I be patient, or can I use --remove sda? And force the recovery using parity information instead?

What do you recommend?

Gabriel Rolland (1 rep)

Apr 29, 2024, 06:25 PM • Last activity: Apr 30, 2024, 07:54 AM

4 votes

2 answers

3681 views

Missing mdadm raid5 array reassembles as raid0 after powerout

mdadm software-raid raid5

I had RAID5 array of three disks with no spares. There was a power out, and on reboot, the array failed to come back up. In fact, the /dev/md127 device disappeared entirely, and was replaced by an incorrect /dev/md0. It was the only array on the machine. I've tried to reassemble it from the three co...

                                  I had RAID5 array of three disks with no spares. There was a power out, and on reboot, the array failed to come back up. In fact, the /dev/md127 device disappeared entirely, and was replaced by an incorrect /dev/md0. It was the only array on the machine. I've tried to reassemble it from the three component devices, but the assembly keeps creating a raid0 array instead of a raid5.

The details of the three disks are 

    root@bragi ~ # mdadm -E /dev/sdc1
    /dev/sdc1:
              Magic : a92b4efc
            Version : 1.2
        Feature Map : 0x0
         Array UUID : 002fa352:9968adbd:b0efdfea:c60ce290
               Name : bragi:0  (local to host bragi)
      Creation Time : Sun Oct 30 00:10:47 2011
         Raid Level : raid5
       Raid Devices : 3
    
     Avail Dev Size : 2930269954 (1397.26 GiB 1500.30 GB)
         Array Size : 2930269184 (2794.52 GiB 3000.60 GB)
      Used Dev Size : 2930269184 (1397.26 GiB 1500.30 GB)
        Data Offset : 2048 sectors
       Super Offset : 8 sectors
       Unused Space : before=1968 sectors, after=770 sectors
              State : clean
        Device UUID : a8a1b48a:ec28a09c:7aec4559:b839365e
    
        Update Time : Sat Oct 11 09:20:36 2014
           Checksum : 7b1ad793 - correct
             Events : 15084
    
             Layout : left-symmetric
         Chunk Size : 512K
    
       Device Role : Active device 2
       Array State : AAA ('A' == active, '.' == missing, 'R' == replacing)
    
    root@bragi ~ # mdadm -E /dev/sdd1
    /dev/sdd1:
              Magic : a92b4efc
            Version : 1.2
        Feature Map : 0x0
         Array UUID : 002fa352:9968adbd:b0efdfea:c60ce290
               Name : bragi:0  (local to host bragi)
      Creation Time : Sun Oct 30 00:10:47 2011
         Raid Level : raid5
       Raid Devices : 3
    
     Avail Dev Size : 2930269954 (1397.26 GiB 1500.30 GB)
         Array Size : 2930269184 (2794.52 GiB 3000.60 GB)
      Used Dev Size : 2930269184 (1397.26 GiB 1500.30 GB)
        Data Offset : 2048 sectors
       Super Offset : 8 sectors
       Unused Space : before=1968 sectors, after=770 sectors
              State : clean
        Device UUID : 36c08006:d5442799:b028db7c:4d4d33c5
    
        Update Time : Wed Oct 15 08:09:37 2014
           Checksum : 7e05979e - correct
             Events : 15196
    
             Layout : left-symmetric
         Chunk Size : 512K
    
       Device Role : Active device 1
       Array State : .A. ('A' == active, '.' == missing, 'R' == replacing)
    
    root@bragi ~ # mdadm -E /dev/sde1
    /dev/sde1:
              Magic : a92b4efc
            Version : 1.2
        Feature Map : 0x8
         Array UUID : 002fa352:9968adbd:b0efdfea:c60ce290
               Name : bragi:0  (local to host bragi)
      Creation Time : Sun Oct 30 00:10:47 2011
         Raid Level : raid5
       Raid Devices : 3
    
     Avail Dev Size : 2930275057 (1397.26 GiB 1500.30 GB)
         Array Size : 2930269184 (2794.52 GiB 3000.60 GB)
      Used Dev Size : 2930269184 (1397.26 GiB 1500.30 GB)
        Data Offset : 2048 sectors
       Super Offset : 8 sectors
       Unused Space : before=1960 sectors, after=5873 sectors
              State : clean
        Device UUID : b048994d:ffbbd710:8eb365d2:b0868ef0
    
        Update Time : Wed Oct 15 08:09:37 2014
      Bad Block Log : 512 entries available at offset 72 sectors - bad blocks present.
           Checksum : bdbc6fc4 - correct
             Events : 15196
    
             Layout : left-symmetric
         Chunk Size : 512K
    
       Device Role : spare
       Array State : .A. ('A' == active, '.' == missing, 'R' == replacing)

I stopped the old array, then reassembled as follows (blank lines inserted for clarity)

    root@bragi ~ # mdadm -S /dev/md0
    mdadm: stopped /dev/md0
    
    root@bragi ~ # mdadm -A /dev/md0 /dev/sdd1 /dev/sdc1 /dev/sde1
    mdadm: /dev/md0 assembled from 1 drive and 1 spare - not enough to start the array.
    
    root@bragi ~ # cat /proc/mdstat 
    Personalities : [raid6] [raid5] [raid4] 
    md0 : inactive sdd1(S) sde1(S) sdc1(S)
          4395407482 blocks super 1.2
           
    unused devices: 
    root@bragi ~ # mdadm -D /dev/md0
    /dev/md0:
            Version : 1.2
         Raid Level : raid0
      Total Devices : 3
        Persistence : Superblock is persistent
    
              State : inactive
    
               Name : bragi:0  (local to host bragi)
               UUID : 002fa352:9968adbd:b0efdfea:c60ce290
             Events : 15084
    
        Number   Major   Minor   RaidDevice
    
           -       8       33        -        /dev/sdc1
           -       8       49        -        /dev/sdd1
           -       8       65        -        /dev/sde1
    
    root@bragi ~ # mdadm -Q /dev/md0
    /dev/md0: is an md device which is not active

Why is this assembling as a raid0 device and not a raid5 device, as the superblocks of the components indicate it should? Is it because /dev/sde1 is marked as spare?

**EDIT:** I tried the following (according to @wurtel's suggestion), with the following results

    # mdadm --create -o --assume-clean --level=5 --layout=ls --chunk=512 --raid-devices=3 /dev/md0 missing /dev/sdd1 /dev/sde1
    mdadm: /dev/sdd1 appears to contain an ext2fs file system
           size=1465135936K  mtime=Sun Oct 23 13:06:11 2011
    mdadm: /dev/sdd1 appears to be part of a raid array:
           level=raid5 devices=3 ctime=Sun Oct 30 00:10:47 2011
    mdadm: /dev/sde1 appears to be part of a raid array:
           level=raid5 devices=3 ctime=Sun Oct 30 00:10:47 2011
    mdadm: partition table exists on /dev/sde1 but will be lost or
           meaningless after creating array
    Continue creating array? no
    mdadm: create aborted.
    #

So it looks like /dev/sde1 is causing the problem again. I suspect this is because it has been marked as spare. Is there anyway I can force change its role back to active? In this case I suspect assembling the array might even work.


                                

sirlark (253 rep)

Oct 22, 2014, 07:37 PM • Last activity: Apr 3, 2024, 08:26 AM

3 votes

1 answers

2589 views

Migrating from hardware to software RAID

linux raid raid5

I have an old PCI-X controller running 8 drives in RAID 5. I'd like to dump the controller and go to software RAID under Ubuntu. Is there a way to do this and retain the data from current array? **EDIT:** (and a slight tangent) The answers below are certainly fine, but here's a bit of added detail i...

                                  I have an old PCI-X controller running 8 drives in RAID 5. I'd like to dump the controller and go to software RAID under Ubuntu. Is there a way to do this and retain the data from current array?

**EDIT:** (and a slight tangent) The answers below are certainly fine, but here's a bit of added detail in my specific situation. 

The hardware raid was being done by an old Promise raid card (don't remember the model number). My whole system went down (dead mobo, most likely) and the old controller was a PCI-X card (not to be confused with PCI-e). I asked the question hoping to salvage my data. What I did was buy another Promise (HighPoint) card, and plug all the drives in and install Ubuntu. I was expecting to have to rebuild the array, but surprisingly enough, the HighPoint card saw the old array and brought it up clean.

Moral of the story - it looks like at least Promise controllers store their metadata on the arrays themselves, and appear to have some amount of forward compatibility.

kolosy (133 rep)

Aug 26, 2012, 07:46 PM • Last activity: Mar 7, 2024, 11:21 AM

1 votes

1 answers

148 views

Trouble Mounting RAID

mount uuid raid5

I have a RAID array that I cannot seem to mount after a power failure. The mount error says it cannot find the UUID even though it is to UUID reported by mdadm. mdadm --examine --scan ARRAY /dev/md/0 metadata=1.2 UUID=4a9ed4ce:505da073:afd780ed:3e5d5622 name=nas:0 fstab entry is: UUID=4a9ed4ce:505da...

                                  I have a RAID array that I cannot seem to mount after a power failure. The mount error says it cannot find the UUID even though it is to UUID reported by mdadm.

    mdadm --examine --scan

    ARRAY /dev/md/0  metadata=1.2 UUID=4a9ed4ce:505da073:afd780ed:3e5d5622 name=nas:0

fstab entry is:

    UUID=4a9ed4ce:505da073:afd780ed:3e5d5622 /md0 ext4 defaults 0 0

Trying to mount the device:

    mount /md0

    mount: /md0: can't find UUID=4a9ed4ce:505da073:afd780ed:3e5d5622.

My mdadm.conf:

    ARRAY /dev/md/nas:0 level=raid5 num-devices=6 metadata=1.2 name=nas:0 UUID=4a9ed4ce:505da073:afd780ed:3e5d5622
       devices=/dev/sda1,/dev/sdb1,/dev/sdc1,/dev/sdd1,/dev/sde1,/dev/sdf1

Can someone help me figure this out? Thanks.

As an aside, I don't know if this is a Ubuntu thing but I find it odd the device is /dev/md/0 rather than /dev/md0.

Wt Riker (111 rep)

Aug 24, 2023, 03:53 PM • Last activity: Aug 24, 2023, 06:14 PM

1 votes

0 answers

166 views

LVM on RAID 5: LV does not mount at boot, need fsck

linux debian lvm fsck raid5

I am using Debian 11. I have a mdadm RAID 5 array of 5 disks (4 active, 1 spare) used as a PV for a VG in which 2 LVs have been created : one for home, one for backup. The home LV mounts at every boot, no problem. The backup LV doesn't, boot process hangs a bit on fsck then fails. I have to go in emergency mode, run fsck manually as root on the backup LV (it seems to try and correct lots of inodes), reboot and then everything boots fine. I have been trying to look at the smartctl output for each disk but all have Reallocated_Sector_Ct or Current_Pending_Sector at 0. Has anybody got any idea ? Let me know what outputs you need. Thanks a lot. Edit: so, yeah, after checking my /etc/fstab, the backup LV was mounted after the swap and the ODD, whereas the home LV was mounted before these. I tried to move the backup LV right after the home LV, before swap and ODD, and the problem disappeared. Problem solved, thank you all for the help. Edit2 : nope, problem still here after a regular shutdown. So as asked: dmesg

[    1.388548] sd 1:0:0:0: [sda] 3907029168 512-byte logical blocks: (2.00 TB/1.82 TiB)
[    1.388556] sd 1:0:0:0: [sda] Write Protect is off
[    1.388558] sd 1:0:0:0: [sda] Mode Sense: 00 3a 00 00
[    1.388578] sd 1:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[    1.388603] sd 1:0:0:0: [sda] Preferred minimum I/O size 512 bytes
[    1.388631] sd 2:0:0:0: [sdb] 3907029168 512-byte logical blocks: (2.00 TB/1.82 TiB)
[    1.388633] sd 2:0:0:0: [sdb] 4096-byte physical blocks
[    1.388643] sd 2:0:0:0: [sdb] Write Protect is off
[    1.388646] sd 2:0:0:0: [sdb] Mode Sense: 00 3a 00 00
[    1.388659] sd 4:0:0:0: [sde] 3907029168 512-byte logical blocks: (2.00 TB/1.82 TiB)
[    1.388661] sd 2:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[    1.388662] sd 4:0:0:0: [sde] 4096-byte physical blocks
[    1.388667] sd 5:0:0:0: [sdd] 3907029168 512-byte logical blocks: (2.00 TB/1.82 TiB)
[    1.388668] sd 5:0:0:0: [sdd] 4096-byte physical blocks
[    1.388670] sd 3:0:0:0: [sdc] 3907029168 512-byte logical blocks: (2.00 TB/1.82 TiB)
[    1.388672] sd 3:0:0:0: [sdc] 4096-byte physical blocks
[    1.388672] sd 4:0:0:0: [sde] Write Protect is off
[    1.388674] sd 4:0:0:0: [sde] Mode Sense: 00 3a 00 00
[    1.388676] sd 5:0:0:0: [sdd] Write Protect is off
[    1.388678] sd 5:0:0:0: [sdd] Mode Sense: 00 3a 00 00
[    1.388683] sd 3:0:0:0: [sdc] Write Protect is off
[    1.388684] sd 3:0:0:0: [sdc] Mode Sense: 00 3a 00 00
[    1.388686] sd 2:0:0:0: [sdb] Preferred minimum I/O size 4096 bytes
[    1.388691] sd 4:0:0:0: [sde] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[    1.388694] sd 5:0:0:0: [sdd] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[    1.388699] sd 3:0:0:0: [sdc] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[    1.388712] sd 4:0:0:0: [sde] Preferred minimum I/O size 4096 bytes
[    1.388714] sd 5:0:0:0: [sdd] Preferred minimum I/O size 4096 bytes
[    1.388725] sd 3:0:0:0: [sdc] Preferred minimum I/O size 4096 bytes
[    1.429357]  sde: sde1
[    1.429601] sd 4:0:0:0: [sde] Attached SCSI disk
[    1.429697]  sdd: sdd1
[    1.429869] sd 5:0:0:0: [sdd] Attached SCSI disk
[    1.430626]  sdb: sdb1
[    1.430853] sd 2:0:0:0: [sdb] Attached SCSI disk
[    1.441422]  sda: sda1
[    1.441590] sd 1:0:0:0: [sda] Attached SCSI disk
[    1.450143]  sdc: sdc1
[    1.450331] sd 3:0:0:0: [sdc] Attached SCSI disk
[    1.531633] sr 0:0:0:0: [sr0] scsi3-mmc drive: 48x/48x writer dvd-ram cd/rw xa/form2 cdda tray
[    1.531644] cdrom: Uniform CD-ROM driver Revision: 3.20
[    1.594613] sr 0:0:0:0: Attached scsi CD-ROM sr0
[    1.868769] md/raid:md0: device sda1 operational as raid disk 3
[    1.868771] md/raid:md0: device sdd1 operational as raid disk 2
[    1.868772] md/raid:md0: device sdc1 operational as raid disk 0
[    1.868772] md/raid:md0: device sde1 operational as raid disk 1
[    1.869107] md/raid:md0: raid level 5 active with 4 out of 4 devices, algorithm 2
[    1.892978] md0: detected capacity change from 0 to 11720288256
[   33.846459] EXT4-fs (dm-0): mounted filesystem with ordered data mode. Quota mode: none.

Vartaghan (11 rep)

Jun 12, 2023, 01:16 PM • Last activity: Jun 23, 2023, 07:06 AM

0 votes

0 answers

34 views

Missing usable space on RAID5

storage raid5

I have 3 drives, 4TB each, in a RAID5 setup. Usable space is 7.24 TB, all good & checks out. I use 5.5Tb and it reports 600Gb left. Maths does not checkout. I get an external drive and I move 5.5Tb of storage off the RAID5 drives an onto a separate drive and the RAID5 drives report usable space of 7...

                                  I have 3 drives, 4TB each, in a RAID5 setup. Usable space is 7.24 TB, all good & checks out. I use 5.5Tb and it reports 600Gb left. Maths does not checkout. I get an external drive and I move 5.5Tb of storage off the RAID5 drives an onto a separate drive and the RAID5 drives report usable space of 7.24 TB.

~1TB of space was missing on the RAID5 drives and I don't understand why. I'm guessing something messed up with calculating free space, and moving files around forced things to recalculated remaining space?

At the time...
I tied using df and it reported free space as 600GB (so ~1TB missing)
I tried du and it reported total used space as 5.5TB (so should be 1.74 TB remaining)
I tried lsof | grep deleted and restarted any processes using deleted files, no difference
Restarted several times

Is this common on RAID5 setups?
Is there something I can do differently if it happens again?

Info on system in use...

Device: 
TNAS F5-221

uname -r
4.19.165+

cat /etc/os-release
NAME=Terra-Master Tnas
VERSION=3.x
ID=Tnas
VERSION_ID=2018.04.9
PRETTY_NAME="Tnas 2018.04.9"
                                

wabbit42 (101 rep)

Jan 21, 2023, 11:47 AM

Showing page 1 of 20 total questions