大容量ストレージサーバを構築する際に RAID を用いるケースはよくあると思います。 ですが、複数台の Disk 故障、突然の電源断、あるいはオペミス等が原因で Disk Array 自体に回復不能な障害が発生する可能性もゼロではありません。 RAID-Z は、ライトホール問題をも完封する極めて高い堅牢性を備えた ZFS のソフトウェア RAID ですが、10TB を超える Disk Array ともなってくると RAID の操作には慎重にならざるを得ません。 いくら RAID-Z といえども、RAID 崩壊によるデータの全消失はなんとしても避けたいものです。 そこで、SnapRAID と mergerfs を使用した、柔軟かつ堅牢なストレージサーバを検証することにします。
というよりも、FreeBSD が分からなすぎて XigmaNAS 使い続けるのが怖いというのが大きいかもです。 ZFS on Linux という選択もアリでしたが、この際だから RAID もやめっかと思い立った次第です。 リアルタイム同期でなくても特に困るワケでもないですからね。
SnapRAID
SnapRAID は、RAID (5, 6, あるいはそれ以上の冗長性) を模倣する Disk Array のバックアップツールです。(模倣と記述した通り、厳密には RAID ではないことに留意すべきです) Disk Array 全体の Parity を別の異なる Disk へとバックアップすることで、最大で6台の Disk 障害から回復することができます。 SnapRAID は JBOD で構成されるため、個々の Disk に対して直接アクセスすることが可能です。そのため、仮に Disk の故障が Parity Disk で許容できなかった場合でも、Disk Array 全体が回復不能になることはありません。
Pros
高い柔軟性
Data/Parity Disk を後から追加できる
容量の異なる Disk を Data Disk に含められる
高い堅牢性
最大6台の Disk 障害から回復可能
Parity で許容できない Disk 故障が発生しても、回復不能になるのは該当する故障 Disk のみである
sudo dnf install -y gcc make curl -LO https://github.com/amadvance/snapraid/releases/download/v12.0/snapraid-12.0.tar.gz tar zxvf snapraid-12.0.tar.gz cd snapraid-12.0/ ./configure make make check sudo make install
data d1 /mnt/data/disk1/ data d2 /mnt/data/disk2/ data d3 /mnt/data/disk3/ data d4 /mnt/data/disk4/ data d5 /mnt/data/disk5/ # data d6 /mnt/data/disk6/ # data d7 /mnt/data/disk7/ # data d8 /mnt/data/disk8/
exclude *.unrecoverable exclude /lost+found/
blocksize 64 # hashsize 16
parity
Parity ファイルの保存先になるため、Parity 用の Disk を指定します。
2-parity
N-Parity による冗長性を持たせる場合は 2-parity, 3-parity などを使用して Parity を冗長化させます。 Parity の冗長化は、最大で6台 (6-parity) まで設定することができます。 N-Parity を設定することで、最大で N 台の Disk を回復させることができます。
[rocky@localhost ~]$ sudo /usr/local/bin/snapraid sync Self test... Loading state from /var/snapraid.content... WARNING! Content file '/var/snapraid.content' not found, trying with another copy... Loading state from /mnt/data/disk1/snapraid.content... WARNING! Content file '/mnt/data/disk1/snapraid.content' not found, trying with another copy... Loading state from /mnt/data/disk2/snapraid.content... WARNING! Content file '/mnt/data/disk2/snapraid.content' not found, trying with another copy... Loading state from /mnt/data/disk3/snapraid.content... WARNING! Content file '/mnt/data/disk3/snapraid.content' not found, trying with another copy... Loading state from /mnt/data/disk4/snapraid.content... WARNING! Content file '/mnt/data/disk4/snapraid.content' not found, trying with another copy... Loading state from /mnt/data/disk5/snapraid.content... No content file found. Assuming empty. WARNING! With 5 disks it's recommended to use two parity levels. Scanning... Scanned d3 in 0 seconds Scanned d2 in 0 seconds Scanned d1 in 0 seconds Scanned d4 in 0 seconds Scanned d5 in 0 seconds Using 0 MiB of memory for the file-system. Initializing... Resizing... Saving state to /var/snapraid.content... Saving state to /mnt/data/disk1/snapraid.content... Saving state to /mnt/data/disk2/snapraid.content... Saving state to /mnt/data/disk3/snapraid.content... Saving state to /mnt/data/disk4/snapraid.content... Saving state to /mnt/data/disk5/snapraid.content... Verifying... Verified /mnt/data/disk1/snapraid.content in 0 seconds Verified /var/snapraid.content in 0 seconds Verified /mnt/data/disk2/snapraid.content in 0 seconds Verified /mnt/data/disk3/snapraid.content in 0 seconds Verified /mnt/data/disk4/snapraid.content in 0 seconds Verified /mnt/data/disk5/snapraid.content in 0 seconds Nothing to do
[rocky@localhost ~]$ echo 'UNCHI!' | sudo tee /mnt/data/disk1/gomi.txt UNCHI! [rocky@localhost ~]$ ll /mnt/data/disk1 total 8 -rw-r--r--. 1 root root 7 Jan 2 03:46 gomi.txt -rw-------. 1 root root 88 Jan 2 03:43 snapraid.content [rocky@localhost ~]$ sudo /usr/local/bin/snapraid sync Self test... Loading state from /var/snapraid.content... WARNING! With 5 disks it's recommended to use two parity levels. Scanning... Scanned d5 in 0 seconds Scanned d3 in 0 seconds Scanned d4 in 0 seconds Scanned d2 in 0 seconds Scanned d1 in 0 seconds Using 0 MiB of memory for the file-system. Initializing... Resizing... Saving state to /var/snapraid.content... Saving state to /mnt/data/disk1/snapraid.content... Saving state to /mnt/data/disk2/snapraid.content... Saving state to /mnt/data/disk3/snapraid.content... Saving state to /mnt/data/disk4/snapraid.content... Saving state to /mnt/data/disk5/snapraid.content... Verifying... Verified /var/snapraid.content in 0 seconds Verified /mnt/data/disk1/snapraid.content in 0 seconds Verified /mnt/data/disk2/snapraid.content in 0 seconds Verified /mnt/data/disk3/snapraid.content in 0 seconds Verified /mnt/data/disk4/snapraid.content in 0 seconds Verified /mnt/data/disk5/snapraid.content in 0 seconds Using 48 MiB of memory for 128 cached blocks. Selecting... Syncing... 100% completed, 1 MB accessed in 0:00
[rocky@localhost storage]$ sudo /usr/local/bin/snapraid sync Self test... Error accessing 'content' dir '/mnt/data/disk1' specification in '/etc/snapraid.conf' at line 5
新しい Disk を追加すると /dev/sdl が生えたので、フォーマットして Mount します。
# -d --filter-disk: 対象の Disk を指定 # -l --log: ログの保存先 [rocky@localhost ~]$ sudo /usr/local/bin/snapraid -d d1 -l fix.log fix Self test... Loading state from /var/snapraid.content... UUID change for disk 'd1' from 'e85bc923-5a55-45d9-ac4b-1d9e2640af27' to '9e2ac5c0-00f9-49d0-83cd-57b85a9eaffd' Searching disk d1... Searching disk d2... Searching disk d3... Searching disk d4... Searching disk d5... Searching disk d6... Searching disk d7... Searching disk d8... Selecting... Using 0 MiB of memory for the file-system. Initializing... Selecting... Fixing... recovered gomi.txt recovered file46.txt recovered file14.txt recovered file22.txt recovered file30.txt recovered file38.txt recovered file54.txt recovered file62.txt recovered file70.txt recovered file78.txt recovered file86.txt recovered file94.txt 100% completed, 1 MB accessed in 0:00
12 errors 12 recovered errors 0 unrecoverable errors Everything OK
snapraid check を実行してデータが回復したか確認します。
1 2 3 4 5 6 7 8 9 10 11 12
# -a --audit-only: Parity をガン無視してデータのみを検証する [rocky@localhost ~]$ sudo /usr/local/bin/snapraid -d d1 -a check Self test... Loading state from /var/snapraid.content... UUID change for disk 'd1' from '3f866f6e-10a8-4551-a460-d6264482d54d' to '19b373e5-b98b-45e9-864e-af90435557ce' Selecting... Using 49 MiB of memory for the file-system. Initializing... Selecting... Hashing... 100% completed, 2415 MB accessed in 0:00 Everything OK
[rocky@localhost ~]$ sudo /usr/local/bin/snapraid sync Self test... Loading state from /var/snapraid.content... UUID change for disk 'd1' from 'e85bc923-5a55-45d9-ac4b-1d9e2640af27' to '9e2ac5c0-00f9-49d0-83cd-57b85a9eaffd' Scanning... Scanned d2 in 0 seconds Scanned d1 in 0 seconds Scanned d3 in 0 seconds Scanned d4 in 0 seconds Scanned d5 in 0 seconds Scanned d6 in 0 seconds Scanned d8 in 0 seconds Scanned d7 in 0 seconds WARNING! UUID is changed for disks: 'd1'. Not using inodes to detect move operations. Using 0 MiB of memory for the file-system. Initializing... Resizing... Saving state to /var/snapraid.content... Saving state to /mnt/data/disk1/snapraid.content... Saving state to /mnt/data/disk2/snapraid.content... Saving state to /mnt/data/disk3/snapraid.content... Saving state to /mnt/data/disk4/snapraid.content... Saving state to /mnt/data/disk5/snapraid.content... Saving state to /mnt/data/disk6/snapraid.content... Saving state to /mnt/data/disk7/snapraid.content... Saving state to /mnt/data/disk8/snapraid.content... Verifying... Verified /var/snapraid.content in 0 seconds Verified /mnt/data/disk1/snapraid.content in 0 seconds Verified /mnt/data/disk2/snapraid.content in 0 seconds Verified /mnt/data/disk3/snapraid.content in 0 seconds Verified /mnt/data/disk4/snapraid.content in 0 seconds Verified /mnt/data/disk5/snapraid.content in 0 seconds Verified /mnt/data/disk6/snapraid.content in 0 seconds Verified /mnt/data/disk7/snapraid.content in 0 seconds Verified /mnt/data/disk8/snapraid.content in 0 seconds Using 80 MiB of memory for 128 cached blocks. Selecting... Syncing... Nothing to do
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
[rocky@localhost ~]$ ll /mnt/data/disk1 total 56 -rw-------. 1 root root 3 Jan 3 10:01 file14.txt -rw-------. 1 root root 3 Jan 3 10:01 file22.txt -rw-------. 1 root root 3 Jan 3 10:01 file30.txt -rw-------. 1 root root 3 Jan 3 10:01 file38.txt -rw-------. 1 root root 3 Jan 3 10:01 file46.txt -rw-------. 1 root root 3 Jan 3 10:01 file54.txt -rw-------. 1 root root 3 Jan 3 10:01 file62.txt -rw-------. 1 root root 3 Jan 3 10:01 file70.txt -rw-------. 1 root root 3 Jan 3 10:01 file78.txt -rw-------. 1 root root 3 Jan 3 10:01 file86.txt -rw-------. 1 root root 3 Jan 3 10:01 file94.txt -rw-------. 1 root root 7 Jan 3 09:48 gomi.txt -rw-------. 1 root root 5134 Jan 3 10:17 snapraid.content
[rocky@localhost ~]$ sudo /usr/local/bin/snapraid sync -F Self test... Loading state from /var/snapraid.content... Scanning... Scanned d1 in 0 seconds Scanned d2 in 0 seconds Scanned d4 in 0 seconds Scanned d5 in 0 seconds Scanned d3 in 0 seconds Scanned d6 in 0 seconds Scanned d7 in 0 seconds Scanned d8 in 0 seconds Using 0 MiB of memory for the file-system. Initializing... Resizing... Saving state to /var/snapraid.content... Saving state to /mnt/data/disk1/snapraid.content... Saving state to /mnt/data/disk2/snapraid.content... Saving state to /mnt/data/disk3/snapraid.content... Saving state to /mnt/data/disk4/snapraid.content... Saving state to /mnt/data/disk5/snapraid.content... Saving state to /mnt/data/disk6/snapraid.content... Saving state to /mnt/data/disk7/snapraid.content... Saving state to /mnt/data/disk8/snapraid.content... Verifying... Verified /var/snapraid.content in 0 seconds Verified /mnt/data/disk1/snapraid.content in 0 seconds Verified /mnt/data/disk2/snapraid.content in 0 seconds Verified /mnt/data/disk3/snapraid.content in 0 seconds Verified /mnt/data/disk4/snapraid.content in 0 seconds Verified /mnt/data/disk5/snapraid.content in 0 seconds Verified /mnt/data/disk6/snapraid.content in 0 seconds Verified /mnt/data/disk7/snapraid.content in 0 seconds Verified /mnt/data/disk8/snapraid.content in 0 seconds Using 80 MiB of memory for 128 cached blocks. Selecting... Syncing... 100% completed, 1 MB accessed in 0:00
Error syncing parity file '/mnt/parity/disk2/snapraid.2-parity'. Input/output error. DANGER! Unexpected sync error in 2-Parity disk. Ensure that disk '2-parity' is sane. Stopping at block 13 Error advising parity file '/mnt/parity/disk2/snapraid.2-parity'. Input/output error. Input/Output error in parity '2-parity' at position '1' Error advising parity file '/mnt/parity/disk2/snapraid.2-parity'. Input/output error. Input/Output error in parity '2-parity' at position '2' Error advising parity file '/mnt/parity/disk2/snapraid.2-parity'. Input/output error. Input/Output error in parity '2-parity' at position '3' Error writing file '/mnt/parity/disk2/snapraid.2-parity'. Input/output error. Input/Output error in parity '2-parity' at position '4' Error writing file '/mnt/parity/disk2/snapraid.2-parity'. Input/output error. Input/Output error in parity '2-parity' at position '5' Error writing file '/mnt/parity/disk2/snapraid.2-parity'. Input/output error. Input/Output error in parity '2-parity' at position '6' Error writing file '/mnt/parity/disk2/snapraid.2-parity'. Input/output error. Input/Output error in parity '2-parity' at position '7' Error writing file '/mnt/parity/disk2/snapraid.2-parity'. Input/output error. Input/Output error in parity '2-parity' at position '8' Error writing file '/mnt/parity/disk2/snapraid.2-parity'. Input/output error. Input/Output error in parity '2-parity' at position '9' Error writing file '/mnt/parity/disk2/snapraid.2-parity'. Input/output error. Input/Output error in parity '2-parity' at position '10' Error writing file '/mnt/parity/disk2/snapraid.2-parity'. Input/output error. Input/Output error in parity '2-parity' at position '11' Error writing file '/mnt/parity/disk2/snapraid.2-parity'. Input/output error. Input/Output error in parity '2-parity' at position '12' Saving state to /var/snapraid.content... Saving state to /mnt/data/disk1/snapraid.content... Saving state to /mnt/data/disk2/snapraid.content... Saving state to /mnt/data/disk3/snapraid.content... Saving state to /mnt/data/disk4/snapraid.content... Saving state to /mnt/data/disk5/snapraid.content... Saving state to /mnt/data/disk6/snapraid.content... Saving state to /mnt/data/disk7/snapraid.content... Saving state to /mnt/data/disk8/snapraid.content... Verifying... Verified /var/snapraid.content in 0 seconds Verified /mnt/data/disk1/snapraid.content in 0 seconds Verified /mnt/data/disk2/snapraid.content in 0 seconds Verified /mnt/data/disk3/snapraid.content in 0 seconds Verified /mnt/data/disk4/snapraid.content in 0 seconds Verified /mnt/data/disk5/snapraid.content in 0 seconds Verified /mnt/data/disk6/snapraid.content in 0 seconds Verified /mnt/data/disk7/snapraid.content in 0 seconds Verified /mnt/data/disk8/snapraid.content in 0 seconds
[rocky@localhost ~]$ sudo /usr/local/bin/snapraid scrub Self test... Loading state from /var/snapraid.content... Using 0 MiB of memory for the file-system. Initializing... Using 96 MiB of memory for 128 cached blocks. Selecting... Scrubbing... Nothing to do