An RPI4 based file server

'Not intending to hijack this thread, but I decided it would not hurt to post my rsync BASH file for people to see and use as they see fit:

Bash:
#!/bin/bash
RunFile=/var/run/gadmin-rsync-Main.run  # Run file to prevent multiple instances of the script running
LogFile=/var/log/gadmin-rsync/gadmin-rsync-Main.log  # Log file to keep track of backup sessions and / or failures

grep -iq true /etc/gadmin-rsync/inhibit && exit 0  #  Check to make sure CRC is properly set or unset
CRC=""
grep -iq crc /etc/gadmin-rsync/inhibit && CRC=--checksum  # Check to see if checksum is required (takes much, much longer)
thisPID=$$

START_TIME=`date +%Y-%m-%d_%H:%M:%S`;

noMount=1
Running=1

mount | grep -q /Backup && noMount=0  # Check to see the RAID array is mounted

if [[ -s $RunFile ]];  # Sanity check
then
    ps -fp $( cat $RunFile ) > /dev/null
    if [ $? -eq 0 ]
    then
        Running=0
    else
        echo $thisPID > $RunFile
    fi
else
    echo $thisPID > $RunFile
fi

# Log any errors; if no errors, then run the rsync process
if [ $noMount -eq 1 ]; then
    echo -n File_system_not_mounted:___ >> $LogFile
    MISSING_PATH=1
elif [ ! -e '/Backup/Recordings' ]; then
    MISSING_PATH=1
    echo -n Missing_destination_path:__ >> $LogFile
elif [ $Running -eq 0 ]; then
    MISSING_PATH=1
    echo -n Instance_of_rsync_running:_ >> $LogFile
else
    MISSING_PATH=0
    rsync --archive --progress --human-readable --verbose --stats  --exclude Firefox/ --skip-compress=* $CRC -e "ssh -l root -p 22" root@RAID-Server:'/RAID/' '/Backup'
fi

# Post run time or failures to log after rsync runs.
if [ $? -eq 0 ] && [ $MISSING_PATH -eq 0 ]; then
   STOP_TIME=`date +%Y-%m-%d_%H:%M:%S`;
   echo "$START_TIME $STOP_TIME Backup successful: Source: [RAID-Server/RAID/] Destination: [/Backup]" >> $LogFile
else
   STOP_TIME=`date +%Y-%m-%d_%H:%M:%S`;
   echo "$START_TIME $STOP_TIME Backup failure:    Source: [RAID-Server/RAID/] Destination: [/Backup]" >> $LogFile
fi
[[ $Running -eq 1 ]] && rm $RunFile

The /etc/gadmin-rsync/inhibit file needs to contain some text. If it contains the text "CRC", then the rsync process will do a full checksum on all the files. Any other text in the file will prevent the checksum.
 
Last edited:
How is the project running at this point @WobblyHand?
The file server itself is a flop. I didn't want to buy any more WD NVME SSD's after the NVME SSD failed under RAID initialization. I got a replacement unit and it failed as well. It would seem that there are issues with the active hub and the USB NVME housings. Using NVME was not a good idea, compact, yes, but not all of the controller chips were compatible with the RPI.

But, I use the assembly as a portable little RPI, booting from normal SSD, with a 1TB NVME for storage. I use it in the shop to do Arduino work and programming my Teensy based ELS. It's also ok, to check up on HM while in the shop.
 
ZFS for the win. So much nicer to work with than RAID.
If by "RAID", you mean mdadm, then I disagree. Although fairly well supported, ZFS does not have the nearly universal support mdadm does. On Debian, for example, last I checked it is only available via backport. It also has some issues on a Raspberry Pi unless it supports SATA directly, rather than via USB link.

That said, I am not knocking ZFS. It is a full featured integrated drive system including multiple utilities. The problem is it is a full featured integrated drive system including multiple utilities. Using ZFS, one is more or less stuck with ZFS for the file system, RAID support, drive management, etc.

I use mdadm for RAID management, XFS for the file system on my large servers, ext3 and ext4 on my boot drives (which are also RAID), FAT and ext4 on my Rasppberry Pi systems, and so forth.
 
The file server itself is a flop. I didn't want to buy any more WD NVME SSD's after the NVME SSD failed under RAID initialization. I got a replacement unit and it failed as well. It would seem that there are issues with the active hub and the USB NVME housings. Using NVME was not a good idea, compact, yes, but not all of the controller chips were compatible with the RPI.

But, I use the assembly as a portable little RPI, booting from normal SSD, with a 1TB NVME for storage. I use it in the shop to do Arduino work and programming my Teensy based ELS. It's also ok, to check up on HM while in the shop.
That is a shame. If you decide to revisit in some form, drop me a line. I would be most happy to help any way I can.
 
That is a shame. If you decide to revisit in some form, drop me a line. I would be most happy to help any way I can.
Thanks for the offer. I want to get my ELS project finished first, but I would like to get back to this. I will contact you.

First thing I need to do is to source some disks that are of a useful capacity and can be powered by an active USB hub. Then I need to revisit mdadm and make sure that I don't screw up yet another drive. I didn't think that adding a second disk to the array would cause the second disk to fail. Probably overheated the part to failure, as it was syncing. Darn crappy WD NVME SSD's.
 
I'm not a LINUX head, so forgive the simplicity here. ZFS seems to use an upgraded RAID5 to close the write hole problem on power loss, and mdadm seems to be yet another software RAID manager. (does it even have journaling? I couldn't find that in the docs)

I abandoned hardware RAID when I discovered that if you have a fault in your Adaptec RAID controller you lose ALL data. No going back, as putting a new controller in your system makes the drives unreadable (I lost 3 terabytes over that one).

Where I have come to is that for my simple network with about 32 terabytes of storage (2 banks of 16 terabytes), that a good mirroring backup system is far superior in terms of stability, reliability and availability. I rsync every 24 hours as very little changes each day, perhaps 5-10 gigabytes. Every few months I make a manual copy of a server for offsite backup. Sure it costs more, but is simple and effective.
 
I'm not a LINUX head, so forgive the simplicity here. ZFS seems to use an upgraded RAID5 to close the write hole problem on power loss, and mdadm seems to be yet another software RAID manager. (does it even have journaling? I couldn't find that in the docs)
RAID is not a file system. Journaling is a feature of a file system. There are battery backed drive controllers which allow the writes to be completed after the OS crashes or power is lost.
I abandoned hardware RAID when I discovered that if you have a fault in your Adaptec RAID controller you lose ALL data. No going back, as putting a new controller in your system makes the drives unreadable (I lost 3 terabytes over that one).
No controller should do that, but even so, hardware RAID is a bad idea.

Where I have come to is that for my simple network with about 32 terabytes of storage (2 banks of 16 terabytes), that a good mirroring backup system is far superior in terms of stability, reliability and availability. I rsync every 24 hours as very little changes each day, perhaps 5-10 gigabytes. Every few months I make a manual copy of a server for offsite backup. Sure it costs more, but is simple and effective.
Once again, RAID is not a backup. RAID allows a system to remain up and running in the face of one or more failed drives. It prevents data loss due to a drive failure, but drive failures are not the principle cause of data loss. Backups can (hopefully) prevent or at least minimize data loss, but recovering an entire system can take days or even weeks. With a good RAID strategy and effective backups, recovering data can take mere minutes.
 
I understand that hardware raid gets a bad rap from independent raid controller boards, that require weird drivers and that controller becomes a single point of failure. Been there done that.

Software RAID is often firmware based, meaning some of the underlying parsing is done in the chipset on the motherboard. Not the CPU, but the I/O chipset, sometimes called the northbridge or various other names. This type of RAID somewhat reduces the load on the main CPU without the dependence on an add-on raid controller, basically some features of the raid controller are built into the native SATA interfaces. Needs to be enabled in BIOS on PCs. Obviously this only applies to computers with that type of support (primarily PCs), not raspberry PI. Just a technical FYI for those interested in the depths of computers.

I tend to prefer raid mirroring. The calculations for raid 5/6 can be pretty CPU intensive and slow down disk access.
 
RAID is not a file system. Journaling is a feature of a file system.
But ZFS is. No Journaling that I can find. But a very clever workaround that is *almost* like journaling to get around the write hole on powers loss. And no need for fsck if you do get a power outage. With 16TB, that wouild take a *long* time
hardware RAID is a bad idea
Agreed.
Once again, RAID is not a backup.
A surprisingly large number of people use it that way. Even businesses have tried to reduce the 'cost of backups' by 'using better raid' (these things have been said to me in multiple meetings).
I tend to prefer raid mirroring
Since I can easily tolerate loss of new data (that is data created in the last 24 hours), RAID has never had high value for me. Using rsync on my QNAP servers, I get sufficient prevention of data loss from a failed drive, and 100% available data in the event of a failure, without rebuilding drives. Over the years I have lost 7 critical drives. Only one time I was in a position that it was necessary to have it recovered by a data recovery firm.
 
Back
Top