I am a big fan of Synology DS1511+…. and I was settling nicely for a well deserved rest on Xmas eve when my email box literally woke me up with an email..
The mail came from my Synology NAS and it stated (and this was possible as I have set up mailing in the Synoloyg DS1511+ NAS) nicely..
“Internal Disk 1 on Diskstation has crashed, please replace it”
And signed off sweetly as “Sincerely, Synology Diskstation”.
I ran to my NAS and indeed it was beeping furiously and the lights on the front of the NAS were flashing. This was the first time in my life I had to deal with a hard disk crash on a NAS. Fearing the worst (and more like, truth be told, that I was not even 1% confident in my own ability to handle this situation), I really had no idea what to do. It was one thing to read (and be sold) about features of a product (in this case, the RAID features of a NAS) and yet it was another thing to face it when you are not so technically capable :)
Using my web browser, I found that I could logged into the NAS. Going to Synology DSM storage manager, I first switched off the beeping sound and then saw this:
Does not looked too good. I went into the individual hard disk level and was not sure what to do. I decided I will do a SMART test. Was thinking that I can then showed it to Seagate and get a new hard disk back from warranty coverage !
Surprisingly, when I ran the SMART test, the “crashed” hard disk was shown up as working fine. What I meant was that the SMART test showed NORMAL. There was, however, a message stating that the “System Partition Failed”.
Going back to the storage manager, I decided to try to repair the volume (that’s what they called the “collection” of the 5 hard disks creating a single data source” in the NAS). This was despite the error message and when the logical thing was to replace the hard disk (I am a stubborn guy and besides that, I do not have a spare empty 2TB sitting around. They were all used for backups for the contents of this very NAS).
Despite these, it happily goes ahead and repair. I decided I will try to sleep.
8 hours later on Xmas day, after the repair was completed, the storage managed still showed failure and I had in fact even received emails of failures from the NAS. It looked exactly the same as the initial error state when I first saw it on Xmas eve (perhaps Synology has even done its own auto repair before that ?).
The thing was that even though there was this error, I could still accessed the NAS and I could still copied data out of the NAS (I did not dare to copy INTO the NAS, fearing that I might disturb it.. such is my ignorance..). But I was quite happy that it seemed to me no data was lost (despite the one hard disk “crash” or “error”). I did managed to update my backups of the NAS (not that it was really necessary as I had just done my weekly backups of the NAS 2-3 days ago).
Anyhow, I decided being stubborn was useless and it was time to replace the hard disk. I shut down the NAS and took out the hard disk. To my shock, the inside of the NAS was so dirty !! There were so much dirt lumps (you know.. lots of dirt lumped together in a ball like shape.. looking very consumable like a rice ball :p) everywhere inside the NAS. There were lots of dirt balls sticking to the NAS disk latches and also to the sides of hard disks. So I took out all 5 hard disks (and also placing them carefully in the SAME order they are in the NAS. you really do not want to mix them up..) and then cleaned the NAS and the hard disks. And the fans inside the NAS etc. All in all, not a very pretty sight on a Xmas day.
As I unscrewed the “crashed” hard disk, a thought came to my mind. Perhaps the hard disk was not really crashed per se (after all the SMART test showed me it is okay, right? I dun know). So perhaps I just need to “clean up” the hard disk literally. I cleaned it up physically (ha) and then bring it to my PC and deleted all the partitions in that hard disk using Windows and hence cleaning it up logically (ha ha). It became a fresh new hard disk without any formatting.
I then put back the hard disk into the NAS (together with the rest, cleaned up and all screws tightened again) and launched into the storage manager. This time there was no red color indicator (HA) but just this message:
Feeling slightly better, I then ran the REPAIR task again. 5 mins later, I received an email from the NAS:
Feeling much much better, I looked at the progress and indeed it looked sweeter:
8 hours later, it looked much much sweeter and better and happier and nicer and…
Finally about 10-11 hours later, everything was back to normal ! :) :)
During the time when the repair was taking place, I was able to looked through the support forums in Synology web site and saw a lot of great help, including one that, if necessary, of actually contacting Synology Support team to help rebuilt the partition remotely. But what strike me most was this:
It looks like there may be a loose connection to disk 2. I recommend reseating this disk and making sure that it is well secured to the disk tray that you are using the long screws that secure the disk trays to the metal chassis. Crash doesn’t mean that there was an issue with the disk, it means that it was ejected and ‘crashed’ out of the array. There was a communication issue for longer than allowed so the system was forced to eject the disk to continue operations.
I thought it sounded like my case and perhaps that was the case here. I never will know anyway.
What I learnt from this incident:
(1) The Synology NAS RAID 5 did worked. No data was lost despite the one hard disk “Crashed”. Ha. I know this is silly but try to understand from a non-technical user point of view. For many highly skilled IT people, you probably have seen it happened at work and knew that is the case. You probably dealt with even worst situations. But for home users like me, we never have seen it before so it is nice to see it working from a personal point of view.
(2) Backups. Backups. Backups. I had backups done regularly for data on the NAS as I was taught that a NAS is not a backup. That gave me much more assurance in this whole incident. At worst, if I failed in my attempts, I can either contact Synology or I can re-do the whole NAS. Painful but at least I know the hard work of years of collecting the prawns (oops.. I really meant the data) in the NAS were still there in backups.
(3) Clean your NAS regularly please. Yes.. Clean your NAS physically once a while. As it is usually sitting in a corner (in my case at the lowest level of the rack), it can really gathered lots of dirt from down there…. Once a while, it is good to be cleaning it up, removing the dirt (*YUCKS*), cleaning the fans (which provides ventilation and cooling to the hard disks) and then tightening the screws on the hard disk cradles. Just don’t mix up the order of the hard disks. Put them back in the exact order.
(4) A ‘crash’ is not always a crash :) :)