Tuesday afternoon I was let known by $manglement that one of our sites had a borked server. Assisted the onsite tech troubleshooting.
Verdict : server is dead.
Said server was shipped to us, and it arrived late on Wednesday afternoon.
I came in all bright-eyed and bushy tailed very early and started to assess the server.
Dead, yup. (HP ML350 G6)
Plonked out the redundant power supplies, and inserted the spare redundant power supplies.
First oops, spare PSU’s does not fit, they’re for another server.
So… it went a bit hither and thither, then I remembered we got another HP server in our server room. Removed one redundant power supply, it is longer than the original. So no go there.
Noticed my appy also have an HP server he played around with (leftover from one site after an server upgrade) - which is still working, and incidentally, have the exact same hardware configuration as the dead one.
Transferred the dead server’s PSU’s over, everything works. It was then that I declared the redundant controller circuit board to be dead. And it was the problem.
Transferred the entire RAID setup on an 1:1 basis from the dead server to appy’s server, switched it on, and success. Fired up, and everything worked from the get-go. (Apparently HP’s RAID stores the setup on the hard drives instead of on the RAID controller).
So, I’m super happy and chuffed. No data was lost, no time is needed to set everything up again, we can just ship it back to site.
Now if our resident electronics guru can figure out what the issue is with the redundant power supply controller… I have requested a quote anyway to see if it is affordable, but being HP it’ll cost you an arm, leg, kidney and left nut.