Oh boy, Linux server went bang
jibbajabba
Member Posts: 4,317 ■■■■■■■■□□
I run a daily Acronis task on a Linux server but it complained that /usr is not readable / cannot read from source etc.
I run fsck and it asked me to reboot .. now this :
I know basic stuff about Linux - but it stops right there
Can someone give me a hint what would be the next step (if there is one) to hopefully get this one up again ?
I run fsck and it asked me to reboot .. now this :
I know basic stuff about Linux - but it stops right there
Can someone give me a hint what would be the next step (if there is one) to hopefully get this one up again ?
My own knowledge base made public: http://open902.com
Comments
-
rossonieri#1 Member Posts: 799 ■■■□□□□□□□hi,
i think you have a corrupted system over there.
perhaps that missing libidl.so was deleted by unlinked inode 2523...
try to repair the system using emergency boot disk.
what distro btw?the More I know, that is more and More I dont know. -
jibbajabba Member Posts: 4,317 ■■■■■■■■□□RHEL 5.2
I made the stupid mistake to run fsck while the system was running .. Didn't catch the "do not run when filesystems are mounted" - but its a learning curve - so its all good - not a crucial server anyway and I have a backup of the data anyway .. still want to fix as learning excercise .. Gonna try the emergency boot disk later on and "call back" if I am stuckMy own knowledge base made public: http://open902.com -
UnixGuy Mod Posts: 4,570 ModI don't know how this works, but if it was solaris, you have to boot from cdrom(emergency disk?) and fsck your file system
-
UnixGuy Mod Posts: 4,570 Modor you can do fsck in single user mode...because you can't unmount /usr or /var/ or /root, thats why you have to fsck the raw disk while the kernel is booted from cdrom (i.e /root is unmounted)
-
jibbajabba Member Posts: 4,317 ■■■■■■■■□□oops - something you don't want to see when you start a server with the rescue disk
My own knowledge base made public: http://open902.com -
tiersten Member Posts: 4,505You're not supposed to run fsck on a mounted RW FS but that isn't what caused your problems. You already had issues before doing that and they were major ones.
With the limited information and assuming it hadn't crashed before or somebody hasn't done anything to it, I'd hazard a guess and say your drive or controller is failing. -
rossonieri#1 Member Posts: 799 ■■■□□□□□□□hi gomjaba,
relax - stay cool that screen was only a warning, not a big deal,
just hit ENTER and see whether it actually gave you the correct information.
enter the shell - and try fdisk /dev/<whatever_disk_you_have_there>
you do know how to use fdisk right?
just print the partition information of the corrupted disk - and if that warning sign is correct that you are no longer having any partition - i hate to say that you lost it, there is no way AFAIK to recover a lost partition eventhough there are news that other 3rd party tool able to do that.
but - if you still can get the partition printed out - than there is a chance to copy the libdl.so from another machine and try to fix it using fsck.
@ unixguysorry man, can't help in Linux
no offense, but come on - you can do better than that,
even in solaris - you'd still have to use fdisk right?
cheers!!!the More I know, that is more and More I dont know. -
darkerosxx Banned Posts: 1,343You noticed it said it couldn't find your fstab file, right? I would say boot into rescue mode and recreate your fstab file using your backup or with what you know it should be. Without that, you won't have any mount points, so you won't have any partitions.
-
jibbajabba Member Posts: 4,317 ■■■■■■■■□□Well thanks guys, but even fdisk shows one big empty disk
Guess it is reinstalling time ...My own knowledge base made public: http://open902.com -
UnixGuy Mod Posts: 4,570 Modlol..I look miserable tho, calling my self UNIX guy and unable to help..I should change my nickname
no you don't use fdisk in Solaris when it's on SPARC (which is like 90% of the time anyway), you use fdisk in the unlikely situation of running Solaris on an X68 architecture.
I didn't want to suggest a solution because I really dont know how this "rescue" disk work so I don't know why this message appeared, his production machine is not for R&D I guess. and I don't know how the initial problem happened.
yes I don't have linux experience yetrossonieri#1 wrote: »hi gomjaba,
relax - stay cool that screen was only a warning, not a big deal,
just hit ENTER and see whether it actually gave you the correct information.
enter the shell - and try fdisk /dev/<whatever_disk_you_have_there>
you do know how to use fdisk right?
just print the partition information of the corrupted disk - and if that warning sign is correct that you are no longer having any partition - i hate to say that you lost it, there is no way AFAIK to recover a lost partition eventhough there are news that other 3rd party tool able to do that.
but - if you still can get the partition printed out - than there is a chance to copy the libdl.so from another machine and try to fix it using fsck.
@ unixguy
no offense, but come on - you can do better than that,
even in solaris - you'd still have to use fdisk right?
cheers!!! -
jibbajabba Member Posts: 4,317 ■■■■■■■■□□Well ... 4am here and server rebuilt lol ...My own knowledge base made public: http://open902.com
-
jibbajabba Member Posts: 4,317 ■■■■■■■■□□oh God, good job !
how did this problem happen ?
Pure stupidity. We have a few server (Linux) running Acronis and recently they all stopped working with a Read Error. So we THOUGHT that the FS is bust for some reason (started of with just one server).
So I thought - fsck - that'll do .. However, I am a n00b when it comes to stuff like that. BUT before I tried that on that particual live system from a customer - I tried my own server which hosts a few forums. I knew if it does go bang - there is no "harm" apart from whinging member where nobody pays for the server anyway (apart from my boss lol).
So I run fsck while the system was running, ignoring all the warnings that you shouldn't do that on mounted systems and got slapped with a stick
Yes it was stupid - but the good thing is : I will probably NEVER EVER do that again on a live system - that is for sureMy own knowledge base made public: http://open902.com -
UnixGuy Mod Posts: 4,570 Modlool so you had nice time last night
but did you know whats the cause of the Read error ? -
jibbajabba Member Posts: 4,317 ■■■■■■■■□□lool so you had nice time last night
but did you know whats the cause of the Read error ?
Nope, we have a ticket open with Acronis as it is clearly not the FS. We tested several other systems "the right" way and they returned all green .. So problem is def. Acronis ..My own knowledge base made public: http://open902.com -
jibbajabba Member Posts: 4,317 ■■■■■■■■□□oops - something you don't want to see when you start a server with the rescue disk
LOL - I JUST realize that I was REALLY stupid / retarded ....
NO WONDER it didn't find an installation - at that point the server was still running CentOS and not RHEL but I used the RHEL DVD DOOOOHMy own knowledge base made public: http://open902.com -
tiersten Member Posts: 4,505LOL - I JUST realize that I was REALLY stupid / retarded ....
NO WONDER it didn't find an installation - at that point the server was still running CentOS and not RHEL but I used the RHEL DVD DOOOOH
Any recent Linux distribution will have detected it as a valid partition. It might not have detected it as the same distribution but it will know it is a Linux partition. -
jibbajabba Member Posts: 4,317 ■■■■■■■■□□CentOS = RHEL without support and from a third party.
Any recent Linux distribution will have detected it as a valid partition. It might not have detected it as the same distribution but it will know it is a Linux partition.
Good point ...My own knowledge base made public: http://open902.com