senghor wrote: » Hi Gomjaba, how are you mounting the NFS? hard,intr I hope.... You can't kill the client until you kill -9 the processes that are using (or trying to use) the share.. you can see the processes with: fuser /your/share if it doesn't, there i a work around that I use....let me know.
192.168.0.29:/shares/mike /backup nfs defaults 0 0
mount /backup
[root@mike-1 ~]# df -h Filesystem Size Used Avail Use% Mounted on /dev/sda1 65G 4.4G 57G 8% / /dev/sdb1 902G 39G 817G 5% /home tmpfs 2.0G 0 2.0G 0% /dev/shm 192.168.0.29:/shares/mike 197G 41G 157G 21% /backup
# device mountpoint fs-type options **** fsckord your.share:/home /backup nfs rw,hard,intr 0 0
senghor wrote: » Gomjaba, I see that in your fstab you are not handling failures of server/network....that is why your clients hang....I think you can handle in two ways soft (the magic recipe for corrupted data) and hard (the way to holiness) try this # device mountpoint fs-type options **** fsckord your.share:/home /backup nfs rw,hard,intr 0 0 you can't kill the process (kind of) if you don't specify intr. Try with a test system and use iptables to simulate the disconnection of the NFS Server. one question though....what happens when the share crashes?....meaning...why the share is down?...is it a network issue, NFS Server down or busy?....can you reach the NFS from this client via network? ICMP, SSH,.... I'm asking because there are ways to "trick" the client when there seems to be no connection on to the NFS server.
Gomjaba wrote: » Heh nice one senghor Now added INTR as well .. df obviously still hangs but does allow cancelling (CTRL-C) and even more importantly allows to unmount the share without crashing the whole console / ssh session ... Cheers ..
Gomjaba wrote: » Nah, I just shutdown the interface with the VLan used solely for the backup share
Gomjaba wrote: » That is what I do when all fails. But the box in question had also a couple of DRBD drives worth a few TB with an uptime of around 700 days. Which was already impressive but we knew it would force a filesystem check upon reboot which would have taken weeks probably. As I also didn't have access to the Internet l couldn't google how to disable the force fsck lol - basically everything which could have gone wrong - did
Gomjaba wrote: » Can this actually be turned off on the fly ?
# device mount_point FS_type options ****_freq fsck_order
[root@test ~]# tune2fs -l /dev/mapper/test <SNIP> Filesystem created: Wed Dec 2 16:23:05 2009 Last mount time: Sat Apr 24 01:17:36 2010 Last write time: Sat Apr 24 01:17:36 2010 Mount count: 3 Maximum mount count: 33 [B]Last checked: Wed Dec 2 16:23:05 2009 Check interval: 15552000 (6 months) Next check after: Mon May 31 17:23:05 2010[/B] Reserved blocks uid: 0 (user root) Reserved blocks gid: 0 (group root) <SNIP> [root@test ~]# tune2fs -i 0 /dev/mapper/test tune2fs 1.39 (29-May-2006) Setting interval between checks to 0 seconds [root@test ~]# tune2fs -l /dev/mapper/test <SNIP> Filesystem created: Wed Dec 2 16:23:05 2009 Last mount time: Sat Apr 24 01:17:36 2010 Last write time: Sun Apr 25 12:17:14 2010 Mount count: 3 Maximum mount count: 33 [B]Last checked: Wed Dec 2 16:23:05 2009 Check interval: 0 (<none>)[/B] Reserved blocks uid: 0 (user root) Reserved blocks gid: 0 (group root) <SNIP>
Gomjaba wrote: » Ah[root@test ~]# tune2fs -l /dev/mapper/test <SNIP> Filesystem created: Wed Dec 2 16:23:05 2009 Last mount time: Sat Apr 24 01:17:36 2010 Last write time: Sat Apr 24 01:17:36 2010 Mount count: 3 Maximum mount count: 33 [B]Last checked: Wed Dec 2 16:23:05 2009 Check interval: 15552000 (6 months) Next check after: Mon May 31 17:23:05 2010[/B] Reserved blocks uid: 0 (user root) Reserved blocks gid: 0 (group root) <SNIP> [root@test ~]# tune2fs -i 0 /dev/mapper/test tune2fs 1.39 (29-May-2006) Setting interval between checks to 0 seconds [root@test ~]# tune2fs -l /dev/mapper/test <SNIP> Filesystem created: Wed Dec 2 16:23:05 2009 Last mount time: Sat Apr 24 01:17:36 2010 Last write time: Sun Apr 25 12:17:14 2010 Mount count: 3 Maximum mount count: 33 [B]Last checked: Wed Dec 2 16:23:05 2009 Check interval: 0 (<none>)[/B] Reserved blocks uid: 0 (user root) Reserved blocks gid: 0 (group root) <SNIP> Guess that'll do ...
Gomjaba wrote: » Lol Forsaken, reboot is a cool solution, forced file system check of 20TB data drives isn't lol.