Operating System - HP-UX
1833589 Members
4236 Online
110061 Solutions
New Discussion

Failsafe for root filesystem filling to 100%

 
Charles Slivkoff
Respected Contributor

Failsafe for root filesystem filling to 100%

Having had two recent incidents with the root fs filling to capacity, I spent some time yesterday investigating exactly why the login attempt fails.

Using tusc, I was able to isolate the problem to an inability to write to /tcb. All of our system run in "trusted" mode, BTW.

It would be nice if HP could add a test for UID==0 and still allow root to login even though the tcb entry could not be updated. Anyone from the labs or Expert Center listening?

The obvious workaround seemed too simple that I never expected it to work. By moving /tcb to its own lvol, I was able to login!

I have not done a lot of testing and won't really have the time to do any more, so I'm offering this to the community for further comments.

Enjoy!

-charles
7 REPLIES 7
Pete Randall
Outstanding Contributor

Re: Failsafe for root filesystem filling to 100%

Charles,

I would have to say that you're asking for trouble. If this lvol is not mounted, like in single user mode, for example, you won't be able to log in.


Pete

Pete
Tim Nelson
Honored Contributor

Re: Failsafe for root filesystem filling to 100%

I would fix the problem by configuring the system properly instead of creating a huge complicated workaround.

Do not allow applications to write into root filesystem. Any appliations that do, remove them, change them, fix them.

It should not be that hard.
/
/etc
/dev
/sbin

are the only directories in the root filesystem.



Charles Slivkoff
Respected Contributor

Re: Failsafe for root filesystem filling to 100%

Pete,

There's no login process involved with single-user, so I'm not sure how this would be an issue. I did test this, BTW, and it worked.


Tim,

I agree completely, but the cases that I've seen recently were both due to user error. Even veteran admins make mistakes.

Tim Nelson
Honored Contributor

Re: Failsafe for root filesystem filling to 100%

Point taken.

One possible failsafe ( this is stretching the idea factory ).

Always leave your remote or hard console logged into a unix session. When disconnected via lan it will be protected with your GSP(MP) login. Your hard console should be protected via physical ?



Sandy Chen
Honored Contributor

Re: Failsafe for root filesystem filling to 100%

Hi Charles,

You are doing risky thing here. You will never know when you will need to go to the single user mode. It worked, but never recommended. Me myself wouldn't do it. But it just me :)

Regards,
Sandy
I never think of the future. It comes soon enough.
A. Clay Stephenson
Acclaimed Contributor

Re: Failsafe for root filesystem filling to 100%

Consider what is going to happen when you reboot this box. The /tcb filesystem will not be mounted initially so that user verifications are going to fail. The /tcb directory is normally not all that large anyway. The real solution is to not fill up /. Once a system has been in use for a few hous the size of / should be all but constant; you need to find why / is filling up rather than throwing a Band-Aid at the problem.
If it ain't broke, I can fix that.
Charles Slivkoff
Respected Contributor

Re: Failsafe for root filesystem filling to 100%

I mounted the tcb lvol on top of the existing /tcb, so the structure is there, if it might be needed in single-user mode. It also would be trivial to keep a daily backup of /tcb if/when it might be needed in single-user.

The failing write() to /tcb is what prevents the login when / is full. Why can't HP simply code a fix for this at least to allow root to still login? (Believe me, I worked in the Response Center for 10 years, so I can come up with a number of excuses.)

Having to force a reset to recover from a fat-fingered command seems excessive when the system and its applications are otherwise operating fine.