Operating System - Linux
1825793 Members
2302 Online
109687 Solutions
New Discussion

cmresmond/cmcheckdisk problem

 
Virgil Chereches_2
Frequent Advisor

cmresmond/cmcheckdisk problem

Hi,
we have the following setup: MC Service/Guard for Linux v.A 11.14.02 on 2 RedHat 7.3 nodes with kernel 2.4.22 and devfs.
The cmcheckdisk program, spawned by cmresmond at every x seconds aborts with a core dump.
Running cmcheckdisk alone, it seems that the environment affects it: some times unsetting some env variables fixes the problem; the strange thing is that the name of the variables doesn't matter, only the number of characters.

We believe that it's because the device names in /proc/partitions are long (e.g. 34 chars) and maybe some internal buffer overflows.

Has anyone seen this behaviour?

3 REPLIES 3
Huc_1
Honored Contributor

Re: cmresmond/cmcheckdisk problem

As no one seems to be answering this one I will have a go !

One way I use to find out more about such problem is perhaps, a little twisted, but her goes !

I would try to invoke the faulty command using something like the following ex:

#strace -o test.dat df

this show you all system calls for command df and place result in test.dat, you can then read/analyse test.dat and try/see if you can more informations out of this.

I also sometimes use lsof command to further help me find out more info to continue solution search ...

In your case could try this before and after changing variable and see/check where that leads...

Just Hoping this will toggle/trigger some advance and get you on the way.

J-P
Smile I will feel the difference
Virgil Chereches_2
Frequent Advisor

Re: cmresmond/cmcheckdisk problem

Hello,

We already did that - indeed, strace+lsof are the first debugging tools. But the problem is that it crashes somewhere in userspace, and gdb shows (on the core) that it SEGFAULT-ed somewhere in glibc/vprintf, which led me to the ideea that it is somehow related to (for example):
char devname[16];
sprintf(devname, "/dev/%s", whatever);
In our case, device names are longer than in non-devfs case, which possibly triggers this. Before switching to devfs, it was ok.
Huc_1
Honored Contributor

Re: cmresmond/cmcheckdisk problem

I suppose you have already done this, but did you try this with more a higher trace level

devfsd -t

I have had a look at man devfsd and that also talks of variable in bash env

Just trying to help

J-P
Smile I will feel the difference