Operating System - HP-UX
1753510 Members
5541 Online
108795 Solutions
New Discussion юеВ

Who is cluttering /var/tmp with char dev files rdskAAA* ?

 
SOLVED
Go to solution
Dennis Handly
Acclaimed Contributor

Re: Who is cluttering /var/tmp with char dev files rdskAAA* ?

I have the same things, since 2002. But only 20 or so, about every month or so.

>Sameer: I am curious to know what if you move /var/tmp/sysstat_em.fmt

I don't have that.
Ralph Grothe
Honored Contributor

Re: Who is cluttering /var/tmp with char dev files rdskAAA* ?

Sameer,

I tentatively renamed systat_em.fmt
as you suggested.

# ps -fp $(fuser -u /var/tmp/sysstat_em.fmt 2>/dev/null)
UID PID PPID C STIME TTY TIME COMMAND
root 29200 1 0 09:04:18 ? 0:00 /usr/sbin/stm/uut/bin/tools/monitor/sysstat_em

# mv /var/tmp/sysstat_em.fmt /var/tmp/sysstat_em.fmt.moved

# ll /var/tmp/sysstat_em.fmt*
-rw-r--r-- 1 root root 0 Aug 9 09:25 /var/tmp/sysstat_em.fmt.moved


But this doesn't work as long as the agent systat_em his holding it open

# ps -fp $(fuser -u /var/tmp/sysstat_em.fmt.moved 2>/dev/null)
UID PID PPID C STIME TTY TIME COMMAND
root 29200 1 0 09:04:18 ? 0:00 /usr/sbin/stm/uut/bin/tools/monitor/sysstat_em

Madness, thy name is system administration
Steven E. Protter
Exalted Contributor

Re: Who is cluttering /var/tmp with char dev files rdskAAA* ?

Shalom Ralph,

I share your pain.

The bottom line here is the files are owned by oracle and they should be cleaned up by the oracle application.

I fought with oracle to get them to do better testing on HP-UX after receiving an isntall script that didn't work on HP-UX that was clearly tested only on Solaris.

I was partially successful, because I refused to accept delivery on products not tested on HP-UX. Still, they leave messes and don't like to clean those messes up.

It's frustrating. I suggest making your unhappiness clear to your Oracle Rep the next time they ask for more money.

SEP
Steven E Protter
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com
Dennis Handly
Acclaimed Contributor

Re: Who is cluttering /var/tmp with char dev files rdskAAA* ?

>SEP: The bottom line here is the files are owned by oracle

Are you confusing this thread with "0 byte files created in /tmp"?
http://forums1.itrc.hp.com/service/forums/questionanswer.do?threadId=1151735

I don't have Oracle.

Re: Who is cluttering /var/tmp with char dev files rdskAAA* ?

Ralph,

I think you're looking at the wrong culprit here - I don't think there's a problem with sysstat_em at all.

I tried a quick experiment on my 11.11 system here which is also patched up to June 07.

My system is trusted, so I was able to enable the auditing subsystem and tune it to just look for mknod system calls.

Here's what I found:

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
070809 09:28:21 1632 S 14 1 0 0 0 0 0 ?????
[ Event=mknod; User=root; Real Grp=root; Eff.Grp=root; ]

RETURN_VALUE 1 = 0;
PARAM #1 (file path) = 0 (cnode);
0x40000008 (dev);
7 (inode);
(path) = /var/tmp/rdskUAAa01632
PARAM #2 (int) = 8576
PARAM #3 (int) = -888971264
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
070809 09:28:21 1632 S 14 1 0 0 0 0 0 ?????
[ Event=mknod; User=root; Real Grp=root; Eff.Grp=root; ]

RETURN_VALUE 1 = 0;
PARAM #1 (file path) = 0 (cnode);
0x40000008 (dev);
7 (inode);
(path) = /var/tmp/rdskVAAa01632
PARAM #2 (int) = 8576
PARAM #3 (int) = -888971264
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# ps -ef | grep 1632
root 1632 1 0 09:22:30 ? 0:00 /usr/sbin/stm/uut/bin/tools/monitor/disk_em


That looks pretty similar to what you are seeing. The difference is on my system these files are being cleanup up straight away:

# ll /var/tmp/*dsk*
/var/tmp/*dsk* not found

So I would apply tusc to disc_em rather that sysstat_em

HTH

Duncan

I am an HPE Employee
Accept or Kudo

Re: Who is cluttering /var/tmp with char dev files rdskAAA* ?

... and looking at the files created, note that the end of the filename contains the PID of the process.

The fact that yours are incrementing whilst mine stays the same suggests that disk_em is in fact failing on your system.

I believe it is supposed to log to the logfiles found in /var/opt/resmon/log, so take a look there.

HTH

Duncan

I am an HPE Employee
Accept or Kudo
Ralph Grothe
Honored Contributor

Re: Who is cluttering /var/tmp with char dev files rdskAAA* ?

SEP and Dennis,

never mind, this well could be an Oracle issue.

Normally, this box serves as a test bed for the Oracle DBAs and the application developers.

For some reason there is currently no DB instance online.
These are the only oracle procs right now

# ps -u oracle
PID TTY TIME COMMAND
2589 ? 23:44 perl
2602 ? 60:53 emagent

Hm, being a bit of a Perl aficionado myself
that interests me, what they are running

# UNIX95= ps -x -o args -p 2589
COMMAND
/app/oracle/grid/agent10g/perl/bin/perl /app/oracle/grid/agent10g/bin/emwd.pl agent /app/oracle/grid/agent10g/sysman/log/emagent.nohup

So I looked at the agent Perl script which, according to its header, is an Oracle supplied piece, and was a bit shocked about lots of unperlish idioms or even deprecated Perl malpractices in the vein of "Pest Practices".

First thing that disqualifies it as production code for a piece that exceeds 100 lines (with crudely stripped comments)

# grep ^[^#] /app/oracle/grid/agent10g/bin/emwd.pl|wc -l
697

Where, the hack is the strict pragma?

# grep -c strict /app/oracle/grid/agent10g/bin/emwd.pl
0

Lets pick up a few pest lines (prepended line Nos.).

What a misnomer for an array:

216 my @COMMAND_STR=@ARGV;

This is C style indenting, isn't it.
Ok, its just a matter of style:

228 if ($NOHUP_FILE eq "")
229 {
230 if($COMMAND eq "iasconsole")


Bizarre quoting and concatting
(Mark Jason Dominus calls this "cave dwellers' Woodoo"):

257 $reqPkg = "$moduleName"."\.pm";

Useless use of parentheses and weird arithmetics:

315 my($NUM_COLS) = 2;
316 my($NUM_COMPONENTS) = (scalar(@components)/$NUM_COLS);


Clumsy initialization:

324 # marked all components as just started
325 my @compJustStarted;
326 for $i ( 0 .. ($NUM_COMPONENTS-1) )
327 {
328 $compJustStarted[$i] = 1;
329 }


Again, ugly C style looping:

355 for($i=0, $baseCtr=0; $i < $NUM_COMPONENTS; $i++, $baseCtr+=2)
356 {


Unclear local scoping:

367 # Reap the child .... returns an array.
368 # [0] : How the process exited [normal/signal/coredump].
369 # [1] : Exit code/Signal Code
370 local (*processExit) = reapChild( $pid, $name );

this is returned by reapChild():

926 else
927 {
928 printDebugMessage("ProcessStatus is $processStatus. Assuming normal exit.");
929 @status = ($PROCESS_EXIT_NORMAL, $processStatus);
930 }
931 }
932 return (\@status);
933 }


Btw, in this sub we suddenly take up Perl style indenting (must be due to different coders):

901 if($reaped == -1) {
902 # we lost the xit code. somebody else reaped it.
903 # we report normal exit as we don't want it restarted.
904 printMessage("Lost xit code. Assuming normal exit. processStatus=$processStatus
904 ");
905 @status = ($PROCESS_EXIT_NORMAL, 0);
906 } else {


Strange logic, and again ample use of parens:

425 # If the status is no_process or process_hang ...
426 if( ($rc == $STATUS_NO_SUCH_PROCESS) or
427 ($rc == $STATUS_PROCESS_HANG) or
428 ($rc == $STATUS_AGENT_ABNORMAL) or
429 ( $processExit[0] != $PROCESS_OK ) )
430 {


just few lines later unmotivated change to higher precedence operators:


434 if ( ($processExit[0] == $PROCESS_OK) &&
435 ( $rc == $STATUS_PROCESS_HANG ) || ( $rc == $STATUS_AGENT_ABNORMAL ) )
436 {


Strange typeglob assignment of our ubiquitious reapChild sub, which merely returns an array ref:

464 (*processExit) = reapChild( $pid, $name );


Again funny concatination (first by operator, later by double quotes interpolation):

491 my($tmpMsg) = $name." exited at ".localtime($currentCrashTime).
492 " with return value $processExit[1].";
493 printMessage($tmpMsg);


while here we relapse to a more palatable and readbable interpolation:

669 writeToEMAbbendFile("$EMHOME/sysman/log/agabend.log",
670 "$message");

Hard to read and obsolete sub dereferncing of array elem:

701 $tmp = &{$components[$baseCtr+$restart_offset]}();


Clumsy logic:

750 if($NUM_COMPONENTS == 0)
751 {
752 if($normalShutdown eq "FALSE")
753 {
754 printMessage("Exited due to Thrash.");
755 }
756 }


Why not better use sprintf or join?

798 my($appender) = $name."_".time();


Repitiion of queer assignments:

# grep -n '@status = ($PROCESS_OK, $PROCESS_OK);' /app/oracle/grid/agent10g/bin/emwd.pl
880: @status = ($PROCESS_OK, $PROCESS_OK);
894: @status = ($PROCESS_OK, $PROCESS_OK);
954: @status = ($PROCESS_OK, $PROCESS_OK);
970: @status = ($PROCESS_OK, $PROCESS_OK);


Well, this goes on and on.
I only had a very superficial look at it without caring at all about the program logic or trying to understand the code.
Maybe I have disparaged indisputable bits?
And altogether this code still is quite acceptable and not as bad as other vendor distributed I have seen (have a look at VCS entrypoints).
But if even a Perl rookie like me notices this then I wonder of what quality the code might be that they provide us only as binaries?


Madness, thy name is system administration
Ralph Grothe
Honored Contributor

Re: Who is cluttering /var/tmp with char dev files rdskAAA* ?

Sorry for my ranting digression.

Duncan,

many thanks for pointing me to disk_em.

Well, as I think I have told, I already looked at the logs in /var/opt/resmon/log.
The only current one is event.log (the other files date back to April), and all entries later than July 5th (when the box was last rebooted due to Support+ patching) only inform about a restart of EMS today, which took place as I stopped it via monconfig.

I also noticed the trailing PIDs in the device files' names.
But these PIDs are all of processes in the past.

Actually, disk_em wasn't in the proc table while EMS was shut down by my monconfig intervention.
But yet, the device files were created as soon as diagmond was running.

I now restarted EMS via monconfig and attached another tusc to disk_em.

I think it is waiting for the master agent to fetch states because it lingers in a select call.

# UNIX95= ps -C disk_em
PID TTY TIME CMD
7425 ? 00:01 disk_em

# tusc -f -s open,mknod 7425
( Attached to process 7425 ("/usr/sbin/stm/uut/bin/tools/monitor/disk_em") [32-bit] )
select(2048, 0x77ff0b24, 0x77ff07a8, 0x77ff08a8, 0x77ff0d34) ............. [sleeping]


I will have to watch this for half an hour and come back.


Madness, thy name is system administration
Solution

Re: Who is cluttering /var/tmp with char dev files rdskAAA* ?

Ralph,

Being an impatient type, I wasn't willing to wait for disk_em to do this itself, so I tried to force it. I found stopping/starting diagnostics caused disk_em to start running those mknods:

/sbin/init.d/diagnostic stop

... wait for a minute...

/sbin/init.d/diagnostic start



HTH

Duncan

I am an HPE Employee
Accept or Kudo
Ralph Grothe
Honored Contributor

Re: Who is cluttering /var/tmp with char dev files rdskAAA* ?

Yes, that's what I also observed.

# /sbin/init.d/diagnostic stop

# UNIX95= ps -C diagmond
PID TTY TIME CMD

But running it directly through tusc rather than attaching to diagmond it aborts.

# tusc -f -s open,mknod /sbin/init.d/diagnostic start 2>&1|grep /var/tmp
open("/var/tmp/aaaa09909", O_WRONLY|O_CREAT|O_TRUNC|O_LARGEFILE, 0666) ....... = 3


# UNIX95= ps -C diagmond
PID TTY TIME CMD


But diagmond is really the culprit, as can be seen this way:

# /sbin/init.d/diagnostic start && sleep 10 && tusc -f -s open,mknod $(UNIX95= ps -C diagmond -o pid=) 2>&1|grep /var/tmp <
mknod("/var/tmp/rdskAAAa11454", S_IFCHR|0600, 3405848576) .................... = 0
open("/var/tmp/rdskAAAa11454", O_RDONLY, 014644) ............................. = 5
open("/var/tmp/rdskAAAa11454", O_RDWR, 0160100) .............................. = 6
mknod("/var/tmp/rdskAAAa11480", S_IFCHR|0600, 3405905920) .................... = 0
open("/var/tmp/rdskAAAa11480", O_RDONLY, 014644) ............................. = 5
open("/var/tmp/rdskAAAa11480", O_RDWR, 0170100) .............................. = 6
mknod("/var/tmp/rdskAAAa11486", S_IFCHR|0600, 3405914112) .................... = 0
open("/var/tmp/rdskAAAa11486", O_RDONLY, 014644) ............................. = 5
open("/var/tmp/rdskAAAa11486", O_RDWR, 0170100) .............................. = 6
open("/var/tmp/CCLOGD_TEST_DATA", O_RDONLY, 032) ............................. ERR#2 ENOENT



Ok, now how can this be fixed?

Do you think this is HP SW call worthy?
Madness, thy name is system administration