Operating System - Tru64 Unix
1748200 Members
3526 Online
108759 Solutions
New Discussion юеВ

Re: Performance problem on a GS1280

 
Alexey Borchev
Regular Advisor

Re: Performance problem on a GS1280

Hi, Florent! You've done lots of wonderful observations!
>>Do you now how it can happen ? - No...

>>And why the system is not able to switch the memory of one process to the CPU #10 that is not used ? - As far as I know, Tru64 does it best to scedule process close to it's memry. But I newer heard that Tru64 Re-locates proces'es memory to another RAD...

>>Concerning the large page memory, can you give me more details about the way to manage ?
1) #man sys_attrs_vm, read all around vm_bigpg_*
2) Run Kernel tuner, section vm, set vm_bigpg_enabled=1
and reboot.
Your application seems to be memory-intensitive - i.e. good candidate for the feature. Please tell us if You've got performance benefits from big pages.
I've just enabled the feature, and got results (see attachment, it's Oracle dbwriter process).
But this will not resolve 'Foreign RAD' problem.

3) have seen an other process that share now the memory on two processors (#21 and #23) ! So xmesh is showing lot of transfert between this two. Do you think this is an expected behavior ? - Yes, definitely.

4) if You want to pin process to memory, then either go for 'runon' or sched_distance (but sched_distance<=1 can harm performance).

5) Sorry, I am not an HP person, I am just selling lipsticks for Avon :-)

The fire follows shedule...
Han Pilmeyer
Esteemed Contributor

Re: Performance problem on a GS1280

If you do decide to try VM:vm_bigpg_enabled=1, be sure to also set vm:vm_segmentation=0. We're working on an official message about that.
Florent Boucher
Occasional Advisor

Re: Performance problem on a GS1280

Dear Hein and Alexey,
I did not had time for the moment to test the vm_bigpg option. For this, I have to reboot the system and I should sent a notification to the users. I think I will do this change in the midle of the week. In the mid time, I would like to come back to the difference between "home rad" and "current rad". On our system, it often happen that job are submitted for many hours (days). So, using the scheduler policy that can suspend one job to start another, it seems possible that two (or even more) heavy job have the same "home rad". Am I rigth ? Of course, unix will try to have different "current rad" for every very demanding process.
Does a way exist to optimize the way the "home rad" are distributed ? One can immagine that unix could move in my case the "home rad" of process 688877 to rad#10 in order to avoid the large transfert between the processors #12 and #10 ?
Concerning the runon, it is impossible to use with mpi jobs. So I do not think I will kept this solution.
Regards
Florent

Han Pilmeyer
Esteemed Contributor

Re: Performance problem on a GS1280

oops. That vm:vm_bigpg_seg=0 (not vm:vm_segmentation=0), when using big pages (vm:vm_bigpg_enabled=1).
Joerg Schulenburg
Frequent Advisor

Re: Performance problem on a GS1280

I admin a GS1280 and also see performance problems. Some RADs are nearly doing 99%
for system and 1% for user if the free memory
of that RAD is to low. I obvserved also,
that 2 different jobs get memory from the same RAD. May be you have a similar problem but not
fully evolved. Unfortunatly I dont see the vmstat -R nor the ps output
mentioned in this thread.
Please have a short look to
http://www.uni-magdeburg.de/urzs/marvel/vmbug3.html
to see what I am talking about.
Fighting for a better world with more penguins.
Joerg Schulenburg
Frequent Advisor

Re: Performance problem on a GS1280

I am also very interested in the mentioned tool about seeing, where the memory of a process is located. I asked google about it, but nothing was found.
Fighting for a better world with more penguins.
Florent Boucher
Occasional Advisor

Re: Performance problem on a GS1280

Dear Joerg,
it seems to me that we have exactly the same problem. For the moment, no news at all from the HP support. I put in attachement the first output from vmstat -R 5 and the information about memory allocation for the two process that have problem.
I hope somebody will give us some "good" answer to solve the problem.
Regards
Florent
Florent Boucher
Occasional Advisor

Re: Performance problem on a GS1280

Dear Joerg
The tool for the analysis of vm allocation has been given by Alexey. You can find it at the beginning of the thread.
I put it again in attachement again.
Regards
Florent
Joerg Schulenburg
Frequent Advisor

Re: Performance problem on a GS1280

Dear Florent,
I am happy that I am not alone. Thanks for the attachement. I just overlooked the paperclip
symbol on the replies.
I will try out the program together with my testprogram tomorrow on the empty machine. Today its to late for long experiments.
Best regards,
Joerg.
Fighting for a better world with more penguins.
Joerg Schulenburg
Frequent Advisor

Re: Performance problem on a GS1280

Dear Florent, Its not easy to make successfull
tests. I did some bad things. First I called
date, vmstat and ps by the program using system call. As I remember that is not very clever because usually fork + exec is called and that means, the big GB memory process is
(virtually) doubled for a short time.
I saw that date, ps, etc. took long time instead of short response. So the outcome of my tests are not optimal.
I try to give more details on the mentioned page and on another forum thread (subject: slow down (swapping) on a GS1280 with lot of free memory). As you can understand, its
not my task to use our expensive machine as
testmachine and reboot it all the days.
For first I saw system becoming very slow
if free pages from one RAD was below 3000 down to 10, which was usually at the 16th GB the case (with and without swap).
Today swap was growing very slowly, and speed was not as bad as some days ago which could be a result of the other users (I did not reboot before the new tests).
Fighting for a better world with more penguins.