Simpler Navigation coming for Servers and Operating Systems
Coming soon: a much simpler Servers and Operating Systems section of the Community. We will combine many of the older boards, and you won't have to click through so many levels to get at the information you need. If you are looking for an older board and do not find it, check the consolidated boards, as the posts are still there.
cancel
Showing results for 
Search instead for 
Did you mean: 

simulation job crash, help!

jane zhang
Regular Advisor

simulation job crash, help!

Hello all,

We need to large IC simulation jobs in our linux box. but the jobs always crash the box before it use up all the memory.

I have increased swap space to 6GB after we increase memory to 3GB.

I have check ulimit and memory without any clue.

any suggestions?

Thanks,

Jane
6 REPLIES
xyko_1
Esteemed Contributor

Re: simulation job crash, help!

Hi Jane,

did you see any messages on /var/log/messages regarding your job ? Any message at all ?

What you mean about server crash ? did it hangs ? reboots ? You loose control over you job only ?

we need more information to help you.

Tell us your hardware configuration and linux distribution also.

regards,
xyko
jane zhang
Regular Advisor

Re: simulation job crash, help!

Xyko,

Our IC designer was running his big job there. I did not see any thing abnormal in the /var/log/messages.

But in the spectre( IC simulation).out log file, it complained.

We has increased memory from 1GB to 3GB and swap space to 6GB. still no go.

Running top before the job crash ( the job just quit) only see 2.5gb allocated to the job ( from the size column).

We are using redhat 8.0 on intel dell PC.

Linux della02 2.4.18-14 #1 Wed Sep 4 13:35:50 EDT 2002 i686 i686 i386 GNU/Linux


Fatal error found by spectre at time = 41.6982 ps during transient analysis
`tran'.
Insufficient memory available.
Warning from spectre.
5 warnings suppressed.


Aggregate audit (7:35:26 PM, Thur Jan 6, 2005):
Time used: CPU = 3.37 ks (56m 14.1s), elapsed = 3.46 ks (57m 35.0s), util. =
97.7%.
Virtual memory used = 924 Mbytes.
spectre completes with 1 error, 388629 warnings, and 251654 notices.
spectre terminated prematurely due to fatal error.
Stuart Browne
Honored Contributor

Re: simulation job crash, help!

At a start, I'd suggest upgrading to the latest eratta kernel, as it fixes many issues.

You may also need to use a differently-compiled kernel. By default, the latest eratta kernels have the kernel flag 'CONFIG_HIGHMEM4G' set on to support between 1 and 4GB of memory. It's turned off in all but the i686 series of kernel packages in the release set.
One long-haired git at your service...
xyko_1
Esteemed Contributor

Re: simulation job crash, help!

Hi Jane,

Stuart is giving a good idea. verify if your kernel supports bigmem. The command free will give you information to see if your kernel is using the hole RAM.

Another thing you may see is if your job has some issues regarding memory parameters that you may configure at startup like shmmax, shmmni and shmall. I guess you have to look for that kind off information on the software manual.

man sysctl will help you with kernel parameters.

regards,
xyko
Rick Beldin
Esteemed Contributor

Re: simulation job crash, help!

You don't mention what distro. If you are using Red Hat Enterprise Linux 3.0, you need to use one of the hugemem kernels from RHN. The description of one says:

This package includes a kernel that has appropriate configuration options
enabled for Pentium III machines with 16 Gigabytes or more of physical memory.
Necessary questions: Why? What? How? When?
jane zhang
Regular Advisor

Re: simulation job crash, help!

Hi all,

Thanks for the feedbacks.

I have apply rpm package kernel-bibmem to redhat 8. the application make more progress to use swap space. still quit due to the same reason.

I am going to upgrade to WS V3 to see what happen.

Thanks,

Jane