- Community Home
- >
- Servers and Operating Systems
- >
- Operating Systems
- >
- Operating System - Linux
- >
- Nice process, bad schedul, big trouble
Operating System - Linux
1753946
Members
7501
Online
108811
Solutions
Forums
Categories
Company
Local Language
back
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Discussions
Discussions
Forums
Discussions
back
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Blogs
Information
Community
Resources
Community Language
Language
Forums
Blogs
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-29-2004 12:33 AM
12-29-2004 12:33 AM
Nice process, bad schedul, big trouble
Dear all,
I have some trouble with nx2 nastran process runing on a dl380 g3 (bi-processor and multithreading enable) with 2.4.18-14smp linux core.
I must say that this server was installed about 1 year and this problem never occur before.
This is the way the things goes :
The analisys process start well with a nice scheduling priority (10).
It does his work (compute and write about 100 Gb of data) taking almost 100 % of one cpu(we don't have nastran // license)
BUT
After a while( depending on the size of the computing project) it take 100 % of ALL cpu and never ending(i let him work 12hours for a a compute of 2 hours).
So i try to kill it, kill -15 : don't work, kill -9 : stay alive, reboot : don't want to know(the server don't reboot at all)
In fact, the process take 100% of cpu in the system mode waiting for some i/o ending( i think ?!)
I let the process running until an "ASR Detected by System ROM" reboot the system.
We restart the compute and the process goes the same way, every compute goes the same way.
After check hardware problem( i don't found any), i try to renice the analysis process level 5 nothing change, level 0 -> the process stop by itself after few second.
We restart a new compute nice level 0, the process stop by itself. But sometime the process nice level 0 goes in trouble again and i must renice it (-1,-2,-3,etc...)
My question is :
What this process expect from me ? more seriously i understand that the nice opération give a process higher scheduling priority, but it seems that whatever the priority the process freeze and the only way to stop it is to renice it with more priority.
The data are writing on local scsi hard disk, with enough free space. No error message in /var/log/*, dmesg, ilo log....
We stress the system(nfs, i/o writing, cpu, memory) during 12 hours and nothing seems to be wrong......
I try with multithreading enable then disable, with Ilo card enable/disable.
If you got any idea ......
Thanks a lot
Best Regards
laurent Ranni
I have some trouble with nx2 nastran process runing on a dl380 g3 (bi-processor and multithreading enable) with 2.4.18-14smp linux core.
I must say that this server was installed about 1 year and this problem never occur before.
This is the way the things goes :
The analisys process start well with a nice scheduling priority (10).
It does his work (compute and write about 100 Gb of data) taking almost 100 % of one cpu(we don't have nastran // license)
BUT
After a while( depending on the size of the computing project) it take 100 % of ALL cpu and never ending(i let him work 12hours for a a compute of 2 hours).
So i try to kill it, kill -15 : don't work, kill -9 : stay alive, reboot : don't want to know(the server don't reboot at all)
In fact, the process take 100% of cpu in the system mode waiting for some i/o ending( i think ?!)
I let the process running until an "ASR Detected by System ROM" reboot the system.
We restart the compute and the process goes the same way, every compute goes the same way.
After check hardware problem( i don't found any), i try to renice the analysis process level 5 nothing change, level 0 -> the process stop by itself after few second.
We restart a new compute nice level 0, the process stop by itself. But sometime the process nice level 0 goes in trouble again and i must renice it (-1,-2,-3,etc...)
My question is :
What this process expect from me ? more seriously i understand that the nice opération give a process higher scheduling priority, but it seems that whatever the priority the process freeze and the only way to stop it is to renice it with more priority.
The data are writing on local scsi hard disk, with enough free space. No error message in /var/log/*, dmesg, ilo log....
We stress the system(nfs, i/o writing, cpu, memory) during 12 hours and nothing seems to be wrong......
I try with multithreading enable then disable, with Ilo card enable/disable.
If you got any idea ......
Thanks a lot
Best Regards
laurent Ranni
1 REPLY 1
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-29-2004 02:58 AM
12-29-2004 02:58 AM
Re: Nice process, bad schedul, big trouble
Ranni,
I'm afraid that you have a software (nastran) problem. But you don't have a nastran license what makes the thing a lot more serious.
Lets try some things.
You can run process with less data to be processed, i.e., less resource consumming process ?
You can see if the memory becames stressed along the time the long runnig process works ?
Maybe your software problem (bug) appears only when a vary large amount of data has to be processed.
So, as you don't have a license that makes possible to report a bug, I suggest you, if possible, to break your long running process into 2 or more.
I know that it's not a great help but it's just what I can do for you.
Good luck.
Xyko
I'm afraid that you have a software (nastran) problem. But you don't have a nastran license what makes the thing a lot more serious.
Lets try some things.
You can run process with less data to be processed, i.e., less resource consumming process ?
You can see if the memory becames stressed along the time the long runnig process works ?
Maybe your software problem (bug) appears only when a vary large amount of data has to be processed.
So, as you don't have a license that makes possible to report a bug, I suggest you, if possible, to break your long running process into 2 or more.
I know that it's not a great help but it's just what I can do for you.
Good luck.
Xyko
The opinions expressed above are the personal opinions of the authors, not of Hewlett Packard Enterprise. By using this site, you accept the Terms of Use and Rules of Participation.
News and Events
Support
© Copyright 2024 Hewlett Packard Enterprise Development LP