- Community Home
- >
- Servers and Operating Systems
- >
- Server Clustering
- >
- Problem with HPC when running big programs
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Discussions
Discussions
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
04-17-2014 03:13 AM - last edited on 04-20-2014 03:33 PM by Lisa198503
04-17-2014 03:13 AM - last edited on 04-20-2014 03:33 PM by Lisa198503
Problem with HPC when running big programs
Hi,
Im not an expert on this matter but i would really appreciate if somebody could find out the solution for me.
I have an HPC with the master showing as below:
[root@masterserver ~]# df -H
Filesystem Size Used Avail Use% Mounted on
/dev/cciss/c0d0p6 21G 17G 3.8G 82% /
/dev/cciss/c0d0p9 70G 8.2G 58G 13% /opt
/dev/cciss/c0d0p8 11G 159M 9.8G 2% /tmp
/dev/cciss/c0d0p5 32G 7.1G 23G 24% /usr
/dev/cciss/c0d0p3 32G 2.2G 28G 8% /var
/dev/cciss/c0d0p2 100G 11G 85G 11% /export
/dev/cciss/c0d0p1 5.3G 181M 4.8G 4% /boot
tmpfs 17G 0 17G 0% /dev/shm
tmpfs 8.3G 19M 8.3G 1% /var/lib/ganglia/rrds
192.168.1.2:/global/home 9.6T 31G 9.1T 1% /global/home
192.168.2.2:/global/apps 8.3T 24G 7.9T 1% /global/apps
*****************************
In the storage master it is :
[root@storageserver ~]# df -H
Filesystem Size Used Avail Use% Mounted on
/dev/cciss/c0d0p6 21G 20G 0 100% /
/dev/cciss/c0d0p7 11G 158M 9.8G 2% /tmp
/dev/cciss/c0d0p5 21G 2.2G 18G 11% /usr
/dev/cciss/c0d0p3 21G 7.0G 13G 36% /var
/dev/cciss/c0d0p2 21G 3.7G 17G 19% /opt
/dev/cciss/c0d0p1 2.1G 43M 2.0G 3% /boot
tmpfs 4.2G 0 4.2G 0% /dev/shm
/dev/mapper/vg0-lv0 9.6T 31G 9.1T 1% /global/home
/dev/mapper/vg1-lv1 8.3T 24G 7.9T 1% /global/apps
******************************
Now when i run any big programs, the program halts, and when i df -H again in the master the line where it mounts the /global/home and /global/apps is missing:
[root@paramsheersh ~]# df -H
Filesystem Size Used Avail Use% Mounted on
/dev/cciss/c0d0p6 21G 17G 3.7G 82% /
/dev/cciss/c0d0p9 70G 8.2G 58G 13% /opt
/dev/cciss/c0d0p8 11G 159M 9.8G 2% /tmp
/dev/cciss/c0d0p5 32G 7.1G 23G 24% /usr
/dev/cciss/c0d0p3 32G 2.2G 28G 8% /var
/dev/cciss/c0d0p2 100G 11G 85G 11% /export
/dev/cciss/c0d0p1 5.3G 181M 4.8G 4% /boot
tmpfs 17G 0 17G 0% /dev/shm
tmpfs 8.3G 19M 8.3G 1% /var/lib/ganglia/rrds
*********************************
I am not able to login to the storageserver at all. It hangs and stay like that till i forcibly shutdown.
Now i guess there is a disconnection to the storageserver when running big programs. Its caused might be because of the space issue or maybe because of this "/dev/cciss/c0d0p6 -100% usage/".
Please help me on this.
Thanks in advance and Regards,
Hamar
P.S. This thread has been moved from Servers>General to Server Clustering. -HP Forum Moderator