- Community Home
- >
- Servers and Operating Systems
- >
- Operating Systems
- >
- Operating System - Linux
- >
- Re: SGLX 12.80 - not resistant to file system over...
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Forums
Discussions
Discussions
Discussions
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-29-2023 10:49 PM - last edited on 03-30-2023 02:31 PM by support_s
03-29-2023 10:49 PM - last edited on 03-30-2023 02:31 PM by support_s
Hello, we are faced with unplanned package stops during overflow of the temporary directory /tmp.
Oracle database monitoring failed to complete its task:
Mar 29 15:57:50 root@sglx_node1 tkit_module.sh[6549]: Retrying 3 more time(s) before giving up.
/opt/cmcluster/oracletoolkit/hagetdbstatus.sh: line 33: cannot create temp file for here-document: No space left on device
Mar 29 15:57:54 root@sglx_node1 tkit_module.sh[6549]: Retrying 2 more time(s) before giving up.
/opt/cmcluster/oracletoolkit/halistener.mon: line 50: cannot create temp file for here-document: No space left on device
Mar 29 15:57:57 root@sglx_node1 tkit_module.sh[6544]: Oracle Listener unisvfe failure detected.
Mar 29 15:57:57 root@sglx_node1 tkit_module.sh[6544]: Oracle Listener unisvfe failed
/opt/cmcluster/oracletoolkit/hagetdbstatus.sh: line 33: cannot create temp file for here-document: No space left on device
Mar 29 15:57:58 root@sglx_node1 tkit_module.sh[6549]: Retrying 1 more time(s) before giving up.
Mar 29 15:58:00 root@sglx_node1 tkit_module.sh[6544]: All listeners have failedMar 29 15:57:50 root@sglx_node1 tkit_module.sh[6549]: Retrying 3 more time(s) before giving up.
/opt/cmcluster/oracletoolkit/hagetdbstatus.sh: line 33: cannot create temp file for here-document: No space left on device
Mar 29 15:57:54 root@sglx_node1 tkit_module.sh[6549]: Retrying 2 more time(s) before giving up.
/opt/cmcluster/oracletoolkit/halistener.mon: line 50: cannot create temp file for here-document: No space left on device
Mar 29 15:57:57 root@sglx_node1 tkit_module.sh[6544]: Oracle Listener unisvfe failure detected.
Mar 29 15:57:57 root@sglx_node1 tkit_module.sh[6544]: Oracle Listener unisvfe failed
/opt/cmcluster/oracletoolkit/hagetdbstatus.sh: line 33: cannot create temp file for here-document: No space left on device
Mar 29 15:57:58 root@sglx_node1 tkit_module.sh[6549]: Retrying 1 more time(s) before giving up.
Mar 29 15:58:00 root@sglx_node1 tkit_module.sh[6544]: All listeners have failed
I checked the hagetdbstatus script.sh - it uses the following script construction:
/usr/local/cmcluster/oracletoolkit/hagetdbstatus.sh: if [[ -f /tmp/ora_error_${SID_NAME}.txt ]] ; then
/usr/local/cmcluster/oracletoolkit/hagetdbstatus.sh: cat /tmp/ora_error_${SID_NAME}.txt
Is this a bug or a feature of SGLX Product?
I understood that bash by default creates temporary files in this directory or in the directory specified in the TMPDIR variable and in the same case, if the /tmp directory overflows, we would face the same problem - stopping the package.
YA
Solved! Go to Solution.
- Tags:
- Operating System
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-30-2023 05:35 AM
03-30-2023 05:35 AM
SolutionI would position this as expected behavior. Serviceguard's primary role is to provide high availability to packaged applications. This means if the node currently running the application is experiencing a problem of some sort, Serviceguard should fail the package over to one of the other adoptive nodes. So the monitor may not be explicitly designed to detect and handle a full /tmp filesystem, I would not say the outcome (failure of the monitor service and faliover of the database) is a completely undesireable outcome from a HA perspective. A full /tmp file system can certainly destabilize a linux operating system leading to problems across many subsystems. Although it might or might not affect Oracle directly, it can affect many other operating system processes so in my mind this is a situation where a failover is probably desirable.
If /tmp is filling up repeatedly that should of course be fixed, either by growing it or figuring out why it keeps happening and stopping whatever it is that is filling it up.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-30-2023 02:37 PM
03-30-2023 02:37 PM
Re: SGLX 12.80 - not resistant to file system overflow /tmp
Hi @Mike_Chisholm.
Thank you for a balanced answer. I agree that any problems potentially negatively affecting the cluster node can also negatively affect the managed application.
YA.