- Community Home
- >
- Servers and Operating Systems
- >
- Operating Systems
- >
- Operating System - OpenVMS
- >
- Re: unable to handle buffer overflow
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Discussions
Discussions
Forums
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО08-10-2009 12:32 AM
тАО08-10-2009 12:32 AM
unable to handle buffer overflow
As we approach our peak transaction period, our online server will slow down the communication between gateway servers, this will create communication data pile up in the TCP socket buffer at the gateway server end, when it hit the threshold it will show "tcp_writeAst: write error 28 "(buffer exceeded) and the communication between online server and gateway server will never recover until we reset the TCP connection.
Currently we have resorted to increase TCPIP tcp sendspace buffer to 987,136 from 61142 and maxbuf to 32000 from 8192, this so far we have stop the "Write error 28" from happening.
We are worried that this will happen again as our terminals connection to gateway increase, with the traffic volume increase, more processing at our online server. The end result the error will happen again.
My question, why is the tcpip stack in integrity server unable to handle this type of buffer overflow checking?
The TCPIP version for the integrity is 5.6 ECO2
Please find attached TCPIP show version from integrity server for your reference.
Below are the servers' specifications
Online server
Alpha DS20
OpenVMS 7.3
TCPware
Gateway server
Integrity RX2660
Openvms 8.3 1H1
TCPIP ver 5.6 ECO2
Online----LAN----- gateway ---- router ---VPN ----- router --- terminals
Hope to hear from your soon.
Regards,
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО08-10-2009 01:17 AM
тАО08-10-2009 01:17 AM
Re: unable to handle buffer overflow
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО08-10-2009 01:26 AM
тАО08-10-2009 01:26 AM
Re: unable to handle buffer overflow
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО08-10-2009 02:08 AM
тАО08-10-2009 02:08 AM
Re: unable to handle buffer overflow
With all due respect, there is a significant amount of information not yet present.
First, there are several possibilities here. One is that there is a problem in either of the TCP/IP stacks (please note that OpenVMS for Alpha 7.3 is very old, and that the version of TCPWARE is not included in the OP).
Second, the problem could be caused by a programming error in the database server, were it to stop processing requests from the gateway. Depending upon the transaction volume, the resulting backlog could conceivably create a scenario like what is being seen.
Is some event happening on the database server that is stopping processing of events? What is the time scale needed to create this problem (e.g., does the depth of the buffering hold 0.5 seconds of data or 50 seconds of data)?
- Bob Gezelter, http://www.rlgsc.com
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО08-10-2009 05:53 AM
тАО08-10-2009 05:53 AM
Re: unable to handle buffer overflow
When the network (or application) limit is reached, the network protocol (or the application) attempts back-pressure, or the protocol drops messages. Which of these occurs depends on how bad the overload problem is, and how the protocol is designed; trade-offs made by designers.
Details of the particular limits varies. Widely.
While going to larger buffers can sometimes help smooth over the handling of bursty traffic, going to larger buffers with excess traffic simply forestalls the inevitable failure.
There are various techniques available here; depending on what limit is arising. Resolution could involve telecom bandwidth improvements or techniques including data compression, faster hardware or sharding techniques or both, tuned application software, or well, something else.
It might mean the database is overloaded, the database design needs changes, the database server is overloaded, the network is overloaded, the disks on the server are overloaded, the application is overloaded, the quotas on the application server processes are insufficient, the memory on the servers is insufficient, or , well, anything.
Or, yes, there could be bugs in OpenVMS or the IP stacks, or in the application code. (OpenVMS and IP bugs are far less likely than application bugs.)
The likely first priority is to identify the particular limit being encountered here. That's also going to be your job; we cannot do that without direct access to the servers and the network involved.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО08-11-2009 11:28 PM
тАО08-11-2009 11:28 PM
Re: unable to handle buffer overflow
We have 11 gateway server and one online server and no stopping processing when problem occurred.
Mostly the problem happened when on pick hour about 2 to 3 minutes and the buffer will just jammed that until we reset the link of both servers
We have sufficient bandwidth from telecom and no data compression for our network.
So far there is no overloaded for online server coz the utilization of the processor and memory were less than 50%
Guru here
How about the mechanism of HP TCPIP?
Is there a way for stack to monitor the socket buffer when it reach quota limit?
In theory the Gateway server should resume send data to online server if the socket buffers have reduces from maximum, but I need to reset both links only resume send data.
Thanks
Wong seng guan
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО08-12-2009 01:46 AM
тАО08-12-2009 01:46 AM
Re: unable to handle buffer overflow
TCP buffering will only do so much. Also, as Hoff has observed, it is not possible to exclude the possibility of two (sets?) of bugs: one in the application(s) on the gateway system and one in the TCP implementations, or the interaction of the different implementations (such interaction problems are exceedingly rare, but they do happen).
First, TCP buffering should generally not be used as a queue management implementation. If one is doing transaction processing of some sort, one should either implement a buffering solution (with flow controls) in the server (and its access routines that are used in the clients), or use a request management package (e.g., RTR).
Identifying problems with the TCP stacks themselves would require any or both of two approaches: a trace of the communications flow to/from the server with a tool such as WireShark (http://www.wireshark.org) and some simple, stripped down test cases that produce the aberrant behavior.
The test cases can then be provided together with the resulting traces to the appropriate developers to reproduce the problem.
As a start, I would want to study the central server, to understand just what is going on that is causing the bottleneck. Approaches that may work at moderate or intermediate loads may not be suitable at high rates of activity. A performance study of the central processor and a code review are probably a good start.
- Bob Gezelter, http://www.rlgsc.com