Operating System - HP-UX
1849152 Members
7707 Online
104041 Solutions
New Discussion

Connection Issues HPUX11 A500-5x

 
Charles Harris
Super Advisor

Connection Issues HPUX11 A500-5x

Dear all,

I'm having a very strange problem with two machines and I'd greatly appreciate any suggestions / help anyone could bestow!!!

It's a bit of a long question, so please bear with me!

Machine A & Machine B are both identical A500-x5 running HPUX11, weblogic MQ

Machine C & Machine D are identical hardware to A & B, but run weblogic, MQ, Informix and an inhouse app.

Traffic is directed on a round-robbin basis to both machine A & B, incoming (HTTP), outgoing(MQ sockets) (Informix sockets) forwarded to the backend machines for processing, which works fine. However, in the event of a failure, on either C or D, all the traffic is directed to the non failed host. Nothing out of the ordinary so far ;-)

The problem is that when either C or D fails, the front end machines A or B cannot establish enough concurrent connections to the back end. They fail with a java related runtime error from the weblogic servers (A & B) :-

nectServlet.init()----- Error creating resource pool
java.lang.RuntimeException: An MQ error occurred opening connections to QMgr or
Queues: Completion code 2 Reason code 2009
at com.dhl.connector.GASMQConnector.(GASMQConnector.java:92)
at com.dhl.connector.MQResourceAllocator.allocate(MQResourceAllocator.ja
va:42)
at com.dhl.util.ResourcePool.acquire(ResourcePool.java:90)
at com.dhl.util.ResourcePool.(ResourcePool.java:60)

We've had an MQ consultant tell us that this is not and MQ issue (suprise) but I've tried everything I can think of but can not resolve the issue!
The TCP_CONN_MAX is set at 4096, which is our default, the TCP outputs from netstat -p tcp lok fine, with no dropped connections due to full queue etc and I'm not aware of any other system related constraints.

If anyone has any ideas, RTFM, RTFF's or suggestions, they will be warmly received!!!

Cheers,

-=ChaZ=-

ps. Despite my horrible description and spelling, this is not an exam question ;-)
6 REPLIES 6
Jeff Schussele
Honored Contributor

Re: Connection Issues HPUX11 A500-5x

Hi ChaZ,

Do you have *all* the possible Q mgrs defined on the A & B systems - especially the failover scenarios? How do they get activated in this scenario?
Also we've had trouble between middleware apps like MQ & Orbix or even Corba-type apps because one app will through an undescribed error-type that the other has no other option than to ignore & fire up another connection - leaving the original intact. This quickly consumes the connection limit & causes failures.
I'd be getting traces - including kernel traces - of all the PIDs in question for examination.

Rgds,
Jeff
PERSEVERANCE -- Remember, whatever does not kill you only makes you stronger!
Charles Harris
Super Advisor

Re: Connection Issues HPUX11 A500-5x

Thanks for the tips, the Qmgrs are all setup correctly, the configuration was working perfectly for a while, then suddenly stopped.
Nothing (as far as we can tell) has changed configuration wise. We are even unable to replicate the problem in our test environment, which is odd although different hardware is used!! It's very odd, I'm just trying to figure out the best way of resolving the issue as it's in a live environment, which makes testing / re-installation traumatic to say the least...

Cheers,

-=ChaZ=-
Charles Harris
Super Advisor

Re: Connection Issues HPUX11 A500-5x

Any pointers on getting said traces? - I'm about to try some tcpdump runs to try and establish if its a network routing / protocol related problem....

Cheers,

-=ChaZ=-
Jeff Schussele
Honored Contributor

Re: Connection Issues HPUX11 A500-5x

Well...to trace the processes you need to get & use tusc - here's where it can be had:

http://hpux.cs.utah.edu/hppd/hpux/Sysadmin/tusc-7.5/

And to get kernel traces you need a program called kitrace - but you have to get that from HP. Generally they'll give it to you if you log a SW call & they think they need the output to troubleshoot the problem.

Rgds,
Jeff
PERSEVERANCE -- Remember, whatever does not kill you only makes you stronger!
rick jones
Honored Contributor

Re: Connection Issues HPUX11 A500-5x

Keep in mind that tcp_conn_max (no caps, we don't shout ndd settings here :) is the limit to the number of pending connections queued to a listen endpoint, not a limit on the number of connections one can establish to/from the system (ftp://ftp.cup.hp.com/dist/networking/briefs/annotated_ndd.txt)

Limits to the number of TCP connections tend towards the limits to the number of file descriptors - nfile for system-wide and maxfiles/maxfiles_lim for per-process.

If the software is doing something unfortuneate like thread-per-connection you might run into issues with max_thread_proc (iirc) and related settings. If it is doing something even more unfortuneate like process-per-connection then you have to consider nproc (system wide) and maxuproc.

Knuth only knows what limits exist within Java...

None of those things (iirc) would appear as errors in the netstat -p tcp output. They would almost certainly appear in an error log if the software is "well writtten" and at the very least would appear as error returns in the aforementioned tusc system call traces.
there is no rest for the wicked yet the virtuous have no pillows
Charles Harris
Super Advisor

Re: Connection Issues HPUX11 A500-5x

Thanks for the followup information, I'm going to check the maxfiles situation and the other kernel parms, maybe even using lsof to see if I can spot what is really going on with the file handles......etc....

A well written application should log errors ;-) Least said about the better I think!

Thanks again, I'll put my findings on here if we ever manage to sort it out without resourting to a total rebuild ( Which is the software equiv. of a reboot ;-) ........ watch this space!


Cheers,

-=ChaZ=-