1847264 Members
4158 Online
110263 Solutions
New Discussion

class C network

 
Ray Aslin
Occasional Contributor

class C network

Hello,

We have an HP-UX 11i box...it runs an oracle database/application. We also have 2 network segments(vlans) a 206.199.174.x segment where the servers and IT department lives, and a 172.16.x.x segment, where the rest of the organization lives, both of these are fully 'classful' meaning they are 255.255.255.0, and 255.255.0.0 address schemes, respectively. ok..here's the interesting part, stick the server on the 172.16.x.x side of the network, everything is fast to all clients, stick it on the 206.199.174.x side and things are fast for the 172.16.X.X clients, but slow for the 206.199.174.x clients, stick the unix box on the 172.16.x.x segment, and things are fast for everyone. We thought we might have some type of flaky network issue, so we isolated the box and 1 client on a 24 port 100mb full duplex switch(hard setting the card, and switch for speed and duplex, on both client and server) it was not connected to any part of the main network...guess what...give it a 206.199.174.x address and the client is slow(client was set to 206.199.174.x address) give it a 172.16.x.x address and the client was fast (client set to a 172.16.x.x address)

Nortel switches, running the 2 Vlans I listed above...crazyiest thing I've seen...(not that I've seen all that much)

the box just doesn't like 206.199.174.x address...

any thoughts...any at all?
the different in speed in the application is profound, a query that should take 1 second, takes as much as 35 seconds, on the 206.199.174.x network.
9 REPLIES 9
Ray Aslin
Occasional Contributor

Re: class C network

I read over the post and realized I left out the tshooting we've done so far(no one should write these things at 5am)

1)the unit is equipped with a fiber card, used that to eliminate the copper 10/100/1000 card, however both cards are pretty much the same chipset, broadcom.

also, this eliminated any copper cabling problems between the server and punchblock.

2)tried different types of clients on the respective segments, with different OS's and hw configurations, it follows the segment every time.

3)replaced the network card, with an identical one, at the request of the VAR that provided the HW and application.

4)on the isolated network played with the IP settings a bit, used a 172.16.5.x address, with a class C subnet mask, and set the client accordingly, just wanted to see if it was all class C subnets, nope that worked fine, just the 206.199.174.x that gives us trouble.

the box isn't live yet(was suppose to go live 3 weeks ago, but we are still trying to work this out with our VAR)

the only thing I am confident of, is that its not the network itself, the fact that we can take it out of the passport switch and put it on its own 24 port switch with one client and the problem still occurs, pretty much rules out the network for me. I see it as a misinterpretation of some kind within unix, some place, the pretty numbers we put in for IP's in unix 206.199.174.82, are not translating correctly into binary...I know that's far fetched...but I swear to god, if I'm still chewing on this in 2 more weeks, I'm going to look into an exciting career in truck driving.

Paula J Frazer-Campbell
Honored Contributor

Re: class C network

Ray

Have you checked things like hosts for duplicate entries?

Have you tried a reboot and tested immediatly after?

Have a very close look at the netconf file - perhaps move it to one side and create a new clean one!

Just some ideas

Paula
If you can spell SysAdmin then you is one - anon
Paula J Frazer-Campbell
Honored Contributor

Re: class C network

Also

Anything in log files ?

Paula
If you can spell SysAdmin then you is one - anon
Ray Aslin
Occasional Contributor

Re: class C network


1)checking the hosts file, I am told that the application doesn't need name resolution of the client pc to work, and I have tried adding it just to see, I also have been quite careful, in switching it back and forth between the networks, to make sure the hostname to IP address was correct for itself, and its default router.

2)netconf, we've taken the file, saved it off, started from scratch manually made changed, made changes through sam...compared the file to the current live box(that runs 11) and compared it to another 11i box..no efforts in that regard have produced results.

3)reboot immediately after, yes, in the isolated environment, when we first truly discovered it didn't have anything to do with the network configuration on the network, we decided that we better include reboots in our troubleshooting, just in case something was 'taking the change' well.

Thanks for the ideas though...I appreciate it.

Ray
Steve Lewis
Honored Contributor

Re: class C network

Ray,

You have been very thorough and I doubt if I can think of anything you haven't tried, but just in case:

1. Rule out database settings:
Which IP is the listener attached to? Try changing it to the other one (the other hostname).
Compare query speeds with ftp times on both networks. If its the network then ftp should also be slow.

2. Check your client settings - how is the client getting the IP for the server - DNS? windows hosts file? Rule our DNS.

3. Reboot, connect over fast lan, netstat -s > file1, connect over slow lan, netstat -s > file2, diff file1 file2, check for errors in all layers.

4. Check settings on server.
/etc/resolv.conf and /etc/nsswitch.conf

5. ndd -get /dev/tcp ?
ndd -get /dev/ip ?

Then check /etc/rc.config.d/nddconf

6. You said you had backed up the netconf file - make sure it isn't still in the rc.config.d directory or it will be executed along with the real one - maybe in the wrong order.

7. traceroute back from server to client as well as from client to server.

8. netstat -rn (on a PC use ROUTE PRINT) to make sure your route to the slow IP range goes the right way.

9. I have encountered network socket saturation on database queries through VLANs which caused >30 second delays to queries. These queries went a lot quicker by load balancing database access over several interface cards and also using shared memory connections for server based batch processes.


Steve Lewis
Honored Contributor

Re: class C network

Ray,

You have been very thorough and I doubt if I can think of anything you haven't tried, but just in case:

1. Rule out database settings:
Which IP is the listener attached to? Try changing it to the other one (the other hostname).
Compare query speeds with ftp times on both networks. If its the network then ftp should also be slow.

2. Check your client settings - how is the client getting the IP for the server - DNS? windows hosts file? Rule our DNS.

3. Reboot, connect over fast lan, netstat -s > file1, connect over slow lan, netstat -s > file2, diff file1 file2, check for errors in all layers.

4. Check settings on server.
/etc/resolv.conf and /etc/nsswitch.conf

5. ndd -get /dev/tcp ?
ndd -get /dev/ip ?

Then check /etc/rc.config.d/nddconf

6. You said you had backed up the netconf file - make sure it isn't still in the rc.config.d directory or it will be executed along with the real one - maybe in the wrong order.

7. traceroute back from server to client as well as from client to server.

8. netstat -rn (on a PC use ROUTE PRINT) to make sure your route to the slow IP range goes the right way.

9. I have encountered network socket saturation on database queries through VLANs which caused >30 second delays to queries. These queries went a lot quicker by load balancing database access over several interface cards and also using shared memory connections for server based batch processes.


Steve Lewis
Honored Contributor

Re: class C network

Ray,

You have been very thorough and I doubt if I can think of anything you haven't tried, but just in case:

1. Rule out database settings:
Which IP is the listener attached to? Try changing it to the other one (the other hostname).
Compare query speeds with ftp times on both networks. If its the network then ftp should also be slow.

2. Check your client settings - how is the client getting the IP for the server - DNS? windows hosts file? Rule our DNS.

3. Reboot, connect over fast lan, netstat -s > file1, connect over slow lan, netstat -s > file2, diff file1 file2, check for errors in all layers.

4. Check settings on server.
/etc/resolv.conf and /etc/nsswitch.conf

5. ndd -get /dev/tcp ?
ndd -get /dev/ip ?

Then check /etc/rc.config.d/nddconf

6. You said you had backed up the netconf file - make sure it isn't still in the rc.config.d directory or it will be executed along with the real one - maybe in the wrong order.

7. traceroute back from server to client as well as from client to server.

8. netstat -rn (on a PC use ROUTE PRINT) to make sure your route to the slow IP range goes the right way.

9. I have encountered network socket saturation on database queries through VLANs which caused >30 second delays to queries. These queries went a lot quicker by load balancing database access over several interface cards and also using shared memory connections for server based batch processes.
Bill Douglass
Esteemed Contributor

Re: class C network

Definitely take a look at the TCP traffic between the host and client. You can capture the traffic useing tcpdump

http://hpux.cs.utah.edu/hppd/hpux/Networking/Admin/tcpdump-3.7.2/


which also requires libpcap


http://hpux.cs.utah.edu/hppd/hpux/Networking/Admin/libpcap-0.7.2/

Once installed, do something like:

tcpdump -v -i to get all traffic coming into or out of your server, and

tcpdump -v -i clientip

to see the traffic between your client and the host. You should be able to get an idea of where the conversation is getting bogged down.

hp-ux also includes nettl, which I have not used, so I can't offer advise on.
Ray Aslin
Occasional Contributor

Re: class C network

yesterday we had the VAR come on site, after some sniffing, and mirroring the ports on the switch, we found out what was going on...most of this is admittedly over my head, but I thought you all would like to know.

The server was xmitting packets with bad/missing checksum's. We had tried another of the same network card's, and had the same problem, so we tried a different type of 10/100/1000 base T cards and the problem stopped occuring, so its either something intrinsic in the Nic, or something in the driver. HP came out with a different 10/100/1000 nic, and are taking a look at the previous nic, and the captures we did. The problem is believed to be 'data specific' and not one of us can imagine why it would manifest itself so much more prominently on one IP address as opposed to another.

We do have another identical server, with the same nic running a different DB application...that one is working fine.

the Nic was:HP A6794-60001 PCI, and the application is Mckesson, HBOC Caremanager.

so, hopefully no other organizations will experience this same scenario...but if you do...there you have it...

a thank you to everyone who took time out of there day to give advice, it was much appreciated.