Apache performance on HPUX 11.00

HPP · ‎07-17-2002

Hi all,
I have installed Apache 2.0.39 on HP-UX 11.00 (32 bit). Syetem is K370 with 2 GB memory. I have also installed the Patches (Qaulity Pack March 2002) for HPUX 11.00. I have performance issue. System load goes up very high like 14 and apache starts responding very very slow, its like almost website is down. Swap usage at 5% during peak load, kernel parameter all are set to max (based on douments i found on HP website) which got for Apache.
I found a script on Apache website to get detail status of netstat, below is the output of that script with system load of 7.5
------------------------
# ./apache_socket.txt
LISTEN 25 ##
FIN_WAIT_1 13 #
FIN_WAIT_2 57 ####
LAST_ACK 3 #
CLOSING 4 #
CLOSE_WAIT 9 #
SYN_RCVD 5 #
TIME_WAIT 903 ############################################################
ESTABLISHED 62 #####
-------------------------------------------------------------------------------
TOTAL 1081
-----------------------------

Here is the list of httpd process. This keeps increasing and load goes high. System is has 2GB memory
# ps -ef|grep http
htpadm 5410 5896 0 15:19:50 ? 5:18 /opt/hpapache2/bin/httpd -d /opt/hpapache2
htpadm 15489 5896 0 10:12:58 ? 144:11 /opt/hpapache2/bin/httpd -d /opt/hpapache2
htpadm 6681 5896 0 15:55:37 ? 0:02 /opt/hpapache2/bin/httpd -d /opt/hpapache2
htpadm 23099 5896 0 12:10:19 ? 59:13 /opt/hpapache2/bin/httpd -d /opt/hpapache2
htpadm 6686 5896 0 15:57:00 ? 0:08 /opt/hpapache2/bin/httpd -d /opt/hpapache2
htpadm 6674 5896 0 15:48:22 ? 0:42 /opt/hpapache2/bin/httpd -d /opt/hpapache2
htpadm 21106 5896 0 11:38:23 ? 69:04 /opt/hpapache2/bin/httpd -d /opt/hpapache2
htpadm 13321 5896 0 09:56:56 ? 143:31 /opt/hpapache2/bin/httpd -d /opt/hpapache2
htpadm 6685 5896 0 15:56:56 ? 0:07 /opt/hpapache2/bin/httpd -d /opt/hpapache2
htpadm 5897 5896 0 08:07:46 ? 0:00 /opt/hpapache2/bin/httpd -d /opt/hpapache2
root 7864 13103 1 16:01:12 pts/ta 0:00 grep http
htpadm 27531 5896 0 13:18:16 ? 33:47 /opt/hpapache2/bin/httpd -d /opt/hpapache2
root 5896 1 0 08:07:45 ? 0:04 /opt/hpapache2/bin/httpd -d /opt/hpapache2
htpadm 6687 5896 0 15:58:45 ? 0:03 /opt/hpapache2/bin/httpd -d /opt/hpapache2
htpadm 6675 5896 0 15:49:33 ? 1:18 /opt/hpapache2/bin/httpd -d /opt/hpapache2
htpadm 28990 5896 0 14:00:19 ? 20:17 /opt/hpapache2/bin/httpd -d /opt/hpapache2
htpadm 6684 5896 0 15:56:56 ? 0:09 /opt/hpapache2/bin/httpd -d /opt/hpapache2
htpadm 6682 5896 0 15:56:16 ? 0:04 /opt/hpapache2/bin/httpd -d /opt/hpapache2
htpadm 20262 5896 0 11:23:47 ? 82:37 /opt/hpapache2/bin/httpd -d /opt/hpapache2
htpadm 28733 5896 0 13:42:36 ? 27:17 /opt/hpapache2/bin/httpd -d /opt/hpapache2
htpadm 6688 5896 0 15:59:45 ? 0:02 /opt/hpapache2/bin/httpd -d /opt/hpapache2

what i observed from netstat is that number of TIME_WAIT is very high and HP recommends not to change tcp_time_wait_interval with ndd. I have not touched any other tcp parameters using ndd.

Any thoughts on performance issue for the above details will be appreciated.

Note: Apache is the one and the only application running on this server.

Thanks in advance.

Be Teachable

Mark Greene_1 · ‎07-17-2002

What is the NIC set to, and how does this compare to the switch/router to which it is attached? You can use lanadmin to check this: select "lan" and then "display" to see the current values used by the card.

It sounds like you may have a speed mismatch with the two. Set the NIC and the network device to 100 half (if not there already) and see if that makes a difference.

HTH
mark

the future will be a lot like now, only later

HPP · ‎07-17-2002

Mark,
Thanks for the quick response. Below is the output of lanadmin->lan->display
-----------------------------
LAN INTERFACE STATUS DISPLAY
Wed, Jul 17,2002 09:19:36

PPA Number = 0
Description = lan0 Hewlett-Packard 10/100Base-TX Full-Duplex Manu Hw Rev 0
Type (value) = ethernet-csmacd(6)
MTU Size = 1500
Speed = 100000000
Station Address = 0x00108396d09b
Administration Status (value) = up(1)
Operation Status (value) = up(1)
Last Change = 6359
Inbound Octets = 232134943
Inbound Unicast Packets = 52462632
Inbound Non-Unicast Packets = 3821284
Inbound Discards = 3485
Inbound Errors = 0
Inbound Unknown Protocols = 89611
Outbound Octets = 994852973
Outbound Unicast Packets = 54309745
Outbound Non-Unicast Packets = 3485
Outbound Discards = 862
Outbound Errors = 0
Outbound Queue Length = 0
Specific = 655367
---------------------------

The system is connected to 10/100 switch. At switch port the speed auto detect. It is currently set to 100 full duplex.

Thanks

Be Teachable

HPP · ‎07-17-2002

Hi All,
Any help????

Thanks

Be Teachable

Mark Greene_1 · ‎07-17-2002

Have you done any general performance analysis? What does vmstat, sar, glance, etc. report? Also, what is your disk configuration? You may be getting I/O bound. Also, to try one more thing at the network level, what does netstat -s report in the way of errors?

mark

the future will be a lot like now, only later

HPP · ‎07-17-2002

Mike,
Below is the output of "netstat -s"

--------------------------
# netstat -s
tcp:
38089616 packets sent
35986074 data packets (908951240 bytes)
603576 data packets (561610119 bytes) retransmitted
2055547 ack-only packets (39611 delayed)
0 URG only packets
19077 window probe packets
0 window update packets
5333460 control packets
37624438 packets received
23000437 acks (for 4105703446 bytes)
2939880 duplicate acks
15 acks for unsent data
8158566 packets (2554029091 bytes) received in-sequence
11 completely duplicate packets (2496 bytes)
14 packets with some dup, data (15989 bytes duped)
4046 out of order packets (1356523 bytes)
116 packets (2048953902 bytes) of data after window
109 window probes
7723016 window update packets
29779 packets received after close
4922 segments discarded for bad checksum
0 bad TCP segments dropped due to state change
5078 connection requests
2688783 connection accepts
2693861 connections established (including accepts)
3180621 connections closed (including 486814 drops)
30008 embryonic connections dropped
20348447 segments updated rtt (of 20348447 attempts)
365728 retransmit timeouts
5207 connections dropped by rexmit timeout
19077 persist timeouts
180757 keepalive timeouts
180757 keepalive probes sent
1263 connections dropped by keepalive
14088 connect requests dropped due to full queue
6398 connect requests dropped due to no listener
udp:
0 incomplete headers
0 bad checksums
0 socket overflows
ip:
55377830 total packets received
0 bad IP headers
823073 fragments received
11 fragments dropped (dup or out of space)
60 fragments dropped after timeout
0 packets forwarded
0 packets not forwardable
icmp:
435 calls to generate an ICMP error message
0 ICMP messages dropped
Output histogram:
echo reply: 46
destination unreachable: 389
source quench: 0
routing redirect: 0
echo: 0
time exceeded: 0
parameter problem: 0
time stamp: 0
time stamp reply: 0
address mask request: 0
address mask reply: 0
0 bad ICMP messages
Input histogram:
echo reply: 2480
destination unreachable: 7589
source quench: 0
routing redirect: 0
echo: 46
time exceeded: 3999
parameter problem: 0
time stamp request: 0
time stamp reply: 0
address mask request: 0
address mask reply: 0
46 responses sent
igmp:
0 messages received
0 messages received with too few bytes
0 messages received with bad checksum
0 membership queries received
0 membership queries received with incorrect fields(s)
0 membership reports received
0 membership reports received with incorrect field(s)
0 membership reports received for groups to which this host belongs
0 membership reports sent

------------------------

I see no errors in the netstat -s output. The content for the website is served from the Netwok Appliance (Netapp). Netapp claims that disk read is faster than reading local disk. We have couple of other big websites and we wont have any problems.
We were servering the same content on D200 HP-UX 10.20 with apache 1.3.9, 512MB memory. We did not had these load increase on the old server. Now that we migrated the website to new server with latest Apache and we facing this problem.
we get 1.5 million hits per day on this website.

Thanks for the quick response.

Be Teachable

Mark Greene_1 · ‎07-17-2002

The number of duplicate packets in the tcp section *could* indicate a network issue; however, given this number is so low:

ip:
823073 fragments received

compared to the 55 million total packets, I'd say that you most likely don't have a network issue, and the tcp numbers point to some sort of bottleneck in the system, leading to a delayed response. This in turn generates a bunch of ack reuests from the clients to see if the system is still there, which adds to the perceived slowness.

Do you have a server-side application running that may be single-threading all of the httpd sessions through a lock or config file or some sort?

What are the kernel parameters dbc_max_pct, dbc_min_pct, and shmmin set to?

And maybe try running iostat to see if any one drive is getting hit much more often than the others.

mark

the future will be a lot like now, only later

HPP · ‎07-17-2002

Mark,
dbc_max_pct --> 25
dbc_min_pct --> 5
shmmin --> 64

System has only one disk 32 GB, and below is the output of "iostat 10" and "iostat 2 10"

# iostat 10

device bps sps msps

c0t6d0 0 0.0 1.0

c0t6d0 2 0.2 1.0

c0t6d0 11 0.9 1.0

c0t6d0 0 0.0 1.0

c0t6d0 20 2.1 1.0

c0t6d0 33 6.9 1.0

c0t6d0 11 0.8 1.0

c0t6d0 12 1.1 1.0

c0t6d0 3 0.5 1.0

c0t6d0 8 0.4 1.0

c0t6d0 6 1.0 1.0

c0t6d0 13 1.0 1.0

c0t6d0 10 0.5 1.0

c0t6d0 10 0.4 1.0

c0t6d0 1 0.1 1.0

c0t6d0 9 0.7 1.0

c0t6d0 15 1.2 1.0

c0t6d0 4 0.5 1.0

c0t6d0 10 0.7 1.0

# iostat 2 10

device bps sps msps

c0t6d0 0 0.0 1.0

c0t6d0 39 1.5 1.0

c0t6d0 12 1.0 1.0

c0t6d0 0 0.0 1.0

c0t6d0 4 0.5 1.0

c0t6d0 6 2.0 1.0

c0t6d0 3 0.5 1.0

c0t6d0 5 1.0 1.0

c0t6d0 0 0.0 1.0

c0t6d0 25 7.0 1.0

---------------------------

Thanks again.

Be Teachable

Mark Greene_1 · ‎07-17-2002

My mistake, the parameter you want is SHMMNI. Make sure this is set to at least 512. You can change it in SAM if it is not. You can probably go to 1024 if you want--SAM may even recommend that.

mark

the future will be a lot like now, only later

rick jones · ‎07-18-2002

a couple of things:

900 TIME_WAIT states is nothing. servers today can handle many orders of magnitude more than that without breaking a sweat.

the inbound discards statistic implies that there are times when we are not sucking packets off the NIC fast enough, or perhaps dropping them somewhere else, but the number of them is small enough that it would be at most a tertiary issue. there was supposed to be a second block of stats - things like FCS errors and the like - that would have been good to see just on principle.

the tcp stats imply that your system is having to retransmit something like 1.7% of its segments. while that is not great, it is not abysmal.

since the TCP retransmit timeout statistic is ~1/2 the number of TCP retransmissions, roughly half the retransmissions are the "fast retransmit" type and the other half are base on timeouts.

the duplicate ACKs are related to the fast retransmits, but explaingin fast rtx is more than I feel like doing right now :)

given that there were virtually no dropped fragments, i doubt that fragmentation of TCP segments is related to the duplicate ACKs or even the retransmissions. it does suggest though (if your traffic is mostly TCP) that some of the folks trying to reach your system are not using PathMTU discovery, but still using a largeish TCP MSS - so their traffic is being fragmented.

the 14000 connection requests dropped due to full queue indicate that one or more of the listen endpoints is not setting a large-enough listen backlog. you might want to first make sure that you use ndd (and edit /etc/rc.config.d/nddconf) to set tcp_conn_request_max to something other than 20 - I would try 1024. then you need to make sure that the Apache bits are setting the listen queue to something similar.

it is going to sound flippant, but perhaps the easiest way to improve the performance of a web server running Apache is to migrate to Zeus :)

anyhow, you need/want to see where CPU time is being spent. i would suggest getting glance (or a trial version) and taking a look at the user/kernel split. if most of the time is user, then grabbing a copy of prospect from ftp.cup.hp.com or the developer portal would be a good thing - you then profile Apache and see where it is wasting the CPU cycles.

Apache 2.mumble is threaded isnt' it? When you look at the system calls in glance (or tusc from ftp.cup.hp.com) are there a lot of ksleep/kwakeup calls?

there is no rest for the wicked yet the virtuous have no pillows

HPP · ‎07-19-2002

Hi,
I installed trial GlancePlus. I see that CPU is always at 99 or 100% useage. The system has two cpu. Both useage are at high as 99 or 100%. Also SysCall CPU % for both CPU are at 70 72 %.
The ksleep and kwakeup syscall count is at 250 and 260 respectively. SysCall rate for both are 18.
One more thing i observed in the system calls is sendfile system call, which is at 1700000.
Disk utilization is at 8%, mem utilization is at 36% and Swap utilization is at 12%. All top ten proccess in proc list are httpd. What would be experts adivise??

Thanks again for the quick response.

Be Teachable

Stefan Farrelly · ‎07-19-2002

There is no way your can be running mem usage at 36% and swap usage at 12%. If swap > 0% used then you have NO memory left. Run vmstat and see what the total in the free column is, multiple it by 4096 and thats how much memory you have free - I bet its a small number.

Youve got to reduce memory pressure for one. Try closing down some processes, dont run so many httpd processes, modify the kernel cache values in /stand/system to;
dbc_max_pct 2
dbc_min_pct 0

Rebuild the kernel and reboot. Keep a close eye on vmstat (free) free memory total as your server boots up and users get on. Your objective is to stop it getting to near 0.

Ive also heard of problems running network cards at 100 FD (full duplex) - I remember Bill Hassel saying 100 HD is best, and set it the same on your switch/hub end (NOT autonegotiate).

That will do for starters.

Im from Palmerston North, New Zealand, but somehow ended up in London...

HPP · ‎07-19-2002

Stefan,
Thanks for your response. I gave what i saw for mem and swap utilization in Glance. Right now i am seeing mem at 36% and swap util at 13%. Also vmstat gave free mem 333958 and multiply by 4096 = 1.3GB.

Thanks

Be Teachable

rick jones · ‎07-19-2002

perhaps what is being recounted as swap usage is actually swap reservation. paging isn't my area of expertise, but if a system were paging (swapping) i wonder if it could really maintain 100% CPU utilization.

is setting dbc_min_pct to 0 really a good thing? there can indeed be problems (heard Nth hand) if the buffer cache is too small...

The system calls and rates are good to know, but they need to be in the complete context - all the system calls being made by an httpd process, not just one or two of them.

it may be goodness to take a tusc trace of one of the httpds. i'd grab rev 7.3 of tusc (ftp.cup.hp.com) and select one of the httpds, then take a verbose tusc trace and send that to a file and post it here. not something terribly huge please :) keep it < 1000 lines.

is your typical content large or small? the "breakeven" point for sendfile() verus send() is somewhere in the neighborhood of 1024 bytes.

there is no rest for the wicked yet the virtuous have no pillows

Bill Hassell · ‎07-19-2002

A couple of things: If the NIC and the switch are both capable of 100/full-duplex, then always use 100FD. Long LAN cables (something like 40 meters) can cause autonegotiation to fail but in your case, lanadmin -x reports 100 FD manual which is fine.

Swap space can indeed be used (look at swapinfo -tam) even when there is plenty of memory. This is likely due to memory mapped files and not real swapping. With Glance (or vmstat), look at the page-out stats. Zero page-outs over a busy period means that there is no swapping and you have plenty of RAM.

As mentioned, check the dbc_max_pct value...if it is 50% (or Glance reports that you are using more than 500 megs of buffer cache), then you need to reduce that value to about 300-500 megs. The K370 is a medium speed server so increasing the buffer cache beyond about 500 megs will introduce some overhead that you don't need. A vast improvement in buffer cache performance will be seen by going to 11.11 version of HP-UX.

Bill Hassell, sysadmin

Christopher Caldwell · ‎07-19-2002

I've tried the Apache 2.0.x (HP-UX 11.0) versions - both the HP downloadable version and a custom compiled - with the thread model and the "traditional" model.

Under "normal" situations, the server ran just fine. In fact, when we used Apache in a custom configuration as an app server, we set a couple of internal performance benchmarks with our application.

When we really stressed Apache using a load testing tool, Apache would run great up until the point when the load test stopped (no requests). Apache would then take over the box and drive performance into the ground.

We were running the test on a 2X440 A class box that isn't used for anything else.

I haven't gone back yet to look at what was happening. I seem to recall a large number of sendfile calls in the system calls report in Glance.

We don't see the problem when Apache isn't under stress. (BTW, for high traffic, we normally use Netscape/iPlanet).

I've seen what you're talking about, and I can't yet tell you what to do.

You do want to take Rick Jones' suggestion and increase the tcp listen queue. 20 is way too short for a modern host on a high traffic web server.

BTW, we haven't run the performance benchmark since adjusting the listen queue - so that may be part of the problem.

HPP · ‎07-22-2002

Rick Jones,
I have attached the output of tusc, which I ran on httpd proccess. I ran tusc on top two httpd process. The attached file has first 50 lines for each proccess.

Thanks

Be Teachable

rick jones · ‎07-26-2002

from the tusc output it would appear that apache 2 is indeed running multi-threaded. i base that on seeing multiple distinct system calls in the sleeping state on startup.

adding a -l to the tusc would be goodness in such cases - it adds the lwpid to the output so you can see which thread is doing what (tusc 7.3 can even be told to just report for a subset of the threads, but that is beyond current needs).

i would initially guess that the apache 2 wsa configured like an older apache 1 - lots of processes - each of those processes will have lots of threads, perhaps some thread issues - mutex contention or something else - however, havin gony the first 50 lines isn't sufficient to see - it appears that there may be nearly that many threads per process.

there is no rest for the wicked yet the virtuous have no pillows

Categories

Company

Local Language

Forums

Discussions

Forums

Discussions

Forums

Discussions

Forums

Discussions

Forums

Discussions

Discussions

Forums

Forums

Discussions

Forums

Discussions

Forums

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Community

Resources

Other HPE Sites

Discussions

Forums

Blogs

Apache performance on HPUX 11.00

Apache performance on HPUX 11.00

Re: Apache performance on HPUX 11.00

Re: Apache performance on HPUX 11.00

Re: Apache performance on HPUX 11.00

Re: Apache performance on HPUX 11.00

Re: Apache performance on HPUX 11.00

Re: Apache performance on HPUX 11.00

Re: Apache performance on HPUX 11.00

Re: Apache performance on HPUX 11.00

Re: Apache performance on HPUX 11.00

Re: Apache performance on HPUX 11.00

Re: Apache performance on HPUX 11.00

Re: Apache performance on HPUX 11.00

Re: Apache performance on HPUX 11.00

Re: Apache performance on HPUX 11.00

Re: Apache performance on HPUX 11.00

Re: Apache performance on HPUX 11.00

Re: Apache performance on HPUX 11.00