Databases
cancel
Showing results for 
Search instead for 
Did you mean: 

SAP startsap of AS failing after patching

Ralph Grothe
Honored Contributor

SAP startsap of AS failing after patching

Hi,

yesterday night I patched several SAP DB and AS servers running on 11i.
I installed the GOLD* patch bundles from Support Plus CDROM June 2004, as well as ONC and Sreams cumulative patches together with ENHAUTOFS.
On all other 7 SAP systems (DB as well as AS) things work fine.
Thus I don't believe it has something to do with my patching and enabling of AutoFS/automount.
Only on this Application Server I don't get SAP started correctly, though it pretends to be started well.
The only hint I've got is this excerpt from the startsap.log:


Execute Pre-Startup Commands
----------------------------
(23434) Local: /usr/sap/Z01/SYS/exe/run/sapmscsa -n pf=/usr/sap/Z01/SYS/profile/Z01_D02_don
/usr/sap/Z01/SYS/exe/run/sapmscsa: SCSA is attached and useable.
/usr/sap/Z01/SYS/exe/run/sapmscsa: finished.
(23434) Local: ln -s -f /usr/sap/Z01/SYS/exe/run/disp+work dw.sapZ01_D02
(23434) Local: ln -s -f /usr/sap/Z01/SYS/exe/run/rslgsend se.sapZ01_D02

Starting Programs
-----------------
/home/z01adm/startsap_don_02.log
Starting: local
dw.sapZ01_D02 pf=/usr/sap/Z01/SYS/profile/Z01_D02_don
(23445) Starting: local se.sapZ01_D02 -F pf=/usr/sap/Z01/SYS/profile/Z01_D02_don
(23434) Waiting for Child Processes to terminate.
(23434) **** 2004/10/13 09:22:05 Child 23444 died due to Signal 11 (core dumped). ****


I did find the core dump of the SIGSEV caught child, but I'm not literate to analyse it.
Does anyone have some adb tricks for an autopsie of the core file?
Unfortunately I only can get hints from this logfile, and the SAP DBAs aren't of great help because they cannot connect to the partly started instance, where they could use their SAP GUI tools.
Are there any other SAP logfiles that could give further hints?

During the startsap I also realized the message that socket on tcp port 3299 was already in use.
From /etc/services I found that the related service is sapdp99.
But SAP DBAs assured me that this message is normal and can be ignored.
After having installed lsof I looked what processes were using this socket:

[root@don(Z01):/root]
# /usr/local/sbin/lsof -i:3299|sed 's/>.*EST/> blanked for privacy EST/'
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
saprouter 2239 z01adm 4u inet 0x8e1d4800 0t0 TCP *:sapdp99 (LISTEN)
saprouter 2239 z01adm 5u inet 0x9b3470c0 0t177750 TCP don:sapdp99-> blanked for privacy ESTABLISHED
)
saprouter 2239 z01adm 6u inet 0x9b3473c0 0t17962 TCP don:50141-> blanked for privacy ESTABLISHED)
saprouter 2239 z01adm 7u inet 0x8e1d4680 0t603224 TCP don:sapdp99-> blanked for privacy ESTABLISHED
)
saprouter 2239 z01adm 8u inet 0x9b347540 0t39576 TCP don:50293-> blanked for privacy ESTABLISHED)
saprouter 2239 z01adm 9u inet 0x9b3476c0 0t214786 TCP don:sapdp99-> blanked for privacy ESTABLISHED
)
saprouter 2239 z01adm 10u inet 0x9b347240 0t15835 TCP don:50305-> blanked for privacy ESTABLISHED)

As can be seen, the saprouter already handles a few client connections.

As I don't know where else to search (given my lacking SAP knowledge) I will be rebooting in hope that everything somehow automagically will recover
(i.e. sysadmin desperation reboot ;-)

Regards
Ralph
Madness, thy name is system administration
3 REPLIES
Ralph Grothe
Honored Contributor

Re: SAP startsap of AS failing after patching

I found it!!! :-)

After having started the "startsap ALL" stuff as a tusc process (with following of forks enabled by -f) I was able to find the spot in the dump where a SIGSEGV was caught.
It was when a shmget() syscall was issued.
That got me to overlook the kernel shared memory tunables again.
So I compared them with a working SAP box and found that my shmmax was factor 256 too small:

[root@aller(Z01):/var/spool/sockets/pwgr]
# kmtune|grep shm
core_addshmem_read 0 Y 0
core_addshmem_write 0 Y 0
shmem 1 - 1
shmmax 17179869184 Y 17179869184
shmmni 256 - 256
shmseg 200 Y 200

[root@aller(Z01):/var/spool/sockets/pwgr]
# echo "obase=16;17179869184"|bc
400000000

So on my box I raised this tunable (luckily one of the very few dynamical that don't require yet another reboot)

[root@don(Z01):/root]
# kmtune -q shmmax
Parameter Current Dyn Planned Module Version
===============================================================================
shmmax 0x4000000 Y 0X4000000


Then R/3 came up as expected. :-)

Now I wonder what needs to be done to fix this dynamical tunable, i.e. make it boot resistent?
I'm sure I will have forgotten what needed to be done the next time the box is rebooting.
Madness, thy name is system administration
Ralph Grothe
Honored Contributor

Re: SAP startsap of AS failing after patching

Forgot to mention,
this was how I raised shmmax:

[root@don(Z01):/root]
# kmtune -u -s shmmax=0x400000000
The kernel's value of shmmax has been set to 17179869184 (0x400000000).

Did this suffice to make it boot persistent, or is something more involved required?
Madness, thy name is system administration
Eric Antunes
Honored Contributor

Re: SAP startsap of AS failing after patching

Hi Ralph,

I can check if you shmmax will keep your value with the next boot in SAM where you can see the "Current Value" and "Pending Value"...

Best Regards,

Eric Antunes
Each and every day is a good day to learn.