StoreEver Tape Storage
1753556 Members
5724 Online
108796 Solutions
New Discussion

Re: HP DPX 4.0 1 agent (linux 64-bit) keeps failing during backup

 
blokemann
Occasional Advisor

HP DPX 4.0 1 agent (linux 64-bit) keeps failing during backup

Hi

 

I need some advice on the best course of action with a frustrating issue.

 

We have Data Protector Express 4.00-sp1 - 56906 running on two Suse Linux Enterprise Server (SLES) v10 servers.

 

The domain server is running on 32 bit SLES 10 SP1, OES 2

while the remote agent server is running on 64 bit SLES 10 SP3, OES 2 SP2

 

I have run test backups during the day and they have worked, so I am not sure why the 2100hrs scheduled backup has the issue.

 

Basically the backup of the remote agents starts, files are counted and file backup commences but then it dies.

 

/var/log/messages shows the following for the job that was scheduled to run at 2100hrs.

The firt entry and this one look interesting:

Sep 7 21:11:46 srv2 dplinsdr: *** glibc detected *** /usr/local/hp/dpx/lin/x86_64/dplinsdr: double free or corruption (!prev): 0x00002aaaaca008d0 ***

 

------------------------------------------------------------------------------------------

Sep 7 21:10:52 srv2 kernel: dplinsdr: page allocation failure. order:4, mode:0xd0

Sep 7 21:10:52 srv2 kernel:

Sep 7 21:10:52 srv2 kernel: Call Trace: <ffffffff80167964>{__alloc_pages+796} <ffffffff80182e4c>{kmem_getpages+106}

Sep 7 21:10:52 srv2 kernel: <ffffffff80184231>{fallback_alloc+275} <ffffffff80184753>{__kmalloc+179}

Sep 7 21:10:52 srv2 kernel: <ffffffff8016d1a7>{kzalloc+9} <ffffffff801a74a9>{getxattr+137}

Sep 7 21:10:52 srv2 kernel: <ffffffff80196cf4>{link_path_walk+218} <ffffffff802f1209>{__down_write+21}

Sep 7 21:10:52 srv2 kernel: <ffffffff801fee72>{__up_write+20} <ffffffff80174544>{sys_brk+244}

Sep 7 21:10:52 srv2 kernel: <ffffffff801a75cf>{sys_lgetxattr+75} <ffffffff802f1209>{__down_write+21}

Sep 7 21:10:52 srv2 kernel: <ffffffff801fee72>{__up_write+20} <ffffffff80174544>{sys_brk+244}

Sep 7 21:10:52 srv2 kernel: <ffffffff8010ae36>{system_call+126}

Sep 7 21:10:52 srv2 kernel: Mem-info:

Sep 7 21:10:52 srv2 kernel: Node 0 DMA per-cpu:

Sep 7 21:10:52 srv2 kernel: CPU 0: Hot: hi: 0, btch: 1 usd: 0 Cold: hi: 0, btch: 1 usd: 0

Sep 7 21:10:52 srv2 kernel: CPU 1: Hot: hi: 0, btch: 1 usd: 0 Cold: hi: 0, btch: 1 usd: 0

Sep 7 21:10:52 srv2 kernel: CPU 2: Hot: hi: 0, btch: 1 usd: 0 Cold: hi: 0, btch: 1 usd: 0

Sep 7 21:10:52 srv2 kernel: CPU 3: Hot: hi: 0, btch: 1 usd: 0 Cold: hi: 0, btch: 1 usd: 0

Sep 7 21:10:52 srv2 kernel: CPU 4: Hot: hi: 0, btch: 1 usd: 0 Cold: hi: 0, btch: 1 usd: 0

Sep 7 21:10:52 srv2 kernel: CPU 5: Hot: hi: 0, btch: 1 usd: 0 Cold: hi: 0, btch: 1 usd: 0

Sep 7 21:10:52 srv2 kernel: CPU 6: Hot: hi: 0, btch: 1 usd: 0 Cold: hi: 0, btch: 1 usd: 0

Sep 7 21:10:52 srv2 kernel: CPU 7: Hot: hi: 0, btch: 1 usd: 0 Cold: hi: 0, btch: 1 usd: 0

Sep 7 21:10:52 srv2 kernel: Node 0 DMA32 per-cpu:

Sep 7 21:10:52 srv2 kernel: CPU 0: Hot: hi: 186, btch: 31 usd: 100 Cold: hi: 62, btch: 15 usd: 55

Sep 7 21:10:52 srv2 kernel: CPU 1: Hot: hi: 186, btch: 31 usd: 169 Cold: hi: 62, btch: 15 usd: 11

Sep 7 21:10:52 srv2 kernel: CPU 2: Hot: hi: 186, btch: 31 usd: 173 Cold: hi: 62, btch: 15 usd: 51

Sep 7 21:10:52 srv2 kernel: CPU 3: Hot: hi: 186, btch: 31 usd: 159 Cold: hi: 62, btch: 15 usd: 54

Sep 7 21:10:52 srv2 kernel: CPU 4: Hot: hi: 186, btch: 31 usd: 179 Cold: hi: 62, btch: 15 usd: 48

Sep 7 21:10:52 srv2 kernel: CPU 5: Hot: hi: 186, btch: 31 usd: 178 Cold: hi: 62, btch: 15 usd: 12

Sep 7 21:10:52 srv2 kernel: CPU 6: Hot: hi: 186, btch: 31 usd: 155 Cold: hi: 62, btch: 15 usd: 50

Sep 7 21:10:52 srv2 kernel: CPU 7: Hot: hi: 186, btch: 31 usd: 156 Cold: hi: 62, btch: 15 usd: 51

Sep 7 21:10:52 srv2 kernel: Node 0 Normal per-cpu:

Sep 7 21:10:52 srv2 kernel: CPU 0: Hot: hi: 186, btch: 31 usd: 124 Cold: hi: 62, btch: 15 usd: 48

Sep 7 21:10:52 srv2 kernel: CPU 1: Hot: hi: 186, btch: 31 usd: 154 Cold: hi: 62, btch: 15 usd: 6

Sep 7 21:10:52 srv2 kernel: CPU 2: Hot: hi: 186, btch: 31 usd: 15 Cold: hi: 62, btch: 15 usd: 57

Sep 7 21:10:52 srv2 kernel: CPU 3: Hot: hi: 186, btch: 31 usd: 139 Cold: hi: 62, btch: 15 usd: 55

Sep 7 21:10:52 srv2 kernel: CPU 4: Hot: hi: 186, btch: 31 usd: 115 Cold: hi: 62, btch: 15 usd: 3

Sep 7 21:10:52 srv2 kernel: CPU 5: Hot: hi: 186, btch: 31 usd: 155 Cold: hi: 62, btch: 15 usd: 14

Sep 7 21:10:52 srv2 kernel: CPU 6: Hot: hi: 186, btch: 31 usd: 168 Cold: hi: 62, btch: 15 usd: 48

Sep 7 21:10:52 srv2 kernel: CPU 7: Hot: hi: 186, btch: 31 usd: 175 Cold: hi: 62, btch: 15 usd: 60

Sep 7 21:10:52 srv2 kernel: Free pages: 804448kB (0kB HighMem)

Sep 7 21:10:52 srv2 kernel: Active:202289 inactive:137558 dirty:229 writeback:0 unstable:0 free:201112 slab:717254 mapped:26192 pagetables:2886

Sep 7 21:10:52 srv2 kernel: Node 0 DMA free:12188kB min:16kB low:20kB high:24kB active:0kB inactive:0kB present:11780kB pages_scanned:0 all_unreclaimable? yes

Sep 7 21:10:52 srv2 kernel: lowmem_reserve[]: 0 3630 6029 6029

Sep 7 21:10:52 srv2 kernel: Node 0 DMA32 free:649472kB min:5976kB low:7468kB high:8964kB active:258228kB inactive:460832kB present:3717536kB pages_scanned:0 all_unreclaimable? no

Sep 7 21:10:52 srv2 kernel: lowmem_reserve[]: 0 0 2398 2398

Sep 7 21:10:52 srv2 kernel: Node 0 Normal free:142788kB min:3948kB low:4932kB high:5920kB active:550928kB inactive:89400kB present:2456320kB pages_scanned:5 all_unreclaimable? no

Sep 7 21:10:52 srv2 kernel: lowmem_reserve[]: 0 0 0 0

Sep 7 21:10:52 srv2 kernel: Node 0 DMA: 3*4kB 2*8kB 2*16kB 5*32kB 1*64kB 3*128kB 1*256kB 0*512kB 1*1024kB 1*2048kB 2*4096kB = 12188kB

Sep 7 21:10:52 srv2 kernel: Node 0 DMA32: 120290*4kB 20847*8kB 26*16kB 1*32kB 1*64kB 0*128kB 0*256kB 0*512kB 1*1024kB 0*2048kB 0*4096kB = 649472kB

Sep 7 21:10:52 srv2 kernel: Node 0 Normal: 27731*4kB 3857*8kB 13*16kB 1*32kB 0*64kB 0*128kB 1*256kB 1*512kB 0*1024kB 0*2048kB 0*4096kB = 142788kB

Sep 7 21:10:52 srv2 kernel: Swap cache: add 58, delete 58, find 4/4, race 0+0

Sep 7 21:10:52 srv2 kernel: Free swap = 1052064kB

Sep 7 21:10:52 srv2 kernel: Total swap = 1052248kB

Sep 7 21:10:52 srv2 kernel: Free swap: 1052064kB

Sep 7 21:10:52 srv2 kernel: 1671167 pages of RAM

Sep 7 21:10:52 srv2 kernel: 144740 reserved pages

Sep 7 21:10:52 srv2 kernel: 225539 pages shared

Sep 7 21:10:52 srv2 kernel: 0 pages swap cached

Sep 7 21:11:46 srv2 dplinsdr: *** glibc detected *** /usr/local/hp/dpx/lin/x86_64/dplinsdr: double free or corruption (!prev): 0x00002aaaaca008d0 ***

Sep 7 21:27:34 srv2 syslog-ng[30490]: STATS: dropped 0

Sep 7 22:27:34 srv2 syslog-ng[30490]: STATS: dropped 0

Sep 7 23:27:34 srv2 syslog-ng[30490]: STATS: dropped 0

 

------------------------------------------------------------------------------------------

 

Some things I have tried:

1. Recreated the backup job

2. Tested backups several times during the day and they worked.

 

Where to from here?

There appears to be a bug somewhere.

 

ATTACHMENTS:

(NB The attachments are from linux so if using Windows, it is best not to view with Notepad, use Wordpad or a word processor.)

 

hpdx-error Sep7.txt

= var/log/messages extract for the 7th Sep 2010 on the remote agent server. Scheduled backup starts at 2100hrs.

 

var-log-messages-dplinsdr-page-alloc-failure.txt

= /var/log/messages grep of dplinsvr showing page allocation failures.

 

var-log-messages-extract.log

= /var/log/messages extract, full log excluding some irrelevant DNS and other daemon messages.

 

Should I raise this as a bug and if so, where is the best place to do that?

5 REPLIES 5
Alex Kuzmin
Advisor

Re: HP DPX 4.0 1 agent (linux 64-bit) keeps failing during backup

Hello Vlad,

 

Please open s support ticket to HP Software Support in order to troubleshoot the issue.

blokemann
Occasional Advisor

Re: HP DPX 4.0 1 agent (linux 64-bit) keeps failing during backup

We don't currently have a support agreement with HP for this product, so I was hoping for some direction that may put me on the path to a resolution.

 

There are other problems with running Data Protector Express (64 bit) remote agent.

 

I just now ran a standard:

/usr/local/hp/dpx/lin/x86_64/dplinsvc -x
and it just sits there with:

"Service is being stopped"

 

Our DPX Domain server is running SLES 10 SP1 and no problems whatsoever.

Can someone confirm if they have this software running properly on a Suse Linux Enterprise Server (SLES) 10 64-bit with service pack 3?

 

blokemann
Occasional Advisor

Re: HP DPX 4.0 1 agent (linux 64-bit) keeps failing during backup

More than 10 min passed, so I stoped it good with kill -9 against these three processes:

dplingqa

dplinsvc

dplinsdr

blokemann
Occasional Advisor

Re: HP DPX 4.0 1 agent (linux 64-bit) keeps failing during backup

After having ongoing issues trying to back up the domain server and one remote system with the agent in the one scheduled job, we appear to have a decent work around by:

  • Splitting the one job with multiple hosts into two separate jobs.
  • On the remote agent server, cron a restart of the dplinux daemon every day at noon.

So far this has been running fine now for a couple of weeks.

 

I still believe that there is a compatibility issue with the agent and SLES 10 SP3 (64-bit).

christian fischer
Occasional Visitor

Re: HP DPX 4.0 1 agent (linux 64-bit) keeps failing during backup

DPX is not supported by HP Software!!!!

this is actually the wrong forum to discuss DPX issues