1833790 Members
2682 Online
110063 Solutions
New Discussion

Ignite Brain Twister.

 
SOLVED
Go to solution
Steven E. Protter
Exalted Contributor

Ignite Brain Twister.

Happy working Ignite Setup.

3 rp450 boxs
HP-UX 11.00
Ignite B.3.62

1 box is golden image server
2 boxes are clients.

Here is what changes:

Ignite is upgrade to 4.1.61
Hostname of Ignite server is changed as are all appropriate DNS records.

A few patches.

Took a new Golden Image, weighing in at 20 G. This includes some oracle filesystes, taken cold. Never got files bigger than 2G don't care.

Transfer is set up to run 100 BT manual. "-S 1 1"

Problem: You connect to the Ignite server, make positive contact, an Ignite transfer starts up..

Mini system downloads.
Disks get configured okay.
Image starts to download.
Runs slow, no progress reports for around 6 hours then.

NFS Server (IP Address) Not responding. Still trying.

There is nothing in install.log file. Nothing.


Here is the thing.
I have the same filesystem mounted on the server from an 11.00 box, so there is nothing wrong with the NFS server. Prabably.

Current ideas:

1) NFS patch of some kind is missing and its triggered in large file tranfers. Bigger than the 7 gig guys I was doing, don't know the limit.

2) Some unknown hardware defect has come up on the client. Except I can restore an Ignite tape and it runs fine.

3) Perhaps largefiles is set wrong on some filesystem and its jamming up trying the download. But then again Ignite doesn't do anything bigger than 2 Gig right?

I'm out of ideas.

I've been down with the flu and temps of up to 104 the past five days. I'm supposed to be on vacation next week so I might not act or reply quickly.

Strain those brains though folks, I give points for every reply.

SEP
Steven E Protter
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com
16 REPLIES 16
RikTytgat
Honored Contributor

Re: Ignite Brain Twister.

Hi,

I've had the "NFS not responding" error, too.

This was when I was creating a depot to ignite Bastion Servers from. I removed the NFS client filesets from the depot, and this resulted in the same kind of behaviour: everything worked, until after the first reboot!

NFS client is required on the Ignite client, at least until the end of the Ignite.

Regards,
Rik
Steven E. Protter
Exalted Contributor

Re: Ignite Brain Twister.

An interesting idea. Since the Golden image included the entire machine which did have nfs.core, nfs.client and nfs.server, I'm not sure its the golden idea. Thanks.

Thinks I'm going to build me a patch depot.
Steven E Protter
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com
Alex Glennie
Honored Contributor

Re: Ignite Brain Twister.

The following may be of some use ...

There is an undocumented option to mount for NFS to enable debug output.

You can add the following line to the INSTALLFS (using instl_adm(1M)) on
the Ignite server that the client system will boot from across the
network:

_hp_nfs_mount_opts="-odebug"

This should print messages on the console about what the mount command
is doing.

Note: If you currently set _hp_nfs_mount_opts you will need to add
-odebug to the options currently set you should not have two lines
starting with _hp_nfs_mount_opts=, if you do the second one will take
precedence over the first. The man page instl_adm(4) does not document
_hp_nfs_mount_opts as supporting += syntax you should not use it with
_hp_nfs_mount_opts.

so you will need to perform these commands:

instl_adm -d > /tmp/cfg
vi /tmp/cfg
add or delete the variables to this file and then save and exit.
instl_adm -f /tmp/cfg

When instl_adm is run on the server, you will see the changes.



Duplex Mismatches

A mismatched duplex during a network recovery can cause a lot of
problems in Ignite-UX. Most commonly it causes performance issues when
the Ignite server is running 11.11 as NFS goes over TCP rather than UDP.
... this seems to cause major problems.

By default the duplex value during an Ignite-UX installation is
autonegotiated on 100BT and 1000BT interfaces. You can override this by
setting _hp_lanadmin_args in the INSTALLFS on the Ignite server (using
instl_adm(1M)) to a value that will cause the duplex value to be correct
for your environment.
Steven E. Protter
Exalted Contributor

Re: Ignite Brain Twister.

Thanks Alex. I have already used your instl_adm methodology to assure that I'm getting 100 BaseT manual on network.

I will try adding your diagnostic when I go back to work. Got the flu spent 6 days in bed woke up and found out today was the first day of a scheduled vacation. So I'm in bed gonna try and make some progress toward the crown.

I also have a recommended NFS patch and I'm going to punch that in tonight while nobody is looking. I know know my PDC is current.

Correction for top of call: rp5450 servers. I'm sure most everyone knew.

Cheers.

SEP
Steven E Protter
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com
Dietmar Konermann
Honored Contributor

Re: Ignite Brain Twister.

Hi, Steven!

What worries me is the fact, that you use the _hp_lanadmin_args trick and got into trouble _after _ upgrading Ignite to 4.x. The reason is, that this feature is broken in Ignite 3.x revisions (JAGae34118), fixed with 4.1.

Couldn't it be possible that you have problems, _because_ you use 100FD now? Maybe you should try the default again...

Best regards...
Dietmar.
"Logic is the beginning of wisdom; not the end." -- Spock (Star Trek VI: The Undiscovered Country)
John Payne_2
Honored Contributor

Re: Ignite Brain Twister.

Steven,

I go on vacation for 10 days, and you have like 800 more points all while being sick. amazing. Just like Micheal Jordan playing the Utah Jazz...

Anyway, in the config script for the image, you name an nfs_source in the file to point to the location of the hostname.gz image file.

i.e.:
nfs_source = "10.2.182.250:/var/opt/ignite/archives"
}

It's up in the top of the script under the sw_source section. If you pointed you script to hostname:/var/opt/ignite/archives instead of ip_address:/var/opt/ignite/archives, then ignite will just sit there forever trying to mount the nfs mount from the old hostname, which it can't get to.

I use the IP address in my scripts. I moved the IP of my server to a private address about a year and a half ago and ran into this problem at the next reinstall.

Hope it helps

John
Spoon!!!!
Steven E. Protter
Exalted Contributor

Re: Ignite Brain Twister.

I am concerned too Deitmar. My 10 BaseT lan card which the built in card is on is barely reliable enough for a boot. It won't do anything else.

I have to try and push the transfer through the faster non built in card.

I will see about borrowing a faster, more reliable switch.


I think what happened on the points thing was I had a lot of replies built up out there before this Ignite problem cut me down and the flu took me out of the game.

I'd be lucky to make fast mover this week, plus I've read a couple of lame replies I tried while sporting a 102 fever. They don't deserve disk space let alone points.

I thank you for your support. I too am on vacation, but the computer is the only thing Nurse Ratchet(kind Mrs. Protter, I think she looks better than Jennifer Garner) will let me do is occaisionally jump on via the cellular modem and a beat up 8 year old laptop.

As noted earlier since I'm on vacation I will not be able to try any of this until a week from today. But I'm making notes and planning patches and together we will foil this dragon. Or whatever.

SEP
Steven E Protter
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com
Steven E. Protter
Exalted Contributor

Re: Ignite Brain Twister.

Note, its looking a lot like a LAN problem.

Yes, fever is gone, headache is gone so I'm back at work pounding my head very hard on this one.

Ran the nfs debug command above, also ran the ignite transfer on debug level 5.

very interesting results.

79 DEBUG: Set erase character from 0177 to 0177
80 * Bringing up Network (lan1)
81 * running: lanadmin -X 100FD 1 1
82 Unable to open catalog for C
83 Switching to the use of Internal Messages.
84 btlan: ERROR: Setting 100FD 1 is unsupported
85 Valid types: HD, FD, 10HD, 10FD, 100HD, 100FD, AUTO_ON
86 DEBUG: sigchld: wait'ed on process: 83 (last_pid=83)
87 WARNING: lanadmin command returned non-zero exit status
88 DEBUG: running: ifconfig lan1 inet 10.1.11.50 netmask 255.255.0.0 up
89 DEBUG: sigchld: wait'ed on process: 84 (last_pid=84)

With the new instl_adm config from HP.


Last reasonable message logged.
8888nfs mount: mount_nfs: fstype = nfs3
8889nfs mount: mount_nfs: global_nokern = 0
8890nfs mount: mount_nfs: Exiting procedure
8891nfs mount: main: Exiting main for nfs command
8892* Mon Feb 24 15:27:12 EST 2003: Starting archive load of the sour
ce
8893 (English HP-UX 11.11 CDE - 64 Bit OS).
8894TEST: symlink(/d_cfg_mnt_sb61/., /d_cfg_mnt_sb61/ )
8895NOTE: Contents of file "/tmp/ign_configure/archive_script":
8896 cd /d_cfg_mnt_sb61
8897 /sbin/cat /tmp/ign_configure/archive_nfs/tzfat.juf.net.gz | /monitor
_bpr -s 20298698 | /usr/contrib/bin/gunzip -c | /sbin/pax_iux -r -p e


Ends repeating this instruction:
9421 "/var/opt/ignite/clients/0x001083FEDCF0/server.instr".
9422 DEBUG: No instructions file

instl_adm -d

[4898#] instl_adm -d
# instl_adm defaults:
# NOTE: Manual additions between the lines containing "instl_adm defaults"
# and "end instl_adm defaults" will not be preserved.
server="10.1.11.60"
netmask[]="255.255.0.0"
init _hp_lanadmin_args="-X 100FD 1"
_hp_nfs_mount_opts="-odebug"
# end instl_adm defaults.


any ideas?????
Steven E Protter
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com
Steven E. Protter
Exalted Contributor

Re: Ignite Brain Twister.

Changed instl_adm file from:

init _hp_lanadmin_args="-X 100FD 1"
init _hp_lanadmin_args="-X 100FD"

Apparently in Ignite the 1 is assumed. Probably due to prior step which says brining up lan 1.

SEP
Steven E Protter
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com
Steven E. Protter
Exalted Contributor

Re: Ignite Brain Twister.

Ignite Brain twister update.

Still points to be had btw....

We have verified the network is not a problem. Of course it was a problem and we had to fix the router and the server to force 100FD, 100 BaseT Full Duplex without auto negotiate.

We ran some pretty esoteric traces and proved the golden image gz file can and be run through a cat command beginning to end on and NFS mount, which means the network is cool.

But, when we ran the hang step which uses gunzip piping through pax(here is the command)

/sbin/cat /tmp/ign_configure/archive_nfs/tzfat.juf.net.gz | /usr/contrib/bin/gunzip -c | /sbin/pax_iux -r -p e -v >images/pax.out 2>&1 &

attached tusc to the pax_iux process (pid = 372) and tracing to images/pax.trace


We get a hang.

The hang happens right in the directory where our Oracle dbf files that are bigger than 2G get extracted.

Prior to my upgrade on the machine the golden image create process make_sys_image issued warnings and blew off files bigger than 2G.

The problem runs stopped showing this warning. I'm thinking based on filesize these big Oracle files found their way into the image and can't be extracted.

Any thoughts?

The proposed fix(me)is that I build a new image with those big oracle filesystems unmounted so they can't create problems.

Interesting.

SEP
Steven E Protter
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com
Jeff Schussele
Honored Contributor

Re: Ignite Brain Twister.

Hi Steven,

Ok I'll take a shot....

We've found in our testing that Ignite has a BIG problem when it's gzipping anything that breaches the 2 Gb wall.

Why?....HP's still shipping gzip 1.2.4 in it's SW-DIST fileset & 1.2.4 is totally unreliable with these files.
Pull 1.3.3 down from the porting center & dump 1.2.4, link 1.3.3 into /usr/contrib/bin - 1.3.3 installs into /usr/local/bin - and Ignite *needs* it in /usr/contrib/bin.

Let me know,
Jeff

PERSEVERANCE -- Remember, whatever does not kill you only makes you stronger!
Steven E. Protter
Exalted Contributor

Re: Ignite Brain Twister.

Bing bing bing.

Prize winner.

I installed the PHCO_26422 pax(1) cumulative patch

This enables pax to do files up to 8 Gigabytes.

One little problem with that.

pax_iux can't unpax what the lateset pax patch can pax

A really, really monster mistake on the part of HP. No installation warnings, no nothing.

I thought I could work around with tape but I bet that would have failed miserably too.

I've removed the patch and rebuilt the Golden image, which came out smaller.

We'll see how the Ignite push goes.

SEP

Steven E Protter
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com
Jeff Schussele
Honored Contributor

Re: Ignite Brain Twister.

Wow..that 6th sentence sent me spinning into Seussland...I started dreaming & I've got this rhyme stuck in my head. You know like that melody that you hear at 8AM that doesn't check out 'til well past 5PM......

If a packet hits a pocket on a socket on a port,
and the bus is interrupted at a very last resort,
and the access of the memory makes your floppy disk abort,
then the socket packet pocket has an error to report.

If your cursor finds a menu item followed by a dash,
and the double-clicking icon puts your window in the trash,
and your data is corrupted cause the index doesn't hash,
then your situation's hopeless and your system's gonna crash!

If the label on the cable on the table at your house
says the network is connected to the button on your mouse,
but your packets want to tunnel to another protocol,
that's repeatedly rejected by the printer down the hall,

and your screen is all distorted by the side effects of gauss,
so your icons in the window are as wavy as a souse;
then you may as well reboot and go out with a bang,
'cuz sure as I'm a poet, the sucker's gonna hang!

When the copy on your floppy's getting sloppy in the disk,
and the macro code instructions cause unnecessary risk,
then you'll have to flash the memory and you'll want to RAM your ROM
then quickly turn off the computer and be sure to tell your MOM !

Keep me informed...I think you've nailed this puppy.

Later,
Jeff
PERSEVERANCE -- Remember, whatever does not kill you only makes you stronger!
Steven E. Protter
Exalted Contributor

Re: Ignite Brain Twister.

We have nailed it.

The pax_iux process has completed. The client server has booted, is in post_configure script building mirrors and all kinds of other mundane things.

Just call me HP bug finder.

SEP
Steven E Protter
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com
Jeff Schussele
Honored Contributor
Solution

Re: Ignite Brain Twister.

Steven,

Cool glad we solved this.

Now, please promise me that you'll never doubt the power & capabilities of this forum again.
We get a big kick out of outperforming the Response Center 8~))

Cheers,
Jeff

P.S. Got a rabbit in your hat?
PERSEVERANCE -- Remember, whatever does not kill you only makes you stronger!
Steven E. Protter
Exalted Contributor

Re: Ignite Brain Twister.

Rabbit for Jeff.

I would note that the I came up with the cause while I was too sick to do anything about it.

The itrc forum thread here came to the conclusion faster than the response center.

The response center however deservers kudos for actually coming up with the methodology to prove the theory correct.

We pushed a shell onto the client and then re-ran the pax_iux unpack while attaching the tusc debugger.

That was cool stuff and everyone deserves a pat on the back.

SEP
Steven E Protter
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com