1825766 Members
2357 Online
109687 Solutions
New Discussion

greatest blunders

 
SOLVED
Go to solution
Systeemingenieurs Infoc
Valued Contributor

Re: greatest blunders

I once spent a day looking after an error in a c-program :

if (a=b) {}

should 've been :

if (a==b) {}
A Life ? Cool ! Where can I download one of those from ?
Alexander M. Ermes
Honored Contributor

Re: greatest blunders

Hi there.
ever told the electricians , that your server is hooked to a completely different circuit breaker ? Had some nice reactions from my users about 2 seconds, after this guy switched it.
;-)
Rgds
Alexander M. Ermes
.. and all these memories are going to vanish like tears in the rain! final words from Rutger Hauer in "Blade Runner"
steven Burgess_2
Honored Contributor

Re: greatest blunders

Hi all

My greatest to date was pvcreating a disk in a service guard cluster that was the root disk on the other node. The girl who had been setting up the server for testing (luckily) wasn't best pleased

Steve
take your time and think things through
Bill Hassell
Honored Contributor

Re: greatest blunders

I guess my blunder sets the record for "most clobbered machines" in one day:

I created an inventory script to be used in the Response Center to track all the systems across the United States (about 320 systems). These are all test and problem replication machines but necessary for the R/C engineers to replicate customer problems.

The script was written about 1992 to handle version 7.0 and higher. About 1995, I had a number of useful scripts that it seemed reasonable to drop these into all 300 machines as a part of the inventory process (so far, so good). Then about that time, 10.01 was released and I made a few changes to the script. One was to change the useful script location from /usr/local/bin to /usr/contrib/bin because of bad directory permissions. I considered 'fixing' the bad permissions but since these systems must represent the customer environment, I decided to move everything.

Enter the shell option -u. I did not use that option in my scripts and due to a spelling error, an environment variable was used in rm -r which was null, thus removing the entire /usr/local directory on 320 machines overnight.

Needless to say, I never write scripts without set -u at the top of the script.


Bill Hassell, sysadmin
Patrick Wallek
Honored Contributor

Re: greatest blunders

Hmmmm.........

I've got a couple that probably qualify. 1 is HP-UX related and one is MPE/V related.

On the MPE/V machine I helped to support a product that the company I worked for sold. I was doing some work with the product license file which was kept at CATALOG.PRODUCTNAME.SYS. Well, I removed CATALOG. The only thing was, I wasn't in the PRODCUTNAME.SYS group, I was in the PUB.SYS group. For anyone not familiar with MPE, CATALOG.PUB.SYS is a pretty important system file, especially when rebooting the machine. Well, I didn't realize what I had done until a couple of days later when I did attempt to reboot the machine. When it started complaining that it couldn't find the CATALOG.PUB.SYS file, I realized what I had done. The machine wasn't super-critical, but we did have some accounting stuff on it. I spent most of that day restoring the system from the 1600 BPI 9 track reel-to-reel tape.

Now for number 2 - on an HP-UX system -

This was the main accounting server for a medium sized company I worked for. I had been moving files around between the production server and our test server. I went to rm a file and thought I was on the test server, but in reality I was on the production server. I was at the office extra early the next morning with the latest backup tape in hand to restore the file. Fortuantely it was not a critical file and no harm was done. At least no harm other than knocking a few years off of my life expectancy, probably.
Robert DJ
Frequent Advisor

Re: greatest blunders

Hi,

I had great time once on a weekend, mounted a file System in /usr and was trying to run few commands but nonetheless i learn no jokes at weekends..............

On my second instance, unfortunately i happened to change the permissions of /etc to read, and then started the fun where i was unable to login to the WKS with root.....

Life Time Acheivement Error.......
"Login Incorrect.."

Thanx.

Regards,
Robert DJ

Robert DJ
Shannon Petry
Honored Contributor

Re: greatest blunders

The first time I got to work on my very own UNIX system was on a Sun Sparc2. I got to load the OS, and install apps, and it was mine.

I noticed that disk space was short, so got an external 512M (huge back then), and plugged it in.

I saw that /usr seemed pretty full, and /usr/lib was lots of data. I descided it would be cool to move this to my newly made mount, and link it over.

Well, I learned rather quiclky that Solaris did not have a staticly linked ln command ;)

Back in them days it was a good 8 hours to install an OS. SO it took me 2 days to get my Sun Sparc up and running.. Live and learn!
Microsoft. When do you want a virus today?
John Poff
Honored Contributor

Re: greatest blunders

We were doing a disaster recovery drill. I was busy Igniting a V-class server for our database server. I had finally gotten the OS on it after about three hours and I was running a slick little script I had written to recreate all the volume groups and filesystems. My script takes a list of available PVs and does a 'pvcreate -f' on them. Well, we started our drill at midnight [not our idea but we had little choice], so around about 3:30am I was trying to run this script. It was chugging along just fine, pvcreating disks, and then the system hung. Not completely, but pretty much dead. After trying to reboot it, I eventually figured out that when I went through the interactive Ignite, I hadn't paid close attention to which disk Ignite had selected to load the OS on, and it had chosen one of the disks in the EMC array instead of one of the local Jamaica disks. My slick script came along and had pvcreated the disk that had the OS on it. Oops. There went a few more hours of work.

The good news is that after that mess they decided that we would never start a DR drill at midnight!

JP
Christopher McCray_1
Honored Contributor

Re: greatest blunders

Hello,

Here's mine.

Situation: 2 N-class servers, one already configured and running production databases, the other newly arrived, connected and the 2 12h autoraids are cross-connected for the future MCSG implementation.

Using an ignite tape from the first N, I booted from it on the other N and installed in batch mode.

For the next couple of weeks, I was scratching my head, constantly running fsck on several filesystems on the first N, wondering where all the inode errors were coming from!!!!

We learn by doing.

Chris
It wasn't me!!!!
John Bolene
Honored Contributor

Re: greatest blunders

In UNIX it was changing netmask on a lan card in a machine across the ocean. Seems real strange to me that when you down the interface, your session goes away.

Turns out the local guys were working on the console network and I could not get in that way either.

Had to wait for the day crew in Dublin to finish working on the console network.

On the UNISYS mainframe, it has a feature that you can change OS code on the fly. I fat fingered a jump and it immediately jumped somewhere causing the OS the crash. Turns out that changes made this way also get written to the boot disk and a disk boot would not work either. A tape boot was required.
It is always a good day when you are launching rockets! http://tripolioklahoma.org, Mostly Missiles http://mostlymissiles.com
Robie Lutsey
Frequent Advisor

Re: greatest blunders

Greastest Blunder? About 3 months after taking the Sys Admin job trying to do an upgrade from HP-UX 10.20 to 11.0. Not being a Sys Admin by trade I failed to reconize the need to VERIFY the backup tapes.... for the last 3 months. Needless to say when I started to receive kernel errors and file system errors I flipped out. Not many people were very pleased when thier entire projects were wiped out.

I learned.
Mick Kearney
Advisor

Re: greatest blunders

I cannot take the credit for these, but I thought they were crackers.
Lets call him "Robin's mate", as he was the one I think bailed him out (Hi Robin)!!!!

Writing a script to change the root passwords on all servers globally, even in remote unmanned locations such as Bogata. Only thing was, the script zero'ed the password files....cracker?

So he moved to another client.

Decided to untar an image of one server onto another machine...without specifying a path. Luckily there was a backup!
David Baraloto
Occasional Contributor

Re: greatest blunders

I had two Vax 6410's running simulation software for a customer. One Vax (which wasn't being used at the moment) had a bad disk. I got a new disk for the Vax, shut it down from the console, then walked over to the wrong Vax and yanked it's disk out. Lost two days of customer simulation time while I restored the two disks.
Jeff Schussele
Honored Contributor

Re: greatest blunders

Almost ten years ago when I was working for a turnkey medical VAR who sold SCO systems to MDs & clinics, I was on-site at a client's office performing some SW maintenance.
The system was located under the receptionist's desk & I was sitting in her seat talking to the office mgr & typing at the same time. When the converstaion ended I swiveled in the chair to face the monitor again when my foot bumped something under the desk. My face started to turn ashen gray watching the screen collapse & the power light go off as I realized I had just bumped the power switch for the UPS.

Needless to say after I brought the system back up & fsck-cleaned all the filesystems (luckily nothing lost or corrupted), I turned that UPS around so the power switch faced the wall.
I wondered why the receptionist hadn't done that until I remembered that she was about 5' tall whereas I'm about 6'. Her legs weren't long enough to reach the UPS from the seating position.

Cheers,
Jeff
PERSEVERANCE -- Remember, whatever does not kill you only makes you stronger!
Dave Johnson_1
Super Advisor

Re: greatest blunders

Here is my worst.
We us BC's on our XP512. We stop the application, resync the BC, split the BC, start the application, mount the BC on same server, start backup to tape from BC. Well I had to add a LUN to the primary and BC. I recreated the BC. I forgot to change the script that mounts the BC to include the new LUN. The error message vgimport when you do not include all the LUN's is just a warning and it makes the volume group available. The backups seemed to be working just fine.
Well 2 months go by. I did not have enough available disk space to test my backups. (That has been changed). Then I decided to be proactive about deleted old files. So I wrote a script:
cd /the/directory/I/want/to/thin/out
find . -mtime +30 -exec rm {} \;

Well that was scheduled on cron to run just before backups one night. The next morning I get the call the system is not responding. (I guessed later the cd command had failed and the find ran from /).
After a reboot I find lots of files are missing from /etc /var /usr /stand and so on. No problem, just rebuild from the make_recovery tape created 2 nights before then restore the rest from backup.
Well step 1 was fine, but the backup tape was bad. The database was incomplete. It took 3 days (that is 24 hours per day) to find the most recient tape with a valid database. Then we had to reload all the data. After the 3rd day I was able to turn over recovery to the developers. It took about a week to get the application back on-line.
I have sent a request to HP to have the vgimport command changed so a vgimport that does not specify all the LUN's will fail unless some new command line param is used. They have not yet provided this "enhancement" as of the last time I checked a couple of months ago. I now test for this condition and send mail to root as well as fail the BC mount if it does.

Life lesson: TEST YOUR BACKUPS!!
Dave Unverhau_1
Honored Contributor

Re: greatest blunders

This is probably not too uncommon...needed to shutdown a server for service (one of several lined up along the floor...no...not racked). Grabbed the keyboard sitting on that box and quickly typed the shutdown string (with a -hy 0, of course) and got ready to service the box.

...ALWAYS make sure the keyboard is sitting on the box to which it is connected!

-------------------

Another thing to remember (I didn't do this one!) NEVER fool around like you're pretending to hit that big red button by the door! (Somethings things just don't go as you had planned...)

(I'm not sure where he is now, but I know he doesn't work for us anymore...)

Dave
Romans 8:28
Deepak Extross
Honored Contributor

Re: greatest blunders

We had this developer who claimed that when he runs his program, it complains about /usr/bin/ld. (This was because of a missing shared library, he later discovered) It was decided to backup /usr/bin/ld and replace it with 'ld' from another machine on which his program worked.
No sooner was ld moved, than all hell breaks loose.
Users get coredumps in response to even simple commands like "ls", pwd", "cd"... New users cannot telnet into the system and those who are logged in are frozen in their tracks.

Both the developer and admin are still working with us...
V. Nyga
Honored Contributor

Re: greatest blunders

Hi,

once I have got 3 identical workstations. Every station had a configuration sheet with (among other things) the lan address.
Every workstation has got a name which I wrote at the sheets.
When one mainboard failed it was replaced and I gave the sheet to the worker to set the lan address.
Since then I had problems with two of my workstations - I checked it with ping and they disappeared during ping!
With help of the support center we checked with lanadmin the lan addresses of both workstation - they were identical!
I've written the wrong name at the sheet and we couldn't work with this clients for one week!

So some things should be checked twice!

Volkma
*** Say 'Thanks' with Kudos ***
T G Manikandan
Honored Contributor

Re: greatest blunders

I wanted to enable remote console on the L class.
Remember I gave the same ip of the machine to the remote console configuration.

The battery inside the server had to be taken off and again placed for the system to come up.

1 hour oooh.....
The time when new to hpux.
V. Nyga
Honored Contributor

Re: greatest blunders

My greatest plunder
.....................
only getting 4 points
while anybody else gets a bunny
.....................
sorry that I didn't rm-ed my /
.....................
*** Say 'Thanks' with Kudos ***
Bill McNAMARA_1
Honored Contributor

Re: greatest blunders

Hey Dave,
woundn't it be easier to clean up your cron job!!
It works for me (tm)
U.SivaKumar_2
Honored Contributor

Re: greatest blunders

Dear nyga ,

accept my sincere apologies. I meant to give you 9 points but it scrolled down to 4 without my knowledge

Put one more post , you deserve 10 points more

regards,
U.SivaKumar
Innovations are made when conventions are broken
Nobody's Hero
Valued Contributor

Re: greatest blunders

Greatest blunder.

Back in the early 90's in the halon days. Working nights. I was alone on a snow night. I was the only mainframe operator with a 4x4. I had everything running as smooth as possible. " Boy my boss will love this when he makes it in tomorrow morning and see's that I can run this place by myself " I smelled something burning slightly but couldn't tell where it came from. So I walked around the room. Boom above my head, a light ficture caught on fire and flames were shooting out of the drop ceiling. You would have thought that the clue 'drop ceiling' would have rang a bell. Nothing above a drop ceiling. I let the halon discharge. Downed systems and a costly 50K refill for the halon.
UNIX IS GOOD
benoit Bruckert
Honored Contributor

Re: greatest blunders

Well,
For me the worst was on AIX and not HP-UX,
TO install 2 new SSA disks (which are like Fibre Channel), I didn't want to stop the prod server,
Then I installed the 2 disks, detect them from the OS : no troubles.
But then declare these disks in the VG (like pvcreate on HP). And at this level, all the disks of the system disappear !!! The LVM structure was out of order, just rootvg (vg00) was ok !! (And remember that it was prod server). But another server connect to the same disks succed to imports the disks, data were there !!!,
In 1 hour, I configured this server to work as the first one (hand made Service guard if you want !!). And later in the evening I succeded to import back the disks on the main server !!
Today, I don't understand what happened !!! I did the same thing (pvcreate ) many other times since this day without any troubles !
Who said that IS is logical ??

regards
Benoit
Une application mal pansée aboutit à une usine à gaze (GHG)
RAC_1
Honored Contributor

Re: greatest blunders

Well I was very very new to HP-UX. Wanted to set up PPP connection with a password borrowed from a friend so that I could browse the net.

Did not worry that the remote support modem can not dial out from remote support port.
Went through all documents available, created device files dozen times, but never worked. In anguish did rm -fr `ltr|tail -4|Awk '{print $9}'
(That to pacify myself that I know complex commands)

But alas, I was /sbin/rc3.d.

Thought this is not going to work and left that.
Other colleage not aware of this rebooted the system for Veritas netbackup problem.

Within next two hours HP engineer was on-site. Was called by colleague.

Was watching whole recovery process, repeatedly saying "I want to learn, I want to learn"

Then came to know that can not be done.
There is no substitute to HARDWORK