cancel
Showing results for 
Search instead for 
Did you mean: 

Re: Linux image deploy

 
SOLVED
Go to solution
Teemu Turpeinen
Advisor

Re: Linux image deploy

Hello.

Ok. We'll be waiting for the fixes.

Br,


/teemu
Teemu Turpeinen
Advisor

Re: Linux image deploy

Hello.

We noticed some other weird issues after image deployment, that are related to Disk Partitioning.
Please note the amount of used data before and after image deployment.

Before image deploy

# df
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/cciss/c0d0p2 8256824 162356 7675044 3% /
/dev/cciss/c0d0p1 518000 11576 480112 3% /boot
/dev/cciss/c0d0p7 14449472 32848 13682636 1% /home
/dev/cciss/c0d0p8 4128368 32844 3885816 1% /opt
none 1027636 0 1027636 0% /dev/shm
/dev/cciss/c0d0p9 4128368 32852 3885808 1% /tmp
/dev/cciss/c0d0p6 16513688 971584 14703260 7% /usr
/dev/cciss/c0d0p5 16513688 64620 15610224 1% /var

Same system after image deploy. Image was captured from the same server (right after the above df command was issued) it was deployed to

# df
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/cciss/c0d0p2 8254272 32828 7802148 1% /
/dev/cciss/c0d0p8 4127076 32828 3884604 1% /opt
/dev/cciss/c0d0p9 14452776 32828 13685780 1% /home
/dev/cciss/c0d0p7 16516052 32988 15644072 1% /var
/dev/cciss/c0d0p5 4127076 32828 3884604 1% /tmp
/dev/cciss/c0d0p6 16516052 32828 15644232 1% /usr
/dev/cciss/c0d0p1 505604 8239 471261 2% /boot

The data amount reported by df is incorrect after deploying the image. If I write something to the disk, df reports it ok, but only the newly written data. Once the new file is deleted df again shows that the disks are basicly empty. For example in /usr, there is over 900 Mb of data.

The disk geometry is also different after deployment.

before:
# fdisk -l

Disk /dev/cciss/c0d0: 73.3 GB, 73372631040 bytes
255 heads, 32 sectors/track, 17562 cylinders
Units = cylinders of 8160 * 512 = 4177920 bytes

Device Boot Start End Blocks Id System
/dev/cciss/c0d0p1 * 1 129 526304 83 Linux
/dev/cciss/c0d0p2 130 2185 8388480 83 Linux
/dev/cciss/c0d0p3 2186 3213 4194240 82 Linux swap
/dev/cciss/c0d0p4 3214 17562 58543920 f Win95 Ext'd (LBA)
/dev/cciss/c0d0p5 3214 7325 16776944 83 Linux
/dev/cciss/c0d0p6 7326 11437 16776944 83 Linux
/dev/cciss/c0d0p7 11438 15035 14679824 83 Linux
/dev/cciss/c0d0p8 15036 16063 4194224 83 Linux
/dev/cciss/c0d0p9 16064 17091 4194224 83 Linux

after:
# fdisk -l

Disk /dev/cciss/c0d0: 73.3 GB, 73372631040 bytes
255 heads, 63 sectors/track, 8920 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

Device Boot Start End Blocks Id System
/dev/cciss/c0d0p1 1 65 522081 83 Linux
/dev/cciss/c0d0p2 66 1109 8385930 83 Linux
/dev/cciss/c0d0p3 1110 1631 4192965 82 Linux swap
/dev/cciss/c0d0p4 1632 8920 58548892+ 5 Extended
/dev/cciss/c0d0p5 1632 2153 4192933+ 83 Linux
/dev/cciss/c0d0p6 2154 4242 16779861 83 Linux
/dev/cciss/c0d0p7 4243 6331 16779861 83 Linux
/dev/cciss/c0d0p8 6332 6853 4192933+ 83 Linux
/dev/cciss/c0d0p9 6854 8681 14683378+ 83 Linux

dmesg after deployment still shows the original information for the disk

cciss: Device 0x3238 has been found at bus 8 dev 8 func 0
blocks= 143305920 block_size= 512
heads= 255, sectors= 32, cylinders= 17562 RAID 1(0+1)

The image deploy task also does not include the following in /etc/fstab

none /dev/pts devpts gid=5,mode=620 0 0
none /dev/shm tmpfs defaults 0 0

OS in use is RHEL3 Update 7

Are these also being investigated on?


Br,

/teemu
John T. Willis
Occasional Advisor

Re: Linux image deploy

Teemu,

The [ df ] command often reports open file nodes that are still in use, but that may no longer exist on the file system. It's mostly used for checking "free space". The usage column can vary depending on the situation.

Please try using the [ du -s ] command on the directories that act as file system mount points to get a different perspective. It's mostly used for checking files that occupy file system space. The "-s" option will summarize the results for the parent and all descending directories.

/
/boot
/home
/opt
/..

I'd like to see your before and after results if you don't mind.

The disk geometry changes due to the tools used to partition the disk.

Each Linux OS is installed with a native installer provided by the OS vendor that uses tools packed with that installer.

Since each toolset can vary between OS vendor, version and release, differences can arise in the default treatment of disk geometry.

The Capture and Restore process uses a single toolset common across all OS vendor, versions and releases managed.

Captures are file based and copies the files found on the disk file systems it captures.

Restore repartitions the disk using the common partition toolset and consistently treats disk geometry the same. File systems are created, and the files restored to the new file systems.

We test each combination of supported OS and file systems supported to insure the OS restored will be able to mount and use the new partitions and file systems.

The partition Id type of "- f -" is used by Win95 and similar products to represent an Extended partition, its a place holder for Logical partitions that will be created after the Extended partition type on the disk.

Only four Primary "type" partitions are supported by convention on a single disk.

The Extended "type" is usually used on the fourth parition so that more partitions can be created on the disk.

The "recognition" of Extended Logical partitions is OS dependent and thus the vendor of the OS usually determines the Id number assigned to the Extended partition.

The partition Id type of "- 83 -" is used by Linux to represent an Extended partition type.

The original disk must have been created using a tool that chose to use the Id of "- f -" for the Extended partition type.

Most of the time Linux will honor this and understand it is a valid Extended partition Id, however Win95 OS types generally react badly to Linux partition types given the Id of 5 that are used for Linux file systems.

Pending the results of your before and after using [ du ], we'll be looking into any potential issue.

However the block counts in the fourth column from [ df ] suggest the Restore proceeded as it should.

Disk geometry and position on the disk probably accounts for the slight difference in numbers. This would be expected.

Here's your results slightly rearranged.

Before image deploy

Before
# df
Available Use% Mounted on
_7675044 3% /
__480112 3% /boot
13682636 1% /home
_3885816 1% /opt
_3885808 1% /tmp
14703260 7% /usr
15610224 1% /var

Same system after image deploy. Image was captured from the same server (right after the above df command was issued) it was deployed to

# df
Available Use% Mounted on
_7802148 1% /
__471261 2% /boot
13685780 1% /home
_3884604 1% /opt
_3884604 1% /tmp
15644232 1% /usr
15644072 1% /var

Thank you for your comments, and keen observations.

- JT
John T. Willis
Occasional Advisor

Re: Linux image deploy

Teemu,

Correction the sentence that read:

The partition Id type of "- 83 -" is used by Linux to represent an Extended partition type.

Should have said:

The partition Id type of "- 5 -" is used by Linux to represent an Extended partition type.

Sorry for the mistake.

- JT
Teemu Turpeinen
Advisor

Re: Linux image deploy

Hello.

Thank you for the response. 'du' command shows correct information for each filesystem meaning that for example in "/usr" it shows over 900 Mb before and after image deployment. Probably I should have mentioned that already in the previous post. The point was, that this over 900 Mb of data does not show in 'df' output after image deployment although it is there.

I am aware that 'df' might sometimes report the used space incorrectly, ie. that there's more space in use than there actually is, but I've just never seen it report too little (in use 1% while should be ~ 7%). Gap is quite big to explained only by different HD geometry.

I was not that concerned about the partitions created on the disk, but more of the geometry which is reported differently by 'fdisk' and 'dmesg', but probably this is not something to be concerned about. Just something we noticed. Original disk was created with RedHat installer (kickstart method).


There's also one other thing that caught my eye.

When fsck is run at boot, it displays statistics for each filesystem.

This one is _after_ image deployment:
Dec 1 20:04:37 localhost fsck: /: clean, 22576/1048576 files, 73105/2096482 blocks
Dec 1 20:04:37 localhost fsck: /opt: clean, 11/524288 files, 24671/1048233 blocks
Dec 1 20:04:37 localhost fsck: /home: clean, 12/1836928 files, 65861/3670844 blocks
Dec 1 20:04:37 localhost fsck: /var: clean, 21/2101152 files, 75241/4194965 blocks
Dec 1 20:04:37 localhost fsck: /tmp: clean, 1064/524288 files, 31405/1048233 blocks
Dec 1 20:04:37 localhost fsck: /usr: clean, 11/2101152 files, 74159/4194965 blocks
Dec 1 20:04:37 localhost fsck: /boot: clean, 11/130560 files, 24715/522080 blocks

This also shows incorrect information. For example the "/usr" partition again. fsck shows 11 files, while in reality, there is almost 50000:

# find /usr | wc -l
49874

fsck from boot.log after kickstart install from the same system shows correct information:
Dec 1 21:15:48 localhost fsck: /: clean, 22564/1048576 files, 73495/2097120 blocks
Dec 1 21:15:50 localhost fsck: /boot: clean, 39/65920 files, 4970/131576 blocks
Dec 1 21:15:50 localhost fsck: /home: clean, 15/1835008 files, 65799/3669956 blocks
Dec 1 21:15:50 localhost fsck: /opt: clean, 15/524288 files, 24675/1048556 blocks
Dec 1 21:15:50 localhost fsck: /tmp: clean, 13/524288 files, 24676/1048556 blocks
Dec 1 21:15:50 localhost fsck: /usr: clean, 46401/2097152 files, 308710/4194236 blocks
Dec 1 21:15:50 localhost fsck: /var: clean, 144/2097152 files, 82178/4194236 blocks

So what could cause the difference?

The time in the above lists is incorrect because of current BIOS settings, so don't pay attention to that.




Br,

/teemu
John T. Willis
Occasional Advisor

Re: Linux image deploy

Teemu,

Sorry it took so long to reply.

I'm still learning how to setup my forum notification messages.

Thanks for mentioning the "- du -" command, glad it provides encouraging results.

Going by measured "free space" using "- df -" is somewhat unreliable, other things could be going on in the system that effect its report, it is Inode sensitive and anything can open or close an Inode at any time.

One thing that comes to mind are sparse files, which by nature are mostly full of "hot air". After a restore they would need an index record access to "reinflate" them to their previous size.

I will admit 7% may sound high, but I can only speak from my experience, which is not to use "- df -" to provide an estimated count of files, or disk space in use from inference.

I would suggest you don't use "- df -" for judging the integrity of your files.

One thing that might also be confusing is the "order" the partitions are recreated in.

The Native installer creates its partitions in the order the designers of that OS Vendor, Version, Release needed, due to the order of install operations.

Capture saves the contents and size of the partitions required to restore the system, but does not necessarily put it back in the order the Native installer designers had to create them.

Rather the Deploy process recreates the partitions in the order found in the fstab file of the captured system.

This may have made it hard for you to notice that the before and after results are not line for line the same item unless you rearrange them based on the file system name.

It creates a visual bias that something is further out of alignment than it actually is.

It's hard to reformat your data in a fix width font in this forum but eliminating all but the easiest to interpret columns, here's an attempt

fsck from boot.log after kickstart install
files 1048576 /
files 524288 /opt
files 1835008 /home
files 2097152 /var
files 524288 /tmp
files 2097152 /usr
files 65920 /boot

This one is _after_ image deployment:
files 1048576 /
files 524288 /opt
files 1836928 /home
files 2101152 /var
files 524288 /tmp
files 2101152 /usr
files 130560 /boot

fsck from boot.log after kickstart install
blocks 2097120 /
blocks 1048556 /opt
blocks 3669956 /home
blocks 4194236 /var
blocks 1048556 /tmp
blocks 4194236 /usr
blocks 131576 /boot

This one is _after_ image deployment:
blocks 2096482 /
blocks 1048233 /opt
blocks 3670844 /home
blocks 4194965 /var
blocks 1048233 /tmp
blocks 4194965 /usr
blocks 522080 /boot

e2fsck is a common program, but I'll be the first to admit I am not an expert in its output.

Theorectically is reconstructs the disk by reading each block in the system, reconstructs the inodes and directories and the directory relationships.

The numbers in front of "- / -" slash before the files and blocks columns may refer to the directory inodes or the journaled logs it replayed for each file system per set of files and blocks, but I do not know for sure.

The answer can probably be found in the source code to the e2fsck program developed by Theodore Ts'o

I wouldn't strictly recommend relying on fsck initscript messages for checking the number of files and the file system integrity.

Especially since its output will vary depending on the actual file system type used and author of the fsck extension for that file system type.

I would recommend, if you like, using a "- find -" count as you suggest, and perhaps an "- md5sum -" command on all of the files in the filesystems that are important and "- diff -" the results.

Obviously /var and /tmp will be highly variable and change everytime the system is booted.

And since the most common file system type, and the default in ICLE Kickstarts recommends you to use a journaling file system, Ext3, you can expect the file system count to drift upwards depending on file access patterns and how long the system has been online before capture.

The longer the system is online after kickstart, the more opportunity for user variations in the number of files created, and automated processes to create files or add to them based on logins and cron jobs.

Again thanks for your observations, and great comments.

- JT
Teemu Turpeinen
Advisor

Re: Linux image deploy

Hello John.

Thanks for the answers. I guess the conclusion is that the system is ok, though there seems to be a few differences before and after deployment.

As the environment we're building is quite critical, we wanted to make sure that these differences before and after deployment are not something to be concerned about. Especially since with Rapid Deployment Pack things worked differently.

Again, thank you for the responses.

Br,

/teemu