- Community Home
- >
- Servers and Operating Systems
- >
- Operating Systems
- >
- Operating System - HP-UX
- >
- Re: Disk write/cache and network
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Discussions
Discussions
Discussions
Forums
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО05-23-2003 11:42 AM
тАО05-23-2003 11:42 AM
We have a cron process that uploads data from some real time systems to the HP9K server everyday. The data size range is from 10KB to 30GB. After the upload, the script compares the checksum results of source and destination files. Intermittently this comparison fails by giving timeout errors and incorrect file size errors. One suspect is the network and the other one is the SCSI channel. Now, my questions:
1) During data transfer of big files, are they firt get cached and then writes to disk? If yes, then while checksum testing will it return the size of an incomplete file (which is still in cache or network)?
2) Can anybody give a detail description of data tranfer/write process, when a file gets transferred through network?
I just need some inputs from you to debate my assumptions.
Thanks!
Shiju
Solved! Go to Solution.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО05-23-2003 11:51 AM
тАО05-23-2003 11:51 AM
Re: Disk write/cache and network
I can say we had similar problems and they were caused by the network. We had files failing checksum after transfer and large transfers sometimes hanging or failing.
Specfically, a Cisco router set to autonegotiate for the HP port.
lanadmin -x 1 showed 100 Base T full duplex autonegotiate.
The problem was eventually solved with two steps:
1) Set the router to explicit 100 BaseT Full Duplex.
2) Set the expected speeds for the NIC in /etc/hpbtlanconf
I'm attaching our conf file.
I'm sure you know all of this, but it is interesting that we have the same symptoms.
SEP
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО05-23-2003 11:51 AM
тАО05-23-2003 11:51 AM
Re: Disk write/cache and network
The OS is 11.0, fully pacthed. Also when I said cache, I actually meant disk array and controller cache, system memory etc..my bad :((
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО05-23-2003 11:53 AM
тАО05-23-2003 11:53 AM
Re: Disk write/cache and network
If I make the assumption that this is TCP/IP then the packets have to be cached because the packets may not even arrive in order they have to be reassembled by the receiver.
I will say that transferring 30GB files over anything except a rock-solid network is asking for trouble. If this were me, I would split the files into smaller chunks, transfer them with error checking, and then cat them together. That way, if a transmission error occurs, error recovery is much faster/cheaper than for a huge file.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО05-23-2003 12:00 PM
тАО05-23-2003 12:00 PM
Re: Disk write/cache and network
Clay, sorry! my bad ..it happens when you have bunch of issues on the friday before a long weekend =))
Most of the RT systems use NFS mount points and some of them uses an rcp command.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО05-23-2003 12:50 PM
тАО05-23-2003 12:50 PM
Re: Disk write/cache and network
Have a great weekend to all, I will check the responses again on Tuesday.
Thanks!
Shiju
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО05-27-2003 07:17 AM
тАО05-27-2003 07:17 AM
Re: Disk write/cache and network
Thanx!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО05-27-2003 11:43 AM
тАО05-27-2003 11:43 AM
Re: Disk write/cache and network
As I see it you have possibly two types of caches at work here.
1)Let's tackle the first cache type -> HW (either on HBA itself or on the controller on the disk - or both) As far as the OS is concerned as soon as the controller/HBA reports that the write is complete, the OS thinks the data is ON the disk while in actuality it may be in the HW cache still. Now most HW caches will report write complete when the data is safely in the ECC cache memory. The frequency of the flush of this cache must be determined to allow for the max time to check the disk. On some - not all - this reporting & flush time is configurable. But you should at least be aware of the absolute max time from HW cache to disk & adjust the program to allow for this latency.
2) The second cache is a SW (or OS) buffer cache & you need to check the filesystem mount options for whether you're using this cache (async write) or not using this cache (sync write). There are *many* reasons when one would or would not want to use the buffer cache & they mostly deal with performance, integrity or memory usage implications. The normal flush time for the OS is *roughly* 5-6 times a minute, but can be sped up by high disk usage that causes all buffer entries to fill before the next *normal* flush by the sync command.
But again you're in the position where the HW will tell the OS that the write is complete, even though the data sits in the HW cache.
So as I see it you need to determine:
A) What the max HW cache to actual disk write latency is
B) Whether you write async (NO buf cache) or sync (using buf cache) in those filesystems
and ensure the program has enough pause time for the sum of the max values of both.
I really don't think the network latency affects you here as the transfer mechanism probably will not report complete until all the data makes it to the destination intact. IF you allow time for all caches involved to flush, you MUCH less likely to have a misreported file size due to incomplete cache flushes.
HTH,
Jeff
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО05-28-2003 05:43 AM
тАО05-28-2003 05:43 AM
Re: Disk write/cache and network
Cksum or any other command really doesn't care if the file is in cache or on disk or any state in between - the buffer cache will take care of that and satify the read requests as needed.
If this were me, I would rethink the entire approach. I would much rather have error checking built into to the transfer so that you know immediately if a) the transfer timed-out b) any other errors were encountered. There is then no need for a cksum because the error reporting is integral to the transfer. One method is to use the Net::FTP Perl module. You can easily script the transfer and all the error checking you need is available simply by looking at a variable. If $status = 2, all is well and that's all you need to know.
If you do a boolean search for Perl and FTP, you should find examples. The Net::FTP module also uses the .netrc conventions so that the passwords do not have to be included in the source or passed as command line arguments.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО05-28-2003 05:43 AM
тАО05-28-2003 05:43 AM
Re: Disk write/cache and network
Cksum or any other command really doesn't care if the file is in cache or on disk or any state in between - the buffer cache will take care of that and satify the read requests as needed.
If this were me, I would rethink the entire approach. I would much rather have error checking built into to the transfer so that you know immediately if a) the transfer timed-out b) any other errors were encountered. There is then no need for a cksum because the error reporting is integral to the transfer. One method is to use the Net::FTP Perl module. You can easily script the transfer and all the error checking you need is available simply by looking at a variable. If $status = 2, all is well and that's all you need to know.
If you do a boolean search for Perl and FTP, you should find examples. The Net::FTP module also uses the .netrc conventions so that the passwords do not have to be included in the source or passed as command line arguments.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО05-28-2003 05:58 AM
тАО05-28-2003 05:58 AM
Re: Disk write/cache and network
Thanks for the information Jeff, much appreciated. I will observe the parameters you said. I have some questions though:
a) What's the default FS mount (vxfs) value - sync or async?
b) If the data is still in cache during a chksum command, will it return "time out" errors? instead of an incomplete value?
c) I did a performance comparison between network and local checksum processes. The same directory on the host was used for getting outputs and there was a huge performance drop when I used Network (nfs). Does it sounds anything?
Thanks ..points will follow!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО05-28-2003 06:15 AM
тАО05-28-2003 06:15 AM
Re: Disk write/cache and network
Cksum or any other command really doesn't care if the file is in cache or on disk or any state in between - the buffer cache will take care of that and satify the read requests as needed.
If this were me, I would rethink the entire approach. I would much rather have error checking built into to the transfer so that you know immediately if a) the transfer timed-out b) any other errors were encountered. There is then no need for a cksum because the error reporting is integral to the transfer. One method is to use the Net::FTP Perl module. You can easily script the transfer and all the error checking you need is available simply by looking at a variable. If $status = 2, all is well and that's all you need to know.
If you do a boolean search for Perl and FTP, you should find examples. The Net::FTP module also uses the .netrc conventions so that the passwords do not have to be included in the source or passed as command line arguments.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО05-28-2003 07:00 AM
тАО05-28-2003 07:00 AM
Re: Disk write/cache and network
Did you mean that the buffer/cache will satisfy the command with the correct value during a cksum (even if the data still in cache)?
Thanks for the idea about Net::FTP perl module, but unfortunately, I cannot do anything on that since the data transfer is being invoked by RT systems and is administered by different set of people. I would defenitely give this suggestion.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО05-28-2003 07:14 AM
тАО05-28-2003 07:14 AM
SolutionIf you want to completely eliminate buffer cache then mount the filesystem convosync=direct,mincache=direct - that will bypass buffer cache but you will see a performance hit for precisely that reason. (Even in this case, on-disk cache will still be in play but that's completely invisible to the OS.)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО05-28-2003 07:17 AM
тАО05-28-2003 07:17 AM
Re: Disk write/cache and network
Command line created default is log, which is a sync write.
SAM created default is delaylog, which is an async write.
You also need to remember that there are two types of writes - Data & Metadata (inode info). Metadata is almost always sync unless overruled by mount options.
mount -v
will show the FS mount options
mount -o remount,option1,option2 /dev/vg_name/lv_name /mnt_point
will allow you to change options on the fly w/o unmounting. Easy way to test performance & integrity implications.
BTW - Clay is correct that any reads will force OS buffer flushes and I would *suppose* HW cache flushes as well.
HTH,
Jeff
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО05-28-2003 07:30 AM
тАО05-28-2003 07:30 AM
Re: Disk write/cache and network
Any more inputs are welcome !
Shiju