- Community Home
- >
- Servers and Operating Systems
- >
- Operating Systems
- >
- Operating System - HP-UX
- >
- Re: Extracting data from a huge file
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Forums
Discussions
Discussions
Discussions
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-09-2006 10:47 AM
06-09-2006 10:47 AM
Extracting data from a huge file
MY APPROACH: data is in a huge file with chunks seperted between lets say these two string i,e chunk starting at "
!!!!!!!!!!!!!!!EXAMPLE DATA!!!!!!
PIN INPUT(
first blah blah( blah
blah
blah blah !@#$#
blah
blah blah blah blah
blah
*** **PUT
!!!!!!!!!!!!!111
MY CODE
=======
loop1
IFS=":"
##DATA FROM BVR###
fileone=$(cat first.file)
dataone=${fileone#*"$i"}
dataone=${dataone%%PUT*}
print $dataone > buf.first
##DATA FROM EVR###
filetwo=$(cat second.file)
datatwo=${filetwo#*"$i"}
datatwo=${datatwo%%PUT*}
print $datatwo > buf.second
loop2
some basic grep data manipulation on
the chunks from above
endloop2
end of loop1
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-09-2006 10:54 AM
06-09-2006 10:54 AM
Re: Extracting data from a huge file
Awk is easier to learn but a really sneaky method is to write your script in awk and then when it is working s desired, use a2p to read your awk script and output an equivalent Perl script. Awk or Perl will probably be 50-100X fater than your current approach.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-09-2006 11:02 AM
06-09-2006 11:02 AM
Re: Extracting data from a huge file
In my case the pattern is very simple all data AS IS from a file between " PUT" to "PUT" from a file. I can write python/perl script but wrote them an year ago so any simple examples would be appreciated and i can then write my code.
Thanks for the input
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-09-2006 12:40 PM
06-09-2006 12:40 PM
Re: Extracting data from a huge file
ex -s inputfile <
q
EOF
If you provide a representative sample of the input to be processed, the above ex code snippet can be tweaked further to meet the specified criteria.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-09-2006 11:46 PM
06-09-2006 11:46 PM
Re: Extracting data from a huge file
if the amount of data you need (between the 'PUT' string) is small against that of the whole file, it is best to put that first into a temporary file. Esp. when you need to process this data more than once, it will really speed up.
About the method for parsing, I will not say, that a ksh solution is always much slower - but only when your solution is purely written with builtin funcions, e.g.
NOT
fileone=$(cat first.file)
(uses 'cat')
BUT
fileone=$(
BTW. awk would be my first choice, netherthess.
mfG Peter
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-10-2006 06:07 AM
06-10-2006 06:07 AM
Re: Extracting data from a huge file
When debugged this line takes 5 minutes when the pattern is in the middle of file,from the top and bottom it takes seconds.
=======================
dataone=${fileone#*"$i"}
=========================
pyhton/perl equivalent giving the same delay. I will try this ex(1) thing though.
Thanks all
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-10-2006 07:02 AM
06-10-2006 07:02 AM
Re: Extracting data from a huge file
perhaps you tell us, what your loop is/ loops are you do through your data - it is really not uniq your pseudocode tells us about "$i".
mfG Peter