- Community Home
- >
- Servers and Operating Systems
- >
- Operating Systems
- >
- Operating System - HP-UX
- >
- Using sed or awk to validate a file content
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Forums
Discussions
Discussions
Discussions
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-03-2003 03:03 PM
12-03-2003 03:03 PM
I am new to sed and awk command. Currently I have a text file that has lines of records. The record fields are delimited by ";". A line of record has 26 fields and separated by 25 semi-colons ";".
eg:
A;B;C;D;E;F;G;H;I;J;K;L;M;N;O;P;Q;R;S;T;U;V;W;X;Y;Z
Our objective is to have a shell script that can validate the text file whether the records are complete in term of number of columns. Hence my idea is to count the number of delimiters on each of the records in the text file.
If the count of delimiters has 25, then the complete record will be written to another file else if the delimiter count has less than 25, then I have to write the record to a reject file.
Can the above objective be accomplished using the sed or awk command? I am stuck at this point using sed or awk to meet the objective.
Any suggestion is very much appreciated. Thank you in advance!
Regards,
Melvin
Solved! Go to Solution.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-03-2003 03:41 PM
12-03-2003 03:41 PM
Re: Using sed or awk to validate a file content
Awk default word seperator is white-space.
But you can make it anything including a semicolon (but you'll need to escape it for the shell).
The two notions combined solve your problem:
awk -F\; '(NF==26); (NF!=26) {print $0 >> "bad-file"}' < mixed-file > good-file
Myself, I'd spend a few minutes more and cerate a PERL scripts to open good, bad, loop over the input, print left or right and close.
hth,
Hein.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-03-2003 05:43 PM
12-03-2003 05:43 PM
Re: Using sed or awk to validate a file content
cat files | while read line
do
if [ $(echo "$line" | sed 's|[^;]||g' | wc -c) -ne 26 ]
then
echo $line > rejectfile
else
echo $line > another_file
fi
done
This should result the number of ';' plus 1 which indicates the number of columns. The plus 1 is because of the newline 'character' at the end of the line, which is counted.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-03-2003 07:08 PM
12-03-2003 07:08 PM
Re: Using sed or awk to validate a file content
a more traditional approach:
#!/usr/bin/sh
# Check CORRECT num of delim in $1
typeset -i NUM_CHARS=0 AFTER_DEL=0 DIFF=0 CORRECT=25
typeset DELIM="\;"
OK_FILE=./okrecs
REJ_FILE=./rejrecs
while read line
do
NUM_CHARS=$(echo "$line"| wc -c)
AFTER_DEL=$(echo "$line" | tr -d $DELIM |wc -c)
DIFF=$(( $NUM_CHARS - $AFTER_DEL ))
if [ "$DIFF" = "$CORRECT" ]
then
echo "$line" >> $OK_FILE
else
echo "$line" >> $REJ_FILE
fi
done <$1
Set OK_FILE and REJ_FILE to something appropriate and run it with your infile as $1.
regards,
John K.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-03-2003 07:21 PM
12-03-2003 07:21 PM
Re: Using sed or awk to validate a file content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-04-2003 01:44 AM
12-04-2003 01:44 AM
Re: Using sed or awk to validate a file content
I'm curious to know why, judging by assigned points, the author appears to value a simple one-line solution with the requested tool less than a broken complex solution with will perform like a dog due multiple forks per record.
(And there was even a bonus solution explanation! :-)
The explanation could be that I simply read too much in the points assigned. The author assigning medium points to a first solution, seeing if something better still will show up.
[no, I don't need more points, my boss might get worried where I found the time :^].
Just curious,
Hein.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-04-2003 02:08 AM
12-04-2003 02:08 AM
Re: Using sed or awk to validate a file content
The AWK Programming Language by the original authors of the language... Aho, Weinberger, Kernigan
It is a great little book and you will be a master of awk after you are done. also the Awk/sed book from O'reilly is good as well. should be able to find it at a used book store for $5.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-04-2003 02:09 AM
12-04-2003 02:09 AM
SolutionMy one-liner would be marginally shorter still...
awk -F\; '(NF!=26) {print >> "bad-file";next} {print}' mixed-file > good-file
-- Graham
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-04-2003 03:48 AM
12-04-2003 03:48 AM
Re: Using sed or awk to validate a file content
if(split($0,a,";")!=26) {print "bad"; exit}
}'
Rgds
Jean-Luc
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-04-2003 04:26 AM
12-04-2003 04:26 AM
Re: Using sed or awk to validate a file content
# perl -paF\; -e'select(@F==26?STDOUT:STDERR)' mixed-file > good-file 2>bad-file
Enjoy, Have FUN! H.Merijn
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-04-2003 06:04 AM
12-04-2003 06:04 AM
Re: Using sed or awk to validate a file content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-04-2003 02:31 PM
12-04-2003 02:31 PM
Re: Using sed or awk to validate a file content
After looking through all the replies, I found that the solutions provided met the objective. Surprisingly, the performance (execution speed) of all these different ways of syntax and coding is almost equal. I tried the execution with a big file. Perhaps as Hein commented, different ways of coding might take more resources though. Well, that's a good point to consider!
Lastly, I appreciate all your contributions here. Thank you very much!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-04-2003 03:27 PM
12-04-2003 03:27 PM
Re: Using sed or awk to validate a file content
Hein.