- Community Home
- >
- Servers and Operating Systems
- >
- Operating Systems
- >
- Operating System - HP-UX
- >
- Re: Need help on creating script to split data
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Discussions
Discussions
Forums
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО04-13-2005 10:46 PM
тАО04-13-2005 10:46 PM
Need help on creating script to split data
I need some help and guidance to create a shell script to split data into different file.
I have data in one file look like this:
file1:
|A|LR|
|B|LR|
|B|FO|
|C|LR|
|D|LR|
|D|FO|
|E|LR|
|F|LR|
|G|LR|
|G|FO|
I want to split "double entry" B,D and G into one file and "single entry" A,C,E and F in another file.
Would appreciate if you could help me to do so.
Regards,
Munawwar
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО04-13-2005 10:52 PM
тАО04-13-2005 10:52 PM
Re: Need help on creating script to split data
cat file1 | grep G | grep D | grep G > file3
SEP
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com
- Tags:
- grep
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО04-13-2005 11:02 PM
тАО04-13-2005 11:02 PM
Re: Need help on creating script to split data
Enjoy, Have FUN! H.Merijn
- Tags:
- Perl
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО04-13-2005 11:18 PM
тАО04-13-2005 11:18 PM
Re: Need help on creating script to split data
The things is that I have about 30,000 of such data in one file.
Actually the first field is represent numbering.
A = 12345
B = 34521
C = 25431
D = 43521
E = 54213
F = 32541
G = 45123
For duplicate data i.e. B, D and G.
it has same number.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО04-13-2005 11:24 PM
тАО04-13-2005 11:24 PM
Re: Need help on creating script to split data
TMTOWTDI
Enjoy, Have FUN! H.Merijn
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО04-14-2005 12:09 AM
тАО04-14-2005 12:09 AM
Re: Need help on creating script to split data
#!/usr/bin/sh
# Extract unique only
cut -d'|' -f2 datafile.lis | uniq -u > unique.lis
# Change the format
sed "1,$ s/^/^|/" unique.lis > unique2.lis
sed "1,$ s/$/|/" unique2.lis > unique.lis
rm unique2.lis
# Extract Uniques
grep -f unique.lis datafile.lis > unique.data
# Extract Duplicates
grep -vf unique.lis datafile.lis> dup.data
rm unique.lis
datfile.lis is the input filename
Regards
- Tags:
- uniq
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО04-14-2005 01:17 AM
тАО04-14-2005 01:17 AM
Re: Need help on creating script to split data
I like the uniq method myself.
Do you know the records are in order, and have just a single duplicate per key?
Here is some somewhat convoluted awk to do the job:
----- x.awk ---------
END{if (dup){print last>>"dups"} else {print last}}
{ if ($2==key) {
print last>>"dups";
dup=1;
} else {
if (dup) {
print last>>"dups";
dup=0;
} else {
if (NR>1) {print last};
}
}
last=$0;
key=$2;
}
It processes the last record based on current key matching the last or not.
It has to avoid the printing nothing for the first, and it has to special case the end for the last last. Yikes.
Usage with your sample data in file 'x'
# awk -F"|" -f x.awk x
|A|LR|
|C|LR|
|E|LR|
|F|LR|
# cat dups
|B|LR|
|B|FO|
|D|LR|
|D|FO|
|G|LR|
|G|FO|
If you just have 30,000 record or so, then you can readily suck them into perl and spit back out based on dups or not:
----- x.pl -----------
while (<>) {
$key = (split(/|/))[1];
$records{$key} .= $_;
}
open (DUPS, ">dups");
foreach $key (sort keys %records) {
$_ = $records{$key};
if (/\n\|/) {print DUPS} else {print};
}
-----------------
So here each record fets concattenated with any prior data for a given key. If there was nothing, it'll be just that new record. If there was something it gets added.
When all is read, retrieve the key, and the data for the key. If there is a newline + bar in the record, it must have been a dup!
Usage: # perl x.pl x
hth,
Hein.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО04-14-2005 03:09 AM
тАО04-14-2005 03:09 AM
Re: Need help on creating script to split data
Thanks for the input... I will try tomorrow and see which one will work :-)
/munawar
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО04-14-2005 03:14 AM
тАО04-14-2005 03:14 AM
Re: Need help on creating script to split data
We like feedback as wel. This way we also can improve ourselves.
Enjoy, Have FUN! H.Merijn
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО04-14-2005 03:42 AM
тАО04-14-2005 03:42 AM
Re: Need help on creating script to split data
Here is an alternate perl solution, suitable for much large files. It makes two passes over the input. First just count occurences for each key. The second time print to the right file based on the key
----
$file = shift @ARGV or die "please provide file";
open (IN,"<$file") or die "Could not open $file";
while (
$keys{(split(/|/))[1]}++;
}
open (DUPS, ">dups");
open (IN,"<$file");
while (
if ($keys{(split(/|/))[1]} > 1) {print DUPS} else {print};
}
-----------------
variant second part:
while (
$filehandle = ($keys{(split(/|/))[1]} > 1) ? DUPS : STDOUT;
print $filehandle $_;
}
Cheers,
Hein.