- Community Home
- >
- Servers and Operating Systems
- >
- Operating Systems
- >
- Operating System - HP-UX
- >
- scripts in shell
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Forums
Discussions
Discussions
Discussions
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
04-09-2008 03:06 PM
04-09-2008 03:06 PM
I would like you can send me a little program for the next question .... I hope you can ... :)
I have 2 files: file A and file B, everyone with 2 columns and they are disorganized. However, these 2 files have some numbers in the first column in common. I would like to get a new file (file C) with the numbers in common in the first column and the second corresponding numbers. For example:
file A:
12345000001 987230001
12345000006 987230002
12345000003 876450001
12394000012 789123456
File B:
12345000001 987230005
12345000004 987231202
12345000003 876450014
12394000012 789123466
File C:
12345000001 987230001
12345000003 876450001 876450014
12394000012 789123456 789123466
I hope you can help me.
Best regards,
Christian
Solved! Go to Solution.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
04-09-2008 03:43 PM
04-09-2008 03:43 PM
SolutionWhy does 12345000001 not show in C? Should it?
WIll the output be sorted? Could it be?
What to do with records which have no correspondence? Drop?
Is there one 'dominant/driver file and the other a secondary/slave or do both wheigh equaly?
How must data? less than a magabyte? More than a gigabyte?
If the output is sorted, then just use
join A B > C
$ join A B
12345000001 987230001 987230005
12394000012 789123456 789123466
$ sort A > AA
$ sort B > BB
$ join AA BB > C
$ cat C
12345000001 987230001 987230005
12345000003 876450001 876450014
12394000012 789123456 789123466
$ perl -e 'open B,") {($k,$v)=split; $b{$k}=$v}; open A,"){($k,$v)=split; print qq($k $v $b{$k}\n)}'
12345000001 987230001 987230005
12345000006 987230002
12345000003 876450001 876450014
12394000012 789123456 789123466
$ perl -e 'open B,") {($k,$v)=split; $b{$k}=$v}; open A,"){($k,$v)=split; print qq($k $v $b{$k}\n) if $b{$k}}'
12345000001 987230001 987230005
12345000003 876450001 876450014
12394000012 789123456 789123466
$
hth,
Hein
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
04-09-2008 03:59 PM
04-09-2008 03:59 PM
Re: scripts in shell
yep, you are right .. it should ne like this:
file A:
12345000001 987230001
12345000006 987230002
12345000003 876450001
12394000012 789123456
File B:
12345000001 987230005
12345000004 987231202
12345000003 876450014
12394000012 789123466
File C:
12345000001 987230001 987230005
12345000003 876450001 876450014
12394000012 789123456 789123466
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
04-09-2008 04:02 PM
04-09-2008 04:02 PM
Re: scripts in shell
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
04-09-2008 04:25 PM
04-09-2008 04:25 PM
Re: scripts in shell
Believe me, it does matter.
If both files are small, any solution will do.
If one file is much (10x) smaller than the other, and less than say 100MB, then you want to load that in memory first. next read the bigger file and look for matches in memory (the second perl solution).
If both files are large ( > 1 GB ) then you probably want to stick to the sort first, then join.
Hope this helps some,
Hein van den Heuvel (at gmail dot com)
HvdH Performance Consulting