Operating System - HP-UX
1833315 Members
2998 Online
110051 Solutions
New Discussion

Re: How to grep for tab spaces?

 
Sean OB_1
Honored Contributor

How to grep for tab spaces?

I have a file with 25k lines in it.

I have another file with 100 lines in it.

I want to remove lines from the 25k file that contain the line from the 100 line file.

25k file is like this

nnnnn xxxxxxemail

the 100 line file just has the xxxxxx entry

the problem with just using grep is that many lines could contain the word in the 100 line file, but I just want to remove the record that has xxxxx.

ex:
25K file

12345carlotemail1
2245caremail1

100 line

car
boat
apple

I only want to remove
2245caremail1

Any ideas?
19 REPLIES 19
James R. Ferguson
Acclaimed Contributor

Re: How to grep for tab spaces?

Hi Sean:

Based upon your last comment:

# perl -lne 'print unless m{\b2245\tcar\temail1\b}' yourfile

Regards!

...JRF...
Rodney Hills
Honored Contributor

Re: How to grep for tab spaces?

How about this perl one liner-

perl -ane 'print $_ if $F[1] eq "car"' yourfile

HTH

-- Rod Hills
There be dragons...
Sean OB_1
Honored Contributor

Re: How to grep for tab spaces?

Any way to do all 100 lines at once? Or do I have to loop through each line in the 100 line file and run the perl command?
A. Clay Stephenson
Acclaimed Contributor

Re: How to grep for tab spaces?

Since you want to remove the line, we need to use -v to print the non-matching lines.

grep -E -v -e 'car' < infile > outfile

You could also anchor your string for even better matching:

grep -E -v -e '^[0-9]+car' < infile > outfile

Note: means that you enclosed a TAB character within the quotes.
If it ain't broke, I can fix that.
Sean OB_1
Honored Contributor

Re: How to grep for tab spaces?


Note: means that you enclosed a TAB character within the quotes.

Ok, by this do you mean \t ?
Mel Burslan
Honored Contributor

Re: How to grep for tab spaces?

in my experimentation with ksh I found that

cat myfileof25klines | grep "\myword\"

yields the lines with myword but not the ones not surrounded by tabs.

( means, hit tab after typing the backslash, not literally typing as above)

HTH
________________________________
UNIX because I majored in cryptology...
A. Clay Stephenson
Acclaimed Contributor

Re: How to grep for tab spaces?

No, I mean you hit the single-quote key and then hit the key.
If it ain't broke, I can fix that.
Sean OB_1
Honored Contributor

Re: How to grep for tab spaces?

ok, that doesn't work in Bash. Will try it in ksh.
Sean OB_1
Honored Contributor

Re: How to grep for tab spaces?

Sorry should have mentioned bash.
A. Clay Stephenson
Acclaimed Contributor

Re: How to grep for tab spaces?

Did I mention that I hate bash? Ok this construct should work:

XX=$(echo "\011car\011")
grep -E -v -e "${XX}" < infile > outfile
If it ain't broke, I can fix that.
Sean OB_1
Honored Contributor

Re: How to grep for tab spaces?

hmm, no luck.


#!/bin/ksh

for USER in `cat /tmp/test`
do
echo $USER
grep -E -v -e' $USER ' < /tmp/emails.lst > /tmp/emails.out

cp -f /tmp/emails.out /tmp/emails.lst
done


Sean OB_1
Honored Contributor

Re: How to grep for tab spaces?

The spaces in ' $USER ' are actually tab chars.
James R. Ferguson
Acclaimed Contributor

Re: How to grep for tab spaces?

Hi (again) Sean:

OK, if you have a file with 100 tokens, each of which you want to delete from your first file then:

Assuming that the first file pattern is defined as some digits followed by a tab, followed by a token (specified in the second file) followed by another tab, like:

TOKEN

...then:

# cat ./perl.pl
#!/usr/bin/perl
use strict;
use warnings;
my %line;
my $key;
my $token;
my $file1=shift or die;
my $file2=shift or die;

open (FH, "<", $file1) or die "Can't open $file1: $!\n";
while () {
chomp;
$line{$.}= $_;
}
close(FH);
open (FH, "<", $file2) or die "Can't open $file2: $!\n";
while () {
chomp;
$token = $_;
foreach $key (keys %line) {
if ($line{$key} =~m/^\d+\t$token\t/) {
delete ($line{$key});
}
}
}
foreach $key (sort keys %line) {
print "$line{$key}\n";
}
1;

Run this as:

# ./perl.pl datafile tokenfile

Regards!

...JRF...
Sean OB_1
Honored Contributor

Re: How to grep for tab spaces?

hmm, still no joy.

[root@fishgeeks tmp]# cat /usr/local/bin/sean/cleanemails.lst
#!/usr/bin/perl
use strict;
use warnings;
my %line;
my $key;
my $token;
my $file1=shift or die;
my $file2=shift or die;

open (FH, "<", $file1) or die "Can't open $file1: $!\n";
while () {
chomp;
$line{$.}= $_;
}
close(FH);
open (FH, "<", $file2) or die "Can't open $file2: $!\n";
while () {
chomp;
$token = $_;
foreach $key (keys %line) {
if ($line{$key} =~m/^\d+\t$token\t/) {
delete ($line{$key});
}
}
}
foreach $key (sort keys %line) {
print "$line{$key}\n";
}
1;



[root@fishgeeks tmp]# /usr/local/bin/sean/cleanemails.lst /tmp/emails.lst /tmp/test > /tmp/emails.out
ll em*
cat emails.lst | wc -l
[root@fishgeeks tmp]# ll em*
-rw-r--r-- 1 root root 1004259 Feb 21 20:20 emails.lst
-rw-r--r-- 1 root root 1004259 Feb 21 20:25 emails.out
[root@fishgeeks tmp]# cat emails.lst | wc -l
24594
[root@fishgeeks tmp]# cat emails.out | wc -l
24594




[root@fishgeeks tmp]# head emails.lst
17779 : gemma : 03goodenoughgem@upperavon.wilts.sch.uk
17322 : Luanne : 11402@aol.com
8101 : 123 : 123@aol.com
15075 : Krokodil : 12794201@puknet.puk.ac.za
17655 : jc : 14337983@sun.ac.za
17656 : mariska : 14378574@sun.ac.za
17350 : 1charmed1 : 1charmed1@sympatico.ca
15765 : chester : 1doglover@sbcglobal.net
17654 : ksehler : 1forme@earthlink.net
24700 : cripps72 : 1fullyinvolved@charter.net



[root@fishgeeks tmp]# head test
ccotu
AngelaDickson
bato
pamrobi
capnben
basheeba
Terrymobil
gem
recycling=goddess
tinfoilowner
Rodney Hills
Honored Contributor

Re: How to grep for tab spaces?

Lets try this-

perl -ane 'BEGIN{open(INP,"<100linefile"); @a=; chomp @a; $flag{$_}=1 foreach @a};print $_ if $flag{$F[1]}' yourfile

HTH

-- Rod Hills
There be dragons...
Muthukumar_5
Honored Contributor

Re: How to grep for tab spaces?

Simply as,

grep -Evf 100linefile 25kfile > new25kfile

mv new25kfile > 25kfile.

--
Muthu
Easy to suggest when don't know about the problem!
A. Clay Stephenson
Acclaimed Contributor

Re: How to grep for tab spaces?

Your attempt :

#!/bin/ksh

for USER in `cat /tmp/test`
do
echo $USER
grep -E -v -e' $USER ' < /tmp/emails.lst > /tmp/emails.out

can't possibly work. Why? Because variables within sigle quotes are not expanded so that ' ${USER} ' is just exactly that. The solution is DOUBLE quotes.
If it ain't broke, I can fix that.
Arturo Galbiati
Esteemed Contributor

Re: How to grep for tab spaces?

Hi,
this is run fine on my server with korn shell:

1.
sed 's/^/ /;s/$/ /' 100line>100line.new
This will create a file 100line.new with the user name eclosed in tabs. Plead note that you have to substitute the space in the above comamnd type TAB on you keyboard

2.
grep -vf 100line.new 25Kfile>25Kfile.new
this will obtain what you want

HTH,
Art
Peter Nikitka
Honored Contributor

Re: How to grep for tab spaces?

Hi Sean,

if you like an awk-solution:


awk -F' ' -v pat=100lines 'BEGIN {while ((getline < pat) == 1) m[$0]=$0; close (pat)}
NF>3 {out=1; for (p in m) {if (p==$2) {out=0;break}}; if(out) print}' logfile


mfG Peter
The Universe is a pretty big place, it's bigger than anything anyone has ever dreamed of before. So if it's just us, seems like an awful waste of space, right? Jodie Foster in "Contact"