1833907 Members
2016 Online
110063 Solutions
New Discussion

grep issue

 
SOLVED
Go to solution
Steve Lewis
Honored Contributor

grep issue

We have a 9Gb text file which has spurious nulls in it at various places.
The file is too big to vi on our system.
We would like to print out all the lines which contain nulls. Unfortunately null (ASCII 0) is the string terminator, so awk, sed and grep take everything up the null as the string and everything afterwards as the next one.
i.e. grep '\0' doesn't work, the manual says that null strings print every character anyway.

Please test your ideas before posting them.

11 REPLIES 11
harry d brown jr
Honored Contributor

Re: grep issue

Steve,

are you trying to identify the lines that have embedded null's in them or trying to remove them??

live free or die
harry
Live Free or Die
Martin Johnson
Honored Contributor

Re: grep issue

How about "od filename | grep 0000"?

See "man od"

Marty
Steve Lewis
Honored Contributor

Re: grep issue

I need to identify which lines in the file have errors, so that the provider of the file can fix the erronious records and re-send them to me.

These nulls form part of a fixed length record, so removing them or replacing them with something is not going to help.

If only it were a simple case of replacement with tr '\0' ' ' < infile > outfile ...

Steve Lewis
Honored Contributor

Re: grep issue

Martin obviously didn't test his idea, since it prints every line - the address is at the start of every column, starting 0000..
harry d brown jr
Honored Contributor

Re: grep issue


Why not have them send you the file again without the nulls?

live free or die
harry
Live Free or Die
Steve Lewis
Honored Contributor

Re: grep issue

We will ask for it again without the nulls, but they are our new customer and I want to be nice and helpful. Its a big file to search.

Actually I have just spotted the C function call gets() reads to a new line, whereas fgets stops at nulls. Maybe I could write a program, unless someone can come up with a neat bit of perl or python or something.
Robin Wakefield
Honored Contributor
Solution

Re: grep issue

Steve,

I did this:

# echo 'a\0b' > /tmp/nulls
# echo hello >> /tmp/nulls
# od -c /tmp/nulls
0000000 a \0 b \n h e l l o \n
0000012
# echo '\0' > /tmp/null
# grep -f /tmp/null /tmp/nulls
ab

Rgds, Robin
Rodney Hills
Honored Contributor

Re: grep issue

Here is a short perl program to scan for nuls and displays the record number it is found on.
Set $reclen to the length of your fixed length records.

open(INP,"$reclen=8;
while(read(INP,$datum,$reclen)) {
$nrec++;
if($datum=~/\00/m) { print "nul at record# $nrec\n";}
}

Hope this helps...

-- Rod Hills
There be dragons...
Steve Lewis
Honored Contributor

Re: grep issue

Thanks Robin and Rodney, well earned.
I learned something new today.

Steve
Tom Maloy
Respected Contributor

Re: grep issue

This will print the lines with NUL characters in them:

perl -nle 'print if /\0/' < datafile

Tom
Carpe diem!
James R. Ferguson
Acclaimed Contributor

Re: grep issue

/No_Points_Please/

Robin, that was a very elegant solution!

Regards!

...JRF...