Operating System - Tru64 Unix
1830045 Members
16872 Online
109998 Solutions
New Discussion

Re: grep command

 
SOLVED
Go to solution
Soontorn Vittayaprachsa
Occasional Advisor

grep command

Hi there,

I need to extract the pattern from the daily collected file (stat.txt), size 300+MB.

The file contains the following snapshot,

-- Collecting Time at 00:01:02 ----
Data 1 -----------> 1
Data 2 -----------> 5
Data 3 -----------> 7
Data 4 -----------> 2
--TOTAL-----------> 15 ------------
-- Collecting Time at 00:06:02 ----
Data 1 -----------> 1
Data 2 -----------> 15
Data 3 -----------> 7
Data 4 -----------> 2
--TOTAL-----------> 25 ------------
-- Collecting Time at 00:11:02 ----
Data 1 -----------> 3
Data 2 -----------> 5
Data 3 -----------> 7
Data 4 -----------> 12
--TOTAL-----------> 27 ------------
. . .
-- Collecting Time at 23:59:02 ----
Data 1 -----------> 10
Data 2 -----------> 9
Data 3 -----------> 7
Data 4 -----------> 4
--TOTAL-----------> 30 ------------

I need to extract the pattern for only certian period, at about 00:0x:xx and the last line of that particular snapshot, for instance, like following,

-- Collecting Time at 00:01:02 ----
--TOTAL-----------> 15 ------------
-- Collecting Time at 00:06:02 ----
--TOTAL-----------> 25 ------------
-- Collecting Time at 00:11:02 ----
--TOTAL-----------> 27 ------------

I have been tried to use grep command as follows,

. grep -E 'Collecting Time at 00 | TOTAL' stat.txt |tee 0000.lst
. grep -E "Collecting Time at 00 | TOTAL" stat.txt |tee 0000.lst
. egrep -e 'Collecting Time at 00 | TOTAL' stat.txt |tee 0000.lst
. egrep -e "Collecting Time at 00" -e "TOTAL" stat.txt |tee 0000.lst
. . .

And the all output was end up as follows, not exactly I need,

--TOTAL-----------> 15 ------------
--TOTAL-----------> 25 ------------
--TOTAL-----------> 27 ------------

Any idea would help.

Soontorn
10 REPLIES 10
Michael Schulte zur Sur
Honored Contributor

Re: grep command

Eliminate the spaces around the |.
Is there also a space before the TOTAL?
If so, then my advice should solve it.

Michael
Soontorn Vittayaprachsa
Occasional Advisor

Re: grep command

May I add some more information.

I found that when run the above grep commands I have following output,
. . .
--TOTAL-----------> xx ------------
--TOTAL-----------> xx ------------
--TOTAL-----------> xx ------------
-- Collecting Time at 00:01:02 ----
--TOTAL-----------> 15 ------------
-- Collecting Time at 00:06:02 ----
--TOTAL-----------> 25 ------------
-- Collecting Time at 00:11:02 ----
--TOTAL-----------> 27 ------------
. . .
--TOTAL-----------> xx ------------
--TOTAL-----------> xx ------------
--TOTAL-----------> xx ------------
--TOTAL-----------> xx ------------
. . .
--TOTAL-----------> 30 ------------
--TOTAL-----------> xx ------------
--TOTAL-----------> xx ------------
--TOTAL-----------> xx ------------
. . .

Question: how to remove the TOTAL xx or TOTAL that is not associated to the selected period by grep command.

Regards
Soontorn
Michael Schulte zur Sur
Honored Contributor

Re: grep command

If I understand it right, then unwanted lines are characterized by xx. If so use
grep -E 'Collecting Time at 00|TOTAL.*[0-9][0-9]' stat.txt

Michael
Soontorn Vittayaprachsa
Occasional Advisor

Re: grep command

Thaks Micheal for the reply.

Let me add some more I have a script running to collect the statistics for every interval, 1 - 600 second for example.

Every interval the script put the snapshot into the file, stat.txt and a snapshot looks like follow,
-- Collecting Time at 00:01:02 ----
Data 1 -----------> 1
Data 2 -----------> 5
Data 3 -----------> 7
Data 4 -----------> 2
--TOTAL-----------> 15 ------------

There are four data, (1,5,7,2), and the total is 15. And I need to remove the details of data1, data2, data3, and data4, save only first line and last line in hourly, for example I'd like to save the collected time and TOTAL value which generated during midnight of the day.

I wondering the grep command would enough to be used in this case.

Regards,
Soontorn
Michael Schulte zur Sur
Honored Contributor
Solution

Re: grep command

Hi Soontorn,

If you don??t want Data then use
grep -v Data
to filter it out.
But slowly you get into regions,
where a language like awk is better suited.

Michael
Joris Denayer
Respected Contributor

Re: grep command

Soontom,

You can't do this with grep alone.
grep is doing exactly what you ask it to do. This means, it shows you all lines containing the string "TOTAL" --or-- the string "Collecting Time at 00"
That's why you see the lines with "TOTAL .." at the end of your output.

Following script is not the shortest, but looks quite simple. (ksh is needed)

grep -e "Collecting Time at 00" -e "TOTAL" stat.txt > tempfile.lst
LAST=`grep -n "Collecting Time" tempfile.lst | tail -1 | cut -d ":" -f 1`
let LAST=LAST+1
head -n ${LAST} < tempfile.lst > 0000.lst
rm tempfile.lst


Enjoy

Joris
To err is human, but to really faul things up requires a computer
Phillip Brown
Frequent Advisor

Re: grep command

Try a script like this (no arg checks, but
you get the idea):

clipper V5.1> cat prog

input_time=$1
data=$2

awk -v time="$input_time" '
{
if($0 ~ /TOTAL/) { NR=0 }

if(NR == 1 && $5 == time) {
print
prt="true"
}

if(/TOTAL/ && prt=="true") {
print
prt="false"
}
}' $data

Here's the test:

clipper V5.1> prog "00:01:02" data
-- Collecting Time at 00:01:02 ----
--TOTAL-----------> 15 ------------

clipper V5.1> prog "00:06:02" data
-- Collecting Time at 00:06:02 ----
--TOTAL-----------> 25 ------------

clipper V5.1> prog "00:11:02" data
-- Collecting Time at 00:11:02 ----
--TOTAL-----------> 27 ------------
Soontorn Vittayaprachsa
Occasional Advisor

Re: grep command

Thanks Michael, Joris, Phillip,

I got the point grep can't use alone to retrieve the result as needed but need other tools to formulate a script, thanks for kindly help.

Regards,
Soontorn
Hein van den Heuvel
Honored Contributor

Re: grep command

Simple AWK solution:

$ awk '/at 00:0/{x=1;print}/TOTAL/{if (x){x=0;print}}' < your-file.data

In words: if you see a line with 'at 00:0' set flag x and print the line. If you see a line with 'TOTAL' and flag x is set, clear flag x and print it, if flag x was not set, meaning you there was no recent 00:0x line, then just ignore the total line.


hth,
Hein.
Felipe Cano Chávez
New Member

Re: grep command

grep -v ^Data arch.txt