Operating System - HP-UX
1831052 Members
2342 Online
110019 Solutions
New Discussion

Only print certain lines in file

 
SOLVED
Go to solution
Steven Givens
New Member

Only print certain lines in file

I imagine this is one of those questions where I'll hit myself upside the head when I see the answers to my question, but I am tired of fumbling through this.

I have created a file using the following command:
grep jobstat.exe /tmp/pso_lp.out | awk '{print $1, $2, $3, $7, "0:"$9}' | sort -n -k 2,2 -k 5,5 > /tmp/jobstat.txt

Long story short, the file is sorted in the order I need, but I only want to print out the last line of each group of processes in the file. In other words, whenever the value of $2 in the file changes, print the line.

Thanks,

Steve
7 REPLIES 7
Darren Prior
Honored Contributor

Re: Only print certain lines in file

Hi Steve,

Here's one solution:

#!/usr/bin/sh

# read 1st line of file so we have 1st checkpoint
LASTLINE=$(head -n 1 /tmp/jobstat.txt)
LASTCHG=$(echo $LASTLINE | awk '{print $2}')

while read LINE
do
THISCHG=$(echo $LINE | awk '{print $2}')
if [[ $THISCHG != $LASTCHG ]]
then
# The $2 value has changed so write the previous line to a file
echo $LASTLINE >/tmp/outfile
fi
#
LASTLINE=$LINE
LASTCHG=$THISCHG
done
regards,

Darren.
Calm down. It's only ones and zeros...
Robin Wakefield
Honored Contributor
Solution

Re: Only print certain lines in file

Hi Steve,

Try the following:

awk 'NR==1{v=$2}$2 != v{print l;v=$2}{l=$0}END{print l}' filename

rgds, Robin
John Palmer
Honored Contributor

Re: Only print certain lines in file

Another shell solution, assuming that when $2 changes you want to print the previous line...

B1=""
grep jobstat.exe /tmp/pso_lp.out | awk '{print $1, $2, $3, $7, "0:"$9}' | sort -n -k 2,2 -k 5,5 | {
while read A B C D E
do
if [[ -n ${B1} ]]; # not first time
then if [[ ${B} != ${B1} ]];
then print "${A1} ${B1} ${C1} ${D1} ${E1}"
fi
fi
A1="${A}"
B1="${B}"
C1="${C}"
D1="${D}"
E1="${E}"
done

# print last line
print "${A1} ${B1} ${C1} ${D1} ${E1}"

Regards,
John
Steven Givens
New Member

Re: Only print certain lines in file

Thanks all. I was spending most of my time trying to figure out how to do it with awk, but I like the idea of combining the commands too.

Robin, please help me understand exactly what the awk script is doing. Specifically how awk manages what record in the file is printed.

As an example, see the following:

+ PROCFILE=/tmp/jobstat.txt
+ awk NR==1 {pid=$2}
$2 != pid {print $0; pid=$2}
END {print $0}
/tmp/jobstat.txt
bomk0b 14689 14636 15:58:07 0:00:05
bomk0b 15008 14976 15:58:36 0:00:01
bomk0b 15322 15267 15:59:02 0:00:07
bomk0b 15658 15625 15:59:41 0:00:00
bomk0b 15676 15658 15:59:41 0:00:00
bomk0b 15988 15941 16:00:24 0:00:05
bomjba 18009 17952 16:06:31 0:00:07
bomk0b 22210 15941 16:18:38 0:00:00

+ print

+ awk NR==1 {pid=$2}
$2 != pid {print line; pid=$2} {line=$0}
END {print line}
/tmp/jobstat.txt
bomk0b 13867 13819 15:56:24 0:23:01
bomk0b 14689 14636 15:58:07 0:21:07
bomk0b 15008 14976 15:58:36 0:20:32
bomk0b 15322 15267 15:59:02 0:20:00
bomk0b 15658 15625 15:59:41 0:19:09
bomk0b 15676 15658 15:59:41 0:00:00
bomk0b 15988 15941 16:00:24 0:18:08
bomjba 18009 17952 16:06:31 0:01:18
bomk0b 22210 15941 16:18:38 0:00:00

I can see that the first line prints the current line vs. the last line, but I don't understand how.

Thanks,

Stev
Rodney Hills
Honored Contributor

Re: Only print certain lines in file

Steve,

Here is an explanation of Robin's script-

Script:
awk 'NR==1{v=$2}$2 != v{print l;v=$2}{l=$0}END{print l}' filename

Variables:
"v" will be used to compare to previous field #2, and "l" will hold previous record.

Explanation:
1) When NR==1 (first record in file), "v" is primed with the first field #2.

2) When field #2 is not the same as previous (as held in "v") then print "l" (the previous line), and save the new field #2 in "v".

3) In all cases save the current line in "l" (after all the above tests).

4) At the "END" print the last line (in "l").

awk will execute each of these 4 tasks sequentially. This process is common in reports when you are doing "break" totals. You have to save the previous values for comparision.

HTH

-- Rod Hills
W

File ...... NR v l
abc def hij .1
bbb def hhh .2
ccc klm ddd .3
There be dragons...
Robin Wakefield
Honored Contributor

Re: Only print certain lines in file

Hi Steve,

1) NR==1 {pid=$2}
2) $2 != pid {print line; pid=$2}
3) {line=$0}
4) END {print line}

1) initialise the pid variable with the value from the 1st line. If you don't do this, then you'd get an immediate mis-match, and it'd print an empty line because it would think the pid has changed (try taking it out and see what happens)

2) if the current pid doesn't equal the previous pid, print the previous line (captured in the next block). Reset the pid to the current value.

3) store the current line in case you need to print it out next time

4) ALWAYS print the last line, as that will always be a different pid to the previously printed value.

hope that makes sense,

rgds, Robin.


Steven Givens
New Member

Re: Only print certain lines in file

It does make sense. Thanks again all! :)