Operating System - HP-UX
1753962 Members
7624 Online
108811 Solutions
New Discussion юеВ

ksh: parse a file with different field separator

 
support_billa
Valued Contributor

ksh: parse a file with different field separator

hello,

i have files ( example of data protector file "cell_info" ) with different fields and i think different field separators.

example :

-host "server_dp_06_20" -os "hp ia64 hp-ux-11.31" -core A.06.20 -integ A.06.20 -cs A.06.20 -da A.06.20 -ma A.06.20 -cc A.06.20 -javagui A.06.20 -oracle8 A.06.20 -docs A.06.20
-host "server_dp_09_20" -os "gpl x86_64 linux-2.6.16.60-0.103.1-smp" -core A.09.00 -integ A.09.00 -da A.09.00 -ma A.09.00 -cc A.09.00 -oracle8 A.09.00 -autodr A.09.00 -ts_core A.09.00 -corepatch A.09.06 -integpatch A.09.06 -dapatch A.09.06 -mapatch A.09.06 -ccpatch A.09.06 -oracle8patch A.09.06 -autodrpatch A.09.06 -ts_corepatch A.09.06


i want to get the value ( in double quotes ) after -host and -os. so in first line after -host  : "server_dp_06_20"   and after -os "hp ia64 hp-ux-11.31" . i tried while IFS="-" read value1 value1 etc  and with awk i have problem the different fields. any idea ?

regards

 

 

5 REPLIES 5
Steven Schweda
Honored Contributor

Re: ksh: parse a file with different field separator

> I want to get the value ( in double quotes ) after -host [...]

mba$ line='-host "server_dp_06_20" -os "hp ia64 hp-ux-11.31" -core A.06.20'

mba$ echo "$line" | sed -e 's/-host "\([^"]*\)".*/\1/'
server_dp_06_20

   Or, if you want to include the quotation marks in the result:

mba$ echo "$line" | sed -e 's/-host \("[^"]*"\).*/\1/'
"server_dp_06_20"

   If you can't do it with "sed", then it's not worth doing, I always
say.  This example was done on a Mac, but at my level of expertise,
"sed" is "sed".

Bill Hassell
Honored Contributor

Re: ksh: parse a file with different field separator

My choice is to use awk to grab the fields you need using the -F \" option as the separator.
Here is the result with awk:

TXT='-host "server_dp_06_20" -os "hp ia64 hp-ux-11.31" -core A.06.20 -integ A.06...'
-host "server_dp_09_20" -os "gpl x86_64 linux-2.6.16.60-0.103.1-smp" -core A.09....'  

echo "$TXT" | awk -F \"  '{print $2,$4}'
server_dp_06_20 hp ia64 hp-ux-11.31
server_dp_09_20 gpl x86_64 linux-2.6.16.60-0.103.1-smp


Since the data fields may contain spaces, you can run the awk command for each variable:

TXT='-host "server_dp_06_20" -os "hp ia64 hp-ux-11.31" -core A.06.20 -integ A.06...' -cs...'
-host "server_dp_09_20" -os "gpl x86_64 linux-2.6.16.60-0.103.1-smp" -core A.09...'

SERVERNAME="$(echo "$TXT" | awk -F \" '{print $2}')"
OS="$(echo "$TXT" | awk -F \" '{print $4}')"
echo "Server: $SERVERNAME, OS: $OS"
Server: server_dp_06_20, OS: hp ia64 hp-ux-11.3


If this script is being used to read a long file, just use read to assign the TXT variable:

cat $SOMEFILE | while read TXT
do
   SERVERNAME="$(echo "$TXT" | awk -F \"  '{print $2}')"
   OS="$(echo "$TXT" | awk -F \" '{print $4}')"
   echo "Server: $SERVERNAME, OS: $OS"
done

 



Bill Hassell, sysadmin
Steven Schweda
Honored Contributor

Re: ksh: parse a file with different field separator

> My choice is to use awk to grab the fields you need [...]

> [...] $2,$4 [...]

   Ok, if you _know_ in which order the fields apppear.

> I want to get the value ( in double quotes ) after -host [...]

   Knowing nothing about the source of these data, if someone tells me
"after -host", then I tend to look for "-host", and take what comes
after it.  Someone who knows more than I may assume things which I
won't.

ranganath ramachandra
Esteemed Contributor

Re: ksh: parse a file with different field separator

The man page of the hp-ux version awk that I tried this on does not document support gensub or any other method of using a captured group as replacement text. So something like this may work:

awk 'match($0, "-host [^ ]* -os [^ ]*") { $0=substr($0, RSTART, RLENGTH); sub("-host ",""); sub ("-os ", ""); print; }'
 
--
ranga
[i work for hpe]

Accept or Kudo

Dennis Handly
Acclaimed Contributor

Re: ksh: parse a file with different field separator

You you can write a program in awk to look at each field.

awk '
{
host = ""
os = ""
for (i = 1; i < NF; ++i) {
    if ($i == "-os") {

      os = $(i+1)

      ++i

      # find next quote

      for (;i < NF; ) {

         j = index(substr(os, 2), "\"")

         if (j > 0) break

         os = os " " $(i+1)

         ++i

      }
   } else if ($i == "-host") {

      host = $(i+1)

      ++i

   }
}
print "host=" host "; os=" os
}'  input-file