Operating System - HP-UX
1833301 Members
2878 Online
110051 Solutions
New Discussion

Re: Difficult string extraction

 
SOLVED
Go to solution
SwissKnife
Frequent Advisor

Difficult string extraction

Hi,

 

Here the format of my filename

aaaaaaaaaaaaaaaaaaaaaaa.bbb.cccccccccccccc.gz

 

How to extract the 8 chars of the section just before .gz ?

 

Exemple:

SIEBER00_ora_38928476_1.aud.20170208163224.gz

=> 20170208

 

Any ideas ?

 

kind regards

Den.

 

 

6 REPLIES 6
Patrick Wallek
Honored Contributor

Re: Difficult string extraction

If everything is the same format, then something like this may work:

 

# export VAR1=SIEBER00_ora_38928476_1.aud.20170208163224.gz

# echo $VAR1
SIEBER00_ora_38928476_1.aud.20170208163224.gz

# echo $VAR1 | awk -F . '{print $3}' | cut -c 1-8
20170208
SwissKnife
Frequent Advisor

Re: Difficult string extraction

Hi, thank you for your answer,

 

I should have give more details. I can't check but perhaps I could have more . in the filename.

I missed to precise this and of course with your solution it works if format stays the same.

Is there a way to consider as a good mark the .gz and take the string before ? or it's too complicated ?

 

 

Kind regards,

Den

 

PWallek
New Member
Solution

Re: Difficult string extraction

Try this.  This will print the 2nd to last field (the one before the .gz and cut out columns 1-8:

# echo $VAR1 | awk -F . '{print $(NF-1)}' | cut -c 1-8
20170208

 

 

Steven Schweda
Honored Contributor

Re: Difficult string extraction

   Or, if "sed" is your only friend:

pro3$ echo 'aaaa.aaa.bbb.c1c2c3c4c5c6c7.gz' | \
 sed -e 's/^.*\.\([^.]*\)\.gz$/\1/' -e 's/\(........\).*/\1/'
c1c2c3c4

   The first expression looks for any characters at the begininng {^.*},
a dot {\.}, any non-dot characters {[^.]*}, and ".gz" at the end
{\.gz$}, and keeps the non-dot characters between those dots (the last
dot before ".gz", and the dot in ".gz").  The second expression keeps
the first eight characters from that result.

SwissKnife
Frequent Advisor

Re: Difficult string extraction

Hi,

perfect, thank you.

Kind regards, Den.

Dennis Handly
Acclaimed Contributor

Re: Difficult string extraction

You can of course program in awk:

echo "SIEBER00_ora_38928476_1.aud.20170208163224.gz" | awk '{print substr($0, index($0, ".gz") - 8, 8)}'