Languages and Scripting
Showing results for 
Search instead for 
Do you mean 

sed negative lookahead

Occasional Advisor

sed negative lookahead

Hello all,

 

I have a text file with data similar to the following:

 

<test>testdata-A&B-testdata</test>

<test>testdata-C&amp;D-testdata</test>

<test>testdata-E&F-testdata</test>

<test>testdata-G&H-testdata</test>

<test>testdata-I&amp;J-testdata</test>

 

I need to replace all instances of "&" with "&amp;".  The tricky part is that I need to replace all "&" only when NOT followed by amp;.  Basically, I need every instance of & to read &amp; in this file when I am done with it.

 

I have been trying to find a solution with sed to do this and stumbled on Negative Lookahead, which sounds promising.  I am not even sure if this is the correct path to take and I am having a hard time with the syntax. 

 

Does anyone have some advice or has encountered a similiar issue?  Any help with this issue is greatly appreciated.

 

Thank you

 

--John

3 REPLIES
Honored Contributor Honored Contributor

Re: sed negative lookahead

 
Acclaimed Contributor Acclaimed Contributor

Re: sed negative lookahead

Perhaps the simplest way is to replace all "&" by "&amp;" and then replace all "&amp;amp;" by "&amp;":

sed -e 's/&/\&amp;/g' -e 's/&amp;amp;/\&amp;/g'

Highlighted
Occasional Advisor

Re: sed negative lookahead

Thank you both for your solutions.  I tested each and they both work perfectly.

 

Steven -- In my case, there should always be at least four characters after any & because of the closing tag for that line.  So I wouldnt have to worry about a situation where the & doesn't have four characters after it.

 

Dennis -- Your solution seems so obvious, now that you brought it up.

 

Thanks again both of you.

 

--John