Hello all,
I have a text file with data similar to the following:
<test>testdata-A&B-testdata</test>
<test>testdata-C&D-testdata</test>
<test>testdata-E&F-testdata</test>
<test>testdata-G&H-testdata</test>
<test>testdata-I&J-testdata</test>
I need to replace all instances of "&" with "&". The tricky part is that I need to replace all "&" only when NOT followed by amp;. Basically, I need every instance of & to read & in this file when I am done with it.
I have been trying to find a solution with sed to do this and stumbled on Negative Lookahead, which sounds promising. I am not even sure if this is the correct path to take and I am having a hard time with the syntax.
Does anyone have some advice or has encountered a similiar issue? Any help with this issue is greatly appreciated.
Thank you
--John
Perhaps the simplest way is to replace all "&" by "&" and then replace all "&amp;" by "&":
sed -e 's/&/\&/g' -e 's/&amp;/\&/g'
Thank you both for your solutions. I tested each and they both work perfectly.
Steven -- In my case, there should always be at least four characters after any & because of the closing tag for that line. So I wouldnt have to worry about a situation where the & doesn't have four characters after it.
Dennis -- Your solution seems so obvious, now that you brought it up.
Thanks again both of you.
--John