HPE Community read-only access December 15, 2018
This is a maintenance upgrade. You will be able to read articles and posts, but not post or reply.
Hours:
Dec 15, 4:00 am to 10:00 am UTC
Dec 14, 10:00 pm CST to Dec 15, 4:00 am CST
Dec 14, 8:00 pm PST to Dec 15, 2:00 am PST
cancel
Showing results for 
Search instead for 
Did you mean: 

BOM  charater issue

 
AwadheshPandey
Honored Contributor

BOM  charater issue

Dear Gurus,

I am getting BOM character () in unix files. Do you have any idea how to get rid off.

Regards,

Awadhesh
It's kind of fun to do the impossible
2 REPLIES
Steven Schweda
Honored Contributor

Re: BOM  charater issue

> [...] BOM character () [...]

I don't know what a BOM character is, and, as
you can see, this forum is not very good at
rendering exotic ASCII characters.

> I am getting [...] in unix files.

Getting _how_? Which "unix files"? (What
_are_ "unix files"?)

> Do you have any idea how to get rid off.

Stop putting them in there in the first
place?

man sed
Matti_Kurkela
Honored Contributor

Re: BOM  charater issue

BOM = Byte-Order Mark, an optional feature in Unicode text files. It should appear at the beginning of the file only. The modern version of the Unicode standard says it should not be used in the middle of text.

http://en.wikipedia.org/wiki/Byte_order_mark

In UTF-8, the BOM is represented as a three-byte sequence: 0xEF,0xBB,0xBF.

This "bomfilter" script could be used to filter the UTF-8 BOM character out from any text piped to it:

#!/bin/sh
BOM=$(/bin/echo \\0357\\0273\\0277\\c)
sed -e "s/$BOM//g"

Examples of use:

bomfilter < bomtext.txt | more

grep someword bomtext.txt | bomfilter | more

MK
MK