- Community Home
- >
- Servers and Operating Systems
- >
- Operating Systems
- >
- Operating System - HP-UX
- >
- sort file containing null characters
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Discussions
Discussions
Forums
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО09-03-2010 03:45 AM
тАО09-03-2010 03:45 AM
sort file containing null characters
# sort -k1,1 -o output.data input.data
# ll
-rwxr-xr-x 1 vfeng techserv 18281639 Sep 2 08:32 input.data
-rw-r--r-- 1 vfeng techserv 16736272 Sep 2 08:42 output.data
I tried this on both our 11iv1 and v2.
I also tried this on Solaris, the sort works well.
For now, my workaround is to remove the null characters with sed. A couple of years ago, somebody reported same issue on AIX. Is this a known bug for HP-UX too?
Victor
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО09-03-2010 04:23 AM
тАО09-03-2010 04:23 AM
Re: sort file containing null characters
no need to use sort prior to workaround.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО09-03-2010 04:26 AM
тАО09-03-2010 04:26 AM
Re: sort file containing null characters
# sed '/^$/d' input.data > output.data
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО09-03-2010 05:44 AM
тАО09-03-2010 05:44 AM
Re: sort file containing null characters
> Our application team has a file which contains some null characters. After the file is sorted, the size of output file is less than the original size.
A snippet of the first few lines of the input and the output files might be informative. Use something like 'xd file' so we can see things.
> For now, my workaround is to remove the null characters with sed.
Then what are you trying to do? You said that "...after the file is sorted, the size of the output file is less than the original..." Eliminating nulls before the sort would also reduce the file's size.
By the way, constructing a small file with embedded nulls and sorting it doesn't lead to any size change for me (as I would expect).
# cat -etv /tmp/sortme
ab1^@^@^@def 111$
ab2^@^@^@def 222$
ab3^@^@^@def 333$
For example, using a reverse sort for emphasis:
# sort -rk1,1 /tmp/sortme|cat -etv
ab3^@^@^@def 333$
ab2^@^@^@def 222$
ab1^@^@^@def 111$
Regards!
...JRF...
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО09-03-2010 10:01 AM
тАО09-03-2010 10:01 AM
Re: sort file containing null characters
This is not a text file. sort(1) has a WARNING:
For non-text input files, the behaviour is undefined.
>JRF: Eliminating nulls before the sort would also reduce the file's size.
Undefined could mean that any chars in the record after the NUL could be lost.
But your example doesn't show that.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО09-03-2010 11:18 AM
тАО09-03-2010 11:18 AM
Re: sort file containing null characters
Here is how I noticed the nulls. When I open the file with vi editor, I see following message:
"vopx-extract-rn-am.data" 8930 lines, 18277789 characters (3850 nulls)
18277789 + 3850 = 18281639
I can just type w! to save the file, and the nulls will be removed.
-rwx------ 1 vfeng techserv 18277789 Sep 2 09:34 in.txt
Or I can use sed to redirect input to a output file, and the nulls will be removed too. e.g.
sed 's///g' in.txt > out.txt
sed 's/SOMETHING-NOT-IN-THE-FILE//g' in.txt > out.txt
set '/^$/d' in.txt > out.txt
#ll
-rwx------ 1 vfeng techserv 18281639 Sep 2 09:34 in.txt
-rw-r----- 1 vfeng techserv 18277789 Sep 3 14:57 out.txt
Then sort will work well on out.txt.
Here is a few line of files
AZ010 90001AMEND - POLICY CHANGE 999N KAT
AZ010 90002AMEND - POLICY CHANGE 999N KAT
Victor
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО09-03-2010 01:01 PM
тАО09-03-2010 01:01 PM
Re: sort file containing null characters
I too can observe that 'vi' and the 'sed' substitution as you used it will eliminate the nulls. In my hands, either on an 11.11 or an 11.31 machine, the 'sort' *fails* to cause the loss of characters.
While I can accept 'vi' eliminating the null characters (because it warns you that they are present), I do not agree with 'sed's behavior when one does:
# sed -e '/^$/d'
This should eliminate lines consisting only of a newline --- i.e. an "empty" line, in my opinion. I observe the same behavior you do.
> Here is a few line of files
This isn't helpful. If you used 'cat -etv' or 'xd' to list the file(s) we could see where null characters occur. This is why I used it in my examples.
Regards!
...JRF...
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО09-04-2010 10:35 AM
тАО09-04-2010 10:35 AM
Re: sort file containing null characters
From your numbers, it seems it is a lot less. 1.5 M vs 3.8 K
>-rwxr-xr-x 18281639 Sep 2 08:32 input.data
(It isn't a good idea to have data files be executable.)
>my workaround is to remove the null characters with sed.
Can you compare the sorted files you get by using sort directly and then sort on the file where you removed the NULs? Also use wc(1) on each.
That might indicate whether records are missing, or just parts of lines.