- Community Home
- >
- Servers and Operating Systems
- >
- Operating Systems
- >
- Operating System - HP-UX
- >
- Re: sed syntax, help
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Discussions
Discussions
Forums
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО03-30-2011 12:51 AM
тАО03-30-2011 12:51 AM
#Replace last comma(,) in each line with 'and'
sed 's#\(.*\),\([^,]*\)#\1 and\2#'
Tx to all
Best
Dai
Solved! Go to Solution.
- Tags:
- sed
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО03-30-2011 02:34 AM
тАО03-30-2011 02:34 AM
SolutionNormally this syntax is described as s/
This allows you to use whatever character is convenient as a delimiter: if you're matching pathnames, using "/" as a delimiter would require escaping all the non-delimiter slashes with backslashes, making the expression harder to read.
Sed is based on regular expressions, or "regexps" for short. The regexp syntax is used with other tools too, like grep, awk, perl and many others. It does take a bit of effort to learn it: sometimes regexps are half-jokingly called "write-only language", i.e. reading a complicated regexp can be harder than actually designing and writing it.
> sed 's#\(.*\),\([^,]*\)#\1 and\2#'
Let's split this up a little. The first characters are "s#", so this is a search-and-replace expression, using # as a delimiter.
- search for '\(.*\),\([^,]*\)'
- replace with '\1 and\2'
- no options.
In the search expression, \( and \) are not part of the string to be searched. They define sub-expressions for later reference. In this case, they are referred to in the replacement expression.
So, in plain language, the search expression means:
- accept anything up to a comma, and remember that part as sub-expression 1.
- after the comma, take anything that does not include a comma, and remember that part as sub-expression 2.
Since the search expression does not begin with ^ nor end with $, it hasn't been "anchored" to neither the beginning nor the end of line. But there is a "maximal munch rule": unless a limit is specified, a regular expression tries to match the maximal amount of data possible.
So, if there are two commas on the line, everything on the line up to the _last_ comma (not including the comma itself) will be assigned to sub-expression 1, and whatever is after the last comma to sub-expression 2.
In the replacement part, "\1" and "\2" mean "insert whatever was assigned to the corresponding sub-expression".
Clear as mud?
MK
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО03-30-2011 05:13 AM
тАО03-30-2011 05:13 AM
Re: sed syntax, help
Matti's explanation is simply excellent.
What Matti describes as the "maximal munch rule" is generally spoken of as "greediness". It is worth noting that In languages like Perl, regular expressions can also be optioned to be "lazy"; that is to match only to match the least data possible.
A very short, but important document about regular expressions can be found in the 'regepx(5)' manpages.
Regards!
...JRF...
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО03-30-2011 08:21 AM
тАО03-30-2011 08:21 AM
Re: sed syntax, help
Tx so much all.
Best
N
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО03-30-2011 09:45 AM
тАО03-30-2011 09:45 AM
Re: sed syntax, help
Yes, got an idea, but it's really take some time to digest, especially p2.
p1..............p2..............p3
sed 's# \(.*\),\([^,]*\) # \1 and\2#'
Tx again
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО03-30-2011 11:35 AM
тАО03-30-2011 11:35 AM
Re: sed syntax, help
The notation '[^,]' says to match any character except a comma. This is called a non-matching list.
That said, however, given:
# X="line, line, line, another line"
# echo ${X}|sed 's#\(.*\),\([^,]*\)#\1 and\2#'
line, line, line and another line
.. is also produced by:
# echo ${X}|sed 's#\(.*\),\(.*\)#\1 and\2#'
line, line, line and another line
In either case, the regex engine bumps along *greedily* capturing characters until it has to give up a comma it "gobbled" in order to leave it as the second (albeit uncaptured) piece and then a third piece of zero or more characters of any kind.
A better example of greediness and the reason for using a non-matching list is this:
# Y='There is "yin" and "yang" in things'
Now, suppose all we wanted was to print "yin". Compare these:
# echo $Y|perl -nle 'm/(".*")/ and print $1'
"yin" and "yang"
...which isn't what we wanted.
# echo $Y|perl -nle 'm/("[^"]*")/ and print $1'
"yin"
...which is the desired, matched output.
Perl works the same way as 'sed' though I used Perl for its less cluttered syntax. There is no need to escape '(' and ')' when grouping and capturing. In my example, I asking Perl to read STDIN from a pipe and if it can match something bounded in double quotes, capture and print it.
The difference in the two examples underscores the greediness of the regular expression engine.
Regards!
...JRF...
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО04-01-2011 03:06 AM
тАО04-01-2011 03:06 AM
Re: sed syntax, help
Be aware that regexp(5) describe three types of regular expression and pattern matching notations:
1) Basic Regular Expressions, used by sed, vi, ex, grep
2) Extended Regular Expressions, used by awk and egrep
3) Pattern Matching Notation, used by shells and find