Operating System - HP-UX
1753800 Members
7783 Online
108805 Solutions
New Discussion юеВ

Regular expression loop inside pattern?

 
Fredrik.eriksson
Valued Contributor

Regular expression loop inside pattern?

Hi,

I've been trying to figure out if it's at all possible to loop inside a regular expression with capture buffers.

In my theory this should be possible by logical grouping but I can't seem to get it to work.

Source looks something like this.
Col1 Col2 Col3 Col4

The data in each column is unknown and can't be statically matched.
The end result should reverse the columns from 4 to 1 instead of 1 to 4.

My first thought of doing this is something like this:

s/^(?:([^\t]+)+\t*)([^$]+)$/$4\t$3\t$2\t$1/

This however present the result from $2 or $1 but $4 and $3 is empty.

Is it completly impossible to generate a loop condition with multiple capture buffers inside a regexp?

ps. There is not real problem in writing a larger regexp for this need. It just peeked my curiosity :) ds.

Best regards
Fredrik Eriksson
6 REPLIES 6
Dennis Handly
Acclaimed Contributor

Re: Regular expression loop inside pattern?

If your columns have a regexp delimiter, you can use awk to swap columns.
awk -Fchar '{print $4, $3, $2, $1 }'

(Or are you already using perl? :-)
Fredrik.eriksson
Valued Contributor

Re: Regular expression loop inside pattern?

Well as I said, there's no real issue :)

I'm just curious if this is plausable doing with regular expressions :)

In basiclly any programming language or scripting language there's a reverse function (or atleast the means to create an array which you then just count backwards in (or using some reverse function)).

Google turns up close to nothing about this subject, most things I find includes either multiple line matching examples or code examples involving programmatic loops. Haven't found a single post or topic about looping inside regular expressions and building capture buffers.

Best regards
Fredrik Eriksson
James R. Ferguson
Acclaimed Contributor

Re: Regular expression loop inside pattern?

Hi Fredrik:

Perl has a 'reverse' function for you.

# perl -le 'while (<>) {@a=split;print join " ",reverse @a}'
col1 col2 col3 col4
col4 col3 col2 col1

# perl -le 'print join " ",reverse @ARGV' a b c
c b a

Regards!

...JRF...

James R. Ferguson
Acclaimed Contributor

Re: Regular expression loop inside pattern?

Hi (again) Fredrik:

If you want to use a regex to capture multiple fields and then present the captured variables in reverse order, you can do this:

# perl -nle '(@nums)=(m/(\b\d+\b)/g);print join(" ", reverse @nums)'
one 1 two 2 three 3 four 4 five 5
5 4 3 2 1

In this case, we read lines of input, plucking only digits and output them in reverse order.

The multiple captures are collected in an array and a 'g'lobal match is used.

Regards!

...JRF...
Fredrik.eriksson
Valued Contributor

Re: Regular expression loop inside pattern?

JRF, thanks for the reply.

But not really what I was looking for.

I'm already well aware the solution can be done in Perl or pretty much any scripting/programming language.

What I was looking for is if it's possible to actually loop inside the regexp pattern to produce multiple captured buffers.

As I've described, my tries has ended me up in having the second to last column captured in $1 and the last column captured in $2. But the first 2 just disappears on me and probably because this feature simply does not exist.

I will point it out again, this still isn't a real issue, it just got me interested and I pretty much just wanted to know if someone ever tried it or know if it's possible to do in specifically regular expressions.

Best regards
Fredrik Eriksson
James R. Ferguson
Acclaimed Contributor

Re: Regular expression loop inside pattern?

Hi (again) Fredrik:

> I'm already well aware the solution can be done in Perl or pretty much any scripting/programming language. What I was looking for is if it's possible to actually loop inside the regexp pattern to produce multiple captured buffers.

Let me be clear about what I suggested. The way I read your query is the way I answered you in my second post: The global match signaled by the 'g' in the expression 'm/(\b\d+\b)/g' gives a loop which produces multiple captured buffers (here, in a array).

The _exact_ abilities you have to manipulate regular expressions will depend in part on the actual engine available. Syntactically this may differ in various languages or tools. Perl has one of the best (if not the finest) engines available. The latest Perl (5.10.x) even enriches its feature set..

Regards!

...JRF...