David Harris's Technology Blog

ColdFusion, Flex, and other stuff...   (and 357,345 hours, 25 mins in to my plan for global domination)

Search:

Calendar:

Sun Mon Tue Wed Thu Fri Sat
    1 2 3 4 5
6 7 8 9 10 11 12
13 14 15 16 17 18 19
20 21 22 23 24 25 26
27 28 29 30 31    

Subscribe:

Enter your email address to subscribe to this blog.

Archives By Subject:

Tags:

action script adobe air ajax cfug coldfusion flash flex frameworks free software fxug general jpgmetadatareader mac off topic opensource papervision spry

Recent Entries:

No recent entries.

Top Posts:

Recent Comments:

Top Commenters:

My Links:

RSS:


To all you REGEX gurus out there

I have a block of text I am working with. Here is an example of it:

the start of the text here
<a href="Some text here" target="_blank" >Some text here</a>
some more text here
<a href="http://someurl">here is some stuff</a>
and some more text

I want to run a regex to get the "Some test here" from between the "a" tags.

When I use the regex "(<a>)(.*)(</a>)" in to The Regex Coach it happily finds the "Some text Here" a tags.

When I run this code:

<cfsavecontent variable="thisText">
the start of the text here
<a href="Some text here" target="_blank" >Some text here</a>
some more text here
<a href="http://someurl">here is some stuff</a>
and some more text
</cfsavecontent>

<cfset aBlock = reFindNoCase("(<a href=")(.*)(" target="_blank" >)(.*)(</a>)",thisText,1,1)>

<cfset aPortion = mid(thisText, aBlock.pos[3], aBlock.len[3])>

<cfdump var="#aPortion#">

I am getting from the "Some text here..." to the end a tag that is at the end of "...here is some stuff"

Can any of you regex Gurus spot my (obvious to you) mistake?

Update:

Been meaning to do this for a while... as per the comments from Steve the correct answer is:
(<a href=")(.*?)(" target="_blank" >)(.*?)(</a>)

note the "?". [Steve Quote]The "?" qualifies the statement as non-greedy[/Steve Quote]

Thank you Steve!

Related Blog Entries

Comments
I think you want this regex instead: (<a>)(.*?)(</a>)

The "?" qualifies the statement as non-greedy, which is what you want.
# Posted By Steve Bryant | 3/3/07 12:05 PM
Thanks Steve!
I won't pretend I understand what you just said, but it works.

Regex is still an unknown to me!

so the "?" says, "find the first </a> one, then stop"?
# Posted By David | 3/3/07 12:14 PM
David,

That's right.

The concept of "greedy" indicates whether the expression will find as much text as it can or as little. A greedy expression will find the largest match that it can, a non-greedy match will find the smallest.

So, the "?" makes the expression non-greedy so it will find the smallest match (in your case, just one link).
# Posted By Steve Bryant | 3/3/07 12:21 PM
Thanks Steve.
So would that suggest The Regex Coach got it wrong?
# Posted By David | 3/3/07 12:48 PM
I haven't ever used Regex Coach, so I can't say if it got it wrong or if it just doesn't ask for specific enough information to get the answer you wanted in this case.
# Posted By Steve Bryant | 3/6/07 2:45 AM
Fair enuff.
Thanks again for your help Steve!
# Posted By David | 3/6/07 5:20 PM