You are viewing a plain text version of this content. The canonical link for it is here.

Posted to regexp-dev@jakarta.apache.org by bu...@apache.org on 2001/02/28 10:08:09 UTC

[Bug 744] New - case independent matchflag not working

http://nagoya.apache.org/bugzilla/show_bug.cgi?id=744

*** shadow/744	Wed Feb 28 01:08:08 2001
--- shadow/744.tmp.4632	Wed Feb 28 01:08:08 2001
***************
*** 0 ****
--- 1,26 ----
+ +============================================================================+
+ | case independent matchflag not working                                     |
+ +----------------------------------------------------------------------------+
+ |        Bug #: 744                         Product: Regexp                  |
+ |       Status: NEW                         Version: unspecified             |
+ |   Resolution:                            Platform: PC                      |
+ |     Severity: Major                    OS/Version: Linux                   |
+ |     Priority: Medium                    Component: Other                   |
+ +----------------------------------------------------------------------------+
+ |  Assigned To: regexp-dev@jakarta.apache.org                                |
+ |  Reported By: alex@7val.de                                                 |
+ |      CC list: Cc:                                                          |
+ +----------------------------------------------------------------------------+
+ |          URL:                                                              |
+ +============================================================================+
+ |                              DESCRIPTION                                   |
+ I'm using V 1.1 of jakarta-regexp and the MATCH_CASEINDEPENDENT flag doesn't
+ seem to be working, not when I set it using setMatchFlags(), neither in the
+ constructor.
+ 
+ this is my regexp:
+ queryLink = new RE("href=\"?([^ ]+\\?[^
+ queryLink.setMatchFlags(RE.MATCH_CASEINDEPENDENT);
+ 
+ and I get different results depending on whether I write "href" or "HREF".
+ personally I don't find it that urgent, but it's a convenient feature.

Re: Retrieving indices of repeated closures

Posted by Michael McCallum <mi...@spinsoftware.com>.

The reason I say that is that in most cases when using the regexp engine there will not be a 
need to get more than one match if any. This is from my own use stats of regular expressions.
Not really necessary they were just my ideas.

With the interface proposal i made earlier for matchers/compilers would be possible to 
instantiate a matcher that did all 3. ie no closure returns, single closure returns, multiple closure 
returns quite efficiently.

Michael

Re: Retrieving indices of repeated closures

Posted by Ian Swett <is...@ispheres.com>.

I had started to do this as you recc.(with Vector), but I'm curious as to
why you say not to initialize until I have 2 + elements?  For efficiency?
I agree it'll save a little bit of time, but I can't imagine it would be a
huge slowdown.  But I very well may be wrong, and I'll try to do it in the
most efficient manner anyway.

Ian

On Wed, 28 Mar 2001, Michael McCallum wrote:

> The simple solution is to have a vector for each paren. At the moment only the last paren is 
> saved as the matches are put into an array.
> 
> Just dont initialise the vector until you have 2 + elements.
> 
> The workaround is to use a reluctant match and then call it repeatedly using the match( string, 
> index ) method which is supposed to be internal but i dont think it is.
> 
> Michael
> 
> On 27 Mar 2001, at 11:01, Ian Swett wrote:
> 
> > I would like to be able to gather all the indices of matches inside
> > repeated closures, instead of just retrieving one of them.
> > 
> > A simple example:
> > matching regexp (abc)* against zyabcabcabc  will only allow the start/end
> > locations of one of the three abc sequences to be retrieved.  I would like
> > to retrieve all of them.  Obviously this is an excessively simple example,
> > but it conveys the idea.
> > 
> > Is there a relatively simple way to do this already?  Or a workaround I
> > could use that wouldn't be very ugly or slow?
> > 
> > Is there enough demand to add this capability?  It seems like it should be
> > relatively easy to do.  I could start working on it if others want it.
> > 
> > Ian
> > 
> > 
> 
> 
>

Re: Retrieving indices of repeated closures

Posted by Michael McCallum <mi...@spinsoftware.com>.

The simple solution is to have a vector for each paren. At the moment only the last paren is 
saved as the matches are put into an array.

Just dont initialise the vector until you have 2 + elements.

The workaround is to use a reluctant match and then call it repeatedly using the match( string, 
index ) method which is supposed to be internal but i dont think it is.

Michael

On 27 Mar 2001, at 11:01, Ian Swett wrote:

> I would like to be able to gather all the indices of matches inside
> repeated closures, instead of just retrieving one of them.
> 
> A simple example:
> matching regexp (abc)* against zyabcabcabc  will only allow the start/end
> locations of one of the three abc sequences to be retrieved.  I would like
> to retrieve all of them.  Obviously this is an excessively simple example,
> but it conveys the idea.
> 
> Is there a relatively simple way to do this already?  Or a workaround I
> could use that wouldn't be very ugly or slow?
> 
> Is there enough demand to add this capability?  It seems like it should be
> relatively easy to do.  I could start working on it if others want it.
> 
> Ian
> 
>

Retrieving indices of repeated closures

Posted by Ian Swett <is...@ispheres.com>.

I would like to be able to gather all the indices of matches inside
repeated closures, instead of just retrieving one of them.

A simple example:
matching regexp (abc)* against zyabcabcabc  will only allow the start/end
locations of one of the three abc sequences to be retrieved.  I would like
to retrieve all of them.  Obviously this is an excessively simple example,
but it conveys the idea.

Is there a relatively simple way to do this already?  Or a workaround I
could use that wouldn't be very ugly or slow?

Is there enough demand to add this capability?  It seems like it should be
relatively easy to do.  I could start working on it if others want it.

Ian