You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Justin Mason <jm...@jmason.org> on 2007/06/02 12:21:26 UTC

Re: sa-compile and backlashes

Gus writes:
> Loren Wilton wrote:
> >> rule work, but the ones with the "\|\\\|" (matching "|\|") all choke:
> >>
> >> "fro|\""|"            {RET("__XM_Sft_Ms_Fp_L33T");}
> >>
> >> It produces the above output, instead of "fro|\|". Not sure what it's 
> >> doing. Any solution other than commenting out the offending rules?
> > 
> > What version of re2c?   There reports that version before version12 
> > "have bugs".
> > The bugs aren't specified, but I assume they could include miscompiling 
> > the regexes.
> 
> It's v0.12. I don't think it's re2c anyway--re2c is choking on it because 
> sa-compile is producing the above .re file. That line of the .re file 
> should start with "fro|\|", and if I manually edit it, it compiles fine.
> 
> It's consistently interpreting escaped backslashes (\\) as either \" or 
> \"", which screws re2c up because that creates an incorrect amount of "'s. 
> I even tried escaping it has hex "\x52" (or whatever the right number 
> was--don't have my ascii table handy anymore :).

yep, this sounds like a bug in the code which extracts the "base
strings" from the full regexp: the BodyRuleBaseExtractor plugin, iirc.
could you open a bugzilla ticket?

--j.