You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Justin Mason <jm...@jmason.org> on 2007/06/02 12:21:26 UTC
Re: sa-compile and backlashes
Gus writes:
> Loren Wilton wrote:
> >> rule work, but the ones with the "\|\\\|" (matching "|\|") all choke:
> >>
> >> "fro|\""|" {RET("__XM_Sft_Ms_Fp_L33T");}
> >>
> >> It produces the above output, instead of "fro|\|". Not sure what it's
> >> doing. Any solution other than commenting out the offending rules?
> >
> > What version of re2c? There reports that version before version12
> > "have bugs".
> > The bugs aren't specified, but I assume they could include miscompiling
> > the regexes.
>
> It's v0.12. I don't think it's re2c anyway--re2c is choking on it because
> sa-compile is producing the above .re file. That line of the .re file
> should start with "fro|\|", and if I manually edit it, it compiles fine.
>
> It's consistently interpreting escaped backslashes (\\) as either \" or
> \"", which screws re2c up because that creates an incorrect amount of "'s.
> I even tried escaping it has hex "\x52" (or whatever the right number
> was--don't have my ascii table handy anymore :).
yep, this sounds like a bug in the code which extracts the "base
strings" from the full regexp: the BodyRuleBaseExtractor plugin, iirc.
could you open a bugzilla ticket?
--j.