You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@httpd.apache.org by Jim Jagielski <ji...@jaguNET.com> on 2007/03/13 14:24:25 UTC

sed filter module

There have been times when having a simple sed filter in Apache
would be useful... I used to use just ext_filter to do this,
but this got more and more painful the more I used it. So awhile
ago I made mod_sed_filter which I find pretty useful. I've just
built and tested in with 2.2 and trunk...

Anyone mind if I fold it into trunk and maybe have us
consider making it part of 2.2 (even under experimental)?

No docs yet but the code is:

	http://people.apache.org/~jim/code/mod_sed_filter.c

and the usage is easy:

	AddOutputFilterByType SEDFILTER text/html
	Sed s/foo/bar/in
	Sed s#monkey(hat)#chimp-$1#i
	Sed "s/works/functions/in"

note that it uses sed line controls, flexible
delims and support regex and simple pattern match (the 'n'
flag... no real sed option there ;) )

Re: sed filter module

Posted by Nick Kew <ni...@webthing.com>.
On Wed, 14 Mar 2007 11:15:00 -0400
Jim Jagielski <ji...@jaguNET.com> wrote:

> 
> On Mar 14, 2007, at 11:01 AM, Nick Kew wrote:
> 
> > Oh, I guess you mean the copying to get a null-terminated string
> > when applying a regexp?  And I see it's repeated for every regexp
> > (ouch)!  mod_line_edit uses a local pool which is cleared at the
> > end of each brigade, and avoids multiple copies of the same buffer.
> >
> 
> Hmmm... I'm confused. The way I do it is:
> 
> loop over sed scripts
>    loop over buckets
>      read bucket
>        make copy of bucket data for regex comparison

You're right, I was confused, and mod_line_edit does exactly the same.
What I'd like to get rid of is that copy inside the loop: once
copied, the copied bucket data should be reusable for other scripts.
But as we both found, that's harder!

-- 
Nick Kew

Application Development with Apache - the Apache Modules Book
http://www.apachetutor.org/

Re: sed filter module

Posted by Jim Jagielski <ji...@jaguNET.com>.
On Mar 14, 2007, at 11:01 AM, Nick Kew wrote:

> Oh, I guess you mean the copying to get a null-terminated string
> when applying a regexp?  And I see it's repeated for every regexp
> (ouch)!  mod_line_edit uses a local pool which is cleared at the
> end of each brigade, and avoids multiple copies of the same buffer.
>

Hmmm... I'm confused. The way I do it is:

loop over sed scripts
   loop over buckets
     read bucket
       make copy of bucket data for regex comparison

so everytime we read in bucket data, I have to make
a null-termed string. It changes with each bucket.
So I don't understand the issue with it being "repeated
for every regexp". How can that be avoided?

I reuse allocated space (I don't just simply keep
making strdups)... so yeah, there will be a chunk
of allocated spool still hanging around. So maybe
making that a subpool and then clearing/destroying
it would be best.

Re: sed filter module

Posted by Ruediger Pluem <rp...@apache.org>.

On 03/14/2007 09:14 PM, Jim Jagielski wrote:
> As a rough proof of concept, I refactored the design,

Ahh, and I was wondering myself why I could not find Joe's
concerns. I read your refactored code :-).

> allowing for the pattern matching and substitution to be
> done as soon as we have a "line". Also is some
> rough ability to pass the data to the next filter
> after we get more than ~AP_MIN_BYTES_TO_WRITE bytes.

Sounds good, but do we need to do our own buffering here?
Shouldn't we leave this work to the filters down in the chain?
They also know how to handle flush buckets in this case.
Which brings up an interesting question: What do we do if there
is a flush bucket in the middle of a "line"?
Flush immediately and risk an unprocessed line or wait for the line
break and flush after this line (the second option seems to be more
sane to me).

> Doesn't alleviate all the problems, but it allows
> for us to pass data quicker (we still have the issue
> where we need to fully read in the bb though...)

But from a first glance this approach has only a memory consumption
proportional to the longest line.
And I am not sure how you want to avoid reading the whole bb
if you want to modify the content. IMHO you can only avoid to have
the *whole* content in memory at a certain point of time.

> It's rough but passes superficial testing...
> 
> More work needs to be done, but more people could
> work on it if I just commit to trunk :)

And then it would be easy to track the changes :-)

Regards

RĂ¼diger

Re: sed filter module

Posted by Jim Jagielski <ji...@jaguNET.com>.
As a rough proof of concept, I refactored the design,
allowing for the pattern matching and substitution to be
done as soon as we have a "line". Also is some
rough ability to pass the data to the next filter
after we get more than ~AP_MIN_BYTES_TO_WRITE bytes.
Doesn't alleviate all the problems, but it allows
for us to pass data quicker (we still have the issue
where we need to fully read in the bb though...)
It's rough but passes superficial testing...

More work needs to be done, but more people could
work on it if I just commit to trunk :)

Same URL, different version:

     http://people.apache.org/~jim/code/mod_sed_filter.c


Re: sed filter module

Posted by Justin Erenkrantz <ju...@erenkrantz.com>.
On 3/14/07, Nick Kew <ni...@webthing.com> wrote:
> to content size?  Other than when the entire contents arrive in a
> single bucket?

Uh, a file bucket?  -- justin

Re: sed filter module

Posted by Joe Orton <jo...@redhat.com>.
On Wed, Mar 14, 2007 at 06:38:48PM +0000, Nick Kew wrote:
> Now, what leads you to suppose mod_line_edit uses RAM proportional
> to content size?  Other than when the entire contents arrive in a
> single bucket?

Because it implements the naive filter implementation, equivalent to:

e = APR_BRIGADE_FIRST(bb);
while (e != APR_BRIGADE_SENTINEL(bb)) {
   apr_bucket_read(e, ...);
   ...process bucket without passing on to f->next or deleting...
   e = APR_BUCKET_NEXT(e);
}

for the general case given bb contains a single FILE bucket, or a 
CGI/PIPE bucket, or any morphing bucket type which doesn't represent a 
chunk of memory, this does:

After Iter#	Contents of bb			Heap memory used
1		HEAP FILE			8K
2		HEAP HEAP FILE			16K
3		HEAP HEAP HEAP FILE		24K
...
n		HEAP*n				n*8K

where n ~= file size / 8K; FILE buckets will also morph into MMAP 
buckets so the practice is a bit more complicated but this illustrates 
the point... and the 8K is really 8000 bytes.

joe

Re: sed filter module

Posted by Nick Kew <ni...@webthing.com>.
On Wed, 14 Mar 2007 16:56:41 +0000
Joe Orton <jo...@redhat.com> wrote:

> On Wed, Mar 14, 2007 at 03:45:05PM +0000, Nick Kew wrote:
> > Nope.  Just one brigades worth at a time.  And the most likely case
> > for that to be an entire document is when it's a static file, and
> > document == brigade == bucket.
> 
> I'm not sure what you're saying here.  Which do you agree with:
> 
> a) size of data represented by a brigade is limited only by apr_off_t

ditto size of a bucket

> b) httpd does use brigades representing large amounts of content e.g. 
> containing FILE or CGI/PIPE buckets

Again, the unit of indefinite size is the bucket

> c) if you loop through all the buckets in a brigade calling read() on 
> every one, you map all the data represented by the brigade into RAM

Indeed.

> d) writing filters which use RAM proportional to content size is bad

Yep.

Now, what leads you to suppose mod_line_edit uses RAM proportional
to content size?  Other than when the entire contents arrive in a
single bucket?

-- 
Nick Kew

Application Development with Apache - the Apache Modules Book
http://www.apachetutor.org/

Re: sed filter module

Posted by Joe Orton <jo...@redhat.com>.
On Wed, Mar 14, 2007 at 03:45:05PM +0000, Nick Kew wrote:
> Nope.  Just one brigades worth at a time.  And the most likely case
> for that to be an entire document is when it's a static file, and
> document == brigade == bucket.

I'm not sure what you're saying here.  Which do you agree with:

a) size of data represented by a brigade is limited only by apr_off_t
b) httpd does use brigades representing large amounts of content e.g. 
containing FILE or CGI/PIPE buckets
c) if you loop through all the buckets in a brigade calling read() on 
every one, you map all the data represented by the brigade into RAM
d) writing filters which use RAM proportional to content size is bad

joe

Re: sed filter module

Posted by Nick Kew <ni...@webthing.com>.
On Wed, 14 Mar 2007 15:27:44 +0000
Joe Orton <jo...@redhat.com> wrote:

> On Wed, Mar 14, 2007 at 03:01:53PM +0000, Nick Kew wrote:
> > On Wed, 14 Mar 2007 14:32:13 +0000
> > Joe Orton <jo...@redhat.com> wrote:
> > 
> > > 1) the filtering logic is broken and will consume RAM
> > > proportional to response size.
> > 
> > I must've missed that when I looked.  I thought it used the
> > same logic as mod_line_edit, which is very careful about that.
> 
> It looks just as broken to me.  It will read() from every bucket in
> the input brigade without passing anything on,

Yes, the processing unit is the brigade.  A bucket could easily be
just a byte or two, whereas a brigade is more likely to be a sensible
amount of the data (such as the 8K seen when mod_proxy is driving,
and which is the most common usage case).

>	 so you guarantee that
> the entire response is mapped into RAM for a single filter invocation.

Nope.  Just one brigades worth at a time.  And the most likely case
for that to be an entire document is when it's a static file, and
document == brigade == bucket.


-- 
Nick Kew

Application Development with Apache - the Apache Modules Book
http://www.apachetutor.org/

Re: sed filter module

Posted by Joe Orton <jo...@redhat.com>.
On Wed, Mar 14, 2007 at 03:01:53PM +0000, Nick Kew wrote:
> On Wed, 14 Mar 2007 14:32:13 +0000
> Joe Orton <jo...@redhat.com> wrote:
> 
> > 1) the filtering logic is broken and will consume RAM proportional to 
> > response size.
> 
> I must've missed that when I looked.  I thought it used the
> same logic as mod_line_edit, which is very careful about that.

It looks just as broken to me.  It will read() from every bucket in the 
input brigade without passing anything on, so you guarantee that the 
entire response is mapped into RAM for a single filter invocation.

joe

Re: sed filter module

Posted by Nick Kew <ni...@webthing.com>.
On Wed, 14 Mar 2007 14:32:13 +0000
Joe Orton <jo...@redhat.com> wrote:

> 1) the filtering logic is broken and will consume RAM proportional to 
> response size.

I must've missed that when I looked.  I thought it used the
same logic as mod_line_edit, which is very careful about that.

Oh, I guess you mean the copying to get a null-terminated string
when applying a regexp?  And I see it's repeated for every regexp
(ouch)!  mod_line_edit uses a local pool which is cleared at the
end of each brigade, and avoids multiple copies of the same buffer.

> 2) 200-line functions are hard to read :)

mod_line_edit does the same there, but that's definitely being split
(not least so that the actual search-and-replace function can be
re-used in a companion input filter).  And given that it's unusually
well-commented and half of it features as example code in my book,
I don't think it's hard to read:-)

> Nick, are you actually planning to submit mod_line_edit for inclusion
> in the tree?

The subject hasn't arisen until this thread (which caught me rather
off-balance), but I'll be happy to include it if there's demand.

As I hinted, there are some enhancements in the pipeline.
If it goes in to trunk, a roadmap would probably be in order.

-- 
Nick Kew

Application Development with Apache - the Apache Modules Book
http://www.apachetutor.org/

Re: sed filter module

Posted by Joe Orton <jo...@redhat.com>.
On Tue, Mar 13, 2007 at 09:24:25AM -0400, Jim Jagielski wrote:
> There have been times when having a simple sed filter in Apache
> would be useful... I used to use just ext_filter to do this,
> but this got more and more painful the more I used it. So awhile
> ago I made mod_sed_filter which I find pretty useful. I've just
> built and tested in with 2.2 and trunk...
> 
> Anyone mind if I fold it into trunk and maybe have us
> consider making it part of 2.2 (even under experimental)?
> 
> No docs yet but the code is:
> 
> 	http://people.apache.org/~jim/code/mod_sed_filter.c

It would be good to have a simple filter like this in the tree.  From a 
quick review:

1) the filtering logic is broken and will consume RAM proportional to 
response size.  The mantra for writing output filters should be: "read 
buckets, process buckets, pass buckets, repeat"

2) 200-line functions are hard to read :)

...otherwise looks like nice simple code.  I don't see a *big* issue 
with the name implying likeness-of-sed.  mod_{pcre,text}_filter or 
something is as good.

Nick, are you actually planning to submit mod_line_edit for inclusion in 
the tree?

joe

Re: sed filter module

Posted by Nick Kew <ni...@webthing.com>.
On Wed, 14 Mar 2007 13:45:47 +0000
Nick Kew <ni...@webthing.com> wrote:


> As for the particular case Frank asked for, that works by
> expanding the union to include a function pointer alongside
> the strmatch and regexp cases.  So it's also a per-rule
> configuration flag, and never touches the code path except
> where explicitly invoked.

Sorry, I meant the "to" field becomes a union which may
be a function.


-- 
Nick Kew

Application Development with Apache - the Apache Modules Book
http://www.apachetutor.org/

Re: sed filter module

Posted by Nick Kew <ni...@webthing.com>.
On Wed, 14 Mar 2007 09:25:11 -0400
Jim Jagielski <ji...@jaguNET.com> wrote:

> 
> On Mar 14, 2007, at 5:07 AM, Frank wrote:
> 
> >
> > RewriteBodyLine 'http://(.*?)/(.*)/(.*)' 'http://${LOWERCASE:$1}/$ 
> > {MD5:$2}/$3'
> >
> 
> Yeah, that would be useful... Of course, the main issue is
> that whereas mod_rewrite can afford to be dog slow, because,
> after all, the URLs aren't *that* big, in-place rewriting
> of content can't be. The more complex the functionality,
> the slower it will be... :/

Solved in mod_line_edit: the code path for extra functionality
(such as per-rule conditional execution and environment variable
substitution) is invoked only when required.

As for the particular case Frank asked for, that works by
expanding the union to include a function pointer alongside
the strmatch and regexp cases.  So it's also a per-rule
configuration flag, and never touches the code path except
where explicitly invoked.

-- 
Nick Kew

Application Development with Apache - the Apache Modules Book
http://www.apachetutor.org/

Re: sed filter module

Posted by Jim Jagielski <ji...@jaguNET.com>.
On Mar 14, 2007, at 5:07 AM, Frank wrote:

>
> RewriteBodyLine 'http://(.*?)/(.*)/(.*)' 'http://${LOWERCASE:$1}/$ 
> {MD5:$2}/$3'
>

Yeah, that would be useful... Of course, the main issue is
that whereas mod_rewrite can afford to be dog slow, because,
after all, the URLs aren't *that* big, in-place rewriting
of content can't be. The more complex the functionality,
the slower it will be... :/

Re: sed filter module

Posted by Nick Kew <ni...@webthing.com>.
On Wed, 14 Mar 2007 10:07:49 +0100
Frank <fr...@x09.de> wrote:

> Just wanted to add my two cents worth...
> 
> We are using mod_line_edit a lot and would like to see a similar 
> functionality coming with Apache by default. :-)

Sounds like a vote.

> When I am correct mod_line_edit has the 'wrong' license model for
> being included into Apache by default.

Indeed.  When my modules have been integrated into the standard
distribution in the past, they've moved to the Apache license.
It's not a problem when there's a good reason for it.

> Just for your infomation: There are more modules having a similar 
> functionality:

Interesting!
> 
> http://mod-replace.sourceforge.net/

That one's genuinely interesting.  Looks like an alternative
reverse-proxy solution, combining filtering with the mod_proxy cookie
rewriting that was missing in 2.0.  But it buffers an entire response
in memory, which limits its usefulness.

> http://yomi.2288.org/forum/ftopic22.html (given by 
> http://modules.apache.org/search?id=857)

My chinese isn't up to finding a download link there!

> http://happygiraffe.net/mod_sed.html (VERY old)

No thank you:-)

> All modules are missing a feature we would like to see: Like in 
> mod_rewrite's RewriteMap it would be cool to specify a function being 
> called on the argument while replacing. E.g.:
> 
> RewriteBodyLine 'http://(.*?)/(.*)/(.*)' 
> 'http://${LOWERCASE:$1}/${MD5:$2}/$3'

This kind of feature is on the to-do list, amongst some
hacks-in-progress that have yet to reach the mod_line_edit site.
This is actually what alarms me somewhat about the prospect of
a different but near-identical module in /trunk/: it leaves me 
either abandoning or redoing some of this stuff.

> P.S.: And I vote for a better name like 'mod_filter_pcre' ...

But it isn't.  It offers string as well as regex matching!

-- 
Nick Kew

Application Development with Apache - the Apache Modules Book
http://www.apachetutor.org/

Re: sed filter module

Posted by Frank <fr...@x09.de>.
Just wanted to add my two cents worth...

We are using mod_line_edit a lot and would like to see a similar 
functionality coming with Apache by default. :-)

When I am correct mod_line_edit has the 'wrong' license model for being 
included into Apache by default.

Just for your infomation: There are more modules having a similar 
functionality:

http://mod-replace.sourceforge.net/
http://yomi.2288.org/forum/ftopic22.html (given by 
http://modules.apache.org/search?id=857)
http://happygiraffe.net/mod_sed.html (VERY old)


All modules are missing a feature we would like to see: Like in 
mod_rewrite's RewriteMap it would be cool to specify a function being 
called on the argument while replacing. E.g.:

RewriteBodyLine 'http://(.*?)/(.*)/(.*)' 
'http://${LOWERCASE:$1}/${MD5:$2}/$3'

... as I told before: Just my $.2

P.S.: And I vote for a better name like 'mod_filter_pcre' ...

Re: sed filter module

Posted by Nick Kew <ni...@webthing.com>.
On Tue, 13 Mar 2007 09:24:25 -0400
Jim Jagielski <ji...@jaguNET.com> wrote:


> 	http://people.apache.org/~jim/code/mod_sed_filter.c

At a glance, it looks like mod_line_edit.
Are you doing anything different?

-- 
Nick Kew

Application Development with Apache - the Apache Modules Book
http://www.apachetutor.org/

Re: sed filter module

Posted by "William A. Rowe, Jr." <wr...@rowe-clan.net>.
Nick Kew wrote:
> 
> I'm even more confused now, because I thought you were with Covalent,
> and I understood from Will that mod_line_edit was widely used by
> clients of Covalent.  Please tell me what I'm missing?

Just to ensure I'm not misquoted, I know I've suggested mod_line_edit
to a few Covalent clients who's desired manipulations would be best served
by a raw text manipulation program (e.g. no html/xml aware transforms).
I'm not clear if they adopted it (I haven't gotten follow up questions)
but I had passed on a quiet inquiry to you if you would be available for
consulting or support if users encountered issues, on Covalent's nickel,
of course, as anything we 'endorse' we back up in our support contracts.

Personally can't speak to any of your other questions or concerns, since
I just became aware of this module when you did.  But I'm sure Jim will
respond and satisfy your concerns.

Bill


Re: sed filter module

Posted by "William A. Rowe, Jr." <wr...@rowe-clan.net>.
Jim Jagielski wrote:
> 
> Bill told me about mod_line_edit maybe 3-4 days ago.
> I had known about mod_proxy_html, which is also something
> we've pointed clients to, so maybe that's where
> the confusion comes from.

Good point - in my experience mod_proxy_html is much more broadly
adopted both by our customers, and by others I chat with at users@,
because it appears (to them) to be the obvious solution to their problem.

Most don't even realize that mod_line_edit can accomplish the same
(and perhaps more efficiently) in many cases :)

Bill

Re: sed filter module

Posted by Jim Jagielski <ji...@jaguNET.com>.
On Mar 13, 2007, at 2:08 PM, Nick Kew wrote:

>
> AFAICS, this not merely looks like mod_line_edit: the filter *is*
> mod_line_edit, right down to the bucket manipulation logic used as
> an example in The Book!  It's just missing a couple of minor features,
> and has a slightly different configuration syntax.  The other  
> difference
> is 15 months "out there" in widespread use.
>

What logic? Let me know what sections you mean because
most of what I based it on is stuff from mod_include
and mod_proxy_ftp.c (and other ASF modules). I don't see
anything in either module which is "new" or not done by
any other modules out there that need to split out sections
from buckets.

Bill told me about mod_line_edit maybe 3-4 days ago.
I had known about mod_proxy_html, which is also something
we've pointed clients to, so maybe that's where
the confusion comes from.


Re: sed filter module

Posted by Nick Kew <ni...@webthing.com>.
On Tue, 13 Mar 2007 13:34:07 -0400
Jim Jagielski <ji...@jaguNET.com> wrote:

> 
> On Mar 13, 2007, at 1:10 PM, William A. Rowe, Jr. wrote:
> 
> >
> > Is this sed or pcre syntax?  I'm a bit confused :)
> >
> 
> It's a mutant ;) But, of course, we maintain
> that confusion internally with regex's being pcre...
> 
> > Although it's sed-ish, is it misleading to confuse the user with the
> > phrase sed considering the unsupported constructs?  E.g. I presume
> > the more complex sed language features aren't present.
> >
> > I'm wondering if mod_pcre_filter wouldn't be more accurate?
> >
> 
> 'sed' certainly gets the message across though :)
> But basically it allows for regex pattern matching
> and substitution in a very sed-like way.
> 
> By agreed that docs would help this....

AFAICS, this not merely looks like mod_line_edit: the filter *is*
mod_line_edit, right down to the bucket manipulation logic used as
an example in The Book!  It's just missing a couple of minor features,
and has a slightly different configuration syntax.  The other difference
is 15 months "out there" in widespread use.

I'm even more confused now, because I thought you were with Covalent,
and I understood from Will that mod_line_edit was widely used by
clients of Covalent.  Please tell me what I'm missing?

-- 
Nick Kew

Application Development with Apache - the Apache Modules Book
http://www.apachetutor.org/

Re: sed filter module

Posted by Jim Jagielski <ji...@jaguNET.com>.
On Mar 13, 2007, at 3:34 PM, William A. Rowe, Jr. wrote:

> Jim Jagielski wrote:
>>
>> On Mar 13, 2007, at 1:10 PM, William A. Rowe, Jr. wrote:
>>
>>>
>>> Is this sed or pcre syntax?  I'm a bit confused :)
>>
>> It's a mutant ;) But, of course, we maintain
>> that confusion internally with regex's being pcre...
>
> Of course :)  But it appears to be a tiny fraction of the sed  
> language...
>
>>> Although it's sed-ish, is it misleading to confuse the user with the
>>> phrase sed considering the unsupported constructs?  E.g. I presume
>>> the more complex sed language features aren't present.
>>>
>>> I'm wondering if mod_pcre_filter wouldn't be more accurate?
>>
>> 'sed' certainly gets the message across though :)
>> But basically it allows for regex pattern matching
>> and substitution in a very sed-like way.
>
> since it is only a pattern substitution subset, I'd prefer to see some
> RewriteBody directive or similar.  As I'm looking at the module,  
> I'm more
> convinced that Sed "foo" should be reserved for at least a basic sed
> implementation that implemented (at least!) the pre-GNU language  
> subset.
>

:)

Well, like I said, the main issue was avoiding the overhead of
having mod_ext_filter do simple in-line replacements by calling
sed to do 's/foo/bar/'... So yeah, it's closer to what a Perl
guy would think than a Unix sed-head :)


Re: sed filter module

Posted by "William A. Rowe, Jr." <wr...@rowe-clan.net>.
Jim Jagielski wrote:
> 
> On Mar 13, 2007, at 1:10 PM, William A. Rowe, Jr. wrote:
> 
>>
>> Is this sed or pcre syntax?  I'm a bit confused :)
> 
> It's a mutant ;) But, of course, we maintain
> that confusion internally with regex's being pcre...

Of course :)  But it appears to be a tiny fraction of the sed language...

>> Although it's sed-ish, is it misleading to confuse the user with the
>> phrase sed considering the unsupported constructs?  E.g. I presume
>> the more complex sed language features aren't present.
>>
>> I'm wondering if mod_pcre_filter wouldn't be more accurate?
> 
> 'sed' certainly gets the message across though :)
> But basically it allows for regex pattern matching
> and substitution in a very sed-like way.

since it is only a pattern substitution subset, I'd prefer to see some
RewriteBody directive or similar.  As I'm looking at the module, I'm more
convinced that Sed "foo" should be reserved for at least a basic sed
implementation that implemented (at least!) the pre-GNU language subset.

Bill

Re: sed filter module

Posted by Jim Jagielski <ji...@jaguNET.com>.
On Mar 13, 2007, at 1:10 PM, William A. Rowe, Jr. wrote:

>
> Is this sed or pcre syntax?  I'm a bit confused :)
>

It's a mutant ;) But, of course, we maintain
that confusion internally with regex's being pcre...

> Although it's sed-ish, is it misleading to confuse the user with the
> phrase sed considering the unsupported constructs?  E.g. I presume
> the more complex sed language features aren't present.
>
> I'm wondering if mod_pcre_filter wouldn't be more accurate?
>

'sed' certainly gets the message across though :)
But basically it allows for regex pattern matching
and substitution in a very sed-like way.

By agreed that docs would help this....

Re: sed filter module

Posted by "William A. Rowe, Jr." <wr...@rowe-clan.net>.
Jim Jagielski wrote:
> Anyone mind if I fold it into trunk and maybe have us
> consider making it part of 2.2 (even under experimental)?

+1 to trunk!  No opinion yet on 2.2 (I'm not a big fan of growing
the stable branch since it entirely defeats the drive to release
2.next, ever.)

> No docs yet but the code is:
> 
>     http://people.apache.org/~jim/code/mod_sed_filter.c
> 
> and the usage is easy:
> 
>     AddOutputFilterByType SEDFILTER text/html
>     Sed s/foo/bar/in
>     Sed s#monkey(hat)#chimp-$1#i
>     Sed "s/works/functions/in"
> 
> note that it uses sed line controls, flexible
> delims and support regex and simple pattern match (the 'n'
> flag... no real sed option there ;) )

Is this sed or pcre syntax?  I'm a bit confused :)

Although it's sed-ish, is it misleading to confuse the user with the
phrase sed considering the unsupported constructs?  E.g. I presume
the more complex sed language features aren't present.

I'm wondering if mod_pcre_filter wouldn't be more accurate?