You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@httpd.apache.org by Roman Gelfand <rg...@gmail.com> on 2012/01/24 00:29:27 UTC

[users@httpd] mod_proxy_html Issue

I am using this module to rewrite the contents of html documents.  It
appears that it strips &nbsp; which causes me all kinds of grief with
IE.  Looking, briefly, at mod_proxy_html.c, I couldn't find reference
to &nbsp;.  Unless, it is a setting in mod_proxy_html config file?

Any suggestions are appreciated

Thanks in advance

---------------------------------------------------------------------
The official User-To-User support forum of the Apache HTTP Server Project.
See <URL:http://httpd.apache.org/userslist.html> for more info.
To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
   "   from the digest: users-digest-unsubscribe@httpd.apache.org
For additional commands, e-mail: users-help@httpd.apache.org


Re: [users@httpd] mod_proxy_html Issue

Posted by Roman Gelfand <rg...@gmail.com>.
 "word1&nbspword2" becomes "word1 word2" is what happens.

Perhaps, the libxml2 that ships with debian lenny, os I am using, is outdated.

Is there a way, other than mod_proxy_html, to get rid of the base tag.

Thanks,

On Mon, Jan 23, 2012 at 8:57 PM, Igor Cicimov <ic...@gmail.com> wrote:
> Sorry mate not a C person my self :) From the look of it that function puts
> the non HTML characters like <,&,>"  into HTML format. Since &nbsp is
> already HTML formatted this function shouldn't affect it at all. But as I
> said I'm not the right person to comment on this hopefully someone alse can
> help.
>
> I think it would be good though if you give an example what exactly is
> happening after parsing like "word1&nbspword2" becomes "word1 word2" or
> something else...
>
> Cheers,
> Igor
>
>
> On Tue, Jan 24, 2012 at 12:40 PM, Roman Gelfand <rg...@gmail.com> wrote:
>>
>> I think I have the latest version as I picked it up from the site.
>>
>> Actually, after doing a little digging, I found that mod_proxy_html by
>> way of mod_xml2enc parses the html and, ultimately, puts it back
>> together again.  At the time of parsing, it replaces &nbsp; with 0xc2
>> 0xca or something like that.
>>
>> In the mod_proxy_html.c, you have this code.
>>
>> static void pcharacters(void* ctxt, const xmlChar *uchars, int length) {
>>  const char* chars = (const char*) uchars;
>>  saxctxt* ctx = (saxctxt*) ctxt ;
>>  int i ;
>>  int begin ;
>>  for ( begin=i=0; i<length; i++ ) {
>>    switch (chars[i]) {
>>      case '&' : FLUSH ; ap_fputs(ctx->f->next, ctx->bb, "&amp;") ; break ;
>>      case '<' : FLUSH ; ap_fputs(ctx->f->next, ctx->bb, "&lt;") ; break ;
>>      case '>' : FLUSH ; ap_fputs(ctx->f->next, ctx->bb, "&gt;") ; break ;
>>      case '"' : FLUSH ; ap_fputs(ctx->f->next, ctx->bb, "&quot;") ; break
>> ;
>>      default : break ;
>>    }
>>  }
>>  FLUSH ;
>> }
>>
>> I suppose I need to add a line for &nbsp;, but it has been a long time
>> since I coded in c and I don't know how to handle utf-8 characters.
>> If you could help I would appreciate it.
>>
>> Thanks,
>>
>> On Mon, Jan 23, 2012 at 6:42 PM, Igor Cicimov <ic...@gmail.com> wrote:
>> > Which version? If it is the newest one have you loaded mod_xml2enc too?
>> >
>> > Did you look for answer on the module web site?
>> > http://apache.webthing.com/mod_proxy_html/config.html
>> >
>> > Igor
>> >
>> > On Tue, Jan 24, 2012 at 10:29 AM, Roman Gelfand <rg...@gmail.com>
>> > wrote:
>> >>
>> >> I am using this module to rewrite the contents of html documents.  It
>> >> appears that it strips &nbsp; which causes me all kinds of grief with
>> >> IE.  Looking, briefly, at mod_proxy_html.c, I couldn't find reference
>> >> to &nbsp;.  Unless, it is a setting in mod_proxy_html config file?
>> >>
>> >> Any suggestions are appreciated
>> >>
>> >> Thanks in advance
>> >>
>> >> ---------------------------------------------------------------------
>> >> The official User-To-User support forum of the Apache HTTP Server
>> >> Project.
>> >> See <URL:http://httpd.apache.org/userslist.html> for more info.
>> >> To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
>> >>   "   from the digest: users-digest-unsubscribe@httpd.apache.org
>> >> For additional commands, e-mail: users-help@httpd.apache.org
>> >>
>> >
>>
>> ---------------------------------------------------------------------
>> The official User-To-User support forum of the Apache HTTP Server Project.
>> See <URL:http://httpd.apache.org/userslist.html> for more info.
>> To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
>>   "   from the digest: users-digest-unsubscribe@httpd.apache.org
>> For additional commands, e-mail: users-help@httpd.apache.org
>>
>

---------------------------------------------------------------------
The official User-To-User support forum of the Apache HTTP Server Project.
See <URL:http://httpd.apache.org/userslist.html> for more info.
To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
   "   from the digest: users-digest-unsubscribe@httpd.apache.org
For additional commands, e-mail: users-help@httpd.apache.org


Re: [users@httpd] mod_proxy_html Issue

Posted by Igor Cicimov <ic...@gmail.com>.
Sorry mate not a C person my self :) From the look of it that function puts
the non HTML characters like <,&,>"  into HTML format. Since &nbsp is
already HTML formatted this function shouldn't affect it at all. But as I
said I'm not the right person to comment on this hopefully someone alse can
help.

I think it would be good though if you give an example what exactly is
happening after parsing like "word1&nbspword2" becomes "word1 word2" or
something else...

Cheers,
Igor

On Tue, Jan 24, 2012 at 12:40 PM, Roman Gelfand <rg...@gmail.com> wrote:

> I think I have the latest version as I picked it up from the site.
>
> Actually, after doing a little digging, I found that mod_proxy_html by
> way of mod_xml2enc parses the html and, ultimately, puts it back
> together again.  At the time of parsing, it replaces &nbsp; with 0xc2
> 0xca or something like that.
>
> In the mod_proxy_html.c, you have this code.
>
> static void pcharacters(void* ctxt, const xmlChar *uchars, int length) {
>  const char* chars = (const char*) uchars;
>  saxctxt* ctx = (saxctxt*) ctxt ;
>  int i ;
>  int begin ;
>  for ( begin=i=0; i<length; i++ ) {
>    switch (chars[i]) {
>      case '&' : FLUSH ; ap_fputs(ctx->f->next, ctx->bb, "&amp;") ; break ;
>      case '<' : FLUSH ; ap_fputs(ctx->f->next, ctx->bb, "&lt;") ; break ;
>      case '>' : FLUSH ; ap_fputs(ctx->f->next, ctx->bb, "&gt;") ; break ;
>      case '"' : FLUSH ; ap_fputs(ctx->f->next, ctx->bb, "&quot;") ; break ;
>      default : break ;
>    }
>  }
>  FLUSH ;
> }
>
> I suppose I need to add a line for &nbsp;, but it has been a long time
> since I coded in c and I don't know how to handle utf-8 characters.
> If you could help I would appreciate it.
>
> Thanks,
>
> On Mon, Jan 23, 2012 at 6:42 PM, Igor Cicimov <ic...@gmail.com> wrote:
> > Which version? If it is the newest one have you loaded mod_xml2enc too?
> >
> > Did you look for answer on the module web site?
> > http://apache.webthing.com/mod_proxy_html/config.html
> >
> > Igor
> >
> > On Tue, Jan 24, 2012 at 10:29 AM, Roman Gelfand <rg...@gmail.com>
> wrote:
> >>
> >> I am using this module to rewrite the contents of html documents.  It
> >> appears that it strips &nbsp; which causes me all kinds of grief with
> >> IE.  Looking, briefly, at mod_proxy_html.c, I couldn't find reference
> >> to &nbsp;.  Unless, it is a setting in mod_proxy_html config file?
> >>
> >> Any suggestions are appreciated
> >>
> >> Thanks in advance
> >>
> >> ---------------------------------------------------------------------
> >> The official User-To-User support forum of the Apache HTTP Server
> Project.
> >> See <URL:http://httpd.apache.org/userslist.html> for more info.
> >> To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
> >>   "   from the digest: users-digest-unsubscribe@httpd.apache.org
> >> For additional commands, e-mail: users-help@httpd.apache.org
> >>
> >
>
> ---------------------------------------------------------------------
> The official User-To-User support forum of the Apache HTTP Server Project.
> See <URL:http://httpd.apache.org/userslist.html> for more info.
> To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
>   "   from the digest: users-digest-unsubscribe@httpd.apache.org
> For additional commands, e-mail: users-help@httpd.apache.org
>
>

Re: [users@httpd] mod_proxy_html Issue

Posted by Roman Gelfand <rg...@gmail.com>.
I think I have the latest version as I picked it up from the site.

Actually, after doing a little digging, I found that mod_proxy_html by
way of mod_xml2enc parses the html and, ultimately, puts it back
together again.  At the time of parsing, it replaces &nbsp; with 0xc2
0xca or something like that.

In the mod_proxy_html.c, you have this code.

static void pcharacters(void* ctxt, const xmlChar *uchars, int length) {
  const char* chars = (const char*) uchars;
  saxctxt* ctx = (saxctxt*) ctxt ;
  int i ;
  int begin ;
  for ( begin=i=0; i<length; i++ ) {
    switch (chars[i]) {
      case '&' : FLUSH ; ap_fputs(ctx->f->next, ctx->bb, "&amp;") ; break ;
      case '<' : FLUSH ; ap_fputs(ctx->f->next, ctx->bb, "&lt;") ; break ;
      case '>' : FLUSH ; ap_fputs(ctx->f->next, ctx->bb, "&gt;") ; break ;
      case '"' : FLUSH ; ap_fputs(ctx->f->next, ctx->bb, "&quot;") ; break ;
      default : break ;
    }
  }
  FLUSH ;
}

I suppose I need to add a line for &nbsp;, but it has been a long time
since I coded in c and I don't know how to handle utf-8 characters.
If you could help I would appreciate it.

Thanks,

On Mon, Jan 23, 2012 at 6:42 PM, Igor Cicimov <ic...@gmail.com> wrote:
> Which version? If it is the newest one have you loaded mod_xml2enc too?
>
> Did you look for answer on the module web site?
> http://apache.webthing.com/mod_proxy_html/config.html
>
> Igor
>
> On Tue, Jan 24, 2012 at 10:29 AM, Roman Gelfand <rg...@gmail.com> wrote:
>>
>> I am using this module to rewrite the contents of html documents.  It
>> appears that it strips &nbsp; which causes me all kinds of grief with
>> IE.  Looking, briefly, at mod_proxy_html.c, I couldn't find reference
>> to &nbsp;.  Unless, it is a setting in mod_proxy_html config file?
>>
>> Any suggestions are appreciated
>>
>> Thanks in advance
>>
>> ---------------------------------------------------------------------
>> The official User-To-User support forum of the Apache HTTP Server Project.
>> See <URL:http://httpd.apache.org/userslist.html> for more info.
>> To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
>>   "   from the digest: users-digest-unsubscribe@httpd.apache.org
>> For additional commands, e-mail: users-help@httpd.apache.org
>>
>

---------------------------------------------------------------------
The official User-To-User support forum of the Apache HTTP Server Project.
See <URL:http://httpd.apache.org/userslist.html> for more info.
To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
   "   from the digest: users-digest-unsubscribe@httpd.apache.org
For additional commands, e-mail: users-help@httpd.apache.org


Re: [users@httpd] mod_proxy_html Issue

Posted by Igor Cicimov <ic...@gmail.com>.
Which version? If it is the newest one have you loaded mod_xml2enc too?

Did you look for answer on the module web site?
http://apache.webthing.com/mod_proxy_html/config.html

Igor

On Tue, Jan 24, 2012 at 10:29 AM, Roman Gelfand <rg...@gmail.com> wrote:

> I am using this module to rewrite the contents of html documents.  It
> appears that it strips &nbsp; which causes me all kinds of grief with
> IE.  Looking, briefly, at mod_proxy_html.c, I couldn't find reference
> to &nbsp;.  Unless, it is a setting in mod_proxy_html config file?
>
> Any suggestions are appreciated
>
> Thanks in advance
>
> ---------------------------------------------------------------------
> The official User-To-User support forum of the Apache HTTP Server Project.
> See <URL:http://httpd.apache.org/userslist.html> for more info.
> To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
>   "   from the digest: users-digest-unsubscribe@httpd.apache.org
> For additional commands, e-mail: users-help@httpd.apache.org
>
>