You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@cocoon.apache.org by Joerg Heinicke <jo...@gmx.de> on 2004/05/17 22:17:57 UTC
Re: the Â_problem.
On 17.05.2004 14:59, leon tian wrote:
> hi,
>
> thanks for your advice. but i wanna transform the web pages from the
> internet and then display them. there are already " " between
> tags in the web pages. how can i change them to " "? should i
> configurate JTidy or XHTML serializer, or any other way?
IMO that's not related to any entity but to encoding directly. I guess
your parsing the files in the wrong encoding.
Joerg
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@cocoon.apache.org
For additional commands, e-mail: users-help@cocoon.apache.org
Re: the Â_problem.
Posted by Joerg Heinicke <jo...@gmx.de>.
On 18.05.2004 00:57, Upayavira wrote:
>> And configuring the parser instead of jtidy is not possible?
>
> Sorry? Tidy takes in an input stream and gives out a DOM which is passed
> to a DOMStreamer. What parser should be configured?
Ah, of course. I'm tired :) I thought of some jtidy output that must be
parsed. But with a DOM that's obviously not the case.
Joerg
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@cocoon.apache.org
For additional commands, e-mail: users-help@cocoon.apache.org
Re: the Â_problem.
Posted by Upayavira <uv...@upaya.co.uk>.
Joerg Heinicke wrote:
> On 18.05.2004 00:25, Upayavira wrote:
>
>>>> I use html generator without configuration and xhtml serializer
>>>> encoding to UTF-8. Could you tell me where the problem may be?
>>>
>>>
>>> The remote web page has a specific encoding. I guess the HTML
>>> generator is ignoring it and parses the remote webpage probably
>>> using UTF-8. I don't know about the details or how to solve it.
>>> Maybe you can get jtidy to output XML in a specific encoding that
>>> the parser parsing the jtidy output expects.
>>
>>
>>
>> I've recently tried to change the encoding on JTidy. It doesn't seem
>> to work. I followed it right in in a debugger - the configured locale
>> was set right inside JTidy, but it still outputted ISO-8859-1. No UTF-8.
>>
>> I'm thinking of extending the HTML generator to use something like
>> NekoHTML (I'm using it right now for a work project, and I reckon
>> it'd be pretty easy to do (like 10 lines of code). So the generator
>> would be configurable as to which tool it uses.
>
>
> And configuring the parser instead of jtidy is not possible?
Sorry? Tidy takes in an input stream and gives out a DOM which is passed
to a DOMStreamer. What parser should be configured?
Upayavira
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@cocoon.apache.org
For additional commands, e-mail: users-help@cocoon.apache.org
Re: the �_problem.
Posted by leon tian <ti...@yahoo.co.uk>.
Hi, I wanna configurate jtidy. How can I set jtidy.properties's url using resource: in sitemap?
Joerg Heinicke <jo...@gmx.de> wrote:On 18.05.2004 00:25, Upayavira wrote:
>>> I use html generator without configuration and xhtml serializer
>>> encoding to UTF-8. Could you tell me where the problem may be?
>>
>> The remote web page has a specific encoding. I guess the HTML
>> generator is ignoring it and parses the remote webpage probably using
>> UTF-8. I don't know about the details or how to solve it. Maybe you
>> can get jtidy to output XML in a specific encoding that the parser
>> parsing the jtidy output expects.
>
>
> I've recently tried to change the encoding on JTidy. It doesn't seem to
> work. I followed it right in in a debugger - the configured locale was
> set right inside JTidy, but it still outputted ISO-8859-1. No UTF-8.
>
> I'm thinking of extending the HTML generator to use something like
> NekoHTML (I'm using it right now for a work project, and I reckon it'd
> be pretty easy to do (like 10 lines of code). So the generator would be
> configurable as to which tool it uses.
And configuring the parser instead of jtidy is not possible?
Joerg
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@cocoon.apache.org
For additional commands, e-mail: users-help@cocoon.apache.org
---------------------------------
Yahoo! Messenger - Communicate instantly..."Ping" your friends today! Download Messenger Now
Re: the Â_problem.
Posted by Joerg Heinicke <jo...@gmx.de>.
On 18.05.2004 00:25, Upayavira wrote:
>>> I use html generator without configuration and xhtml serializer
>>> encoding to UTF-8. Could you tell me where the problem may be?
>>
>> The remote web page has a specific encoding. I guess the HTML
>> generator is ignoring it and parses the remote webpage probably using
>> UTF-8. I don't know about the details or how to solve it. Maybe you
>> can get jtidy to output XML in a specific encoding that the parser
>> parsing the jtidy output expects.
>
>
> I've recently tried to change the encoding on JTidy. It doesn't seem to
> work. I followed it right in in a debugger - the configured locale was
> set right inside JTidy, but it still outputted ISO-8859-1. No UTF-8.
>
> I'm thinking of extending the HTML generator to use something like
> NekoHTML (I'm using it right now for a work project, and I reckon it'd
> be pretty easy to do (like 10 lines of code). So the generator would be
> configurable as to which tool it uses.
And configuring the parser instead of jtidy is not possible?
Joerg
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@cocoon.apache.org
For additional commands, e-mail: users-help@cocoon.apache.org
Re: the �_problem.
Posted by leon tian <ti...@yahoo.co.uk>.
Thanks. I'll try it;)
Upayavira <uv...@upaya.co.uk> wrote:
Joerg Heinicke wrote:
> On 18.05.2004 00:01, leon tian wrote:
>
>> I use html generator without configuration and xhtml serializer
>> encoding to UTF-8. Could you tell me where the problem may be?
>
>
> The remote web page has a specific encoding. I guess the HTML
> generator is ignoring it and parses the remote webpage probably using
> UTF-8. I don't know about the details or how to solve it. Maybe you
> can get jtidy to output XML in a specific encoding that the parser
> parsing the jtidy output expects.
I've recently tried to change the encoding on JTidy. It doesn't seem to
work. I followed it right in in a debugger - the configured locale was
set right inside JTidy, but it still outputted ISO-8859-1. No UTF-8.
I'm thinking of extending the HTML generator to use something like
NekoHTML (I'm using it right now for a work project, and I reckon it'd
be pretty easy to do (like 10 lines of code). So the generator would be
configurable as to which tool it uses.
Upayavira
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@cocoon.apache.org
For additional commands, e-mail: users-help@cocoon.apache.org
---------------------------------
Yahoo! Messenger - Communicate instantly..."Ping" your friends today! Download Messenger Now
Re: the Â_problem.
Posted by Upayavira <uv...@upaya.co.uk>.
Joerg Heinicke wrote:
> On 18.05.2004 00:01, leon tian wrote:
>
>> I use html generator without configuration and xhtml serializer
>> encoding to UTF-8. Could you tell me where the problem may be?
>
>
> The remote web page has a specific encoding. I guess the HTML
> generator is ignoring it and parses the remote webpage probably using
> UTF-8. I don't know about the details or how to solve it. Maybe you
> can get jtidy to output XML in a specific encoding that the parser
> parsing the jtidy output expects.
I've recently tried to change the encoding on JTidy. It doesn't seem to
work. I followed it right in in a debugger - the configured locale was
set right inside JTidy, but it still outputted ISO-8859-1. No UTF-8.
I'm thinking of extending the HTML generator to use something like
NekoHTML (I'm using it right now for a work project, and I reckon it'd
be pretty easy to do (like 10 lines of code). So the generator would be
configurable as to which tool it uses.
Upayavira
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@cocoon.apache.org
For additional commands, e-mail: users-help@cocoon.apache.org
Re: the Â_problem.
Posted by Joerg Heinicke <jo...@gmx.de>.
On 18.05.2004 00:01, leon tian wrote:
> I use html generator without configuration and xhtml serializer
> encoding to UTF-8. Could you tell me where the problem may be?
The remote web page has a specific encoding. I guess the HTML generator
is ignoring it and parses the remote webpage probably using UTF-8. I
don't know about the details or how to solve it. Maybe you can get jtidy
to output XML in a specific encoding that the parser parsing the jtidy
output expects.
Joerg
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@cocoon.apache.org
For additional commands, e-mail: users-help@cocoon.apache.org
Re: the �_problem.
Posted by leon tian <ti...@yahoo.co.uk>.
I use html generator without configuration and xhtml serializer encoding to UTF-8. Could you tell me where the problem may be? Thanks.
Joerg Heinicke <jo...@gmx.de> wrote:
On 17.05.2004 14:59, leon tian wrote:
> hi,
>
> thanks for your advice. but i wanna transform the web pages from the
> internet and then display them. there are already " " between
> tags in the web pages. how can i change them to " "? should i
> configurate JTidy or XHTML serializer, or any other way?
IMO that's not related to any entity but to encoding directly. I guess
your parsing the files in the wrong encoding.
Joerg
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@cocoon.apache.org
For additional commands, e-mail: users-help@cocoon.apache.org
---------------------------------
Yahoo! Messenger - Communicate instantly..."Ping" your friends today! Download Messenger Now