You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@cocoon.apache.org by Joerg Heinicke <jo...@gmx.de> on 2004/05/17 22:17:57 UTC

Re: the Â_problem.

On 17.05.2004 14:59, leon tian wrote:

> hi,
> 
> thanks for your advice. but i wanna transform the web pages from the
> internet and then display them. there are already "&nbsp" between
> tags in the web pages. how can i change them to " "? should i
> configurate JTidy or XHTML serializer, or any other way?


IMO that's not related to any entity but to encoding directly. I guess 
your parsing the files in the wrong encoding.

Joerg

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@cocoon.apache.org
For additional commands, e-mail: users-help@cocoon.apache.org


Re: the Â_problem.

Posted by Joerg Heinicke <jo...@gmx.de>.
On 18.05.2004 00:57, Upayavira wrote:

>> And configuring the parser instead of jtidy is not possible?
> 
> Sorry? Tidy takes in an input stream and gives out a DOM which is passed 
> to a DOMStreamer. What parser should be configured?

Ah, of course. I'm tired :) I thought of some jtidy output that must be 
parsed. But with a DOM that's obviously not the case.

Joerg

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@cocoon.apache.org
For additional commands, e-mail: users-help@cocoon.apache.org


Re: the Â_problem.

Posted by Upayavira <uv...@upaya.co.uk>.
Joerg Heinicke wrote:

> On 18.05.2004 00:25, Upayavira wrote:
>
>>>> I use html generator without configuration and xhtml serializer
>>>> encoding to UTF-8. Could you tell me where the problem may be?
>>>
>>>
>>> The remote web page has a specific encoding. I guess the HTML 
>>> generator is ignoring it and parses the remote webpage probably 
>>> using UTF-8. I don't know about the details or how to solve it. 
>>> Maybe you can get jtidy to output XML in a specific encoding that 
>>> the parser parsing the jtidy output expects.
>>
>>
>>
>> I've recently tried to change the encoding on JTidy. It doesn't seem 
>> to work. I followed it right in in a debugger - the configured locale 
>> was set right inside JTidy, but it still outputted ISO-8859-1. No UTF-8.
>>
>> I'm thinking of extending the HTML generator to use something like 
>> NekoHTML (I'm using it right now for a work project, and I reckon 
>> it'd be pretty easy to do (like 10 lines of code). So the generator 
>> would be configurable as to which tool it uses.
>
>
> And configuring the parser instead of jtidy is not possible?

Sorry? Tidy takes in an input stream and gives out a DOM which is passed 
to a DOMStreamer. What parser should be configured?

Upayavira



---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@cocoon.apache.org
For additional commands, e-mail: users-help@cocoon.apache.org


Re: the �_problem.

Posted by leon tian <ti...@yahoo.co.uk>.
Hi, I wanna configurate jtidy. How can I set jtidy.properties's url using resource: in sitemap?

Joerg Heinicke <jo...@gmx.de> wrote:On 18.05.2004 00:25, Upayavira wrote:

>>> I use html generator without configuration and xhtml serializer
>>> encoding to UTF-8. Could you tell me where the problem may be?
>>
>> The remote web page has a specific encoding. I guess the HTML 
>> generator is ignoring it and parses the remote webpage probably using 
>> UTF-8. I don't know about the details or how to solve it. Maybe you 
>> can get jtidy to output XML in a specific encoding that the parser 
>> parsing the jtidy output expects.
> 
> 
> I've recently tried to change the encoding on JTidy. It doesn't seem to 
> work. I followed it right in in a debugger - the configured locale was 
> set right inside JTidy, but it still outputted ISO-8859-1. No UTF-8.
> 
> I'm thinking of extending the HTML generator to use something like 
> NekoHTML (I'm using it right now for a work project, and I reckon it'd 
> be pretty easy to do (like 10 lines of code). So the generator would be 
> configurable as to which tool it uses.

And configuring the parser instead of jtidy is not possible?

Joerg

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@cocoon.apache.org
For additional commands, e-mail: users-help@cocoon.apache.org


		
---------------------------------
  Yahoo! Messenger - Communicate instantly..."Ping" your friends today! Download Messenger Now

Re: the Â_problem.

Posted by Joerg Heinicke <jo...@gmx.de>.
On 18.05.2004 00:25, Upayavira wrote:

>>> I use html generator without configuration and xhtml serializer
>>> encoding to UTF-8. Could you tell me where the problem may be?
>>
>> The remote web page has a specific encoding. I guess the HTML 
>> generator is ignoring it and parses the remote webpage probably using 
>> UTF-8. I don't know about the details or how to solve it. Maybe you 
>> can get jtidy to output XML in a specific encoding that the parser 
>> parsing the jtidy output expects.
> 
> 
> I've recently tried to change the encoding on JTidy. It doesn't seem to 
> work. I followed it right in in a debugger - the configured locale was 
> set right inside JTidy, but it still outputted ISO-8859-1. No UTF-8.
> 
> I'm thinking of extending the HTML generator to use something like 
> NekoHTML (I'm using it right now for a work project, and I reckon it'd 
> be pretty easy to do (like 10 lines of code). So the generator would be 
> configurable as to which tool it uses.

And configuring the parser instead of jtidy is not possible?

Joerg

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@cocoon.apache.org
For additional commands, e-mail: users-help@cocoon.apache.org


Re: the �_problem.

Posted by leon tian <ti...@yahoo.co.uk>.
Thanks. I'll try it;)


Upayavira <uv...@upaya.co.uk> wrote:
Joerg Heinicke wrote:

> On 18.05.2004 00:01, leon tian wrote:
>
>> I use html generator without configuration and xhtml serializer
>> encoding to UTF-8. Could you tell me where the problem may be?
>
>
> The remote web page has a specific encoding. I guess the HTML 
> generator is ignoring it and parses the remote webpage probably using 
> UTF-8. I don't know about the details or how to solve it. Maybe you 
> can get jtidy to output XML in a specific encoding that the parser 
> parsing the jtidy output expects.

I've recently tried to change the encoding on JTidy. It doesn't seem to 
work. I followed it right in in a debugger - the configured locale was 
set right inside JTidy, but it still outputted ISO-8859-1. No UTF-8.

I'm thinking of extending the HTML generator to use something like 
NekoHTML (I'm using it right now for a work project, and I reckon it'd 
be pretty easy to do (like 10 lines of code). So the generator would be 
configurable as to which tool it uses.

Upayavira



---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@cocoon.apache.org
For additional commands, e-mail: users-help@cocoon.apache.org

		
---------------------------------
  Yahoo! Messenger - Communicate instantly..."Ping" your friends today! Download Messenger Now

Re: the Â_problem.

Posted by Upayavira <uv...@upaya.co.uk>.
Joerg Heinicke wrote:

> On 18.05.2004 00:01, leon tian wrote:
>
>> I use html generator without configuration and xhtml serializer
>> encoding to UTF-8. Could you tell me where the problem may be?
>
>
> The remote web page has a specific encoding. I guess the HTML 
> generator is ignoring it and parses the remote webpage probably using 
> UTF-8. I don't know about the details or how to solve it. Maybe you 
> can get jtidy to output XML in a specific encoding that the parser 
> parsing the jtidy output expects.

I've recently tried to change the encoding on JTidy. It doesn't seem to 
work. I followed it right in in a debugger - the configured locale was 
set right inside JTidy, but it still outputted ISO-8859-1. No UTF-8.

I'm thinking of extending the HTML generator to use something like 
NekoHTML (I'm using it right now for a work project, and I reckon it'd 
be pretty easy to do (like 10 lines of code). So the generator would be 
configurable as to which tool it uses.

Upayavira



---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@cocoon.apache.org
For additional commands, e-mail: users-help@cocoon.apache.org


Re: the Â_problem.

Posted by Joerg Heinicke <jo...@gmx.de>.
On 18.05.2004 00:01, leon tian wrote:

> I use html generator without configuration and xhtml serializer
> encoding to UTF-8. Could you tell me where the problem may be?

The remote web page has a specific encoding. I guess the HTML generator 
is ignoring it and parses the remote webpage probably using UTF-8. I 
don't know about the details or how to solve it. Maybe you can get jtidy 
to output XML in a specific encoding that the parser parsing the jtidy 
output expects.

Joerg

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@cocoon.apache.org
For additional commands, e-mail: users-help@cocoon.apache.org


Re: the �_problem.

Posted by leon tian <ti...@yahoo.co.uk>.
I use html generator without configuration and xhtml serializer encoding to UTF-8. Could you tell me where the problem may be? Thanks.
 

Joerg Heinicke <jo...@gmx.de> wrote:
On 17.05.2004 14:59, leon tian wrote:

> hi,
> 
> thanks for your advice. but i wanna transform the web pages from the
> internet and then display them. there are already " " between
> tags in the web pages. how can i change them to " "? should i
> configurate JTidy or XHTML serializer, or any other way?


IMO that's not related to any entity but to encoding directly. I guess 
your parsing the files in the wrong encoding.

Joerg

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@cocoon.apache.org
For additional commands, e-mail: users-help@cocoon.apache.org

		
---------------------------------
  Yahoo! Messenger - Communicate instantly..."Ping" your friends today! Download Messenger Now