You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@poi.apache.org by Rajeev Mohanraj <ra...@gmail.com> on 2011/08/01 09:48:16 UTC

Re: How to extract special character and symbols from the word document

Sergey Vladimirov <vlsergey <at> gmail.com> writes:

> 
> If it's non-secret document, you can upload it to some file-hosting
> servers (like www.rapidshare.com) or send it to me privatly.
> 


Am also facing similiar kind of problem. Poi doesnt read the special 
characters & symbols from word document. for example in my word document 
contains µĪĦĜăĂ content, when i read this with poi it gives ?????? like that. 
how to get the exact special character.. Pls help me out.


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org


Re: How to extract special character and symbols from the word document

Posted by Baby Periasamy <ba...@gmail.com>.
Hi Nick,

I cold not found any related information. Plz help me out.

--
View this message in context: http://apache-poi.1045710.n5.nabble.com/How-to-extract-special-character-and-symbols-from-the-word-document-tp4638645p4658562.html
Sent from the POI - User mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org


Re: How to extract special character and symbols from the word document

Posted by Rajeev Mohanraj <ra...@gmail.com>.
Nick Burch <nick.burch <at> alfresco.com> writes:

> 
> On Mon, 1 Aug 2011, Rajeev Mohanraj wrote:
> > Am also facing similiar kind of problem. Poi doesnt read the special 
> > characters & symbols from word document. for example in my word document 
> > contains µĪĦĜăĂ content, when i read this with poi it gives ?????? like 
> > that. how to get the exact special character.. Pls help me out.
> 
> Looks like you've sent an incorrect encoding on your output. This comes up 
> a lot, mostly with people trying to use excel, see the archives for 
> details on how to resolve it for your given platform
> 
> Nick
> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe <at> poi.apache.org
> For additional commands, e-mail: user-help <at> poi.apache.org


Hi Nick,

I already set encoding method to UTF-8. am using wordtohtml conversion using 
poi. and i set output encoding method as UTF-8. but i still get special 
characters µĪĦĜăĂ as ?????? only. whether i need to read content as UTF-8?


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org


Re: How to extract special character and symbols from the word document

Posted by Sergey Vladimirov <vl...@gmail.com>.
Rajeev,

There can be a problem with bidirection documents. Such things can be
complicated and errors can occur (first of all because i usually don't
work with bidirectional documents). To fix the issues please open new
bug in bugzilla and attach document and result HTML, generated using
latest build.

Also, on new topic please create new letter (do not use reply-to-all)
when starting new topic.

Best regards,
Sergey.

On Mon, Aug 8, 2011 at 10:36 AM, Rajeev Mohanraj <ra...@gmail.com> wrote:
> Nick Burch <nick.burch <at> alfresco.com> writes:
>
>>
>> On Mon, 1 Aug 2011, Rajeev Mohanraj wrote:
>> > Am also facing similiar kind of problem. Poi doesnt read the special
>> > characters & symbols from word document. for example in my word document
>> > contains µĪĦĜăĂ content, when i read this with poi it gives ?????? like
>> > that. how to get the exact special character.. Pls help me out.
>>
>> Looks like you've sent an incorrect encoding on your output. This comes up
>> a lot, mostly with people trying to use excel, see the archives for
>> details on how to resolve it for your given platform
>>
>> Nick
>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscribe <at> poi.apache.org
>> For additional commands, e-mail: user-help <at> poi.apache.org
>
>
> Hi,
>  Again i got trouble in Alignment issue. iconvert word documnt to html using
> poi htmltoword converter. but the alignment didnt come properly. the right
> side content display in left side. the alignment format is missing. please
> help me out.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
> For additional commands, e-mail: user-help@poi.apache.org
>
>



-- 
Sergey Vladimirov

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org


Re: How to extract special character and symbols from the word document

Posted by Rajeev Mohanraj <ra...@gmail.com>.
Nick Burch <nick.burch <at> alfresco.com> writes:

> 
> On Mon, 1 Aug 2011, Rajeev Mohanraj wrote:
> > Am also facing similiar kind of problem. Poi doesnt read the special 
> > characters & symbols from word document. for example in my word document 
> > contains µĪĦĜăĂ content, when i read this with poi it gives ?????? like 
> > that. how to get the exact special character.. Pls help me out.
> 
> Looks like you've sent an incorrect encoding on your output. This comes up 
> a lot, mostly with people trying to use excel, see the archives for 
> details on how to resolve it for your given platform
> 
> Nick
> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe <at> poi.apache.org
> For additional commands, e-mail: user-help <at> poi.apache.org


Hi,
 Again i got trouble in Alignment issue. iconvert word documnt to html using 
poi htmltoword converter. but the alignment didnt come properly. the right 
side content display in left side. the alignment format is missing. please 
help me out.


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org


Re: How to extract special character and symbols from the word document

Posted by Nick Burch <ni...@alfresco.com>.
On Mon, 1 Aug 2011, Rajeev Mohanraj wrote:
> Am also facing similiar kind of problem. Poi doesnt read the special 
> characters & symbols from word document. for example in my word document 
> contains µĪĦĜăĂ content, when i read this with poi it gives ?????? like 
> that. how to get the exact special character.. Pls help me out.

Looks like you've sent an incorrect encoding on your output. This comes up 
a lot, mostly with people trying to use excel, see the archives for 
details on how to resolve it for your given platform

Nick