You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@poi.apache.org by Anish Sneh <an...@techblue.co.uk> on 2009/12/31 09:36:55 UTC

NON-LATIN fonts using POI

Hi all

I want to read/manipulate NON-LATIN fonts using POI api. Actually I want 
to implement text based search for NON-LATIN language (Hindi/Punjabi 
languages) in my JAVA application.


Please suggest.

-- 
Anish Sneh
Software Engineer
Techblue Softwares Pvt. Ltd.
www.technologyblueprint.co.uk


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org


Re: NON-LATIN fonts using POI

Posted by MSB <ma...@tiscali.co.uk>.
Is that not just a case of setting the language the package uses? I would
have thgouht that if you change the dictionary then the application will
present you with the option to use words typed in non-latin fonts but I do
not know this for certain.

With regard to HWPF - that is the part of the POI API that you will need to
use - there is a distinct lack of examples for you to look at. Sadly, I do
not have the time here to create the code, but you will need to do something
like the following;

Create an instance of the HWPFDocument class by opening an InputStream
connected to your Word file and passing this to the HWPFDocument
constructor.
>From the HWPFDocument, recover a reference to the overall Range object for
the Word document - call the getRamge() method.
>From the Range object recover the number of paragraphs there are in the
document - call the numParagraphs() method.
Use a loop to iterate over the Paragraphs in the document - call the
getParagraph(int) method to get at an individual Paragraph object.
Once you have a Paragraph object in hand, you can recover the text by
calling the text(0 method on the Paragraph object.

That will return to you a String that contains the paragraphs text and it
should then be possible to use simple Java code to search for occurrences of
your search text. I think this will work equally well for non-latin fonts
and it should be easy enough to try quickly.

Yours

Mark B



anishsneh wrote:
> 
>>
>> What will you be searching, Word documents, Excel spreadsheets or
>> Powerpoint
>> presentations? What sort of manipulations do you envisage?
> I'm using Non-Latin fonts. I'm searching in MS Word Document. Just need 
> to search particular words. The problem with Word 2003 search window is 
> when I type in the search box it gets typed in Latin font irrespective 
> of the document fonts.
> 
> 
> 
> MSB wrote:
>> What will you be searching, Word documents, Excel spreadsheets or
>> Powerpoint
>> presentations? What sort of manipulations do you envisage?
>>
>> Yours
>>
>> Mark B
>>
>>
>> anishsneh wrote:
>>   
>>> Hi all
>>>
>>> I want to read/manipulate NON-LATIN fonts using POI api. Actually I want 
>>> to implement text based search for NON-LATIN language (Hindi/Punjabi 
>>> languages) in my JAVA application.
>>>
>>>
>>> Please suggest.
>>>
>>> -- 
>>> Anish Sneh
>>> Software Engineer
>>> Techblue Softwares Pvt. Ltd.
>>> www.technologyblueprint.co.uk
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
>>> For additional commands, e-mail: user-help@poi.apache.org
>>>
>>>
>>>
>>> -----
>>> Thanks & Regards,
>>> Anish Sneh
>>>
>>>     
>>
>>   
> 
> 
> -- 
> Anish Sneh
> Software Engineer
> Techblue Softwares Pvt. Ltd.
> www.technologyblueprint.co.uk
> 
> 
> 
> -----
> Thanks & Regards,
> Anish Sneh
> 

-- 
View this message in context: http://old.nabble.com/NON-LATIN-fonts-using-POI-tp26976422p27010285.html
Sent from the POI - User mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org


Re: NON-LATIN fonts using POI

Posted by Anish Sneh <an...@techblue.co.uk>.
>
> What will you be searching, Word documents, Excel spreadsheets or Powerpoint
> presentations? What sort of manipulations do you envisage?
I'm using Non-Latin fonts. I'm searching in MS Word Document. Just need 
to search particular words. The problem with Word 2003 search window is 
when I type in the search box it gets typed in Latin font irrespective 
of the document fonts.



MSB wrote:
> What will you be searching, Word documents, Excel spreadsheets or Powerpoint
> presentations? What sort of manipulations do you envisage?
>
> Yours
>
> Mark B
>
>
> anishsneh wrote:
>   
>> Hi all
>>
>> I want to read/manipulate NON-LATIN fonts using POI api. Actually I want 
>> to implement text based search for NON-LATIN language (Hindi/Punjabi 
>> languages) in my JAVA application.
>>
>>
>> Please suggest.
>>
>> -- 
>> Anish Sneh
>> Software Engineer
>> Techblue Softwares Pvt. Ltd.
>> www.technologyblueprint.co.uk
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
>> For additional commands, e-mail: user-help@poi.apache.org
>>
>>
>>
>> -----
>> Thanks & Regards,
>> Anish Sneh
>>
>>     
>
>   


-- 
Anish Sneh
Software Engineer
Techblue Softwares Pvt. Ltd.
www.technologyblueprint.co.uk


Re: NON-LATIN fonts using POI

Posted by MSB <ma...@tiscali.co.uk>.
What will you be searching, Word documents, Excel spreadsheets or Powerpoint
presentations? What sort of manipulations do you envisage?

Yours

Mark B


anishsneh wrote:
> 
> Hi all
> 
> I want to read/manipulate NON-LATIN fonts using POI api. Actually I want 
> to implement text based search for NON-LATIN language (Hindi/Punjabi 
> languages) in my JAVA application.
> 
> 
> Please suggest.
> 
> -- 
> Anish Sneh
> Software Engineer
> Techblue Softwares Pvt. Ltd.
> www.technologyblueprint.co.uk
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
> For additional commands, e-mail: user-help@poi.apache.org
> 
> 
> 
> -----
> Thanks & Regards,
> Anish Sneh
> 

-- 
View this message in context: http://old.nabble.com/NON-LATIN-fonts-using-POI-tp26976422p26980381.html
Sent from the POI - User mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org


Re: NON-LATIN fonts using POI

Posted by Ognjen Blagojevic <og...@etf.bg.ac.rs>.
Anish Sneh wrote:
> I want to read/manipulate NON-LATIN fonts using POI api. Actually I want 
> to implement text based search for NON-LATIN language (Hindi/Punjabi 
> languages) in my JAVA application.
> 
> Please suggest.

You can use POI to read (and write) non-Latin characters, usually 
without any special settings. If you have any specific problem with POI, 
please elaborate.

Regards,
Ognjen

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org