You are viewing a plain text version of this content. The canonical link for it is here.

Posted to java-user@lucene.apache.org by Chandan Tamrakar <ch...@ccnep.com.np> on 2004/03/22 12:46:27 UTC

Indexing japanese PDF documents

I am using latest PDFbox library for parsing . I can parse a english
documents successfully but when I parse a document containing english and
japanese I do not get as I expected .

Have anyone tried using PDFBox library for parsing a japanese documents ? Or
do i need to use other parser like xPDF ,Jpedal ?

Thanks in advace
Chandan



---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org

Re: Indexing japanese PDF documents

Posted by Ben Litchfield <be...@csh.rit.edu>.

Yes he did, but I was away the past couple days.  As this is more of a
PDFBox issue I responded in the PDFBox forums, please follow the thread
there if you are interested.

Ben



On Mon, 22 Mar 2004, Otis Gospodnetic wrote:

> I have not tried these other tools yet.
> Have you asked Ben Litchfield, the PDFBox author, about handling of
> Japanese text?
>
> Otis
>
> --- Chandan Tamrakar <ch...@ccnep.com.np> wrote:
> > I am using latest PDFbox library for parsing . I can parse a english
> > documents successfully but when I parse a document containing english
> > and
> > japanese I do not get as I expected .
> >
> > Have anyone tried using PDFBox library for parsing a japanese
> > documents ? Or
> > do i need to use other parser like xPDF ,Jpedal ?
> >
> > Thanks in advace
> > Chandan
> >
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> > For additional commands, e-mail: lucene-user-help@jakarta.apache.org
> >
>

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org

Re: Indexing japanese PDF documents

Posted by Otis Gospodnetic <ot...@yahoo.com>.

I have not tried these other tools yet.
Have you asked Ben Litchfield, the PDFBox author, about handling of
Japanese text?

Otis

--- Chandan Tamrakar <ch...@ccnep.com.np> wrote:
> I am using latest PDFbox library for parsing . I can parse a english
> documents successfully but when I parse a document containing english
> and
> japanese I do not get as I expected .
> 
> Have anyone tried using PDFBox library for parsing a japanese
> documents ? Or
> do i need to use other parser like xPDF ,Jpedal ?
> 
> Thanks in advace
> Chandan
> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-user-help@jakarta.apache.org
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org