You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Chandan Tamrakar <ch...@ccnep.com.np> on 2004/03/22 12:46:27 UTC
Indexing japanese PDF documents
I am using latest PDFbox library for parsing . I can parse a english
documents successfully but when I parse a document containing english and
japanese I do not get as I expected .
Have anyone tried using PDFBox library for parsing a japanese documents ? Or
do i need to use other parser like xPDF ,Jpedal ?
Thanks in advace
Chandan
---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org
Re: Indexing japanese PDF documents
Posted by Ben Litchfield <be...@csh.rit.edu>.
Yes he did, but I was away the past couple days. As this is more of a
PDFBox issue I responded in the PDFBox forums, please follow the thread
there if you are interested.
Ben
On Mon, 22 Mar 2004, Otis Gospodnetic wrote:
> I have not tried these other tools yet.
> Have you asked Ben Litchfield, the PDFBox author, about handling of
> Japanese text?
>
> Otis
>
> --- Chandan Tamrakar <ch...@ccnep.com.np> wrote:
> > I am using latest PDFbox library for parsing . I can parse a english
> > documents successfully but when I parse a document containing english
> > and
> > japanese I do not get as I expected .
> >
> > Have anyone tried using PDFBox library for parsing a japanese
> > documents ? Or
> > do i need to use other parser like xPDF ,Jpedal ?
> >
> > Thanks in advace
> > Chandan
> >
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> > For additional commands, e-mail: lucene-user-help@jakarta.apache.org
> >
>
---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org
Re: Indexing japanese PDF documents
Posted by Otis Gospodnetic <ot...@yahoo.com>.
I have not tried these other tools yet.
Have you asked Ben Litchfield, the PDFBox author, about handling of
Japanese text?
Otis
--- Chandan Tamrakar <ch...@ccnep.com.np> wrote:
> I am using latest PDFbox library for parsing . I can parse a english
> documents successfully but when I parse a document containing english
> and
> japanese I do not get as I expected .
>
> Have anyone tried using PDFBox library for parsing a japanese
> documents ? Or
> do i need to use other parser like xPDF ,Jpedal ?
>
> Thanks in advace
> Chandan
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-user-help@jakarta.apache.org
>
---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org