You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@pdfbox.apache.org by Najib Sahyoun <na...@concordia.ca> on 2016/03/11 04:50:51 UTC

Look for text in pdf file and then extract it

?Hello,


I am Najib Sahyoun, PhD student in accounting.


I am looking for an application that looks for a specific term (i.e. board of directors) and then will extract all the sentences that include the term (board of directors).


Does your application perform this?


Thanks and best regards,

Najib




Re: Look for text in pdf file and then extract it

Posted by Muhammad Ismail <it...@gmail.com>.
Try extract text from PDF & created search index from that text & do your
desired searching.

On Fri, Mar 11, 2016 at 10:37 AM, Tilman Hausherr <TH...@t-online.de>
wrote:

> Am 11.03.2016 um 04:50 schrieb Najib Sahyoun:
>
>> ?Hello,
>>
>>
>> I am Najib Sahyoun, PhD student in accounting.
>>
>>
>> I am looking for an application that looks for a specific term (i.e.
>> board of directors) and then will extract all the sentences that include
>> the term (board of directors).
>>
>>
>> Does your application perform this?
>>
>
> No. You'll have to develop this on top of the text extraction or hire
> someone to do it. PDFBox just extract the text.
>
> Tilman
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
> For additional commands, e-mail: users-help@pdfbox.apache.org
>
>


-- 
Thanks
Muhammad Ismail
cell (PAK) : +92.322.5100362
cell (Sweden): +46 700-321-521
e-mail: it.is.ismail@gmail.com

This message may contain confidential and/or privileged information.  If
you are not the addressee or authorized to receive this for the addressee,
you must not use, copy, disclose or take any action based on this message
or any information herein.  If you have received this message in error,
please advise the sender immediately by reply e-mail and delete this
message.  Thank you for your cooperation.

Re: Look for text in pdf file and then extract it

Posted by Tilman Hausherr <TH...@t-online.de>.
Am 11.03.2016 um 04:50 schrieb Najib Sahyoun:
> ?Hello,
>
>
> I am Najib Sahyoun, PhD student in accounting.
>
>
> I am looking for an application that looks for a specific term (i.e. board of directors) and then will extract all the sentences that include the term (board of directors).
>
>
> Does your application perform this?

No. You'll have to develop this on top of the text extraction or hire 
someone to do it. PDFBox just extract the text.

Tilman



---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org