You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@pdfbox.apache.org by "Andreas Lehmkühler (JIRA)" <ji...@apache.org> on 2010/03/07 15:17:27 UTC

[jira] Created: (PDFBOX-650) Remove dependency to lucene

Remove dependency to lucene
---------------------------

                 Key: PDFBOX-650
                 URL: https://issues.apache.org/jira/browse/PDFBOX-650
             Project: PDFBox
          Issue Type: Improvement
          Components: Lucene, Utilities
    Affects Versions: 1.0.0
            Reporter: Andreas Lehmkühler
            Assignee: Andreas Lehmkühler


The current pdfbox version extracts all needed data from a pdf document and uses lucene to create an index for the lucene search engine. 

To avoid the dependency to lucene pdfbox should only extract the data which can be used to create a lucene index outside from pdfbox. That would decrase the number of external jars and woukld eliminate an other potential issue because of changing apis like those coming with lucene 3.0. 

I've created 2 new classes (one for the extraction and one as example how to use that feature) based on existing code and attached it as patch.

WDYT?

If that patch will be added to the trunk the existing code will be removed including both lucene jars.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (PDFBOX-650) Remove dependency on lucene

Posted by "Andreas Lehmkühler (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/PDFBOX-650?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Andreas Lehmkühler updated PDFBOX-650:
--------------------------------------

    Description: 
The current pdfbox version extracts all needed data from a pdf document and uses lucene to create an index for the lucene search engine. 

To avoid the dependency on lucene pdfbox should only extract the data which can be used to create a lucene index outside from pdfbox. That would decrase the number of external jars and woukld eliminate an other potential issue because of changing apis like those coming with lucene 3.0. 

I've created 2 new classes (one for the extraction and one as example how to use that feature) based on existing code and attached it as patch.

WDYT?

If that patch will be added to the trunk the existing code will be removed including both lucene jars.


  was:
The current pdfbox version extracts all needed data from a pdf document and uses lucene to create an index for the lucene search engine. 

To avoid the dependency to lucene pdfbox should only extract the data which can be used to create a lucene index outside from pdfbox. That would decrase the number of external jars and woukld eliminate an other potential issue because of changing apis like those coming with lucene 3.0. 

I've created 2 new classes (one for the extraction and one as example how to use that feature) based on existing code and attached it as patch.

WDYT?

If that patch will be added to the trunk the existing code will be removed including both lucene jars.


        Summary: Remove dependency on lucene  (was: Remove dependency to lucene)

> Remove dependency on lucene
> ---------------------------
>
>                 Key: PDFBOX-650
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-650
>             Project: PDFBox
>          Issue Type: Improvement
>          Components: Lucene, Utilities
>    Affects Versions: 1.0.0
>            Reporter: Andreas Lehmkühler
>            Assignee: Andreas Lehmkühler
>         Attachments: removing_lucene_patch.txt
>
>
> The current pdfbox version extracts all needed data from a pdf document and uses lucene to create an index for the lucene search engine. 
> To avoid the dependency on lucene pdfbox should only extract the data which can be used to create a lucene index outside from pdfbox. That would decrase the number of external jars and woukld eliminate an other potential issue because of changing apis like those coming with lucene 3.0. 
> I've created 2 new classes (one for the extraction and one as example how to use that feature) based on existing code and attached it as patch.
> WDYT?
> If that patch will be added to the trunk the existing code will be removed including both lucene jars.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (PDFBOX-650) Remove dependency to lucene

Posted by "Andreas Lehmkühler (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/PDFBOX-650?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Andreas Lehmkühler updated PDFBOX-650:
--------------------------------------

    Attachment: removing_lucene_patch.txt

> Remove dependency to lucene
> ---------------------------
>
>                 Key: PDFBOX-650
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-650
>             Project: PDFBox
>          Issue Type: Improvement
>          Components: Lucene, Utilities
>    Affects Versions: 1.0.0
>            Reporter: Andreas Lehmkühler
>            Assignee: Andreas Lehmkühler
>         Attachments: removing_lucene_patch.txt
>
>
> The current pdfbox version extracts all needed data from a pdf document and uses lucene to create an index for the lucene search engine. 
> To avoid the dependency to lucene pdfbox should only extract the data which can be used to create a lucene index outside from pdfbox. That would decrase the number of external jars and woukld eliminate an other potential issue because of changing apis like those coming with lucene 3.0. 
> I've created 2 new classes (one for the extraction and one as example how to use that feature) based on existing code and attached it as patch.
> WDYT?
> If that patch will be added to the trunk the existing code will be removed including both lucene jars.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Closed: (PDFBOX-650) Remove dependency on lucene

Posted by "Andreas Lehmkühler (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/PDFBOX-650?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Andreas Lehmkühler closed PDFBOX-650.
-------------------------------------


> Remove dependency on lucene
> ---------------------------
>
>                 Key: PDFBOX-650
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-650
>             Project: PDFBox
>          Issue Type: Improvement
>          Components: Lucene, Utilities
>    Affects Versions: 1.0.0
>            Reporter: Andreas Lehmkühler
>            Assignee: Andreas Lehmkühler
>         Attachments: removing_lucene_patch.txt
>
>
> The current pdfbox version extracts all needed data from a pdf document and uses lucene to create an index for the lucene search engine. 
> To avoid the dependency on lucene pdfbox should only extract the data which can be used to create a lucene index outside from pdfbox. That would decrase the number of external jars and woukld eliminate an other potential issue because of changing apis like those coming with lucene 3.0. 
> I've created 2 new classes (one for the extraction and one as example how to use that feature) based on existing code and attached it as patch.
> WDYT?
> If that patch will be added to the trunk the existing code will be removed including both lucene jars.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Resolved: (PDFBOX-650) Remove dependency on lucene

Posted by "Andreas Lehmkühler (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/PDFBOX-650?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Andreas Lehmkühler resolved PDFBOX-650.
---------------------------------------

    Resolution: Won't Fix

As PDFBOX-752 moves the lucene code to a separate component we don't need this any more.

> Remove dependency on lucene
> ---------------------------
>
>                 Key: PDFBOX-650
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-650
>             Project: PDFBox
>          Issue Type: Improvement
>          Components: Lucene, Utilities
>    Affects Versions: 1.0.0
>            Reporter: Andreas Lehmkühler
>            Assignee: Andreas Lehmkühler
>         Attachments: removing_lucene_patch.txt
>
>
> The current pdfbox version extracts all needed data from a pdf document and uses lucene to create an index for the lucene search engine. 
> To avoid the dependency on lucene pdfbox should only extract the data which can be used to create a lucene index outside from pdfbox. That would decrase the number of external jars and woukld eliminate an other potential issue because of changing apis like those coming with lucene 3.0. 
> I've created 2 new classes (one for the extraction and one as example how to use that feature) based on existing code and attached it as patch.
> WDYT?
> If that patch will be added to the trunk the existing code will be removed including both lucene jars.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.