You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@jspwiki.apache.org by "NicolaFischer (JIRA)" <ji...@apache.org> on 2009/01/15 15:16:59 UTC

[jira] Created: (JSPWIKI-469) Enhance LuceneSearchProvider for other Attachments

Enhance LuceneSearchProvider for other Attachments 
---------------------------------------------------

                 Key: JSPWIKI-469
                 URL: https://issues.apache.org/jira/browse/JSPWIKI-469
             Project: JSPWiki
          Issue Type: New Feature
    Affects Versions: 2.8.1
            Reporter: NicolaFischer
         Attachments: patch.txt

LuceneProvider should index more filestypes then only plain text. This is one attempt to index pdf-files.

Required jars:

* [Apache POI|http://ftp.tpnet.pl/vol/d1/apache/poi/release/bin] (not tested with 3.0.1 final)
* [PDFBox|http://www.pdfbox.org]
* [FontBox|http://www.fontbox.org] 
* [OpenDocumentTextInputStream|http://books.evc-cit.info/odf_utils/index.html]

Patch attached for 2.8.1

Maybe we should check how to index more documents.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (JSPWIKI-469) Enhance LuceneSearchProvider for other Attachments

Posted by "NicolaFischer (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/JSPWIKI-469?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

NicolaFischer updated JSPWIKI-469:
----------------------------------

    Attachment: patch.txt

> Enhance LuceneSearchProvider for other Attachments 
> ---------------------------------------------------
>
>                 Key: JSPWIKI-469
>                 URL: https://issues.apache.org/jira/browse/JSPWIKI-469
>             Project: JSPWiki
>          Issue Type: New Feature
>    Affects Versions: 2.8.1
>            Reporter: NicolaFischer
>         Attachments: patch.txt
>
>
> LuceneProvider should index more filestypes then only plain text. This is one attempt to index pdf-files.
> Required jars:
> * [Apache POI|http://ftp.tpnet.pl/vol/d1/apache/poi/release/bin] (not tested with 3.0.1 final)
> * [PDFBox|http://www.pdfbox.org]
> * [FontBox|http://www.fontbox.org] 
> * [OpenDocumentTextInputStream|http://books.evc-cit.info/odf_utils/index.html]
> Patch attached for 2.8.1
> Maybe we should check how to index more documents.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (JSPWIKI-469) Enhance LuceneSearchProvider for other Attachments

Posted by "NicolaFischer (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/JSPWIKI-469?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12664805#action_12664805 ] 

NicolaFischer commented on JSPWIKI-469:
---------------------------------------

i think the auther meant do do this for wiki. maybe he left an e-mailadress and we can try to reach hi if you check his account details in jspwiki.org. 

Fontobx seems to be part of pdfbox.
http://svn.apache.org/repos/asf/incubator/pdfbox/trunk/build.xml

> Enhance LuceneSearchProvider for other Attachments 
> ---------------------------------------------------
>
>                 Key: JSPWIKI-469
>                 URL: https://issues.apache.org/jira/browse/JSPWIKI-469
>             Project: JSPWiki
>          Issue Type: New Feature
>    Affects Versions: 2.8.1
>            Reporter: NicolaFischer
>         Attachments: patch.txt
>
>
> LuceneProvider should index more filestypes then only plain text. This is one attempt to index pdf-files.
> Required jars:
> * [Apache POI|http://ftp.tpnet.pl/vol/d1/apache/poi/release/bin] (not tested with 3.0.1 final)
> * [PDFBox|http://www.pdfbox.org]
> * [FontBox|http://www.fontbox.org] 
> * [OpenDocumentTextInputStream|http://books.evc-cit.info/odf_utils/index.html]
> Patch attached for 2.8.1
> Maybe we should check how to index more documents.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (JSPWIKI-469) Enhance LuceneSearchProvider for other Attachments

Posted by "Janne Jalkanen (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/JSPWIKI-469?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12664815#action_12664815 ] 

Janne Jalkanen commented on JSPWIKI-469:
----------------------------------------

LGPL is not ok.  That dependency needs to be cleared away.

> Enhance LuceneSearchProvider for other Attachments 
> ---------------------------------------------------
>
>                 Key: JSPWIKI-469
>                 URL: https://issues.apache.org/jira/browse/JSPWIKI-469
>             Project: JSPWiki
>          Issue Type: New Feature
>    Affects Versions: 2.8.1
>            Reporter: NicolaFischer
>         Attachments: patch.txt
>
>
> LuceneProvider should index more filestypes then only plain text. This is one attempt to index pdf-files.
> Required jars:
> * [Apache POI|http://ftp.tpnet.pl/vol/d1/apache/poi/release/bin] (not tested with 3.0.1 final)
> * [PDFBox|http://www.pdfbox.org]
> * [FontBox|http://www.fontbox.org] 
> * [OpenDocumentTextInputStream|http://books.evc-cit.info/odf_utils/index.html]
> Patch attached for 2.8.1
> Maybe we should check how to index more documents.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (JSPWIKI-469) Enhance LuceneSearchProvider for other Attachments

Posted by "Harry Metske (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/JSPWIKI-469?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12664592#action_12664592 ] 

Harry Metske commented on JSPWIKI-469:
--------------------------------------

I have a couple of questions/remarks:

# the patch imports classes in the pl.com.pbpolsoft.wiki.util.attachment package, I could not find the sources or license, are they available ?
# It is unclear to me what to do with http://www.fontbox.org 
# POI has a fairly large library (1.4 MB),  license is ok  (Apache)
# PDFBox has an even larger lib (2.8 MB),  license is ok  (Apache)
# ODF utils license is GNU LGPL (Lesser General Public License), don't know if that's ok, Janne ?



> Enhance LuceneSearchProvider for other Attachments 
> ---------------------------------------------------
>
>                 Key: JSPWIKI-469
>                 URL: https://issues.apache.org/jira/browse/JSPWIKI-469
>             Project: JSPWiki
>          Issue Type: New Feature
>    Affects Versions: 2.8.1
>            Reporter: NicolaFischer
>         Attachments: patch.txt
>
>
> LuceneProvider should index more filestypes then only plain text. This is one attempt to index pdf-files.
> Required jars:
> * [Apache POI|http://ftp.tpnet.pl/vol/d1/apache/poi/release/bin] (not tested with 3.0.1 final)
> * [PDFBox|http://www.pdfbox.org]
> * [FontBox|http://www.fontbox.org] 
> * [OpenDocumentTextInputStream|http://books.evc-cit.info/odf_utils/index.html]
> Patch attached for 2.8.1
> Maybe we should check how to index more documents.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.