You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by Litrik De Roy <li...@gmail.com> on 2008/01/11 09:40:06 UTC

Eclipse plug-in

Hi,

I have made an Eclipse plug-in so you can see all metadata of any
resource in your workspace in the Properties dialog. Nothing fancy,
but it works.

I'm not sure it is really worth the effort make the source code
available. Maybe I'll set up a Google Code project if there is anybody
interested.

-- 
Litrik De Roy
Norio ICT Consulting - http://www.norio.be/

Re: Eclipse plug-in

Posted by Litrik De Roy <li...@gmail.com>.
On Jan 11, 2008 11:06 AM, Jukka Zitting <ju...@gmail.com> wrote:
> On Jan 11, 2008 11:50 AM, Litrik De Roy <li...@gmail.com> wrote:
> > And more importantly - because I don't want to get into trouble - how
> > do I figure out what the license of each dependency is?
>
> They are all either ALv2 or compatible licenses, so you'll be fine as
> long as you include all the relevant copyright notices. If you like, I
> can help you with that.

That would be great. I'm was hoping that maven would be able to spit
out all license/copyrights as well.
I guess you'll need those licenses sooner or later anyway, if you plan
to provide a binary release (including dependencies).

> Legally there's no trouble if you just put your sources available, you
> only need to worry about the licenses of the dependencies if you
> release a binary plugin that contains the dependency jars.

Hmm...

I don't see how I can get the Eclipse plug-in to work without
including the dependencies in my own project. Unless I go for a source
only plug-in where you'll have to build Tika first and then copy the
dependencies yourself. For the Eclipse plug-in to work, the dependency
jars must be located inside the plug-in. I don't think Eclipse OSGi is
able to dynamically pull them from a maven repository.

I think I'll just read each 3rd party license and if OK I'll check
them in (together with their license/copyright) file. This allows me
to do a binary release.

-- 
Litrik De Roy
Norio ICT Consulting - http://www.norio.be/

Re: Eclipse plug-in

Posted by Litrik De Roy <li...@gmail.com>.
On Jan 22, 2008 12:27 AM, Jukka Zitting <ju...@gmail.com> wrote:
> As a part of TIKA-115 I reviewed all the licenses of our current (post
> TIKA-117) dependencies and updated the license and notice files
> accordingly in preparation for distributing a Tika bundle with all the
> dependencies.

Awesome.
This will make it lot easier to build/distribute the Eclipse plug-in.

I really should be able to find some time later this week to come up
with an initial alpha version.

-- 
Litrik De Roy
Norio ICT Consulting - http://www.norio.be/

Re: Eclipse plug-in

Posted by Jukka Zitting <ju...@gmail.com>.
Hi,

On Jan 11, 2008 12:06 PM, Jukka Zitting <ju...@gmail.com> wrote:
> On Jan 11, 2008 11:50 AM, Litrik De Roy <li...@gmail.com> wrote:
> > And more importantly - because I don't want to get into trouble - how
> > do I figure out what the license of each dependency is?
>
> They are all either ALv2 or compatible licenses, so you'll be fine as
> long as you include all the relevant copyright notices.

Hmm, that's not exactly correct... It turns out that the XOM library
brought in as a transitive dependency by Jaxen is LGPL software. You
should be fine as LGPL makes few demands on redistribution, but we'll
need to evaluate the impact of this LGPL dependency in relation with
the Apache third party licensing policy.

AFAIK Jaxen uses XOM only when querying XOM trees, and since Tika
never invokes that functionality the XOM dependency is fully optional.
According to the current draft of the third party licensing policy [1]
that should be OK, so there is no need to pull the 0.1 release because
of this.

And now after TIKA-117 we no longer even depend on Jaxen or XOM.

> If you like, I can help you with that.

As a part of TIKA-115 I reviewed all the licenses of our current (post
TIKA-117) dependencies and updated the license and notice files
accordingly in preparation for distributing a Tika bundle with all the
dependencies. See [2] and [3] for the copyright notices and license
information that cover both Tika itself and all the required runtime
dependencies.

[1] http://www.apache.org/legal/3party.html
[2] http://svn.apache.org/repos/asf/incubator/tika/trunk/NOTICE.txt
[3] http://svn.apache.org/repos/asf/incubator/tika/trunk/LICENSE.txt

BR,

Jukka Zitting

Re: Eclipse plug-in

Posted by Jukka Zitting <ju...@gmail.com>.
Hi,

On Jan 11, 2008 11:50 AM, Litrik De Roy <li...@gmail.com> wrote:
> I'll need to include all dependencies in the Eclipse plug-in.
> Currently I have copied a couple of jar files from my local maven
> repository into my Eclipse project. I have copied enough jars to get
> rid of compile errors and prevent NPEs at runtime.
>
> If I forget some jars I might be missing functionality that gets
> loaded dynamically. Excel files show no metadata, so i suspect I'm
> missing some more jars. How do I know I copied enough jars?

The full list of current dependencies in Tika is:

    bcmail-jdk14-136.jar
    bcprov-jdk14-136.jar
    commons-codec-1.3.jar
    commons-lang-2.1.jar
    commons-logging-1.0.4.jar
    dom4j-1.6.1.jar
    fontbox-0.1.0.jar
    icu4j-3.4.4.jar
    jaxen-1.1.1.jar
    jdom-1.0.jar
    jempbox-0.2.0.jar
    log4j-1.2.14.jar
    nekohtml-0.9.5.jar
    pdfbox-0.7.3.jar
    poi-3.0-FINAL.jar
    xalan-2.6.0.jar
    xercesImpl-2.6.2.jar
    xml-apis-1.3.02.jar
    xmlParserAPIs-2.6.2.jar
    xom-1.0.jar

You can get them all by running "mvn dependency:copy-dependencies"
from the Tika source directory.

> And more importantly - because I don't want to get into trouble - how
> do I figure out what the license of each dependency is?

They are all either ALv2 or compatible licenses, so you'll be fine as
long as you include all the relevant copyright notices. If you like, I
can help you with that.

Legally there's no trouble if you just put your sources available, you
only need to worry about the licenses of the dependencies if you
release a binary plugin that contains the dependency jars.

BR,

Jukka Zitting

Re: Eclipse plug-in

Posted by Litrik De Roy <li...@gmail.com>.
On Jan 11, 2008 9:54 AM, Litrik De Roy <li...@litrik.com> wrote:
>
> I'll clean up the source a little bit and I'll try to start a project
> at Google Code in the next couple of days.
> Stay tuned...
>

I'll need to include all dependencies in the Eclipse plug-in.
Currently I have copied a couple of jar files from my local maven
repository into my Eclipse project. I have copied enough jars to get
rid of compile errors and prevent NPEs at runtime.

If I forget some jars I might be missing functionality that gets
loaded dynamically. Excel files show no metadata, so i suspect I'm
missing some more jars. How do I know I copied enough jars?

And more importantly - because I don't want to get into trouble - how
do I figure out what the license of each dependency is?

-- 
Litrik De Roy
Norio ICT Consulting - http://www.norio.be/

Re: Eclipse plug-in

Posted by Litrik De Roy <li...@litrik.com>.
On Jan 11, 2008 9:48 AM, Jukka Zitting <ju...@gmail.com> wrote:
> On Jan 11, 2008 10:40 AM, Litrik De Roy <li...@gmail.com> wrote:
> > I have made an Eclipse plug-in so you can see all metadata of any
> > resource in your workspace in the Properties dialog. Nothing fancy,
> > but it works.
>
> I'd be eager to try it out, so it would be nice if you'd make at least
> the binary plugin available somewhere. Sources would be even better.
> :-)
>

I'll clean up the source a little bit and I'll try to start a project
at Google Code in the next couple of days.
Stay tuned...

-- 
Litrik De Roy
Norio ICT Consulting - http://www.norio.be/

Re: Eclipse plug-in

Posted by Jukka Zitting <ju...@gmail.com>.
Hi,

On Jan 11, 2008 10:40 AM, Litrik De Roy <li...@gmail.com> wrote:
> I have made an Eclipse plug-in so you can see all metadata of any
> resource in your workspace in the Properties dialog. Nothing fancy,
> but it works.

Oh, that's way cool! This is exactly the kind of higher-level
functionality I was hoping to see emerging once a generic metadata
extraction toolkit like Tika is available.

Excellent, we must be doing something right! :-)

> I'm not sure it is really worth the effort make the source code
> available. Maybe I'll set up a Google Code project if there is anybody
> interested.

I'd be eager to try it out, so it would be nice if you'd make at least
the binary plugin available somewhere. Sources would be even better.
:-)

BR,

Jukka Zitting