You are viewing a plain text version of this content. The canonical link for it is here.
Posted to legal-discuss@apache.org by Karl Wettin <ka...@apache.org> on 2008/04/15 01:57:23 UTC

IP on algorithmic implemetations

LUCENE-1039 is a Bayesian classifier that use Lucene for data storage 
and tokenization of text for various text mining purposes.

The code is based on the Naïve Bayes and Fisher method algorithms as 
described by Toby Segaran in "Programming Collective Intelligence",
ISBN 978-0-596-52932-1.

I can't find my copy right now, but the legal part says something like 
"you may use the code in this book in you commercial products but you 
can not distribute it any way you want, email permissions@oreilly.com".

I've sent multiple mails to them and a week ago I tried with Toby 
directly but never got a response from an of them.

However I don't use the code. The book speak Python and Lucene is Java. 
There are few similarities in the architecture, mainly due to the fact 
they are both implementations of an algorithm this guy in the UK came up 
with some 250 years ago.

People have hinted that the only issue is that I have brought it up as a 
potential issue.

Do I dare proceed committing this patch to the SVN trunk?


     karl

---------------------------------------------------------------------
DISCLAIMER: Discussions on this list are informational and educational
only.  Statements made on this list are not privileged, do not
constitute legal advice, and do not necessarily reflect the opinions
and policies of the ASF.  See <http://www.apache.org/licenses/> for
official ASF policies and documents.
---------------------------------------------------------------------
To unsubscribe, e-mail: legal-discuss-unsubscribe@apache.org
For additional commands, e-mail: legal-discuss-help@apache.org


Re: IP on algorithmic implemetations

Posted by Paul Libbrecht <pa...@activemath.org>.
Karl,

could you revise "250 years ago" and put a hand on the original  
article and maybe related patents? I think these are the ones that  
can byte the ASF.

Naive Bayes is a fairly well known algorithm which, I think is  
implemented in other tools, for example Weka or Yale/RapidMiner (both  
are GPL). Have you had a chance to look at their source code and  
documentation? There must be pointers there.

Keep up the good work for mining-in-Lucene though! I may be wrong but  
I think it is the first BSD-ish licensed set of tools for mining, at  
least in Java.

paul


Le 15 avr. 08 à 01:57, Karl Wettin a écrit :
> LUCENE-1039 is a Bayesian classifier that use Lucene for data  
> storage and tokenization of text for various text mining purposes.
>
> The code is based on the Naïve Bayes and Fisher method algorithms  
> as described by Toby Segaran in "Programming Collective Intelligence",
> ISBN 978-0-596-52932-1.
>
> I can't find my copy right now, but the legal part says something  
> like "you may use the code in this book in you commercial products  
> but you can not distribute it any way you want, email  
> permissions@oreilly.com".
>
> I've sent multiple mails to them and a week ago I tried with Toby  
> directly but never got a response from an of them.
>
> However I don't use the code. The book speak Python and Lucene is  
> Java. There are few similarities in the architecture, mainly due to  
> the fact they are both implementations of an algorithm this guy in  
> the UK came up with some 250 years ago.
>
> People have hinted that the only issue is that I have brought it up  
> as a potential issue.
>
> Do I dare proceed committing this patch to the SVN trunk?
>

Re: IP on algorithmic implemetations

Posted by Doug Cutting <cu...@apache.org>.
Karl Wettin wrote:
> However I don't use the code. The book speak Python and Lucene is Java. 
> There are few similarities in the architecture, mainly due to the fact 
> they are both implementations of an algorithm this guy in the UK came up 
> with some 250 years ago.

A re-implementation to a different programming language may be a 
derivative work, depending on how similar it is to the original.  If the 
original license prohibits commercial use, and the new implementation 
has substantial similarities to the original, then the code could not be 
used at Apache, since the Apache license permits commercial use.

> Do I dare proceed committing this patch to the SVN trunk?

How similar is it to the original?  If it is no more similar than other 
implementations of the same algorithms, then there should not be a 
problem.  But if it shares many elements that are particular to the 
original program, that could be a problem.

Doug

---------------------------------------------------------------------
DISCLAIMER: Discussions on this list are informational and educational
only.  Statements made on this list are not privileged, do not
constitute legal advice, and do not necessarily reflect the opinions
and policies of the ASF.  See <http://www.apache.org/licenses/> for
official ASF policies and documents.
---------------------------------------------------------------------
To unsubscribe, e-mail: legal-discuss-unsubscribe@apache.org
For additional commands, e-mail: legal-discuss-help@apache.org


Re: IP on algorithmic implemetations

Posted by Ian Holsman <li...@holsman.net>.
Niclas Hedhman wrote:
> On Tuesday 15 April 2008 10:51, Ian Holsman wrote:
>   
>>  someone has
>> mentioned there might be a patent on the algorithm.
>>     
>
> I have been told by a Patent Lawyer that algorithms are not patentable, when I 
> wanted to patent a search algorithm some 6-7 years ago. Seems there is some 
> discrepancy here...
>
>   
Hi Niclas.
There was a lot of controversy regarding the GIF file format some years 
back due to the compression algorithm it was using (LZW), that was 
patented, and unisys (the patent holder) demanding money for it.
(see 
http://en.wikipedia.org/wiki/Graphics_Interchange_Format#Unisys_and_LZW_patent_enforcement 
)
it's the reason we have PNG's today.

Regards
Ian

ps. that one expired in 2003.
> Cheers
>   


---------------------------------------------------------------------
DISCLAIMER: Discussions on this list are informational and educational
only.  Statements made on this list are not privileged, do not
constitute legal advice, and do not necessarily reflect the opinions
and policies of the ASF.  See <http://www.apache.org/licenses/> for
official ASF policies and documents.
---------------------------------------------------------------------
To unsubscribe, e-mail: legal-discuss-unsubscribe@apache.org
For additional commands, e-mail: legal-discuss-help@apache.org


Re: IP on algorithmic implemetations

Posted by Niclas Hedhman <ni...@hedhman.org>.
On Tuesday 15 April 2008 10:51, Ian Holsman wrote:
>  someone has
> mentioned there might be a patent on the algorithm.

I have been told by a Patent Lawyer that algorithms are not patentable, when I 
wanted to patent a search algorithm some 6-7 years ago. Seems there is some 
discrepancy here...


Cheers
-- 
Niclas Hedhman, Software Developer

I  live here; http://tinyurl.com/2qq9er
I  work here; http://tinyurl.com/2ymelc
I relax here; http://tinyurl.com/2cgsug

---------------------------------------------------------------------
DISCLAIMER: Discussions on this list are informational and educational
only.  Statements made on this list are not privileged, do not
constitute legal advice, and do not necessarily reflect the opinions
and policies of the ASF.  See <http://www.apache.org/licenses/> for
official ASF policies and documents.
---------------------------------------------------------------------
To unsubscribe, e-mail: legal-discuss-unsubscribe@apache.org
For additional commands, e-mail: legal-discuss-help@apache.org


Re: IP on algorithmic implemetations

Posted by Doug Cutting <cu...@apache.org>.
Ian Holsman wrote:
> we have a similar issue on MAHOUT-31 
> (https://issues.apache.org/jira/browse/MAHOUT-31 ) where someone has 
> mentioned there might be a patent on the algorithm.

They're not actually very similar.  One involves copyright and the other 
patents.  In the case of patents we generally wait for a patent owner to 
assert that we've infringed before we worry about this.  In the case of 
copyright, we are generally pro-active.

Doug

---------------------------------------------------------------------
DISCLAIMER: Discussions on this list are informational and educational
only.  Statements made on this list are not privileged, do not
constitute legal advice, and do not necessarily reflect the opinions
and policies of the ASF.  See <http://www.apache.org/licenses/> for
official ASF policies and documents.
---------------------------------------------------------------------
To unsubscribe, e-mail: legal-discuss-unsubscribe@apache.org
For additional commands, e-mail: legal-discuss-help@apache.org


Re: IP on algorithmic implemetations

Posted by Ian Holsman <li...@holsman.net>.
we have a similar issue on MAHOUT-31 
(https://issues.apache.org/jira/browse/MAHOUT-31 ) where someone has 
mentioned there might be a patent on the algorithm.

Karl Wettin wrote:
> LUCENE-1039 is a Bayesian classifier that use Lucene for data storage 
> and tokenization of text for various text mining purposes.
>
> The code is based on the Naïve Bayes and Fisher method algorithms as 
> described by Toby Segaran in "Programming Collective Intelligence",
> ISBN 978-0-596-52932-1.
>
> I can't find my copy right now, but the legal part says something like 
> "you may use the code in this book in you commercial products but you 
> can not distribute it any way you want, email permissions@oreilly.com".
>
> I've sent multiple mails to them and a week ago I tried with Toby 
> directly but never got a response from an of them.
>
> However I don't use the code. The book speak Python and Lucene is 
> Java. There are few similarities in the architecture, mainly due to 
> the fact they are both implementations of an algorithm this guy in the 
> UK came up with some 250 years ago.
>
> People have hinted that the only issue is that I have brought it up as 
> a potential issue.
>
> Do I dare proceed committing this patch to the SVN trunk?
>
>
>     karl
>
> ---------------------------------------------------------------------
> DISCLAIMER: Discussions on this list are informational and educational
> only.  Statements made on this list are not privileged, do not
> constitute legal advice, and do not necessarily reflect the opinions
> and policies of the ASF.  See <http://www.apache.org/licenses/> for
> official ASF policies and documents.
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: legal-discuss-unsubscribe@apache.org
> For additional commands, e-mail: legal-discuss-help@apache.org
>
>


---------------------------------------------------------------------
DISCLAIMER: Discussions on this list are informational and educational
only.  Statements made on this list are not privileged, do not
constitute legal advice, and do not necessarily reflect the opinions
and policies of the ASF.  See <http://www.apache.org/licenses/> for
official ASF policies and documents.
---------------------------------------------------------------------
To unsubscribe, e-mail: legal-discuss-unsubscribe@apache.org
For additional commands, e-mail: legal-discuss-help@apache.org