You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by sa...@cfl.rr.com on 2006/02/09 17:13:10 UTC
anyone interested in taking over textmining.org?
The TextMining.org website keeps getting hacked and I don't have the
time to upgrade postnuke to a more secure version. Also, because of
legal reasons I can't maintain the software. I am more than willing
to "hand-off" the project to lucene or someone else. It's an apache 2
license so anyone can branch at anytime and use any license they want.
However, if someone wants to take over and gets my seal of approval, I
will make the textmining.org home page redirect to your site.
It extracts text from Word documents pretty solidly. If there are
problems, they are caused by fast-saved files or files saved with the
doc extensions that aren't actually Word documents (rtf, html). Unlike
POI, it supports Word 6.0/95 documents. There are many ways it can be
improved but they are trivial changes in my opinion. The core logic is
solid and is used in commercial/gov't applications.
Send me an email directly if you are interested.
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org