You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by Alan Gibson <al...@gmail.com> on 2005/01/28 03:26:32 UTC

A Lucene document factory implementation

Ive created a simple document factory for a Lucene based project that
im working on.. I want to see if anyone would have in any interest in
this;; maybe put it in the sandbox..

Basically,, the following steps are performed:
1. DocumentFactory uses a component that determines the file type via
extension and,, soon,, magic number..
2. DocumentFactory then hands off the url to the appropriate Builder..

Classes:

DocumentFactory
    // primary method to create document
    Document build(URL url);
    // force to build as given mime type
    Document build(URL url, String mimeType);

Builder (Interface)
    Document build(URL url);    

So far the file type guesser guesses 383 or so file types reliably.. I
have working Builder implementations for PDFs, RTFs, HTML and Plain
Text files..

alan

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-dev-help@jakarta.apache.org