You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@uima.apache.org by Richard Eckart de Castilho <re...@apache.org> on 2016/12/01 01:09:47 UTC
Re: Example of Tika FilesystemReader working with uimaFIT?
Hi,
you can set up a "types.txt" file as documented here [1] to
point uimaFIT to the type system descriptor that contains the missing
annotation type.
Alternatively, you can construct a load your type system description
in code and pass it after the class argument to createCollectionReader,
e.g.
TypeSystemDescription tsd = TypeSystemDescriptionFactory.createTypeSystemDescriptionFromPath(
"path/to/your/typesystem.xml");
CollectionReader readerEngine = CollectionReaderFactory.createCollectionReader(
FileSystemCollectionReader.class, tsd, ... params ...);
Cheers,
-- Richard
[1] https://uima.apache.org/d/uimafit-current/tools.uimafit.book.html#ugr.tools.uimafit.typesystem
> On 26.11.2016, at 22:26, Paul Browne <pa...@firstpartners.net> wrote:
>
> ​Folks,
>
> Wondering if there are any samples of using the Uima component Tika
> FilesystemReader working with uimaFIT?
>
> I've been playing around with it, getting several errors (probably my
> fault) but can't appear to find a similar example on the website / mailing
> list despite a search. Have downloaded and compiled source (Uima, Uima
> tools, examples); existing code is clear but when I try to combine them to
> do the following outline I get errors.
>
> Aim is to:
> 1)Read a collection of documents using the Uima component Tika
> FilesystemReader
> 2)later - do more serious POS tagging.
>
> The code for is:
>
> CollectionReader readerEngine =
> CollectionReaderFactory.createCollectionReader(FileSystemCollectionReader.class,
> FileSystemCollectionReader.PARAM_INPUTDIR,
> "C:\\Somelocation",
> FileSystemCollectionReader.PARAM_ENCODING, "UTF-8",
> FileSystemCollectionReader.PARAM_LANGUAGE, "EN");
>
> AggregateBuilder builder = new AggregateBuilder();
>
> SimplePipeline.runPipeline(readerEngine, builder.createAggregate());
>
> And the error is
> Exception in thread "main" org.apache.uima.cas.CASRuntimeException: JCas
> type "org.apache.uima.examples.SourceDocumentInformation" used in Java
> code, but was not declared in the XML type descriptor.
>
> Similar error referenced at link below, but not clear how to implement the
> suggested fix
> http://user.uima.apache.narkive.com/b940cOrO/how-to-test-a-collectionreader
>
> Any suggestions or pointers on the web that I should be looking at?
>
> Thanks for your help
>
> Paul