You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@uima.apache.org by "Richard Eckart de Castilho (JIRA)" <de...@uima.apache.org> on 2012/06/09 16:38:42 UTC

[jira] [Created] (UIMA-2419) Initial view for sofa unaware components not automatically created

Richard Eckart de Castilho created UIMA-2419:
------------------------------------------------

             Summary: Initial view for sofa unaware components not automatically created
                 Key: UIMA-2419
                 URL: https://issues.apache.org/jira/browse/UIMA-2419
             Project: UIMA
          Issue Type: Bug
          Components: Core Java Framework
            Reporter: Richard Eckart de Castilho


When running a sofa-unaware component in an aggregate analysis engine, the initial view for the component to operate on is not automatically created if it does not exist. This causes a CASRuntimeException, here "No sofaFS with name A found.".

{noformat}
org.apache.uima.analysis_engine.AnalysisEngineProcessException: Annotator processing failed.    
	at org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.callAnalysisComponentProcess(PrimitiveAnalysisEngine_impl.java:394)
	at org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.processAndOutputNewCASes(PrimitiveAnalysisEngine_impl.java:298)
	at org.apache.uima.analysis_engine.asb.impl.ASB_impl$AggregateCasIterator.processUntilNextOutputCas(ASB_impl.java:568)
	at org.apache.uima.analysis_engine.asb.impl.ASB_impl$AggregateCasIterator.<init>(ASB_impl.java:410)
	at org.apache.uima.analysis_engine.asb.impl.ASB_impl.process(ASB_impl.java:343)
	at org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl.processAndOutputNewCASes(AggregateAnalysisEngine_impl.java:265)
	at org.apache.uima.analysis_engine.impl.AnalysisEngineImplBase.processAndOutputNewCASes(AnalysisEngineImplBase.java:340)
         ...
Caused by: org.apache.uima.cas.CASRuntimeException: No sofaFS with name A found.
	at org.apache.uima.cas.impl.CASImpl.getSofa(CASImpl.java:661)
	at org.apache.uima.cas.impl.CASImpl.getView(CASImpl.java:2658)
	at org.apache.uima.impl.Util.getStartingView(Util.java:46)
	at org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.callAnalysisComponentProcess(PrimitiveAnalysisEngine_impl.java:349)
	... 31 more
{noformat}

I'd consider this a bug, because a sofa-unaware component cannot be expected to create a view itself. If the sofa-unaware component is the first one in an aggregate, e.g. acting as a reader, then there is also no other component to create the view before.

If the initial view of a component it is mapped to does not exist, it should be the task of the UIMA framework to create this view. 

See also CPE ArtifactProducer:481 (UIMA 2.4.0).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Re: [jira] [Created] (UIMA-2419) Initial view for sofa unaware components not automatically created

Posted by Eddie Epstein <ea...@gmail.com>.
Hi Richard,

The _InitialView is different in order to maintain compatibility with
UIMA applications that predated Views. For example, an application
creates a CAS and then operates on it without creating any Views.

Regards,
Eddie

On Mon, Jul 30, 2012 at 4:42 AM, Richard Eckart de Castilho
<ec...@ukp.informatik.tu-darmstadt.de> wrote:
> Hello Eddie,
>
> I did not try it (yet), but I agree that this should work. While I understand your argumentation, my subjective feeling
> is that the naming of SofAs at the pipeline level and at the AE level should be completely independent and the
> mapping flexible. I think the _InitialView should not receive a special treatment in this context.
>
> I'll get back if I run into really substantial problems or if your suggestion should  not work out.
>
> Thanks!
>
> -- Richard
>
> Am 17.06.2012 um 18:11 schrieb Eddie Epstein:
>
>> Richard,
>>
>> Non-default views are currently created by application code, not by
>> the framework. The absence of an expected view is more clearly
>> diagnostic than the highly varied errors that would come if the
>> framework automatically created a view.
>>
>> Sofa mapping is intended to solve your scenario by having the CR fill
>> the default _IntialView and then mapping view A to the _InitialView
>> for the analyzer. When analyzer asks for view(A) it would get
>> _InitialView.
>>
>> Did you try this?
>>
>> Eddie
>>
>>
>> On Fri, Jun 15, 2012 at 5:36 PM, Richard Eckart de Castilho
>> <ec...@ukp.informatik.tu-darmstadt.de> wrote:
>>> Am 11.06.2012 um 20:11 schrieb Eddie Epstein:
>>>
>>>> Can you be a bit more explicit what the failing scenario is?
>>>
>>> Take a scenario where you need want to access the CASes produced by an aggregate pipeline directly - no CAS consumer, but you want to use a reader to fill the CASes (this is what is implemented in the demo below).
>>>
>>> Now add the need for sofa mapping to that scenario, because you want to run a complex analysis. The collection reader is not sofa aware, but you do want it to write to some view "A" instead of writing to the "_initialView", because "A" is what the next component will process. This is possible now, because in the AnalysisEngineDescription I can declare sofa mappings for the reader. However, I would get an exception due to UIMA-2419.
>>>
>>>> I'm definitely confused by wrapping a CR in an AE descriptor. Is it
>>>> possible to paste here an aggregate descriptor using sample components
>>>> from the UIMA SDK that demonstrates the problem?
>>>
>>> So here is the demo of wrapping a CR in an AE - no sofa mappings here because they would cause an exception. The SimpleReader
>>> creates a single CAS and set the text, the SimpleAnalyzer additionally sets the document language. It's a very basic example.
>>> The full runnable sources are at
>>>
>>> http://code.google.com/p/uimafit/source/browse/trunk/uimaFIT/src/test/java/org/uimafit/factory/AggregateWithReaderTest.java
>>>
>>> /**
>>>  * Demo of disguising a reader as a CAS multiplier. This works because internally, UIMA wraps
>>>  * the reader in a CollectionReaderAdapter. This nice thing about this is, that in principle
>>>  * it would be possible to define sofa mappings. However, UIMA-2419 prevents this.
>>>  */
>>> @Test
>>> public void demoAggregateWithDisguisedReader() throws UIMAException {
>>>  ResourceSpecifierFactory factory = UIMAFramework.getResourceSpecifierFactory();
>>>
>>>  AnalysisEngineDescription reader = factory.createAnalysisEngineDescription();
>>>  reader.getMetaData().setName("reader");
>>>  reader.setPrimitive(true);
>>>  reader.setImplementationName(SimpleReader.class.getName());
>>>  reader.getAnalysisEngineMetaData().getOperationalProperties().setOutputsNewCASes(true);
>>>
>>>  AnalysisEngineDescription analyzer = factory.createAnalysisEngineDescription();
>>>  analyzer.getMetaData().setName("analyzer");
>>>  analyzer.setPrimitive(true);
>>>  analyzer.setImplementationName(SimpleAnalyzer.class.getName());
>>>
>>>  FixedFlow flow = factory.createFixedFlow();
>>>  flow.setFixedFlow(new String[] { "reader", "analyzer" });
>>>
>>>  AnalysisEngineDescription aggregate = factory.createAnalysisEngineDescription();
>>>  aggregate.getMetaData().setName("aggregate");
>>>  aggregate.setPrimitive(false);
>>>  aggregate.getAnalysisEngineMetaData().setFlowConstraints(flow);
>>>  aggregate.getAnalysisEngineMetaData().getOperationalProperties().setOutputsNewCASes(true);
>>>  aggregate.getAnalysisEngineMetaData().getOperationalProperties()
>>>      .setMultipleDeploymentAllowed(false);
>>>  aggregate.getDelegateAnalysisEngineSpecifiersWithImports().put("reader", reader);
>>>  aggregate.getDelegateAnalysisEngineSpecifiersWithImports().put("analyzer", analyzer);
>>>
>>>  AnalysisEngine pipeline = UIMAFramework.produceAnalysisEngine(aggregate);
>>>  CasIterator iterator = pipeline.processAndOutputNewCASes(pipeline.newCAS());
>>>  while (iterator.hasNext()) {
>>>    CAS cas = iterator.next();
>>>    System.out.printf("[%s] is [%s]%n", cas.getDocumentText(), cas.getDocumentLanguage());
>>>  }
>>> }
>
> --
> -------------------------------------------------------------------
> Richard Eckart de Castilho
> Technical Lead
> Ubiquitous Knowledge Processing Lab (UKP-TUD)
> FB 20 Computer Science Department
> Technische Universität Darmstadt
> Hochschulstr. 10, D-64289 Darmstadt, Germany
> phone [+49] (0)6151 16-7477, fax -5455, room S2/02/B117
> eckart@ukp.informatik.tu-darmstadt.de
> www.ukp.tu-darmstadt.de
> Web Research at TU Darmstadt (WeRC) www.werc.tu-darmstadt.de
> -------------------------------------------------------------------
>
>
>
>
>
>

Re: [jira] [Created] (UIMA-2419) Initial view for sofa unaware components not automatically created

Posted by Richard Eckart de Castilho <ec...@ukp.informatik.tu-darmstadt.de>.
Hello Eddie,

I did not try it (yet), but I agree that this should work. While I understand your argumentation, my subjective feeling
is that the naming of SofAs at the pipeline level and at the AE level should be completely independent and the
mapping flexible. I think the _InitialView should not receive a special treatment in this context.

I'll get back if I run into really substantial problems or if your suggestion should  not work out.

Thanks!

-- Richard

Am 17.06.2012 um 18:11 schrieb Eddie Epstein:

> Richard,
> 
> Non-default views are currently created by application code, not by
> the framework. The absence of an expected view is more clearly
> diagnostic than the highly varied errors that would come if the
> framework automatically created a view.
> 
> Sofa mapping is intended to solve your scenario by having the CR fill
> the default _IntialView and then mapping view A to the _InitialView
> for the analyzer. When analyzer asks for view(A) it would get
> _InitialView.
> 
> Did you try this?
> 
> Eddie
> 
> 
> On Fri, Jun 15, 2012 at 5:36 PM, Richard Eckart de Castilho
> <ec...@ukp.informatik.tu-darmstadt.de> wrote:
>> Am 11.06.2012 um 20:11 schrieb Eddie Epstein:
>> 
>>> Can you be a bit more explicit what the failing scenario is?
>> 
>> Take a scenario where you need want to access the CASes produced by an aggregate pipeline directly - no CAS consumer, but you want to use a reader to fill the CASes (this is what is implemented in the demo below).
>> 
>> Now add the need for sofa mapping to that scenario, because you want to run a complex analysis. The collection reader is not sofa aware, but you do want it to write to some view "A" instead of writing to the "_initialView", because "A" is what the next component will process. This is possible now, because in the AnalysisEngineDescription I can declare sofa mappings for the reader. However, I would get an exception due to UIMA-2419.
>> 
>>> I'm definitely confused by wrapping a CR in an AE descriptor. Is it
>>> possible to paste here an aggregate descriptor using sample components
>>> from the UIMA SDK that demonstrates the problem?
>> 
>> So here is the demo of wrapping a CR in an AE - no sofa mappings here because they would cause an exception. The SimpleReader
>> creates a single CAS and set the text, the SimpleAnalyzer additionally sets the document language. It's a very basic example.
>> The full runnable sources are at
>> 
>> http://code.google.com/p/uimafit/source/browse/trunk/uimaFIT/src/test/java/org/uimafit/factory/AggregateWithReaderTest.java
>> 
>> /**
>>  * Demo of disguising a reader as a CAS multiplier. This works because internally, UIMA wraps
>>  * the reader in a CollectionReaderAdapter. This nice thing about this is, that in principle
>>  * it would be possible to define sofa mappings. However, UIMA-2419 prevents this.
>>  */
>> @Test
>> public void demoAggregateWithDisguisedReader() throws UIMAException {
>>  ResourceSpecifierFactory factory = UIMAFramework.getResourceSpecifierFactory();
>> 
>>  AnalysisEngineDescription reader = factory.createAnalysisEngineDescription();
>>  reader.getMetaData().setName("reader");
>>  reader.setPrimitive(true);
>>  reader.setImplementationName(SimpleReader.class.getName());
>>  reader.getAnalysisEngineMetaData().getOperationalProperties().setOutputsNewCASes(true);
>> 
>>  AnalysisEngineDescription analyzer = factory.createAnalysisEngineDescription();
>>  analyzer.getMetaData().setName("analyzer");
>>  analyzer.setPrimitive(true);
>>  analyzer.setImplementationName(SimpleAnalyzer.class.getName());
>> 
>>  FixedFlow flow = factory.createFixedFlow();
>>  flow.setFixedFlow(new String[] { "reader", "analyzer" });
>> 
>>  AnalysisEngineDescription aggregate = factory.createAnalysisEngineDescription();
>>  aggregate.getMetaData().setName("aggregate");
>>  aggregate.setPrimitive(false);
>>  aggregate.getAnalysisEngineMetaData().setFlowConstraints(flow);
>>  aggregate.getAnalysisEngineMetaData().getOperationalProperties().setOutputsNewCASes(true);
>>  aggregate.getAnalysisEngineMetaData().getOperationalProperties()
>>      .setMultipleDeploymentAllowed(false);
>>  aggregate.getDelegateAnalysisEngineSpecifiersWithImports().put("reader", reader);
>>  aggregate.getDelegateAnalysisEngineSpecifiersWithImports().put("analyzer", analyzer);
>> 
>>  AnalysisEngine pipeline = UIMAFramework.produceAnalysisEngine(aggregate);
>>  CasIterator iterator = pipeline.processAndOutputNewCASes(pipeline.newCAS());
>>  while (iterator.hasNext()) {
>>    CAS cas = iterator.next();
>>    System.out.printf("[%s] is [%s]%n", cas.getDocumentText(), cas.getDocumentLanguage());
>>  }
>> }

-- 
------------------------------------------------------------------- 
Richard Eckart de Castilho
Technical Lead
Ubiquitous Knowledge Processing Lab (UKP-TUD) 
FB 20 Computer Science Department      
Technische Universität Darmstadt 
Hochschulstr. 10, D-64289 Darmstadt, Germany 
phone [+49] (0)6151 16-7477, fax -5455, room S2/02/B117
eckart@ukp.informatik.tu-darmstadt.de 
www.ukp.tu-darmstadt.de 
Web Research at TU Darmstadt (WeRC) www.werc.tu-darmstadt.de
------------------------------------------------------------------- 







Re: [jira] [Created] (UIMA-2419) Initial view for sofa unaware components not automatically created

Posted by Eddie Epstein <ea...@gmail.com>.
Richard,

Non-default views are currently created by application code, not by
the framework. The absence of an expected view is more clearly
diagnostic than the highly varied errors that would come if the
framework automatically created a view.

Sofa mapping is intended to solve your scenario by having the CR fill
the default _IntialView and then mapping view A to the _InitialView
for the analyzer. When analyzer asks for view(A) it would get
_InitialView.

Did you try this?

Eddie


On Fri, Jun 15, 2012 at 5:36 PM, Richard Eckart de Castilho
<ec...@ukp.informatik.tu-darmstadt.de> wrote:
> Am 11.06.2012 um 20:11 schrieb Eddie Epstein:
>
>> Can you be a bit more explicit what the failing scenario is?
>
> Take a scenario where you need want to access the CASes produced by an aggregate pipeline directly - no CAS consumer, but you want to use a reader to fill the CASes (this is what is implemented in the demo below).
>
> Now add the need for sofa mapping to that scenario, because you want to run a complex analysis. The collection reader is not sofa aware, but you do want it to write to some view "A" instead of writing to the "_initialView", because "A" is what the next component will process. This is possible now, because in the AnalysisEngineDescription I can declare sofa mappings for the reader. However, I would get an exception due to UIMA-2419.
>
>> I'm definitely confused by wrapping a CR in an AE descriptor. Is it
>> possible to paste here an aggregate descriptor using sample components
>> from the UIMA SDK that demonstrates the problem?
>
> So here is the demo of wrapping a CR in an AE - no sofa mappings here because they would cause an exception. The SimpleReader
> creates a single CAS and set the text, the SimpleAnalyzer additionally sets the document language. It's a very basic example.
> The full runnable sources are at
>
> http://code.google.com/p/uimafit/source/browse/trunk/uimaFIT/src/test/java/org/uimafit/factory/AggregateWithReaderTest.java
>
> /**
>  * Demo of disguising a reader as a CAS multiplier. This works because internally, UIMA wraps
>  * the reader in a CollectionReaderAdapter. This nice thing about this is, that in principle
>  * it would be possible to define sofa mappings. However, UIMA-2419 prevents this.
>  */
> @Test
> public void demoAggregateWithDisguisedReader() throws UIMAException {
>  ResourceSpecifierFactory factory = UIMAFramework.getResourceSpecifierFactory();
>
>  AnalysisEngineDescription reader = factory.createAnalysisEngineDescription();
>  reader.getMetaData().setName("reader");
>  reader.setPrimitive(true);
>  reader.setImplementationName(SimpleReader.class.getName());
>  reader.getAnalysisEngineMetaData().getOperationalProperties().setOutputsNewCASes(true);
>
>  AnalysisEngineDescription analyzer = factory.createAnalysisEngineDescription();
>  analyzer.getMetaData().setName("analyzer");
>  analyzer.setPrimitive(true);
>  analyzer.setImplementationName(SimpleAnalyzer.class.getName());
>
>  FixedFlow flow = factory.createFixedFlow();
>  flow.setFixedFlow(new String[] { "reader", "analyzer" });
>
>  AnalysisEngineDescription aggregate = factory.createAnalysisEngineDescription();
>  aggregate.getMetaData().setName("aggregate");
>  aggregate.setPrimitive(false);
>  aggregate.getAnalysisEngineMetaData().setFlowConstraints(flow);
>  aggregate.getAnalysisEngineMetaData().getOperationalProperties().setOutputsNewCASes(true);
>  aggregate.getAnalysisEngineMetaData().getOperationalProperties()
>      .setMultipleDeploymentAllowed(false);
>  aggregate.getDelegateAnalysisEngineSpecifiersWithImports().put("reader", reader);
>  aggregate.getDelegateAnalysisEngineSpecifiersWithImports().put("analyzer", analyzer);
>
>  AnalysisEngine pipeline = UIMAFramework.produceAnalysisEngine(aggregate);
>  CasIterator iterator = pipeline.processAndOutputNewCASes(pipeline.newCAS());
>  while (iterator.hasNext()) {
>    CAS cas = iterator.next();
>    System.out.printf("[%s] is [%s]%n", cas.getDocumentText(), cas.getDocumentLanguage());
>  }
> }
>
> -- Richard
>
> --
> -------------------------------------------------------------------
> Richard Eckart de Castilho
> Technical Lead
> Ubiquitous Knowledge Processing Lab (UKP-TUD)
> FB 20 Computer Science Department
> Technische Universität Darmstadt
> Hochschulstr. 10, D-64289 Darmstadt, Germany
> phone [+49] (0)6151 16-7477, fax -5455, room S2/02/B117
> eckart@ukp.informatik.tu-darmstadt.de
> www.ukp.tu-darmstadt.de
> Web Research at TU Darmstadt (WeRC) www.werc.tu-darmstadt.de
> -------------------------------------------------------------------
>
>
>
>
>
>

Re: [jira] [Created] (UIMA-2419) Initial view for sofa unaware components not automatically created

Posted by Richard Eckart de Castilho <ec...@ukp.informatik.tu-darmstadt.de>.
Am 11.06.2012 um 20:11 schrieb Eddie Epstein:

> Can you be a bit more explicit what the failing scenario is?

Take a scenario where you need want to access the CASes produced by an aggregate pipeline directly - no CAS consumer, but you want to use a reader to fill the CASes (this is what is implemented in the demo below).

Now add the need for sofa mapping to that scenario, because you want to run a complex analysis. The collection reader is not sofa aware, but you do want it to write to some view "A" instead of writing to the "_initialView", because "A" is what the next component will process. This is possible now, because in the AnalysisEngineDescription I can declare sofa mappings for the reader. However, I would get an exception due to UIMA-2419.

> I'm definitely confused by wrapping a CR in an AE descriptor. Is it
> possible to paste here an aggregate descriptor using sample components
> from the UIMA SDK that demonstrates the problem?

So here is the demo of wrapping a CR in an AE - no sofa mappings here because they would cause an exception. The SimpleReader
creates a single CAS and set the text, the SimpleAnalyzer additionally sets the document language. It's a very basic example.
The full runnable sources are at

http://code.google.com/p/uimafit/source/browse/trunk/uimaFIT/src/test/java/org/uimafit/factory/AggregateWithReaderTest.java

/**
 * Demo of disguising a reader as a CAS multiplier. This works because internally, UIMA wraps
 * the reader in a CollectionReaderAdapter. This nice thing about this is, that in principle
 * it would be possible to define sofa mappings. However, UIMA-2419 prevents this.
 */
@Test
public void demoAggregateWithDisguisedReader() throws UIMAException {
  ResourceSpecifierFactory factory = UIMAFramework.getResourceSpecifierFactory();
	
  AnalysisEngineDescription reader = factory.createAnalysisEngineDescription();
  reader.getMetaData().setName("reader");
  reader.setPrimitive(true);
  reader.setImplementationName(SimpleReader.class.getName());
  reader.getAnalysisEngineMetaData().getOperationalProperties().setOutputsNewCASes(true);

  AnalysisEngineDescription analyzer = factory.createAnalysisEngineDescription();
  analyzer.getMetaData().setName("analyzer");
  analyzer.setPrimitive(true);
  analyzer.setImplementationName(SimpleAnalyzer.class.getName());

  FixedFlow flow = factory.createFixedFlow();
  flow.setFixedFlow(new String[] { "reader", "analyzer" });

  AnalysisEngineDescription aggregate = factory.createAnalysisEngineDescription();
  aggregate.getMetaData().setName("aggregate");
  aggregate.setPrimitive(false);
  aggregate.getAnalysisEngineMetaData().setFlowConstraints(flow);
  aggregate.getAnalysisEngineMetaData().getOperationalProperties().setOutputsNewCASes(true);
  aggregate.getAnalysisEngineMetaData().getOperationalProperties()
      .setMultipleDeploymentAllowed(false);
  aggregate.getDelegateAnalysisEngineSpecifiersWithImports().put("reader", reader);
  aggregate.getDelegateAnalysisEngineSpecifiersWithImports().put("analyzer", analyzer);

  AnalysisEngine pipeline = UIMAFramework.produceAnalysisEngine(aggregate);
  CasIterator iterator = pipeline.processAndOutputNewCASes(pipeline.newCAS());
  while (iterator.hasNext()) {
    CAS cas = iterator.next();
    System.out.printf("[%s] is [%s]%n", cas.getDocumentText(), cas.getDocumentLanguage());
  }
}

-- Richard

-- 
------------------------------------------------------------------- 
Richard Eckart de Castilho
Technical Lead
Ubiquitous Knowledge Processing Lab (UKP-TUD) 
FB 20 Computer Science Department      
Technische Universität Darmstadt 
Hochschulstr. 10, D-64289 Darmstadt, Germany 
phone [+49] (0)6151 16-7477, fax -5455, room S2/02/B117
eckart@ukp.informatik.tu-darmstadt.de 
www.ukp.tu-darmstadt.de 
Web Research at TU Darmstadt (WeRC) www.werc.tu-darmstadt.de
------------------------------------------------------------------- 







Re: [jira] [Created] (UIMA-2419) Initial view for sofa unaware components not automatically created

Posted by Richard Eckart de Castilho <ec...@ukp.informatik.tu-darmstadt.de>.
Hi Eddie,

I'd suggest I mount a piece of code to programmatically create such a descriptor. For an example I should be able to do that with reasonable effort and lines of code even without uimaFIT. If that's fine for you as well, stay tuned.

-- Richard

Am 11.06.2012 um 20:11 schrieb Eddie Epstein:

> Hi Richard,
> 
> Can you be a bit more explicit what the failing scenario is? I'm
> definitely confused by wrapping a CR in an AE descriptor. Is it
> possible to paste here an aggregate descriptor using sample components
> from the UIMA SDK that demonstrates the problem?
> 
> Thanks,
> Eddie
> 
> 
> On Sun, Jun 10, 2012 at 2:11 PM, Richard Eckart de Castilho
> <ec...@ukp.informatik.tu-darmstadt.de> wrote:
>> Am 10.06.2012 um 19:50 schrieb Richard Eckart de Castilho:
>> 
>>> I guess another option should be to change CollectionReaderAdapter to create any missing initial view for sofa-unaware readers. That would not have any side other component type and it would solve the problem for my use-case as well. The problem is, that doesn't work, because the PrimitiveAnalysisEngine_impl.classAnalysisComponentProcess() already tries to access the mapped view and fails. Changing that to test if the mAnalysisComponent is a sofa-unaware CollectionReaderAdapter and creating a new view only in that case looks rather like a hack to me, although it would probably resolve the situation. I didn't test that yet, but if you think it reasonable, I can check it.
>> 
>> Actually, the exception triggered by PrimitiveAnalysisEngine_impl.classAnalysisComponentProcess() when accessing the non-existing mapped CAS seems completely redundant, because if the analysis engine delegate is a sofa-unaware CasMultiplier or CollectionReader(Adapter) that doesn't actually use its input CAS, it doesn't matter at all a that point that the mapped view does not exist. It's enough if the mapped initial view is set up in any new CAS created for the CasMultiplier/CollectionReader.
>> 
>> So, there are many possible ways. I personally don't find to very attractive to change the CollectionReaderDescription because I think that has quite some overhead. Even if that was done, the problem would probably remain for mapped CasMultipliers. I like it that UIMA internally treats all components equally, so I would prefer doing something that this also works out well when sofa mappings are used on components that produce new CASes and potentially do not at all make used of the input CAS.
>> 
>> -- Richard

-- 
------------------------------------------------------------------- 
Richard Eckart de Castilho
Technical Lead
Ubiquitous Knowledge Processing Lab (UKP-TUD) 
FB 20 Computer Science Department      
Technische Universität Darmstadt 
Hochschulstr. 10, D-64289 Darmstadt, Germany 
phone [+49] (0)6151 16-7477, fax -5455, room S2/02/B117
eckart@ukp.informatik.tu-darmstadt.de 
www.ukp.tu-darmstadt.de 
Web Research at TU Darmstadt (WeRC) www.werc.tu-darmstadt.de
------------------------------------------------------------------- 







Re: [jira] [Created] (UIMA-2419) Initial view for sofa unaware components not automatically created

Posted by Eddie Epstein <ea...@gmail.com>.
Hi Richard,

Can you be a bit more explicit what the failing scenario is? I'm
definitely confused by wrapping a CR in an AE descriptor. Is it
possible to paste here an aggregate descriptor using sample components
from the UIMA SDK that demonstrates the problem?

Thanks,
Eddie


On Sun, Jun 10, 2012 at 2:11 PM, Richard Eckart de Castilho
<ec...@ukp.informatik.tu-darmstadt.de> wrote:
> Am 10.06.2012 um 19:50 schrieb Richard Eckart de Castilho:
>
>> I guess another option should be to change CollectionReaderAdapter to create any missing initial view for sofa-unaware readers. That would not have any side other component type and it would solve the problem for my use-case as well. The problem is, that doesn't work, because the PrimitiveAnalysisEngine_impl.classAnalysisComponentProcess() already tries to access the mapped view and fails. Changing that to test if the mAnalysisComponent is a sofa-unaware CollectionReaderAdapter and creating a new view only in that case looks rather like a hack to me, although it would probably resolve the situation. I didn't test that yet, but if you think it reasonable, I can check it.
>
> Actually, the exception triggered by PrimitiveAnalysisEngine_impl.classAnalysisComponentProcess() when accessing the non-existing mapped CAS seems completely redundant, because if the analysis engine delegate is a sofa-unaware CasMultiplier or CollectionReader(Adapter) that doesn't actually use its input CAS, it doesn't matter at all a that point that the mapped view does not exist. It's enough if the mapped initial view is set up in any new CAS created for the CasMultiplier/CollectionReader.
>
> So, there are many possible ways. I personally don't find to very attractive to change the CollectionReaderDescription because I think that has quite some overhead. Even if that was done, the problem would probably remain for mapped CasMultipliers. I like it that UIMA internally treats all components equally, so I would prefer doing something that this also works out well when sofa mappings are used on components that produce new CASes and potentially do not at all make used of the input CAS.
>
> -- Richard
>
> --
> -------------------------------------------------------------------
> Richard Eckart de Castilho
> Technical Lead
> Ubiquitous Knowledge Processing Lab (UKP-TUD)
> FB 20 Computer Science Department
> Technische Universität Darmstadt
> Hochschulstr. 10, D-64289 Darmstadt, Germany
> phone [+49] (0)6151 16-7477, fax -5455, room S2/02/B117
> eckart@ukp.informatik.tu-darmstadt.de
> www.ukp.tu-darmstadt.de
> Web Research at TU Darmstadt (WeRC) www.werc.tu-darmstadt.de
> -------------------------------------------------------------------
>
>
>
>
>
>

Re: [jira] [Created] (UIMA-2419) Initial view for sofa unaware components not automatically created

Posted by Richard Eckart de Castilho <ec...@ukp.informatik.tu-darmstadt.de>.
Am 10.06.2012 um 19:50 schrieb Richard Eckart de Castilho:

> I guess another option should be to change CollectionReaderAdapter to create any missing initial view for sofa-unaware readers. That would not have any side other component type and it would solve the problem for my use-case as well. The problem is, that doesn't work, because the PrimitiveAnalysisEngine_impl.classAnalysisComponentProcess() already tries to access the mapped view and fails. Changing that to test if the mAnalysisComponent is a sofa-unaware CollectionReaderAdapter and creating a new view only in that case looks rather like a hack to me, although it would probably resolve the situation. I didn't test that yet, but if you think it reasonable, I can check it.

Actually, the exception triggered by PrimitiveAnalysisEngine_impl.classAnalysisComponentProcess() when accessing the non-existing mapped CAS seems completely redundant, because if the analysis engine delegate is a sofa-unaware CasMultiplier or CollectionReader(Adapter) that doesn't actually use its input CAS, it doesn't matter at all a that point that the mapped view does not exist. It's enough if the mapped initial view is set up in any new CAS created for the CasMultiplier/CollectionReader.

So, there are many possible ways. I personally don't find to very attractive to change the CollectionReaderDescription because I think that has quite some overhead. Even if that was done, the problem would probably remain for mapped CasMultipliers. I like it that UIMA internally treats all components equally, so I would prefer doing something that this also works out well when sofa mappings are used on components that produce new CASes and potentially do not at all make used of the input CAS.

-- Richard

-- 
------------------------------------------------------------------- 
Richard Eckart de Castilho
Technical Lead
Ubiquitous Knowledge Processing Lab (UKP-TUD) 
FB 20 Computer Science Department      
Technische Universität Darmstadt 
Hochschulstr. 10, D-64289 Darmstadt, Germany 
phone [+49] (0)6151 16-7477, fax -5455, room S2/02/B117
eckart@ukp.informatik.tu-darmstadt.de 
www.ukp.tu-darmstadt.de 
Web Research at TU Darmstadt (WeRC) www.werc.tu-darmstadt.de
------------------------------------------------------------------- 







Re: [jira] [Created] (UIMA-2419) Initial view for sofa unaware components not automatically created

Posted by Richard Eckart de Castilho <ec...@ukp.informatik.tu-darmstadt.de>.
Am 10.06.2012 um 19:03 schrieb Marshall Schor:

> Hmmm,  it seems to me that something is wrong if a UIMA pipeline ended up sending a CAS to a sofa-unaware component without a default view having been set up.  I would guess that in this situation, it would be better to throw an exception rather than hide this by automatically creating the view.   If a missing view is created, its subject-of-analysis would be left unset?  I'm guessing that most sofa-unaware annotators would not expect that, and would fail in mysterious ways.
> 
> What would be the use cases where it would be more valuable to create the view, rather than signal something's amiss?

My use-case is an aggregate analysis engine that uses a CollectionReader as its first component (a CasMultiplier may also work, I didn't test that). UIMA doesn't support sofa mappings for readers other than in CPEs (or I missed something). We would like to add support for sofa-mapped readers in uimaFIT though and would like to do so implementing as little infrastructure as possible on top of UIMA. Ideally, we'd just cleverly configure UIMA to get the feature implemented.

So, to work around that fact that CollectionReaderDescriptions do not support sofa mappings, I configured an AnalysisEngineDescription for a CollectionReader. UIMA internally doesn't really care much which kind of processing component is declared in an AnalysisEngineDescription, because internally it is all handled the same. I dimly remember a post to one of the UIMA mailing lists saying that the distinction between readers, analysis engines and consumers is largely arbitrary and that everything could be done with CasMultipliers as well.

So when I run the aggregate, the collection reader tries to write data to some mapped sofa, but the sofa does not yet exist. The reader is not sofa-aware, so it shouldn't have to create its initial view itself. If I use a sofa-unaware CasMultiplier instead, I suppose the same thing will happen. The reader/CasMultiplier would set the sofa of course, but since it is sofa-unaware, it wouldn't create the view.

I guess another option should be to change CollectionReaderAdapter to create any missing initial view for sofa-unaware readers. That would not have any side other component type and it would solve the problem for my use-case as well. The problem is, that doesn't work, because the PrimitiveAnalysisEngine_impl.classAnalysisComponentProcess() already tries to access the mapped view and fails. Changing that to test if the mAnalysisComponent is a sofa-unaware CollectionReaderAdapter and creating a new view only in that case looks rather like a hack to me, although it would probably resolve the situation. I didn't test that yet, but if you think it reasonable, I can check it.

Actually, thinking about it, I wonder if missing views should not be created on the first request in general. I have several times seen people use some helper methods that try to get a view and if an exception is thrown create the view and return it.

Or maybe it'd make sense to simply add the possibility to declare sofa mappings to the CollectionReaderDescription.

-- Richard

-- 
------------------------------------------------------------------- 
Richard Eckart de Castilho
Technical Lead
Ubiquitous Knowledge Processing Lab (UKP-TUD) 
FB 20 Computer Science Department      
Technische Universität Darmstadt 
Hochschulstr. 10, D-64289 Darmstadt, Germany 
phone [+49] (0)6151 16-7477, fax -5455, room S2/02/B117
eckart@ukp.informatik.tu-darmstadt.de 
www.ukp.tu-darmstadt.de 
Web Research at TU Darmstadt (WeRC) www.werc.tu-darmstadt.de
------------------------------------------------------------------- 







Re: [jira] [Created] (UIMA-2419) Initial view for sofa unaware components not automatically created

Posted by Marshall Schor <ms...@schor.com>.
Hmmm,  it seems to me that something is wrong if a UIMA pipeline ended up 
sending a CAS to a sofa-unaware component without a default view having been set 
up.  I would guess that in this situation, it would be better to throw an 
exception rather than hide this by automatically creating the view.   If a 
missing view is created, its subject-of-analysis would be left unset?  I'm 
guessing that most sofa-unaware annotators would not expect that, and would fail 
in mysterious ways.

What would be the use cases where it would be more valuable to create the view, 
rather than signal something's amiss?

-Marshall



On 6/9/2012 10:38 AM, Richard Eckart de Castilho (JIRA) wrote:
> Richard Eckart de Castilho created UIMA-2419:
> ------------------------------------------------
>
>               Summary: Initial view for sofa unaware components not automatically created
>                   Key: UIMA-2419
>                   URL: https://issues.apache.org/jira/browse/UIMA-2419
>               Project: UIMA
>            Issue Type: Bug
>            Components: Core Java Framework
>              Reporter: Richard Eckart de Castilho
>
>
> When running a sofa-unaware component in an aggregate analysis engine, the initial view for the component to operate on is not automatically created if it does not exist. This causes a CASRuntimeException, here "No sofaFS with name A found.".
>
> {noformat}
> org.apache.uima.analysis_engine.AnalysisEngineProcessException: Annotator processing failed.
> 	at org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.callAnalysisComponentProcess(PrimitiveAnalysisEngine_impl.java:394)
> 	at org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.processAndOutputNewCASes(PrimitiveAnalysisEngine_impl.java:298)
> 	at org.apache.uima.analysis_engine.asb.impl.ASB_impl$AggregateCasIterator.processUntilNextOutputCas(ASB_impl.java:568)
> 	at org.apache.uima.analysis_engine.asb.impl.ASB_impl$AggregateCasIterator.<init>(ASB_impl.java:410)
> 	at org.apache.uima.analysis_engine.asb.impl.ASB_impl.process(ASB_impl.java:343)
> 	at org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl.processAndOutputNewCASes(AggregateAnalysisEngine_impl.java:265)
> 	at org.apache.uima.analysis_engine.impl.AnalysisEngineImplBase.processAndOutputNewCASes(AnalysisEngineImplBase.java:340)
>           ...
> Caused by: org.apache.uima.cas.CASRuntimeException: No sofaFS with name A found.
> 	at org.apache.uima.cas.impl.CASImpl.getSofa(CASImpl.java:661)
> 	at org.apache.uima.cas.impl.CASImpl.getView(CASImpl.java:2658)
> 	at org.apache.uima.impl.Util.getStartingView(Util.java:46)
> 	at org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.callAnalysisComponentProcess(PrimitiveAnalysisEngine_impl.java:349)
> 	... 31 more
> {noformat}
>
> I'd consider this a bug, because a sofa-unaware component cannot be expected to create a view itself. If the sofa-unaware component is the first one in an aggregate, e.g. acting as a reader, then there is also no other component to create the view before.
>
> If the initial view of a component it is mapped to does not exist, it should be the task of the UIMA framework to create this view.
>
> See also CPE ArtifactProducer:481 (UIMA 2.4.0).
>
> --
> This message is automatically generated by JIRA.
> If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
> For more information on JIRA, see: http://www.atlassian.com/software/jira
>
>
>

[jira] [Updated] (UIMA-2419) Initial view for sofa unaware components not automatically created

Posted by "Richard Eckart de Castilho (JIRA)" <de...@uima.apache.org>.
     [ https://issues.apache.org/jira/browse/UIMA-2419?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Richard Eckart de Castilho updated UIMA-2419:
---------------------------------------------

    Affects Version/s: 2.4.0SDK
    
> Initial view for sofa unaware components not automatically created
> ------------------------------------------------------------------
>
>                 Key: UIMA-2419
>                 URL: https://issues.apache.org/jira/browse/UIMA-2419
>             Project: UIMA
>          Issue Type: Bug
>          Components: Core Java Framework
>    Affects Versions: 2.4.0SDK
>            Reporter: Richard Eckart de Castilho
>              Labels: patch
>         Attachments: UIMA-2419-REC-20120609.patch
>
>
> When running a sofa-unaware component in an aggregate analysis engine, the initial view for the component to operate on is not automatically created if it does not exist. This causes a CASRuntimeException, here "No sofaFS with name A found.".
> {noformat}
> org.apache.uima.analysis_engine.AnalysisEngineProcessException: Annotator processing failed.    
> 	at org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.callAnalysisComponentProcess(PrimitiveAnalysisEngine_impl.java:394)
> 	at org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.processAndOutputNewCASes(PrimitiveAnalysisEngine_impl.java:298)
> 	at org.apache.uima.analysis_engine.asb.impl.ASB_impl$AggregateCasIterator.processUntilNextOutputCas(ASB_impl.java:568)
> 	at org.apache.uima.analysis_engine.asb.impl.ASB_impl$AggregateCasIterator.<init>(ASB_impl.java:410)
> 	at org.apache.uima.analysis_engine.asb.impl.ASB_impl.process(ASB_impl.java:343)
> 	at org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl.processAndOutputNewCASes(AggregateAnalysisEngine_impl.java:265)
> 	at org.apache.uima.analysis_engine.impl.AnalysisEngineImplBase.processAndOutputNewCASes(AnalysisEngineImplBase.java:340)
>          ...
> Caused by: org.apache.uima.cas.CASRuntimeException: No sofaFS with name A found.
> 	at org.apache.uima.cas.impl.CASImpl.getSofa(CASImpl.java:661)
> 	at org.apache.uima.cas.impl.CASImpl.getView(CASImpl.java:2658)
> 	at org.apache.uima.impl.Util.getStartingView(Util.java:46)
> 	at org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.callAnalysisComponentProcess(PrimitiveAnalysisEngine_impl.java:349)
> 	... 31 more
> {noformat}
> I'd consider this a bug, because a sofa-unaware component cannot be expected to create a view itself. If the sofa-unaware component is the first one in an aggregate, e.g. acting as a reader, then there is also no other component to create the view before.
> If the initial view of a component it is mapped to does not exist, it should be the task of the UIMA framework to create this view. 
> See also CPE ArtifactProducer:481 (UIMA 2.4.0).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (UIMA-2419) Initial view for sofa unaware components not automatically created

Posted by "Richard Eckart de Castilho (JIRA)" <de...@uima.apache.org>.
     [ https://issues.apache.org/jira/browse/UIMA-2419?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Richard Eckart de Castilho updated UIMA-2419:
---------------------------------------------

    Attachment: UIMA-2419-REC-20120609.patch

Patch to create the mapped initial view if it doesn't exist yet.
                
> Initial view for sofa unaware components not automatically created
> ------------------------------------------------------------------
>
>                 Key: UIMA-2419
>                 URL: https://issues.apache.org/jira/browse/UIMA-2419
>             Project: UIMA
>          Issue Type: Bug
>          Components: Core Java Framework
>    Affects Versions: 2.4.0SDK
>            Reporter: Richard Eckart de Castilho
>              Labels: patch
>         Attachments: UIMA-2419-REC-20120609.patch
>
>
> When running a sofa-unaware component in an aggregate analysis engine, the initial view for the component to operate on is not automatically created if it does not exist. This causes a CASRuntimeException, here "No sofaFS with name A found.".
> {noformat}
> org.apache.uima.analysis_engine.AnalysisEngineProcessException: Annotator processing failed.    
> 	at org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.callAnalysisComponentProcess(PrimitiveAnalysisEngine_impl.java:394)
> 	at org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.processAndOutputNewCASes(PrimitiveAnalysisEngine_impl.java:298)
> 	at org.apache.uima.analysis_engine.asb.impl.ASB_impl$AggregateCasIterator.processUntilNextOutputCas(ASB_impl.java:568)
> 	at org.apache.uima.analysis_engine.asb.impl.ASB_impl$AggregateCasIterator.<init>(ASB_impl.java:410)
> 	at org.apache.uima.analysis_engine.asb.impl.ASB_impl.process(ASB_impl.java:343)
> 	at org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl.processAndOutputNewCASes(AggregateAnalysisEngine_impl.java:265)
> 	at org.apache.uima.analysis_engine.impl.AnalysisEngineImplBase.processAndOutputNewCASes(AnalysisEngineImplBase.java:340)
>          ...
> Caused by: org.apache.uima.cas.CASRuntimeException: No sofaFS with name A found.
> 	at org.apache.uima.cas.impl.CASImpl.getSofa(CASImpl.java:661)
> 	at org.apache.uima.cas.impl.CASImpl.getView(CASImpl.java:2658)
> 	at org.apache.uima.impl.Util.getStartingView(Util.java:46)
> 	at org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.callAnalysisComponentProcess(PrimitiveAnalysisEngine_impl.java:349)
> 	... 31 more
> {noformat}
> I'd consider this a bug, because a sofa-unaware component cannot be expected to create a view itself. If the sofa-unaware component is the first one in an aggregate, e.g. acting as a reader, then there is also no other component to create the view before.
> If the initial view of a component it is mapped to does not exist, it should be the task of the UIMA framework to create this view. 
> See also CPE ArtifactProducer:481 (UIMA 2.4.0).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira