You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@uima.apache.org by "Marshall Schor (JIRA)" <de...@uima.apache.org> on 2013/07/01 20:40:23 UTC

[jira] [Resolved] (UIMA-3017) Getting feature value from feature structure longer than expected

     [ https://issues.apache.org/jira/browse/UIMA-3017?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Marshall Schor resolved UIMA-3017.
----------------------------------

    Resolution: Later
    
> Getting feature value from feature structure longer than expected
> -----------------------------------------------------------------
>
>                 Key: UIMA-3017
>                 URL: https://issues.apache.org/jira/browse/UIMA-3017
>             Project: UIMA
>          Issue Type: Improvement
>          Components: Core Java Framework
>    Affects Versions: 2.3
>         Environment: Linux x86_64
>            Reporter: Mike Barborak
>            Priority: Minor
>
> Should getting a value of a feature in a feature structure be fast? Intuitively, I would expect performance to be about the same as getting an entry from a Java HashMap or faster but in my experiments it seems to be 8 times slower. To solve my problem, I wrap my feature structures with caching Java code but it seems that there might be an opportunity to speed up UIMA generally.
> My test creates a CAS with a single feature structure in it. It sets a string feature in that feature structure and then simply gets the value of that feature in a tight loop. I compare that to an instance of a Java class that has an internal HashMap of strings to strings. In that case, a method is called on that instance to get an entry from the map in a very tight loop. 
> I do 5 rounds of each of the loops. The total times for the rounds involving the CAS were:
> round 0 total time 1: 7.520104509s
> round 1 total time 1: 6.812214938s
> round 2 total time 1: 6.882752307s
> round 3 total time 1: 6.728515004s
> round 4 total time 1: 6.813674956s
> The total times for the rounds just using the Java class were:
> round 0 total time 2: 0.847296054s
> round 1 total time 2: 0.814570347s
> round 2 total time 2: 0.814399859s
> round 3 total time 2: 0.814189383s
> round 4 total time 2: 0.814979357s
> Here is my Java code:
> {code:title=MyTest.java}
> package test;
> import java.io.InputStream;
> import java.util.HashMap;
> import java.util.Map;
> import org.apache.uima.UIMAFramework;
> import org.apache.uima.cas.CAS;
> import org.apache.uima.cas.Feature;
> import org.apache.uima.cas.FeatureStructure;
> import org.apache.uima.cas.Type;
> import org.apache.uima.resource.metadata.TypeSystemDescription;
> import org.apache.uima.util.CasCreationUtils;
> import org.apache.uima.util.XMLInputSource;
> public class MyTest {
>   
>   static class MyClass {
>     Map<String, String> myFeatures = new HashMap<String, String>();
>     
>     void setStringValue(String feature, String value) {
>       myFeatures.put(feature, value);
>     }
>     
>     String getStringValue(String feature) {
>       return myFeatures.get(feature);
>     }
>   }
>   
>   static public void main(String[] argv) throws Exception {
>     InputStream stream = TestSupport.class.getClassLoader().getResourceAsStream("MyTypes.xml");
>     TypeSystemDescription typeSystemDescription = UIMAFramework.getXMLParser().parseTypeSystemDescription(new XMLInputSource(stream, null));
>     CAS cas = CasCreationUtils.createCas(typeSystemDescription, null, null);
>     Type myType = cas.getTypeSystem().getType("MyType");
>     FeatureStructure fs = cas.createFS(myType);
>     Feature myFeature = myType.getFeatureByBaseName("myFeature");
>     fs.setStringValue(myFeature, "myString");
>     cas.addFsToIndexes(fs);
>     
>     MyClass myInstance = new MyClass();
>     myInstance.setStringValue("myFeature2", "myString2");
>     
>     long iterations = 100000000;
>     double nanoSecsPerSec = 1000000000.0d;
>     
>     for (int round = 0; round < 5; round++) {
>       long start = System.nanoTime();
>       for (long i = 0; i < iterations; i++) {
>         fs.getStringValue(myFeature);
>       }
>       long end = System.nanoTime();
>       System.out.println("round " + round + " total time 1: " + ((end - start) / nanoSecsPerSec) + "s");
>     }
>       
>     for (int round = 0; round < 5; round++) {
>       long start = System.nanoTime();
>       for (long i = 0; i < iterations; i++) {
>         myInstance.getStringValue("myFeature2");
>       }
>       long end = System.nanoTime();
>       System.out.println("round " + round + " total time 2: " + ((end - start) / nanoSecsPerSec) + "s");
>     }
>   }
> }
> {code}
> Here is my type descriptor:
> {code:xml}
> <?xml version="1.0" encoding="UTF-8"?>
> <typeSystemDescription xmlns="http://uima.apache.org/resourceSpecifier">
>   <name>MyTypes</name>
>   <description/>
>   <version>1.0</version>
>   <vendor/>
>   <types>
>     <typeDescription>
>       <name>MyType</name>
>       <description/>
>       <supertypeName>uima.cas.TOP</supertypeName>
>       <features>
>         <featureDescription>
>           <name>myFeature</name>
>           <description></description>
>           <rangeTypeName>uima.cas.String</rangeTypeName>
>         </featureDescription>
>       </features>
>     </typeDescription>
>   </types>
> </typeSystemDescription>
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira