You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@jena.apache.org by "Crapo, Andrew (GE Global Research)" <cr...@ge.com> on 2015/06/15 14:30:36 UTC

Threading Issue in OntDocumentManager.loadImports

I posted this to users but on second thought maybe this is more appropriate to dev...

I have multiple clients accessing a common service, which runs in a multi-threaded environment. Each client will have different scenario data, e.g., OWL models SC1, SC2, SC3, etc. Each SCi model imports a common hierarchy of domain models; SCi imports tbox A, which imports tbox B, which imports tbox C. There is also a rule that inserts a new triple into an infered model, the object value of which is specific to the data in the input model SCi.

To validate the thread-safely of the server, I've created a test case with one or more threads. Each thread accesses concepts in the imported models via the thread-specific scenario model. That part of the test code is:

                                for (int i = 0; i < 50; i++) {
                                                JenaBareThread myThread = new JenaBareThread();
                                                myThread.start();
                                }

There are several versions of the run method of JenaBareThread. The one that works loads one of three scenario data models, which in turn imports the tbox models, and then queries for the inferred triple and checks for the correct value for the given scenario. This code is of the form:

                public void run() {
                                String modelFolderDir = "file:/C:tmp/DataModels/Shapes/OwlModels";
                                String abox = modelFolderDir + "/Test" + (getInoutidx() + 1) + ".owl";
                                String ontPolicy = modelFolderDir + "/ont-policy.rdf";

                                OntModel dataModel = ModelFactory.createOntologyModel(OntModelSpec.OWL_MEM);
                                try {
                                                dataModel.enterCriticalSection(Lock.WRITE);
                                                dataModel.getDocumentManager().setProcessImports(true);
                                                dataModel.read(abox);
                                }
                                catch (Throwable t) {
                                                t.printStackTrace();
                                }
                                finally {
                                                dataModel.leaveCriticalSection();
                                }

                                String rulefn = modelFolderDir + "/Rule.rules";
                                List<Rule> ruleList = new ArrayList<Rule>();
                                try {
                                                InputStream in = FileManager.get().open(rulefn);
                                                if (in != null) {
                                                                ... (load rule)
                                                    }
                                                    GenericRuleReasoner reasoner = new GenericRuleReasoner(ruleList);
                                                    reasoner.setDerivationLogging(true);
                                                    reasoner.setMode(GenericRuleReasoner.HYBRID);
                                                    InfModel infModel = ModelFactory.createInfModel(reasoner, dataModel);
                                                    String askQuery = "select ?s ?a where {?s <http://sadl.org/Shapes/Shapes#area> ?a}";
                                                    QueryExecution qexec = QueryExecutionFactory.create(QueryFactory.create(askQuery, Syntax.syntaxARQ), infModel);
                                                    ResultSet results = qexec.execSelect();
                                                    ... (test results)
                                                }
                                }
                                catch (RulesetNotFoundException e) {
                                                e.printStackTrace();
                                }
                                catch (HttpException e) {
                                                e.printStackTrace();
                                }

In this test the scenario models are pre-existing and the multi-threaded test passes for all threads regardless of the number of threads. However, I really need to construct each scenario model with data coming from the client. This I am unable to do without encountering various errors which apprear to be due to threading issues.

There are 3 other versions of the test, each trying a different way of creating a scenario model with data not already in an OWL file. For each, if I run 1 thread the test passes. If I set breakpoints and run multiple threads, the test passes for each thread. Otherwise I get a errors. Several of the problems appear to occur when a call is made to find a concept in one of the imported models. For example, one version has this in the run method:

                                long timestamp = System.currentTimeMillis();
                                String tsStr = "" + timestamp;
                                OntModel dataModel = ModelFactory.createOntologyModel(OntModelSpec.OWL_MEM);
                                try {
                                                dataModel.enterCriticalSection(Lock.WRITE);
                                                dataModel.setDynamicImports(true);
                                                ReadFailureHandler rfHandler = new SadlReadFailureHandler(logger);
                                                dataModel.getDocumentManager().setReadFailureHandler(rfHandler );
                                                dataModel.getDocumentManager().setMetadataSearchPath(ontPolicy, true);
                                                dataModel.getDocumentManager().setProcessImports(true);
                                                dataModel.createOntology(aboxNS);
                                                Ontology ont = dataModel.getOntology(aboxNS);
                                                Resource imp = dataModel.createResource(ruleNS);
                                                ont.addImport(imp);
                                                OntClass cls = dataModel.getOntClass(rectNS + "#Rectangle");
                                                Individual inst = dataModel.createIndividual(aboxNS + "#MyRect", cls);
                                                assertNotNull(inst);
                                                OntProperty height = dataModel.getOntProperty(rectNS + "#height");
                                                assertNotNull(height);
                                                OntProperty width = dataModel.getOntProperty(rectNS + "#width");
                                                assertNotNull(width);
                                                String last3 = tsStr.substring(tsStr.length() - 3);
                                                float w = Float.parseFloat(last3);
                                                float h = w + 10;
                                                area = h * w;
                                                inst.addProperty(height, dataModel.createTypedLiteral(w));
                                                inst.addProperty(width,  dataModel.createTypedLiteral(h));

On a few threads, at least one of the calls to dataModel.getOntClass or dataModel.getOntProperty returns null. This is because MultiUnion.graphBaseContains(Triple t) returns false.

The stack trace looks like this:

                MultiUnion.graphBaseContains(Triple) line: 132
                MultiUnion(GraphBase).contains(Triple) line: 321
                MultiUnion(GraphBase).contains(Node, Node, Node) line: 338
                OntModelImpl.findByURIAs(String, Class<T>) line: 2873
                OntModelImpl.getOntClass(String) line: 908
                JenaBareThread3.run() line: 76

Further examination shows that the m_subGraphs ArrayList contains an incomplete set of imports, sometimes only 2, sometimes 3 and sometimes the 4 that it should contain. This appears to be the case because the transitive closure of imports performed by OntDocumentManager.loadImports does not always load all of the indirect imports.

I will try attaching a ZIP containing an export of the Eclipse project containing all tests and the models required. It can be imported into Eclipse and only the one line setting the location of the "modelFolderDir" will need adjusting to the path to which the project is imported. Since my recollection is that attachments here will not work, I've also shared this ZIP file publicly on Google Drive at https://drive.google.com/file/d/0B7lXpNc8yGmTWFZZZzRHOWdtZTQ/view?usp=sharing.

Any insight into why this doesn't work/how to create data models in a thread-safe manner would be greatly appreciated!

Andrew W Crapo
Information Scientist
GE Global Research

T +1 518 387-5729
crapo@research.ge.com<ma...@research.ge.com>

One Research Circle
Niskayuna, NY 12302 USA

GE imagination at work


Re: Threading Issue in OntDocumentManager.loadImports

Posted by Andy Seaborne <an...@apache.org>.
Andrew,

I may have something running and illustrating something.  It is not 
quite as simple as changing modelFolderDir - ont-policy.rdf has E:/ 
references, and your description has file:/C:tmp/.  I still intermittent 
404s where the ont-policy isn't being applied.

I'm not seeing what you see though.  I get 404s or concurrent 
modification exceptions.

Can we focus on one test case? Which is the simplest?  JenaBareThread2 
looks closest to below and works (?) for me.

TestJenaBareThreads gives me the exceptions.Is that a suitable focus?

Is the rule reading part necessary to illustrate the problem?  Not 
knowing this part of the system, reducing to something manageable is 
necessary for me.

	Andy

On 15/06/15 13:30, Crapo, Andrew (GE Global Research) wrote:
> I posted this to users but on second thought maybe this is more
> appropriate to dev…
>
> I have multiple clients accessing a common service, which runs in a
> multi-threaded environment. Each client will have different scenario
> data, e.g., OWL models SC1, SC2, SC3, etc. Each SCi model imports a
> common hierarchy of domain models; SCi imports tbox A, which imports
> tbox B, which imports tbox C. There is also a rule that inserts a new
> triple into an infered model, the object value of which is specific to
> the data in the input model SCi.
>
> To validate the thread-safely of the server, I've created a test case
> with one or more threads. Each thread accesses concepts in the imported
> models via the thread-specific scenario model. That part of the test
> code is:
>
>                                  for (int i = 0; i < 50; i++) {
>
>                                                  JenaBareThread myThread
> = new JenaBareThread();
>
>                                                  myThread.start();
>
>                                  }
>
> There are several versions of the run method of JenaBareThread. The one
> that works loads one of three scenario data models, which in turn
> imports the tbox models, and then queries for the inferred triple and
> checks for the correct value for the given scenario. This code is of the
> form:
>
>                  public void run() {
>
>                                  String modelFolderDir =
> "file:/C:tmp/DataModels/Shapes/OwlModels";
>
>                                  String abox = modelFolderDir + "/Test"
> + (getInoutidx() + 1) + ".owl";
>
>                                  String ontPolicy = modelFolderDir +
> "/ont-policy.rdf";
>
>                                  OntModel dataModel =
> ModelFactory.createOntologyModel(OntModelSpec.OWL_MEM);
>
>                                  try {
>
>
> dataModel.enterCriticalSection(Lock.WRITE);
>
>
> dataModel.getDocumentManager().setProcessImports(true);
>
>                                                  dataModel.read(abox);
>
>                                  }
>
>                                  catch (Throwable t) {
>
>                                                  t.printStackTrace();
>
>                                  }
>
>                                  finally {
>
>
> dataModel.leaveCriticalSection();
>
>                                  }
>
>                                  String rulefn = modelFolderDir +
> "/Rule.rules";
>
>                                  List<Rule> ruleList = new
> ArrayList<Rule>();
>
>                                  try {
>
>                                                  InputStream in =
> FileManager.get().open(rulefn);
>
>                                                  if (in != null) {
>
>                                                                  ...
> (load rule)
>
>                                                      }
>
>                                                      GenericRuleReasoner
> reasoner = new GenericRuleReasoner(ruleList);
>
>
> reasoner.setDerivationLogging(true);
>
>
> reasoner.setMode(GenericRuleReasoner.HYBRID);
>
>                                                      InfModel infModel =
> ModelFactory.createInfModel(reasoner, dataModel);
>
>                                                      String askQuery =
> "select ?s ?a where {?s <http://sadl.org/Shapes/Shapes#area> ?a}";
>
>                                                      QueryExecution
> qexec = QueryExecutionFactory.create(QueryFactory.create(askQuery,
> Syntax.syntaxARQ), infModel);
>
>                                                      ResultSet results =
> qexec.execSelect();
>
>                                                      ... (test results)
>
>                                                  }
>
>                                  }
>
>                                  catch (RulesetNotFoundException e) {
>
>                                                  e.printStackTrace();
>
>                                  }
>
>                                  catch (HttpException e) {
>
>                                                  e.printStackTrace();
>
>                                  }
>
> In this test the scenario models are pre-existing and the multi-threaded
> test passes for all threads regardless of the number of threads.
> However, I really need to construct each scenario model with data coming
> from the client. This I am unable to do without encountering various
> errors which apprear to be due to threading issues.
>
> There are 3 other versions of the test, each trying a different way of
> creating a scenario model with data not already in an OWL file. For
> each, if I run 1 thread the test passes. If I set breakpoints and run
> multiple threads, the test passes for each thread. Otherwise I get a
> errors. Several of the problems appear to occur when a call is made to
> find a concept in one of the imported models. For example, one version
> has this in the run method:
>
>                                  long timestamp =
> System.currentTimeMillis();
>
>                                  String tsStr = "" + timestamp;
>
>                                  OntModel dataModel =
> ModelFactory.createOntologyModel(OntModelSpec.OWL_MEM);
>
>                                  try {
>
>
> dataModel.enterCriticalSection(Lock.WRITE);
>
>
> dataModel.setDynamicImports(true);
>
>                                                  ReadFailureHandler
> rfHandler = new SadlReadFailureHandler(logger);
>
>
> dataModel.getDocumentManager().setReadFailureHandler(rfHandler );
>
>
> dataModel.getDocumentManager().setMetadataSearchPath(ontPolicy, true);
>
>
> dataModel.getDocumentManager().setProcessImports(true);
>
>
> dataModel.createOntology(aboxNS);
>
>                                                  Ontology ont =
> dataModel.getOntology(aboxNS);
>
>                                                  Resource imp =
> dataModel.createResource(ruleNS);
>
>                                                  ont.addImport(imp);
>
>                                                  OntClass cls =
> dataModel.getOntClass(rectNS + "#Rectangle");
>
>                                                  Individual inst =
> dataModel.createIndividual(aboxNS + "#MyRect", cls);
>
>                                                  assertNotNull(inst);
>
>                                                  OntProperty height =
> dataModel.getOntProperty(rectNS + "#height");
>
>                                                  assertNotNull(height);
>
>                                                  OntProperty width =
> dataModel.getOntProperty(rectNS + "#width");
>
>                                                  assertNotNull(width);
>
>                                                  String last3 =
> tsStr.substring(tsStr.length() - 3);
>
>                                                  float w =
> Float.parseFloat(last3);
>
>                                                  float h = w + 10;
>
>                                                  area = h * w;
>
>
> inst.addProperty(height, dataModel.createTypedLiteral(w));
>
>
> inst.addProperty(width,  dataModel.createTypedLiteral(h));
>
> On a few threads, at least one of the calls to dataModel.getOntClass or
> dataModel.getOntProperty returns null. This is because
> MultiUnion.graphBaseContains(Triple t) returns false.
>
> The stack trace looks like this:
>
>                  MultiUnion.graphBaseContains(Triple) line: 132
>
>                  MultiUnion(GraphBase).contains(Triple) line: 321
>
>                  MultiUnion(GraphBase).contains(Node, Node, Node) line: 338
>
>                  OntModelImpl.findByURIAs(String, Class<T>) line: 2873
>
>                  OntModelImpl.getOntClass(String) line: 908
>
>                  JenaBareThread3.run() line: 76
>
> Further examination shows that the m_subGraphs ArrayList contains an
> incomplete set of imports, sometimes only 2, sometimes 3 and sometimes
> the 4 that it should contain. This appears to be the case because the
> transitive closure of imports performed by
> OntDocumentManager.loadImports does not always load all of the indirect
> imports.
>
> I will try attaching a ZIP containing an export of the Eclipse project
> containing all tests and the models required. It can be imported into
> Eclipse and only the one line setting the location of the
> "modelFolderDir" will need adjusting to the path to which the project is
> imported. Since my recollection is that attachments here will not work,
> I've also shared this ZIP file publicly on Google Drive at
> https://drive.google.com/file/d/0B7lXpNc8yGmTWFZZZzRHOWdtZTQ/view?usp=sharing.
>
> Any insight into why this doesn’t work/how to create data models in a
> thread-safe manner would be greatly appreciated!
>
> **
>
> *Andrew W Crapo*
>
> Information Scientist
>
> GE Global Research
>
> T +1 518 387-5729
>
> crapo@research.ge.com <ma...@research.ge.com>
>
> One Research Circle
>
> Niskayuna, NY 12302 USA
>
> GE imagination at work
>