You are viewing a plain text version of this content. The canonical link for it is here.
Posted to pr@jena.apache.org by GitBox <gi...@apache.org> on 2022/11/25 10:32:30 UTC
[GitHub] [jena] rvesse commented on a diff in pull request #1637: apacheGH-1636 add support for semantic actions

rvesse commented on code in PR #1637:
URL: https://github.com/apache/jena/pull/1637#discussion_r1032221065


##########
jena-shex/src/main/java/org/apache/jena/shex/parser/ParserShExC.java:
##########
@@ -632,6 +641,48 @@ protected void numericFacetLength(String facetKind, int length, int line, int co
         addNodeConstraint(numLength);
     }
 
+    protected SemAct crackSemanticAction(String iriAndCode, int line, int column) {
+        // e.g. % <http://shex.io/extensions/Test/> { print(s) %}
+        // or   % ex:Test { print(s) %}
+        // or   %<http://shex.io/extensions/Test/>%
+
+        // Trim leading '%' and ws and trim trailing '%}'
+        String whitespaces = " \t\n\r\f";
+        int startOfIri = 1; // get past '%'
+        for (; whitespaces.indexOf(iriAndCode.charAt(startOfIri)) != -1; ++startOfIri)

Review Comment:
   This all seems a bit brittle e.g. what if there's excess whitespace anywhere?
   
   We have a commons-lang dependency already so `StringUtils.stripStart()` would be more robust



##########
jena-shex/src/main/java/org/apache/jena/shex/sys/ShexValidatorImpl.java:
##########
@@ -18,20 +18,31 @@
 
 package org.apache.jena.shex.sys;
 
-import java.util.ArrayList;
-import java.util.Collection;
-import java.util.List;
-import java.util.Objects;
+import java.util.*;
 
 import org.apache.jena.atlas.lib.InternalErrorException;
 import org.apache.jena.atlas.lib.ListUtils;
 import org.apache.jena.graph.Graph;
 import org.apache.jena.graph.Node;
 import org.apache.jena.graph.Triple;
 import org.apache.jena.shex.*;
+import org.apache.jena.shex.semact.SemanticActionPlugin;
 
 class ShexValidatorImpl implements ShexValidator{
 
+    private Map<String,SemanticActionPlugin> semanticActionPluginIndex;
+
+    ShexValidatorImpl() {}
+
+    ShexValidatorImpl(Map<String,SemanticActionPlugin> semActPluginIndex) {
+        this.semanticActionPluginIndex = semActPluginIndex;
+    }
+
+    private static ShexValidator shexValidator = new ShexValidatorImpl();
+    public static ShexValidator lightlyUsedInstance() {

Review Comment:
   Why? I didn't see this used anywhere or if it was I missed it



##########
jena-shex/src/main/java/org/apache/jena/shex/sys/ValidationContext.java:
##########
@@ -79,37 +127,81 @@ public void startValidate(ShexShape shape, Node data) {
 
     // Return true if done or in-progress (i.e. don't walk further)
     public boolean cycle(ShexShape shape, Node data) {
-        return inProgress.stream().anyMatch(p->p.equalElts(data, shape));
+        return inProgress.stream().anyMatch(p -> p.equalElts(data, shape));
+    }
+
+    public boolean dispatchStartSemanticAction(ShexSchema schema, ValidationContext vCxt) {
+        return !schema.getSemActs().stream().anyMatch(semAct -> {
+            String semActIri = semAct.getIri();
+            SemanticActionPlugin semActPlugin = this.semActPluginIndex.get(semActIri);
+            if (semActPlugin != null) {
+                if (!semActPlugin.evaluateStart(semAct, schema)) {
+                    vCxt.reportEntry(new ReportItem(String.format("%s start shape failed", semActIri), null));
+                    return true;
+                }
+            }

Review Comment:
   It would be helpful to report that a semantic action plugin was unavailable so users have some indication of why validation failed



##########
jena-shex/src/test/files/spec/schemas/1dotCode3fail.shex:
##########
@@ -1,6 +1,11 @@
+PREFIX p: <http://shex.io/extensions/>

Review Comment:
   Per my comments elsewhere it would be useful to have a variety of test cases here with variations in whitespace, use of URIs vs prefixed names to identify plugins etc, to check that the parser modifications are sufficiently robust
   
   Although looking at other changes perhaps the reduction in test exclusions is already picking up a selection of pre-existing test cases that were previously skipped?



##########
jena-shex/src/main/java/org/apache/jena/shex/parser/javacc/ShExJavacc.java:
##########
@@ -1,4 +1,3 @@
-/* ShExJavacc.java */

Review Comment:
   Again unclear how much of the diff here is due to using an older JavaCC version vs actual functional changes from the grammar changes



##########
jena-shex/src/test/java/org/apache/jena/shex/runner/ShexTests.java:
##########
@@ -329,10 +325,41 @@ private static boolean runTestExclusionsInclusions(ManifestEntry entry) {
                 //System.err.println("Skipping:  "+fragment);
                 return false;
             }
+
+            List<Resource> traits = extractTraits(entry);
+            if (traits != null) {
+                List<Resource> excludedBecause = traits.stream().filter(excl -> excludeTraits.contains(excl)).collect(Collectors.toList());

Review Comment:
   Think you can simplify the whole interior of this if block to just `return !traits.stream().anyMatch(excl -> excludeTraits.contains(excl));`



##########
jena-shex/src/test/java/org/apache/jena/shex/runner/ShexT.java:
##########
@@ -74,4 +81,12 @@ public class ShexT {
     public final static Property sx_json = ResourceFactory.createProperty(NS_SX + "json");
     public final static Property sx_ttl  = ResourceFactory.createProperty(NS_SX + "ttl");
 
+
+    /** <p>The test statusThe expected output of semantic actions</p> */

Review Comment:
   Some copy-paste leftovers here?



##########
jena-shex/src/main/java/org/apache/jena/shex/parser/javacc/ParseException.java:
##########
@@ -1,5 +1,5 @@
-/* Generated By:JavaCC: Do not edit this line. ParseException.java Version 7.0 */
-/* JavaCCOptions:KEEP_LINE_COLUMN=true */
+/* Generated By:JavaCC: Do not edit this line. ParseException.java Version 5.0 */

Review Comment:
   I believe we moved to using JavaCC 7 as there's known deficiencies in the parsers generated by older JavaCC versions, could you please regenerate with that as that will reduce unnecessary diff content in this large PR



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: pr-unsubscribe@jena.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@jena.apache.org
For additional commands, e-mail: pr-help@jena.apache.org