You are viewing a plain text version of this content. The canonical link for it is here.
Posted to pr@jena.apache.org by "afs (via GitHub)" <gi...@apache.org> on 2023/03/29 16:54:58 UTC

[GitHub] [jena] afs opened a new pull request, #1818: Misc

afs opened a new pull request, #1818:
URL: https://github.com/apache/jena/pull/1818

   Pull request Description:
   Miscellaneous small things accumulated while 
   
   [Count triples in a quads stream](https://github.com/apache/jena/commit/9864de292da5b7808f9609fd545021ba6ebcb853) which is a bug fix for reporting upload stats in Fuseki.
   
   [Fix generating file name when running on MS Windows](https://github.com/apache/jena/commit/e3c02a6612f0daad703176e9b3574e5773481c17) is test related and should make the GH/Windows action work.
   
   [Document and test bnode-iri round trip](https://github.com/apache/jena/commit/a1cf0685c3bd8610dad76a24d9cd29cbc49fb718) adds library functions that are missing.
   
   ----
   
   By submitting this pull request, I acknowledge that I am making a contribution to the Apache Software Foundation under the terms and conditions of the [Contributor's Agreement](https://www.apache.org/licenses/contributor-agreements.html).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: pr-unsubscribe@jena.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@jena.apache.org
For additional commands, e-mail: pr-help@jena.apache.org


[GitHub] [jena] afs merged pull request #1818: Misc

Posted by "afs (via GitHub)" <gi...@apache.org>.
afs merged PR #1818:
URL: https://github.com/apache/jena/pull/1818


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: pr-unsubscribe@jena.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@jena.apache.org
For additional commands, e-mail: pr-help@jena.apache.org


[GitHub] [jena] afs commented on a diff in pull request #1818: Misc

Posted by "afs (via GitHub)" <gi...@apache.org>.
afs commented on code in PR #1818:
URL: https://github.com/apache/jena/pull/1818#discussion_r1152443056


##########
jena-arq/src/main/java/org/apache/jena/riot/out/NodeFmtLib.java:
##########
@@ -195,19 +198,24 @@ private static String displayStrNodes(Node...nodes) {
     // ---- Blank node labels.
 
     // Strict N-triples only allows [A-Za-z][A-Za-z0-9]
-    static char encodeMarkerChar = 'X';
+    private static final char encodeMarkerChar = 'X';
 
     // These two form a pair to convert bNode labels to a safe (i.e. legal N-triples form) and back again.
 
-    // Encoding is:
-    // 1 - Add a Letter
-    // 2 - Hexify, as Xnn, anything outside ASCII A-Za-z0-9
-    // 3 - X is encoded as XX
+    /**
+     *  Encoding is:
+     *  <ul>
+     *  <li> Add a start letter
+     *  <li> Hexify, as Xnn, anything outside ASCII A-Za-z0-9
+     *  <li> X is encoded as XX
+     */
 
-    private static char LabelLeadingLetter = 'B';
+    private static final char LabelLeadingLetter = 'B';
 
     public static String encodeBNodeLabel(String label) {
-        StringBuilder buff = new StringBuilder();
+        // A UUID string is 36 characters including 4 dashes.
+        // Together with the leading 'B', a total of 45 characters + any additional encoding needed.
+        StringBuilder buff = new StringBuilder(48);

Review Comment:
   The import thing is to avoid reallocation in the StringBuilder in common usage (UUID strings).
   
   Dashes get X-encoded so the overall is 45 -- a dash is "X2D".
   
   `878b5787-2c84-4624-a237-ad6fc35ec260`
   becomes 
   `B878b5787X2D2c84X2D4624X2Da237X2Dad6fc35ec260` which is 45.
   
   So the buffer must be >=45 for UUIDs.
   
   48 is a multiple of 8 (possible cache effect) and more than 45.
   
   The label isn't required to be a UUID.
   
   😄 
   
   Comment revised.
   



##########
jena-arq/src/main/java/org/apache/jena/riot/system/RiotLib.java:
##########
@@ -85,6 +85,16 @@ public static String blankNodeToIriString(Node node) {
         throw new RiotException("Not a blank node or URI");
     }
 
+    /**
+     * Convert a ARQ-encoded blank node URI nod to a blank node, otehrwise return the argument node unchanged.

Review Comment:
   Done



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: pr-unsubscribe@jena.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@jena.apache.org
For additional commands, e-mail: pr-help@jena.apache.org


[GitHub] [jena] afs commented on a diff in pull request #1818: Misc

Posted by "afs (via GitHub)" <gi...@apache.org>.
afs commented on code in PR #1818:
URL: https://github.com/apache/jena/pull/1818#discussion_r1152449499


##########
jena-arq/src/main/java/org/apache/jena/riot/system/RiotLib.java:
##########
@@ -107,6 +117,15 @@ public static boolean isBNodeIRI(String iri) {
         return iri.startsWith(bNodeLabelStart);
     }
 
+    /**
+     * Test whether a node is an ARQ-encoded blank node IRI.
+     */
+    public static boolean isBNodeIRI(Node node) {
+        if ( ! node.isURI() )
+            return false;
+        return isBNodeIRI(node.getURI());

Review Comment:
   Style choice. The `if` is a guard on the argument, then a test on the `getURI`. Think of it as lisp `(cond)` inspired‼️
   
   I'm not a fan of (IMO) unnecessary brevity in compound expressions because multiple lines given multiple debug points.
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: pr-unsubscribe@jena.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@jena.apache.org
For additional commands, e-mail: pr-help@jena.apache.org


[GitHub] [jena] kinow commented on a diff in pull request #1818: Misc

Posted by "kinow (via GitHub)" <gi...@apache.org>.
kinow commented on code in PR #1818:
URL: https://github.com/apache/jena/pull/1818#discussion_r1152471301


##########
jena-arq/src/main/java/org/apache/jena/riot/out/NodeFmtLib.java:
##########
@@ -195,19 +198,24 @@ private static String displayStrNodes(Node...nodes) {
     // ---- Blank node labels.
 
     // Strict N-triples only allows [A-Za-z][A-Za-z0-9]
-    static char encodeMarkerChar = 'X';
+    private static final char encodeMarkerChar = 'X';
 
     // These two form a pair to convert bNode labels to a safe (i.e. legal N-triples form) and back again.
 
-    // Encoding is:
-    // 1 - Add a Letter
-    // 2 - Hexify, as Xnn, anything outside ASCII A-Za-z0-9
-    // 3 - X is encoded as XX
+    /**
+     *  Encoding is:
+     *  <ul>
+     *  <li> Add a start letter
+     *  <li> Hexify, as Xnn, anything outside ASCII A-Za-z0-9
+     *  <li> X is encoded as XX
+     */
 
-    private static char LabelLeadingLetter = 'B';
+    private static final char LabelLeadingLetter = 'B';
 
     public static String encodeBNodeLabel(String label) {
-        StringBuilder buff = new StringBuilder();
+        // A UUID string is 36 characters including 4 dashes.
+        // Together with the leading 'B', a total of 45 characters + any additional encoding needed.
+        StringBuilder buff = new StringBuilder(48);

Review Comment:
   Aaaahhh. Thanks!!!!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: pr-unsubscribe@jena.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@jena.apache.org
For additional commands, e-mail: pr-help@jena.apache.org


[GitHub] [jena] kinow commented on a diff in pull request #1818: Misc

Posted by "kinow (via GitHub)" <gi...@apache.org>.
kinow commented on code in PR #1818:
URL: https://github.com/apache/jena/pull/1818#discussion_r1152416980


##########
jena-arq/src/main/java/org/apache/jena/riot/out/NodeFmtLib.java:
##########
@@ -195,19 +198,24 @@ private static String displayStrNodes(Node...nodes) {
     // ---- Blank node labels.
 
     // Strict N-triples only allows [A-Za-z][A-Za-z0-9]
-    static char encodeMarkerChar = 'X';
+    private static final char encodeMarkerChar = 'X';
 
     // These two form a pair to convert bNode labels to a safe (i.e. legal N-triples form) and back again.
 
-    // Encoding is:
-    // 1 - Add a Letter
-    // 2 - Hexify, as Xnn, anything outside ASCII A-Za-z0-9
-    // 3 - X is encoded as XX
+    /**
+     *  Encoding is:
+     *  <ul>
+     *  <li> Add a start letter
+     *  <li> Hexify, as Xnn, anything outside ASCII A-Za-z0-9
+     *  <li> X is encoded as XX
+     */
 
-    private static char LabelLeadingLetter = 'B';
+    private static final char LabelLeadingLetter = 'B';
 
     public static String encodeBNodeLabel(String label) {
-        StringBuilder buff = new StringBuilder();
+        // A UUID string is 36 characters including 4 dashes.
+        // Together with the leading 'B', a total of 45 characters + any additional encoding needed.
+        StringBuilder buff = new StringBuilder(48);

Review Comment:
   >A UUID string is 36 characters 
   
   `36`
   
   >including 4 dashes.
   
   `36 + 4 = 40`
   
   >Together with the leading 'B'
   
   `40 + 1 = 41 ?`
   
   >a total of 45 characters + any additional encoding needed.
   
   `45`?
   
   >new StringBuilder(48);
   
   `48`? :sweat_smile: 
   
   Is the `48` just to have some buffer?
   



##########
jena-arq/src/main/java/org/apache/jena/riot/system/RiotLib.java:
##########
@@ -85,6 +85,16 @@ public static String blankNodeToIriString(Node node) {
         throw new RiotException("Not a blank node or URI");
     }
 
+    /**
+     * Convert a ARQ-encoded blank node URI nod to a blank node, otehrwise return the argument node unchanged.

Review Comment:
   s/a ARQ/an ARQ if we say it like ark/arc? I actually say it that way but no idea what's the correct way.
   
   And I think the `nod` can be deleted?
   
   And s/otehrwise/otherwise



##########
jena-arq/src/main/java/org/apache/jena/riot/system/RiotLib.java:
##########
@@ -107,6 +117,15 @@ public static boolean isBNodeIRI(String iri) {
         return iri.startsWith(bNodeLabelStart);
     }
 
+    /**
+     * Test whether a node is an ARQ-encoded blank node IRI.
+     */
+    public static boolean isBNodeIRI(Node node) {
+        if ( ! node.isURI() )
+            return false;
+        return isBNodeIRI(node.getURI());

Review Comment:
   Or `return node.isURI() && isBNodeIRI(node.getURI())`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: pr-unsubscribe@jena.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@jena.apache.org
For additional commands, e-mail: pr-help@jena.apache.org


[GitHub] [jena] afs commented on a diff in pull request #1818: Misc

Posted by "afs (via GitHub)" <gi...@apache.org>.
afs commented on code in PR #1818:
URL: https://github.com/apache/jena/pull/1818#discussion_r1152443056


##########
jena-arq/src/main/java/org/apache/jena/riot/out/NodeFmtLib.java:
##########
@@ -195,19 +198,24 @@ private static String displayStrNodes(Node...nodes) {
     // ---- Blank node labels.
 
     // Strict N-triples only allows [A-Za-z][A-Za-z0-9]
-    static char encodeMarkerChar = 'X';
+    private static final char encodeMarkerChar = 'X';
 
     // These two form a pair to convert bNode labels to a safe (i.e. legal N-triples form) and back again.
 
-    // Encoding is:
-    // 1 - Add a Letter
-    // 2 - Hexify, as Xnn, anything outside ASCII A-Za-z0-9
-    // 3 - X is encoded as XX
+    /**
+     *  Encoding is:
+     *  <ul>
+     *  <li> Add a start letter
+     *  <li> Hexify, as Xnn, anything outside ASCII A-Za-z0-9
+     *  <li> X is encoded as XX
+     */
 
-    private static char LabelLeadingLetter = 'B';
+    private static final char LabelLeadingLetter = 'B';
 
     public static String encodeBNodeLabel(String label) {
-        StringBuilder buff = new StringBuilder();
+        // A UUID string is 36 characters including 4 dashes.
+        // Together with the leading 'B', a total of 45 characters + any additional encoding needed.
+        StringBuilder buff = new StringBuilder(48);

Review Comment:
   The import thing is to avoid reallocation in the StringBuilder in common usage (UUID strings).
   
   Dashes get X-encoded so the overall is 45 -- a dash is "X2D".
   
   `878b5787-2c84-4624-a237-ad6fc35ec260`
   becomes 
   `B878b5787X2D2c84X2D4624X2Da237X2Dad6fc35ec260` which is 45.
   
   So the buffer must be >=45 for UUIDs.
   
   48 is a multiple of 8 (possible cache effect) and more than 45.
   
   The label isn't required to be a UUID.
   
   😄 
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: pr-unsubscribe@jena.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@jena.apache.org
For additional commands, e-mail: pr-help@jena.apache.org


[GitHub] [jena] kinow commented on a diff in pull request #1818: Misc

Posted by "kinow (via GitHub)" <gi...@apache.org>.
kinow commented on code in PR #1818:
URL: https://github.com/apache/jena/pull/1818#discussion_r1152515740


##########
jena-arq/src/main/java/org/apache/jena/riot/out/NodeFmtLib.java:
##########
@@ -195,19 +198,24 @@ private static String displayStrNodes(Node...nodes) {
     // ---- Blank node labels.
 
     // Strict N-triples only allows [A-Za-z][A-Za-z0-9]
-    static char encodeMarkerChar = 'X';
+    private static final char encodeMarkerChar = 'X';
 
     // These two form a pair to convert bNode labels to a safe (i.e. legal N-triples form) and back again.
 
-    // Encoding is:
-    // 1 - Add a Letter
-    // 2 - Hexify, as Xnn, anything outside ASCII A-Za-z0-9
-    // 3 - X is encoded as XX
+    /**
+     *  Encoding is:
+     *  <ul>
+     *  <li> Add a start letter
+     *  <li> Hexify, as Xnn, anything outside ASCII A-Za-z0-9
+     *  <li> X is encoded as XX
+     */
 
-    private static char LabelLeadingLetter = 'B';
+    private static final char LabelLeadingLetter = 'B';
 
     public static String encodeBNodeLabel(String label) {
-        StringBuilder buff = new StringBuilder();
+        // A UUID string is 36 characters including 4 dashes.
+        // Together with the leading 'B', a total of 45 characters + any additional encoding needed.
+        StringBuilder buff = new StringBuilder(48);

Review Comment:
   s/gerater/greater



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: pr-unsubscribe@jena.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@jena.apache.org
For additional commands, e-mail: pr-help@jena.apache.org