You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pdfbox.apache.org by "Toparvion (via GitHub)" <gi...@apache.org> on 2023/01/22 05:40:46 UTC

[GitHub] [pdfbox] Toparvion opened a new pull request, #153: PDFBOX-4189: Add a method to get GSUB data for specific script tag

Toparvion opened a new pull request, #153:
URL: https://github.com/apache/pdfbox/pull/153

   This is a proposal for overcoming current GSUB table handling limitation of FontBox library. It is currently not possible to read GSUB table data for any language other than Bengali.
   
   This PR introduces 2 new methods to `org.apache.fontbox.ttf.GlyphSubstitutionTable` class:
   
   1. `getSupportedScriptTags` to fetch a read-only view of the `scriptList` map keys;
   1. `getGsubData(String)` to obtain `GsubData` for given script tag (that can be determined by the method above).
   
   In order to make it possible to read GSUB data without actual glyph substitution, the PR also changes the behavior of `org.apache.fontbox.ttf.gsub.GsubWorkerFactory`:
   
   - **Before** the change it used to throw an exception for any non-supported language (in fact - for any one other than Bengali);
   - **After** the change it would return a default `GsubWorker` implementation that doesn't perform actual glyph transformations (substitutions) but emits a WARN message into the log.
   
   The PR introduces primarily additive changes. The previous behavior of obtaining GSUB data for the first detected supported language (and its caching) has been kept intact.   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org


[GitHub] [pdfbox] Toparvion commented on a diff in pull request #153: PDFBOX-4189: Add a method to get GSUB data for specific script tag

Posted by "Toparvion (via GitHub)" <gi...@apache.org>.
Toparvion commented on code in PR #153:
URL: https://github.com/apache/pdfbox/pull/153#discussion_r1089665713


##########
fontbox/src/main/java/org/apache/fontbox/ttf/gsub/DefaultGsubWorker.java:
##########
@@ -0,0 +1,43 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.fontbox.ttf.gsub;
+
+import java.util.List;
+
+import org.apache.commons.logging.Log;
+import org.apache.commons.logging.LogFactory;
+import org.apache.fontbox.ttf.GlyphSubstitutionTable;
+
+/**
+ * A default implementation of {@link GsubWorker} that actually does not transform the glyphs yet
+ * allows to correctly {@linkplain GlyphSubstitutionTable#getGsubData(String) load} GSUB table
+ * data even from fonts for which a complete glyph substitution is not implemented.
+ *
+ * @author Vladimir Plizga
+ */
+class DefaultGsubWorker implements GsubWorker
+{
+    private static final Log LOG = LogFactory.getLog(DefaultGsubWorker.class);
+
+    @Override
+    public List<Integer> applyTransforms(List<Integer> originalGlyphIds)
+    {
+        LOG.warn(getClass().getSimpleName() + " class does not perform actual GSUB substitutions. "
+                + "Perhaps the selected language is not yet supported by the FontBox library.");
+        return originalGlyphIds;

Review Comment:
   Indeed. Thank you for pointing to this!
   
   In this particular case I've preferred to make the result unmodifiable as it would clearly inform the user about incorrect modification attempt (unlike copying which would "swallow" the modification thus hiding a probable bug).



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org


[GitHub] [pdfbox] lehmi commented on a diff in pull request #153: PDFBOX-4189: Add a method to get GSUB data for specific script tag

Posted by "lehmi (via GitHub)" <gi...@apache.org>.
lehmi commented on code in PR #153:
URL: https://github.com/apache/pdfbox/pull/153#discussion_r1088628537


##########
fontbox/src/main/java/org/apache/fontbox/ttf/gsub/DefaultGsubWorker.java:
##########
@@ -0,0 +1,43 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.fontbox.ttf.gsub;
+
+import java.util.List;
+
+import org.apache.commons.logging.Log;
+import org.apache.commons.logging.LogFactory;
+import org.apache.fontbox.ttf.GlyphSubstitutionTable;
+
+/**
+ * A default implementation of {@link GsubWorker} that actually does not transform the glyphs yet
+ * allows to correctly {@linkplain GlyphSubstitutionTable#getGsubData(String) load} GSUB table
+ * data even from fonts for which a complete glyph substitution is not implemented.
+ *
+ * @author Vladimir Plizga
+ */
+class DefaultGsubWorker implements GsubWorker
+{
+    private static final Log LOG = LogFactory.getLog(DefaultGsubWorker.class);
+
+    @Override
+    public List<Integer> applyTransforms(List<Integer> originalGlyphIds)
+    {
+        LOG.warn(getClass().getSimpleName() + " class does not perform actual GSUB substitutions. "
+                + "Perhaps the selected language is not yet supported by the FontBox library.");
+        return originalGlyphIds;

Review Comment:
   We should return either an unmodifiable version or a copy of the origin list, as the caller doesn't expect to get back the provided instance of the list. This may lead to unwanted changes of the origin list.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org


[GitHub] [pdfbox] Toparvion commented on pull request #153: PDFBOX-4189: Add a method to get GSUB data for specific script tag

Posted by "Toparvion (via GitHub)" <gi...@apache.org>.
Toparvion commented on PR #153:
URL: https://github.com/apache/pdfbox/pull/153#issuecomment-1408525920

   ✅ The fix has been [integrated](https://svn.apache.org/viewvc?view=revision&revision=1907078) into Apache Git repository by @lehmi within issue [PDFBOX-4189](https://issues.apache.org/jira/browse/PDFBOX-4189).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org


[GitHub] [pdfbox] Toparvion commented on a diff in pull request #153: PDFBOX-4189: Add a method to get GSUB data for specific script tag

Posted by "Toparvion (via GitHub)" <gi...@apache.org>.
Toparvion commented on code in PR #153:
URL: https://github.com/apache/pdfbox/pull/153#discussion_r1083583024


##########
fontbox/src/main/java/org/apache/fontbox/ttf/GlyphSubstitutionTable.java:
##########
@@ -701,6 +703,42 @@ public GsubData getGsubData()
         return gsubData;
     }
 
+    /**
+     * Builds a new {@link GsubData} instance for given script tag. In contrast to neighbour
+     * {@link #getGsubData()} method, this one does not try to find the first supported language
+     * and load GSUB data for it. Instead, it fetches the data for the given {@code scriptTag} (if
+     * it's supported by the font) leaving the language unspecified. It means that even after
+     * successful reading of GSUB data, the actual glyph substitution may not work if there is no
+     * corresponding {@link GsubWorker} implementation for it.
+     *
+     * @implNote This method performs searching on every invocation (no results are cached)
+     * @param scriptTag
+     * a <a href="https://learn.microsoft.com/en-us/typography/opentype/spec/scripttags">script
+     * tag</a> for which the data is needed
+     * @return GSUB data for the given script or {@code null} if no such script in the font
+     */
+    public GsubData getGsubData(String scriptTag)
+    {
+        ScriptTable scriptTable = scriptList.get(scriptTag);
+        if (scriptTable == null)
+        {
+            return null;

Review Comment:
   It could seem reasonable to align this behavior with the sibling method and return `GsubData.NO_DATA_FOUND` from here, but I find it confusable for the user because without thorough reading of JavaDoc (which is normal) it would be hard to understand than the returned result is actually "no result", while with `null` it is definitely clear.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org


[GitHub] [pdfbox] asfgit closed pull request #153: PDFBOX-4189: Add a method to get GSUB data for specific script tag

Posted by "asfgit (via GitHub)" <gi...@apache.org>.
asfgit closed pull request #153: PDFBOX-4189: Add a method to get GSUB data for specific script tag
URL: https://github.com/apache/pdfbox/pull/153


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org


[GitHub] [pdfbox] Toparvion commented on a diff in pull request #153: PDFBOX-4189: Add a method to get GSUB data for specific script tag

Posted by "Toparvion (via GitHub)" <gi...@apache.org>.
Toparvion commented on code in PR #153:
URL: https://github.com/apache/pdfbox/pull/153#discussion_r1083583024


##########
fontbox/src/main/java/org/apache/fontbox/ttf/GlyphSubstitutionTable.java:
##########
@@ -701,6 +703,42 @@ public GsubData getGsubData()
         return gsubData;
     }
 
+    /**
+     * Builds a new {@link GsubData} instance for given script tag. In contrast to neighbour
+     * {@link #getGsubData()} method, this one does not try to find the first supported language
+     * and load GSUB data for it. Instead, it fetches the data for the given {@code scriptTag} (if
+     * it's supported by the font) leaving the language unspecified. It means that even after
+     * successful reading of GSUB data, the actual glyph substitution may not work if there is no
+     * corresponding {@link GsubWorker} implementation for it.
+     *
+     * @implNote This method performs searching on every invocation (no results are cached)
+     * @param scriptTag
+     * a <a href="https://learn.microsoft.com/en-us/typography/opentype/spec/scripttags">script
+     * tag</a> for which the data is needed
+     * @return GSUB data for the given script or {@code null} if no such script in the font
+     */
+    public GsubData getGsubData(String scriptTag)
+    {
+        ScriptTable scriptTable = scriptList.get(scriptTag);
+        if (scriptTable == null)
+        {
+            return null;

Review Comment:
   It could seem reasonable to align this behavior with the sibling method and return `GsubData.NO_DATA_FOUND` from here, but I find it confusable for the user because without thorough reading of JavaDoc (which is normal) it would be hard to understand that the returned result is actually "no result", while with `null` it is definitely clear.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org