You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@daffodil.apache.org by GitBox <gi...@apache.org> on 2021/02/11 09:01:45 UTC

[GitHub] [incubator-daffodil] susmita-m145 opened a new pull request #490: Unsupported characterset

susmita-m145 opened a new pull request #490:
URL: https://github.com/apache/incubator-daffodil/pull/490


   Jira- DAFFODIL-2049


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-daffodil] stevedlawrence commented on a change in pull request #490: Unsupported characterset

Posted by GitBox <gi...@apache.org>.
stevedlawrence commented on a change in pull request #490:
URL: https://github.com/apache/incubator-daffodil/pull/490#discussion_r574637772



##########
File path: daffodil-io/src/main/scala/org/apache/daffodil/processors/charset/IBM1047.scala
##########
@@ -0,0 +1,57 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.daffodil.processors.charset
+
+import org.apache.daffodil.io.InputSourceDataInputStream
+import org.apache.daffodil.io.FormatInfo
+
+object BitsCharsetIBM1047 extends {
+  override val name = "IBM1047"
+  override val aliases = Seq("X-IBM-1047-S390")
+} with BitsCharsetJava {
+
+  val decodeStringTable =
+    "\u0000\u0001\u0002\u0003\u009C\u0009\u0086\u007F\u0097\u008D\u008E\u000B\u000C\u000D\u000E\u000F" +
+    "\u0010\u0011\u0012\u0013\u009D\u0085\u0008\u0087\u0018\u0019\u0092\u008F\u001C\u001D\u001E\u005E" +
+    "\u0080\u0081\u0082\u0083\u0084\u000A\u0017\u001B\u0088\u0089\u008A\u008B\u008C\u0005\u0006\u0007" +
+    "\u0090\u0091\u0016\u0093\u0094\u0095\u0096\u0004\u0098\u0099\u009A\u009B\u0014\u0015\u009E\u001A" +
+    "\u0020\u00A0\u00E2\u00E4\u00E0\u00E1\u00E3\u00E5\u00E7\u00F1\u00A2\u002E\u003C\u0028\u002B\u007C" +
+    "\u0026\u00E9\u00EA\u00EB\u00E8\u00ED\u00EE\u00EF\u00EC\u00DF\u0021\u0024\u002A\u0029\u003B\u005E" +
+    "\u002D\u002F\u00C2\u00C4\u00C0\u00C1\u00C3\u00C5\u00C7\u00D1\u00A6\u002C\u0025\u005F\u003E\u003F" +
+    "\u00F8\u00C9\u00CA\u00CB\u00C8\u00CD\u00CE\u00CF\u00CC\u0060\u003A\u0023\u0040\u0027\u003D\"" +
+    "\u00D8\u0061\u0062\u0063\u0064\u0065\u0066\u0067\u0068\u0069\u00AB\u00BB\u00F0\u00FD\u00FE\u00B1" +
+    "\u00B0\u006A\u006B\u006C\u006D\u006E\u006F\u0070\u0071\u0072\u00AA\u00BA\u00E6\u00B8\u00C6\u00A4" +
+    "\u00B5\u007E\u0073\u0074\u0075\u0076\u0077\u0078\u0079\u007A\u00A1\u00BF\u00D0\u005B\u00DE\u00AE" +
+    "\u00AC\u00A3\u00A5\u00B7\u00A9\u00A7\u00B6\u00BC\u00BD\u00BE\u00DD\u00A8\u00AF\u005D\u00B4\u00D7" +
+    "\u007B\u0041\u0042\u0043\u0044\u0045\u0046\u0047\u0048\u0049\u00AD\u00F4\u00F6\u00F2\u00F3\u00F5" +
+    "\u007D\u004A\u004B\u004C\u004D\u004E\u004F\u0050\u0051\u0052\u00B9\u00FB\u00FC\u00F9\u00FA\u00FF" +
+    "\\\u00F7\u0053\u0054\u0055\u0056\u0057\u0058\u0059\u005A\u00B2\u00D4\u00D6\u00D2\u00D3\u00D5" +
+    "\u0030\u0031\u0032\u0033\u0034\u0035\u0036\u0037\u0038\u0039\u00B3\u00DB\u00DC\u00D9\u00DA\u009F"
+
+  override def newDecoder() = new BitsCharsetDecoderIBM1047()
+}
+
+class BitsCharsetDecoderIBM1047
+  extends BitsCharsetDecoderByteSize {
+
+  protected override def decodeOneChar(dis: InputSourceDataInputStream, finfo: FormatInfo): Char = {

Review comment:
       > Suggest since the finfo argument is unused that it be named "unused" instead of "finfo". Makes it explicit that you intended to not use it, rather than you forgot something.
   
   Scala 2.13 adds an @unused annotation for this purpose. This way the variable name can be kept, which can sometimes be self documenting (though it doesn't add much in this case). This is also supposed to silence element unused warnings. We dont' get those right now, but maybe that's because we don't have that warning turned on.
   
   It doesn't look like Scala 2.12 has anything similar. But maybe this is the kindof thing we just wait until we bump up to 2.13 to fix find all the unused parameters and annotate them correctly?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-daffodil] stevedlawrence commented on a change in pull request #490: Unsupported characterset

Posted by GitBox <gi...@apache.org>.
stevedlawrence commented on a change in pull request #490:
URL: https://github.com/apache/incubator-daffodil/pull/490#discussion_r574459908



##########
File path: daffodil-io/src/main/scala/org/apache/daffodil/processors/charset/IBM1047.scala
##########
@@ -0,0 +1,57 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.daffodil.processors.charset
+
+import org.apache.daffodil.io.InputSourceDataInputStream
+import org.apache.daffodil.io.FormatInfo
+
+object BitsCharsetIBM1047 extends {
+  override val name = "IBM1047"
+  override val aliases = Seq("X-IBM-1047-S390")
+} with BitsCharsetJava {
+
+  val decodeStringTable =
+    "\u0000\u0001\u0002\u0003\u009C\u0009\u0086\u007F\u0097\u008D\u008E\u000B\u000C\u000D\u000E\u000F" +
+    "\u0010\u0011\u0012\u0013\u009D\u0085\u0008\u0087\u0018\u0019\u0092\u008F\u001C\u001D\u001E\u005E" +

Review comment:
       According to https://www.compart.com/en/unicode/charsets/IBM1047, the ``\u005E`` at the end of this line is incorrect and should be ``\u001F``. I think everything else looks correct.

##########
File path: daffodil-io/src/main/scala/org/apache/daffodil/processors/charset/DaffodilCharsetProvider.scala
##########
@@ -51,7 +51,8 @@ object DaffodilCharsetProvider {
     BitsCharsetUTF16LE,
     BitsCharsetUTF32BE,
     BitsCharsetUTF32LE,
-    BitsCharsetUTF8)
+    BitsCharsetUTF8,
+    BitsCharsetIBM1047 )

Review comment:
       I would prefer that this be added to this list in alphabetical order so things are a bit more organized. So move this right after BitsCharsetIBM037

##########
File path: daffodil-io/src/main/scala/org/apache/daffodil/processors/charset/IBM1047.scala
##########
@@ -0,0 +1,57 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.daffodil.processors.charset
+
+import org.apache.daffodil.io.InputSourceDataInputStream
+import org.apache.daffodil.io.FormatInfo
+
+object BitsCharsetIBM1047 extends {
+  override val name = "IBM1047"
+  override val aliases = Seq("X-IBM-1047-S390")

Review comment:
       Why the ``X-``? Can this just be ``IBM-1047-S390``?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-daffodil] mbeckerle commented on a change in pull request #490: Unsupported characterset

Posted by GitBox <gi...@apache.org>.
mbeckerle commented on a change in pull request #490:
URL: https://github.com/apache/incubator-daffodil/pull/490#discussion_r574582771



##########
File path: daffodil-io/src/main/scala/org/apache/daffodil/processors/charset/IBM1047.scala
##########
@@ -0,0 +1,57 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.daffodil.processors.charset
+
+import org.apache.daffodil.io.InputSourceDataInputStream
+import org.apache.daffodil.io.FormatInfo
+
+object BitsCharsetIBM1047 extends {
+  override val name = "IBM1047"
+  override val aliases = Seq("X-IBM-1047-S390")

Review comment:
       To clarify, (maybe add this as a code comment for future developers). Charset names and aliases should be taken from: https://www.iana.org/assignments/character-sets/character-sets.xhtml
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-daffodil] mbeckerle commented on a change in pull request #490: Unsupported characterset

Posted by GitBox <gi...@apache.org>.
mbeckerle commented on a change in pull request #490:
URL: https://github.com/apache/incubator-daffodil/pull/490#discussion_r574594121



##########
File path: daffodil-io/src/main/scala/org/apache/daffodil/processors/charset/IBM1047.scala
##########
@@ -0,0 +1,57 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.daffodil.processors.charset
+
+import org.apache.daffodil.io.InputSourceDataInputStream
+import org.apache.daffodil.io.FormatInfo
+
+object BitsCharsetIBM1047 extends {
+  override val name = "IBM1047"
+  override val aliases = Seq("X-IBM-1047-S390")
+} with BitsCharsetJava {
+
+  val decodeStringTable =
+    "\u0000\u0001\u0002\u0003\u009C\u0009\u0086\u007F\u0097\u008D\u008E\u000B\u000C\u000D\u000E\u000F" +
+    "\u0010\u0011\u0012\u0013\u009D\u0085\u0008\u0087\u0018\u0019\u0092\u008F\u001C\u001D\u001E\u005E" +

Review comment:
       I think a unit test should parse and unparse a string containing every character, all 256 of them, perhaps in groups of 16 to match these lines of code. That would make it absolutely positive that there are no typos in these, at least for the characters that are not control-codes/whitespace.  

##########
File path: daffodil-io/src/main/scala/org/apache/daffodil/processors/charset/IBM1047.scala
##########
@@ -0,0 +1,57 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.daffodil.processors.charset
+
+import org.apache.daffodil.io.InputSourceDataInputStream
+import org.apache.daffodil.io.FormatInfo
+
+object BitsCharsetIBM1047 extends {
+  override val name = "IBM1047"
+  override val aliases = Seq("X-IBM-1047-S390")
+} with BitsCharsetJava {
+
+  val decodeStringTable =
+    "\u0000\u0001\u0002\u0003\u009C\u0009\u0086\u007F\u0097\u008D\u008E\u000B\u000C\u000D\u000E\u000F" +
+    "\u0010\u0011\u0012\u0013\u009D\u0085\u0008\u0087\u0018\u0019\u0092\u008F\u001C\u001D\u001E\u005E" +
+    "\u0080\u0081\u0082\u0083\u0084\u000A\u0017\u001B\u0088\u0089\u008A\u008B\u008C\u0005\u0006\u0007" +
+    "\u0090\u0091\u0016\u0093\u0094\u0095\u0096\u0004\u0098\u0099\u009A\u009B\u0014\u0015\u009E\u001A" +
+    "\u0020\u00A0\u00E2\u00E4\u00E0\u00E1\u00E3\u00E5\u00E7\u00F1\u00A2\u002E\u003C\u0028\u002B\u007C" +
+    "\u0026\u00E9\u00EA\u00EB\u00E8\u00ED\u00EE\u00EF\u00EC\u00DF\u0021\u0024\u002A\u0029\u003B\u005E" +
+    "\u002D\u002F\u00C2\u00C4\u00C0\u00C1\u00C3\u00C5\u00C7\u00D1\u00A6\u002C\u0025\u005F\u003E\u003F" +
+    "\u00F8\u00C9\u00CA\u00CB\u00C8\u00CD\u00CE\u00CF\u00CC\u0060\u003A\u0023\u0040\u0027\u003D\"" +
+    "\u00D8\u0061\u0062\u0063\u0064\u0065\u0066\u0067\u0068\u0069\u00AB\u00BB\u00F0\u00FD\u00FE\u00B1" +
+    "\u00B0\u006A\u006B\u006C\u006D\u006E\u006F\u0070\u0071\u0072\u00AA\u00BA\u00E6\u00B8\u00C6\u00A4" +
+    "\u00B5\u007E\u0073\u0074\u0075\u0076\u0077\u0078\u0079\u007A\u00A1\u00BF\u00D0\u005B\u00DE\u00AE" +
+    "\u00AC\u00A3\u00A5\u00B7\u00A9\u00A7\u00B6\u00BC\u00BD\u00BE\u00DD\u00A8\u00AF\u005D\u00B4\u00D7" +
+    "\u007B\u0041\u0042\u0043\u0044\u0045\u0046\u0047\u0048\u0049\u00AD\u00F4\u00F6\u00F2\u00F3\u00F5" +
+    "\u007D\u004A\u004B\u004C\u004D\u004E\u004F\u0050\u0051\u0052\u00B9\u00FB\u00FC\u00F9\u00FA\u00FF" +
+    "\\\u00F7\u0053\u0054\u0055\u0056\u0057\u0058\u0059\u005A\u00B2\u00D4\u00D6\u00D2\u00D3\u00D5" +
+    "\u0030\u0031\u0032\u0033\u0034\u0035\u0036\u0037\u0038\u0039\u00B3\u00DB\u00DC\u00D9\u00DA\u009F"
+
+  override def newDecoder() = new BitsCharsetDecoderIBM1047()
+}
+
+class BitsCharsetDecoderIBM1047
+  extends BitsCharsetDecoderByteSize {
+
+  protected override def decodeOneChar(dis: InputSourceDataInputStream, finfo: FormatInfo): Char = {

Review comment:
       Also, this method seems generic, and seems like it should be the default implementation, and every single-byte-character-set (SBCS) would share it. 
   
   If the BitsCharsetDecoderByteSize had a constructor that took the BitsCharsetJava object, then is could be a one-liner:
   ```
   class BitsCharsetDecoderIBM1047 extends BitsCharsetDecoderByteSize(BitsCharset1047)
   ```
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-daffodil] mbeckerle commented on a change in pull request #490: Unsupported characterset

Posted by GitBox <gi...@apache.org>.
mbeckerle commented on a change in pull request #490:
URL: https://github.com/apache/incubator-daffodil/pull/490#discussion_r574580481



##########
File path: daffodil-io/src/main/scala/org/apache/daffodil/processors/charset/IBM1047.scala
##########
@@ -0,0 +1,57 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.daffodil.processors.charset
+
+import org.apache.daffodil.io.InputSourceDataInputStream
+import org.apache.daffodil.io.FormatInfo
+
+object BitsCharsetIBM1047 extends {
+  override val name = "IBM1047"
+  override val aliases = Seq("X-IBM-1047-S390")

Review comment:
       Should not have "-s390" suffix. Name of this charset should just be "IBM-1047"




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-daffodil] mbeckerle commented on a change in pull request #490: Unsupported characterset

Posted by GitBox <gi...@apache.org>.
mbeckerle commented on a change in pull request #490:
URL: https://github.com/apache/incubator-daffodil/pull/490#discussion_r574582771



##########
File path: daffodil-io/src/main/scala/org/apache/daffodil/processors/charset/IBM1047.scala
##########
@@ -0,0 +1,57 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.daffodil.processors.charset
+
+import org.apache.daffodil.io.InputSourceDataInputStream
+import org.apache.daffodil.io.FormatInfo
+
+object BitsCharsetIBM1047 extends {
+  override val name = "IBM1047"
+  override val aliases = Seq("X-IBM-1047-S390")

Review comment:
       To clarify, (maybe add this as a code comment for future developers). Charset names should be taken from: https://www.iana.org/assignments/character-sets/character-sets.xhtml
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-daffodil] mbeckerle commented on a change in pull request #490: Unsupported characterset

Posted by GitBox <gi...@apache.org>.
mbeckerle commented on a change in pull request #490:
URL: https://github.com/apache/incubator-daffodil/pull/490#discussion_r574586034



##########
File path: daffodil-io/src/main/scala/org/apache/daffodil/processors/charset/IBM1047.scala
##########
@@ -0,0 +1,57 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.daffodil.processors.charset
+
+import org.apache.daffodil.io.InputSourceDataInputStream
+import org.apache.daffodil.io.FormatInfo
+
+object BitsCharsetIBM1047 extends {
+  override val name = "IBM1047"
+  override val aliases = Seq("X-IBM-1047-S390")
+} with BitsCharsetJava {
+
+  val decodeStringTable =
+    "\u0000\u0001\u0002\u0003\u009C\u0009\u0086\u007F\u0097\u008D\u008E\u000B\u000C\u000D\u000E\u000F" +
+    "\u0010\u0011\u0012\u0013\u009D\u0085\u0008\u0087\u0018\u0019\u0092\u008F\u001C\u001D\u001E\u005E" +
+    "\u0080\u0081\u0082\u0083\u0084\u000A\u0017\u001B\u0088\u0089\u008A\u008B\u008C\u0005\u0006\u0007" +
+    "\u0090\u0091\u0016\u0093\u0094\u0095\u0096\u0004\u0098\u0099\u009A\u009B\u0014\u0015\u009E\u001A" +
+    "\u0020\u00A0\u00E2\u00E4\u00E0\u00E1\u00E3\u00E5\u00E7\u00F1\u00A2\u002E\u003C\u0028\u002B\u007C" +
+    "\u0026\u00E9\u00EA\u00EB\u00E8\u00ED\u00EE\u00EF\u00EC\u00DF\u0021\u0024\u002A\u0029\u003B\u005E" +
+    "\u002D\u002F\u00C2\u00C4\u00C0\u00C1\u00C3\u00C5\u00C7\u00D1\u00A6\u002C\u0025\u005F\u003E\u003F" +
+    "\u00F8\u00C9\u00CA\u00CB\u00C8\u00CD\u00CE\u00CF\u00CC\u0060\u003A\u0023\u0040\u0027\u003D\"" +
+    "\u00D8\u0061\u0062\u0063\u0064\u0065\u0066\u0067\u0068\u0069\u00AB\u00BB\u00F0\u00FD\u00FE\u00B1" +
+    "\u00B0\u006A\u006B\u006C\u006D\u006E\u006F\u0070\u0071\u0072\u00AA\u00BA\u00E6\u00B8\u00C6\u00A4" +
+    "\u00B5\u007E\u0073\u0074\u0075\u0076\u0077\u0078\u0079\u007A\u00A1\u00BF\u00D0\u005B\u00DE\u00AE" +
+    "\u00AC\u00A3\u00A5\u00B7\u00A9\u00A7\u00B6\u00BC\u00BD\u00BE\u00DD\u00A8\u00AF\u005D\u00B4\u00D7" +
+    "\u007B\u0041\u0042\u0043\u0044\u0045\u0046\u0047\u0048\u0049\u00AD\u00F4\u00F6\u00F2\u00F3\u00F5" +
+    "\u007D\u004A\u004B\u004C\u004D\u004E\u004F\u0050\u0051\u0052\u00B9\u00FB\u00FC\u00F9\u00FA\u00FF" +
+    "\\\u00F7\u0053\u0054\u0055\u0056\u0057\u0058\u0059\u005A\u00B2\u00D4\u00D6\u00D2\u00D3\u00D5" +
+    "\u0030\u0031\u0032\u0033\u0034\u0035\u0036\u0037\u0038\u0039\u00B3\u00DB\u00DC\u00D9\u00DA\u009F"
+
+  override def newDecoder() = new BitsCharsetDecoderIBM1047()
+}
+
+class BitsCharsetDecoderIBM1047
+  extends BitsCharsetDecoderByteSize {
+
+  protected override def decodeOneChar(dis: InputSourceDataInputStream, finfo: FormatInfo): Char = {

Review comment:
       Suggest since the finfo argument is unused that it be named "unused" instead of "finfo". Makes it explicit that you intended to not use it, rather than you forgot something.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-daffodil] susmita-m145 commented on pull request #490: Unsupported characterset

Posted by GitBox <gi...@apache.org>.
susmita-m145 commented on pull request #490:
URL: https://github.com/apache/incubator-daffodil/pull/490#issuecomment-777395780


   Kindly review my changes


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-daffodil] mbeckerle commented on a change in pull request #490: Unsupported characterset

Posted by GitBox <gi...@apache.org>.
mbeckerle commented on a change in pull request #490:
URL: https://github.com/apache/incubator-daffodil/pull/490#discussion_r574652201



##########
File path: daffodil-io/src/main/scala/org/apache/daffodil/processors/charset/IBM1047.scala
##########
@@ -0,0 +1,57 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.daffodil.processors.charset
+
+import org.apache.daffodil.io.InputSourceDataInputStream
+import org.apache.daffodil.io.FormatInfo
+
+object BitsCharsetIBM1047 extends {
+  override val name = "IBM1047"
+  override val aliases = Seq("X-IBM-1047-S390")
+} with BitsCharsetJava {
+
+  val decodeStringTable =
+    "\u0000\u0001\u0002\u0003\u009C\u0009\u0086\u007F\u0097\u008D\u008E\u000B\u000C\u000D\u000E\u000F" +
+    "\u0010\u0011\u0012\u0013\u009D\u0085\u0008\u0087\u0018\u0019\u0092\u008F\u001C\u001D\u001E\u005E" +
+    "\u0080\u0081\u0082\u0083\u0084\u000A\u0017\u001B\u0088\u0089\u008A\u008B\u008C\u0005\u0006\u0007" +
+    "\u0090\u0091\u0016\u0093\u0094\u0095\u0096\u0004\u0098\u0099\u009A\u009B\u0014\u0015\u009E\u001A" +
+    "\u0020\u00A0\u00E2\u00E4\u00E0\u00E1\u00E3\u00E5\u00E7\u00F1\u00A2\u002E\u003C\u0028\u002B\u007C" +
+    "\u0026\u00E9\u00EA\u00EB\u00E8\u00ED\u00EE\u00EF\u00EC\u00DF\u0021\u0024\u002A\u0029\u003B\u005E" +
+    "\u002D\u002F\u00C2\u00C4\u00C0\u00C1\u00C3\u00C5\u00C7\u00D1\u00A6\u002C\u0025\u005F\u003E\u003F" +
+    "\u00F8\u00C9\u00CA\u00CB\u00C8\u00CD\u00CE\u00CF\u00CC\u0060\u003A\u0023\u0040\u0027\u003D\"" +
+    "\u00D8\u0061\u0062\u0063\u0064\u0065\u0066\u0067\u0068\u0069\u00AB\u00BB\u00F0\u00FD\u00FE\u00B1" +
+    "\u00B0\u006A\u006B\u006C\u006D\u006E\u006F\u0070\u0071\u0072\u00AA\u00BA\u00E6\u00B8\u00C6\u00A4" +
+    "\u00B5\u007E\u0073\u0074\u0075\u0076\u0077\u0078\u0079\u007A\u00A1\u00BF\u00D0\u005B\u00DE\u00AE" +
+    "\u00AC\u00A3\u00A5\u00B7\u00A9\u00A7\u00B6\u00BC\u00BD\u00BE\u00DD\u00A8\u00AF\u005D\u00B4\u00D7" +
+    "\u007B\u0041\u0042\u0043\u0044\u0045\u0046\u0047\u0048\u0049\u00AD\u00F4\u00F6\u00F2\u00F3\u00F5" +
+    "\u007D\u004A\u004B\u004C\u004D\u004E\u004F\u0050\u0051\u0052\u00B9\u00FB\u00FC\u00F9\u00FA\u00FF" +
+    "\\\u00F7\u0053\u0054\u0055\u0056\u0057\u0058\u0059\u005A\u00B2\u00D4\u00D6\u00D2\u00D3\u00D5" +
+    "\u0030\u0031\u0032\u0033\u0034\u0035\u0036\u0037\u0038\u0039\u00B3\u00DB\u00DC\u00D9\u00DA\u009F"
+
+  override def newDecoder() = new BitsCharsetDecoderIBM1047()
+}
+
+class BitsCharsetDecoderIBM1047
+  extends BitsCharsetDecoderByteSize {
+
+  protected override def decodeOneChar(dis: InputSourceDataInputStream, finfo: FormatInfo): Char = {

Review comment:
       I'm fine with that approach, we have a separate JIRA ticket about re-enabling warnings for things like unused params. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-daffodil] stevedlawrence commented on pull request #490: Unsupported characterset

Posted by GitBox <gi...@apache.org>.
stevedlawrence commented on pull request #490:
URL: https://github.com/apache/incubator-daffodil/pull/490#issuecomment-777462074


   It would also be really helpful to add a test so that this new code has some test coverage to make sure we don't accidentally break this in the future.
   
   It looks like we have similar tests in ``daffodil-test/src/test/resources/org/apache/daffodil/section10/representation_properties/encodings.tdml``. The easiest way to add this test would be to:
   1. Copy the ``eString`` element to make a new element with this new encoding
   2. Copy the ``ebcidc1`` parserTestCase to create a new parserTestCase with a unique name and that uses your new element as the root
   
   With that done, you just need to modify ``daffodil-test/src/test/scala/org/apache/daffodil/section10/representation_properties/TestRepProps2.scala`` and copy ``def test_ebcdic1`` to run your new parser test case.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-daffodil] susmita-m145 closed pull request #490: Unsupported characterset

Posted by GitBox <gi...@apache.org>.
susmita-m145 closed pull request #490:
URL: https://github.com/apache/incubator-daffodil/pull/490


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org