You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "Chaitra Rajappa (Jira)" <ji...@apache.org> on 2021/08/06 17:44:00 UTC

[jira] [Updated] (TIKA-3516) Unexpected charset IBM424_rtl detected for utf_8 file by CharsetDetector

     [ https://issues.apache.org/jira/browse/TIKA-3516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Chaitra Rajappa updated TIKA-3516:
----------------------------------
    Description: 
Hi,
 The CharsetDetector detects the wrong charset for a file as IBM424_rtl. 
 Resulting in exception 

*_java.nio.charset.UnsupportedCharsetException: IBM424_rtl 17 at java.nio.charset.Charset.forName(Charset.java:531)_*

I see there is also an existing ticket with the same issue thats not been fixed.

https://issues.apache.org/jira/browse/TIKA-2396
 Please suggest the changes to fix this. 

Versions being used:

apache-core - 1.20
apache-parsers-1.20

Thanks

  was:
Hi,
The CharsetDetector detects the wrong charset for a file as IBM424_rtl. 
Resulting in exception 

*_java.nio.charset.UnsupportedCharsetException: IBM424_rtl 17 at java.nio.charset.Charset.forName(Charset.java:531)_*

I see there is also an existing ticket with the same issue thats not been fixed.

https://issues.apache.org/jira/browse/TIKA-2396
Please suggest the changes to fix this. 

Thanks


> Unexpected charset IBM424_rtl detected for  utf_8  file by CharsetDetector
> --------------------------------------------------------------------------
>
>                 Key: TIKA-3516
>                 URL: https://issues.apache.org/jira/browse/TIKA-3516
>             Project: Tika
>          Issue Type: Bug
>          Components: detector, parser
>            Reporter: Chaitra Rajappa
>            Priority: Major
>
> Hi,
>  The CharsetDetector detects the wrong charset for a file as IBM424_rtl. 
>  Resulting in exception 
> *_java.nio.charset.UnsupportedCharsetException: IBM424_rtl 17 at java.nio.charset.Charset.forName(Charset.java:531)_*
> I see there is also an existing ticket with the same issue thats not been fixed.
> https://issues.apache.org/jira/browse/TIKA-2396
>  Please suggest the changes to fix this. 
> Versions being used:
> apache-core - 1.20
> apache-parsers-1.20
> Thanks



--
This message was sent by Atlassian Jira
(v8.3.4#803005)