You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "Jukka Zitting (JIRA)" <ji...@apache.org> on 2010/05/12 17:34:41 UTC
[jira] Resolved: (TIKA-417) Unable to parse the content for UCS2
Litte Endian encoded file
[ https://issues.apache.org/jira/browse/TIKA-417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jukka Zitting resolved TIKA-417.
--------------------------------
Assignee: Jukka Zitting
Fix Version/s: 0.8
Resolution: Fixed
This problem was caused by a rare MP3 byte pattern that happened to also match the UCS2 LE byte order mark. I've fixed this in revision 943554. Thanks for the problem report!
> Unable to parse the content for UCS2 Litte Endian encoded file
> --------------------------------------------------------------
>
> Key: TIKA-417
> URL: https://issues.apache.org/jira/browse/TIKA-417
> Project: Tika
> Issue Type: Bug
> Components: parser
> Affects Versions: 0.7
> Environment: Windows
> Reporter: Rajiv Kumar
> Assignee: Jukka Zitting
> Fix For: 0.8
>
> Attachments: TXT_UCS2_LE2.txt
>
>
> I have text file which I encoded in UCS2 Little Endian format using Notepad++. It is unable to parse the content and also it is not throwing any exception.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.