You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "Hudson (Jira)" <ji...@apache.org> on 2020/04/01 18:05:00 UTC
[jira] [Commented] (TIKA-3080) CharsetMatch.getString can get stuck
in infinite loop
[ https://issues.apache.org/jira/browse/TIKA-3080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17073050#comment-17073050 ]
Hudson commented on TIKA-3080:
------------------------------
UNSTABLE: Integrated in Jenkins build Tika-trunk #1799 (See [https://builds.apache.org/job/Tika-trunk/1799/])
TIKA-3080 -- prevent infinite loop in CharsetMatch.getString (tallison: [https://github.com/apache/tika/commit/8e33e28b72b791710a1e9fdf515c2fcd72f82deb])
* (edit) tika-parsers/src/main/java/org/apache/tika/parser/txt/CharsetMatch.java
> CharsetMatch.getString can get stuck in infinite loop
> -----------------------------------------------------
>
> Key: TIKA-3080
> URL: https://issues.apache.org/jira/browse/TIKA-3080
> Project: Tika
> Issue Type: Bug
> Components: parser
> Affects Versions: 1.24
> Reporter: Vikram Shrowty
> Priority: Major
>
> In here:
> [https://github.com/apache/tika/blob/fb5a191edac2cef28c0a4ac390c9156acdc9e673/tika-parsers/src/main/java/org/apache/tika/parser/txt/CharsetMatch.java#L147-L150]
> If you specify a maxLength and the stream is long enough, the max variable in the loop goes to zero and the loop then gets stuck because you're asking to read 0 bytes but not exiting unless the number of bytes read is < 0.
> Looks like the condition ought to be > 0 instead of >= 0.
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)