You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "Chris A. Mattmann (JIRA)" <ji...@apache.org> on 2010/08/25 18:04:17 UTC

[jira] Commented: (TIKA-498) HTML parser fails on turkish locale

    [ https://issues.apache.org/jira/browse/TIKA-498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12902489#action_12902489 ] 

Chris A. Mattmann commented on TIKA-498:
----------------------------------------

Hi Robert:

Thanks mucho! I've verified that the test breaks too on my machine (Mac OS X 10.5.6, JDK 1.6.0, System JRE) with your above command line:

{noformat}
Running org.apache.tika.parser.html.HtmlParserTest
Tests run: 25, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 0.778 sec <<< FAILURE!

Results :

Failed tests: 
  testLineBreak(org.apache.tika.parser.html.HtmlParserTest)

Tests run: 165, Failures: 1, Errors: 0, Skipped: 0

{noformat}

I'll now apply your patch and see if it fixes it...

Cheers,
Chris


> HTML parser fails on turkish locale
> -----------------------------------
>
>                 Key: TIKA-498
>                 URL: https://issues.apache.org/jira/browse/TIKA-498
>             Project: Tika
>          Issue Type: Bug
>          Components: parser
>            Reporter: Robert Muir
>            Assignee: Chris A. Mattmann
>         Attachments: TIKA-498.patch
>
>
> To reproduce: mvn test -DargLine=-Duser.language=tr
> This is because it uses toLowerCase for the default Locale 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.