You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Hoss Man (JIRA)" <ji...@apache.org> on 2014/01/04 06:54:50 UTC
[jira] [Commented] (SOLR-5605) MapReduceIndexerTool fails in some
locales -- seen in random failures of
MapReduceIndexerToolArgumentParserTest
[ https://issues.apache.org/jira/browse/SOLR-5605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13862220#comment-13862220 ]
Hoss Man commented on SOLR-5605:
--------------------------------
Filling this because i encountered it in randomized testing, it sounded familiar, but i was suprised not to be able to find an issue about it.
easy to repro...
{code}
ant test -Dtestcase=MapReduceIndexerToolArgumentParserTest -Dtests.method=testArgsParserHelp -Dtests.slow=true -Dtests.locale=hi_IN -Dtests.file.encoding=UTF-8
...
[junit4] 2> NOTE: reproduce with: ant test -Dtestcase=MapReduceIndexerToolArgumentParserTest -Dtests.method=testArgsParserHelp -Dtests.seed=90EEAEBDB08626A8 -Dtests.slow=true -Dtests.locale=hi_IN -Dtests.timezone=Pacific/Apia -Dtests.file.encoding=UTF-8
[junit4] ERROR 0.25s | MapReduceIndexerToolArgumentParserTest.testArgsParserHelp <<<
[junit4] > Throwable #1: java.util.UnknownFormatConversionException: Conversion = '१'
[junit4] > at __randomizedtesting.SeedInfo.seed([90EEAEBDB08626A8:C3C04CAF7E84AE5]:0)
[junit4] > at java.util.Formatter.checkText(Formatter.java:2547)
[junit4] > at java.util.Formatter.parse(Formatter.java:2523)
[junit4] > at java.util.Formatter.format(Formatter.java:2469)
[junit4] > at java.io.PrintWriter.format(PrintWriter.java:905)
[junit4] > at net.sourceforge.argparse4j.helper.TextHelper.printHelp(TextHelper.java:206)
[junit4] > at net.sourceforge.argparse4j.internal.ArgumentImpl.printHelp(ArgumentImpl.java:247)
[junit4] > at net.sourceforge.argparse4j.internal.ArgumentParserImpl.printArgumentHelp(ArgumentParserImpl.java:253)
[junit4] > at net.sourceforge.argparse4j.internal.ArgumentParserImpl.printHelp(ArgumentParserImpl.java:279)
[junit4] > at org.apache.solr.hadoop.MapReduceIndexerTool$MyArgumentParser$1.run(MapReduceIndexerTool.java:187)
{code}
Analysis from Uwe on the list when jenkins hit this a while back...
{quote}
Locale problem with the argument parser.
The sperm-like symbol (१) is DEVANAGARI DIGIT ONE (U+0967). It looks like while testing some foreign (non-lucene) code converts the digit "1" to this small creature maybe through the use of default locale. As the Lucene code is forbidden-api checked, this seems to be a bug somewhere else - the stack trace shows the bug: net.sourceforge.argparse4j.helper.TextHelper calls String.format without Locale!).
{quote}
...and...
{quote}
The problem is in Argparser4J:
http://grepcode.com/file/repo1.maven.org/maven2/net.sourceforge.argparse4j/argparse4j/0.3.2/net/sourceforge/argparse4j/helper/TextHelper.java#197
The code does the following:
String fmt = String.format("%%%ds%%s\n", indentWidth);
writer.format(fmt,....)
So it uses the first String.format (without locale) to produce the format string of the second one. The %d will be the indentWidth, so the code is right-aligned. But the indent-with pattern is formatted using default locale, so the first line produces something like the following code:
"%१s%s" instead of "%1s%s"
This will fail format parsing in the second. In my opinion the whole code is a bug by itself. Creating a format pattern with another format pattern is slow and as shown: buggy!
{quote}
> MapReduceIndexerTool fails in some locales -- seen in random failures of MapReduceIndexerToolArgumentParserTest
> ---------------------------------------------------------------------------------------------------------------
>
> Key: SOLR-5605
> URL: https://issues.apache.org/jira/browse/SOLR-5605
> Project: Solr
> Issue Type: Bug
> Reporter: Hoss Man
>
> I noticed a randomized failure in MapReduceIndexerToolArgumentParserTest which is reproducible with any seed -- all that matters is the locale.
> The problem sounded familiar, and a quick search verified that jenkins has in fact hit this a couple of times in the past -- Uwe commented on the list that this is due to a real problem in one of the third-party dependencies (that does the argument parsing) that will affect usage on some systems.
> If working around the bug in the arg parsing lib isn't feasible, MapReduceIndexerTool should fail cleanly if the locale isn't one we know is "supported"
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org