You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@samza.apache.org by "Tobias Roth (JIRA)" <ji...@apache.org> on 2015/10/06 12:03:26 UTC

[jira] [Commented] (SAMZA-700) YarnJob mangles config properties containing quotes

    [ https://issues.apache.org/jira/browse/SAMZA-700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14944823#comment-14944823 ] 

Tobias Roth commented on SAMZA-700:
-----------------------------------

As mentioned above the problem is due to Util.envVarEscape(), which simply escapes single and double quotes globally. The result is passed to shell using weak quoting (see for example https://www.gnu.org/software/bash/manual/html_node/Double-Quotes.html).

To fix this issue the value of SAMZA_CONFIG should be escaped by 
{code:java}
value.replace("\\", "\\\\").replace("`", "\\`").replace("$", "\\$").replace("\"", "\\\"").replace("!", "\"'!'\"")
{code}

I'm not sure what's the best approach here. Change the behavior of Util.envVarEscape() or define a now function Util.envVarWeakQuotingBashEscape() and add a configuration parameter for selecting the strategy that should be used?

> YarnJob mangles config properties containing quotes
> ---------------------------------------------------
>
>                 Key: SAMZA-700
>                 URL: https://issues.apache.org/jira/browse/SAMZA-700
>             Project: Samza
>          Issue Type: Bug
>    Affects Versions: 0.8.0, 0.9.0
>            Reporter: Tommy Becker
>
> YarnJob passes the Config to the AM via an environment variable, SAMZA_CONFIG.  After serializing the Config to JSON, it goes through Util.envVarEscape(), which I think is behaving improperly.  Specifically, that method escapes single quotes globally, even inside double quotes.  Consider the following config property:
> {code:javascript}
> expression="type == 'LINEAR'"
> {code}
> After encoding to JSON this looks like this:
> {code:javascript}
> {"expression":"type == 'LINEAR'"}
> {code}
> And after being run though Util.envVarEscape():
> {code:javascript}
> {\"expression\":\"type == \'LINEAR\'\"}
> {code}
> I presume these values are being escaped because the YARN client is passing them through the shell at some point.  But the escaping is too simplistic; single quotes should not be escaped within double quotes.  As a result, the value arrives at the AppMaster as follows:
> {code:javascript}
> {"expression": "type == \'LINEAR\'"}
> {code}
> At which point Jackson chokes on it because \' is invalid JSON (invalid escape sequence):
> {noformat}
> Exception in thread "main" org.codehaus.jackson.JsonParseException: Unrecognized character escape ''' (code 39)
>  at [Source: java.io.StringReader@1b6e1eff; line: 1, column: 2814]
>         at org.codehaus.jackson.JsonParser._constructError(JsonParser.java:1433)
>         at org.codehaus.jackson.impl.JsonParserMinimalBase._reportError(JsonParserMinimalBase.java:521)
>         at org.codehaus.jackson.impl.JsonParserMinimalBase._handleUnrecognizedCharacterEscape(JsonParserMinimalBase.java:496)
>         at org.codehaus.jackson.impl.ReaderBasedParser._decodeEscaped(ReaderBasedParser.java:1606)
>         at org.codehaus.jackson.impl.ReaderBasedParser._finishString2(ReaderBasedParser.java:1353)
>         at org.codehaus.jackson.impl.ReaderBasedParser._finishString(ReaderBasedParser.java:1330)
>         at org.codehaus.jackson.impl.ReaderBasedParser.getText(ReaderBasedParser.java:200)
>         at org.codehaus.jackson.map.deser.std.UntypedObjectDeserializer.deserialize(UntypedObjectDeserializer.java:59)
>         at org.codehaus.jackson.map.deser.std.MapDeserializer._readAndBind(MapDeserializer.java:319)
>         at org.codehaus.jackson.map.deser.std.MapDeserializer.deserialize(MapDeserializer.java:249)
>         at org.codehaus.jackson.map.deser.std.MapDeserializer.deserialize(MapDeserializer.java:33)
>         at org.codehaus.jackson.map.ObjectMapper._readMapAndClose(ObjectMapper.java:2732)
>         at org.codehaus.jackson.map.ObjectMapper.readValue(ObjectMapper.java:1863)
>         at org.apache.samza.config.serializers.JsonConfigSerializer$.fromJson(JsonConfigSerializer.scala:34)
>         at org.apache.samza.job.yarn.SamzaAppMaster$.main(SamzaAppMaster.scala:72)
>         at org.apache.samza.job.yarn.SamzaAppMaster.main(SamzaAppMaster.scala)
> {noformat}
> This is particularly nasty since I don't see a way for any quotes, single or double to get passed to the job successfully and remain intact.  I know the way this config is passed has undergone some change but I don't know the details so wanted to get this issue on record.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)