You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@dolphinscheduler.apache.org by GitBox <gi...@apache.org> on 2021/01/13 08:43:45 UTC

[GitHub] [incubator-dolphinscheduler] Slivery1 opened a new issue #4438: [Bug][Server] There is something wrong with the spliced sentences in Mr task

Slivery1 opened a new issue #4438:
URL: https://github.com/apache/incubator-dolphinscheduler/issues/4438


   **For better global communication, Please describe it in English. If you feel the description in English is not clear, then you can append description in Chinese(just for Mandarin(CN)), thx! **
   
   Describe the bug
   When DS splices the statements of Mr task, there is a problem in the order of parameters, resulting in the last command cannot be executed
   
   To Reproduce
   Steps to reproduce the behavior, for example:
   1. Create an MR task
   2. Class and package select the commonly used WordCount prepared in advance
   3. Command line parameters fill in /dolphinscheduler/dolphinscheduler/resources/word.txt /dolphinscheduler/dolphinscheduler/resources/out
   4. Save and execute
   
   Expected behavior
   The generated Mr command should be hadoop jar wordCount_MR-1.0-SNAPSHOT.jar pers.jun.WordcountDriver /dolphinscheduler/dolphinscheduler/resources/word.txt /dolphinscheduler/dolphinscheduler/resources/out -D mapreduce.job.queuename=default
   
   
   Screenshots
   ![image](https://user-images.githubusercontent.com/30253711/104427239-facf4680-55bd-11eb-9f1f-ab21dee09bd3.png)
   ![image](https://user-images.githubusercontent.com/30253711/104427431-30742f80-55be-11eb-8368-235e4b3f07a2.png)
   ![image](https://user-images.githubusercontent.com/30253711/104427529-4f72c180-55be-11eb-8cb0-524e78d101eb.png)
   
   
   
   Which version of Dolphin Scheduler:
    -[1.3.4-release]
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-dolphinscheduler] zhuangchong commented on issue #4438: [Bug][Server] There is something wrong with the spliced sentences in Mr task

Posted by GitBox <gi...@apache.org>.
zhuangchong commented on issue #4438:
URL: https://github.com/apache/incubator-dolphinscheduler/issues/4438#issuecomment-759424400


   I can fix it.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-dolphinscheduler] Slivery1 closed issue #4438: [Bug][Server] There is something wrong with the spliced sentences in Mr task

Posted by GitBox <gi...@apache.org>.
Slivery1 closed issue #4438:
URL: https://github.com/apache/incubator-dolphinscheduler/issues/4438


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-dolphinscheduler] chengshiwen commented on issue #4438: [Bug][Server] There is something wrong with the spliced sentences in Mr task

Posted by GitBox <gi...@apache.org>.
chengshiwen commented on issue #4438:
URL: https://github.com/apache/incubator-dolphinscheduler/issues/4438#issuecomment-759433321


   @Slivery1 
   Do you use the `GenericOptionsParser` to handle generic Hadoop command-line options? Like the follows?
   ```
   public static void main(String[] args) throws Exception {
     Configuration conf = new Configuration();
     GenericOptionsParser optionParser = new GenericOptionsParser(conf, args);
     String[] remainingArgs = optionParser.getRemainingArgs();
     if ((remainingArgs.length != 2) && (remainingArgs.length != 4)) {
       System.err.println("Usage: wordcount <in> <out> [-skip skipPatternFile]");
       System.exit(2);
     }
     Job job = Job.getInstance(conf, "word count");
     job.setJarByClass(WordCount2.class);
     job.setMapperClass(TokenizerMapper.class);
     job.setCombinerClass(IntSumReducer.class);
     job.setReducerClass(IntSumReducer.class);
     job.setOutputKeyClass(Text.class);
     job.setOutputValueClass(IntWritable.class);
   
     List<String> otherArgs = new ArrayList<String>();
     for (int i=0; i < remainingArgs.length; ++i) {
       if ("-skip".equals(remainingArgs[i])) {
         job.addCacheFile(new Path(remainingArgs[++i]).toUri());
         job.getConfiguration().setBoolean("wordcount.skip.patterns", true);
       } else {
         otherArgs.add(remainingArgs[i]);
       }
     }
     FileInputFormat.addInputPath(job, new Path(otherArgs.get(0)));
     FileOutputFormat.setOutputPath(job, new Path(otherArgs.get(1)));
   
     System.exit(job.waitForCompletion(true) ? 0 : 1);
   }
   ```
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-dolphinscheduler] chengshiwen commented on issue #4438: [Bug][Server] There is something wrong with the spliced sentences in Mr task

Posted by GitBox <gi...@apache.org>.
chengshiwen commented on issue #4438:
URL: https://github.com/apache/incubator-dolphinscheduler/issues/4438#issuecomment-759457064


   Confirmed by source code:
   1. Running command `bin/hadoop jar wordcount.jar org.apache.hadoop.examples.WordCount [GENERIC_OPTIONS] [COMMAND_OPTIONS]` will execute `"$JAVA" $JAVA_HEAP_MAX $HADOOP_OPTS $CLASS "$@"` with `CLASS=org.apache.hadoop.util.RunJar` ([source code](https://github.com/apache/hadoop/blob/branch-2.8.5/hadoop-common-project/hadoop-common/src/main/bin/hadoop#L166))
   2. The class `org.apache.hadoop.util.RunJar` parses the args with usage `"RunJar jarFile [mainClass] args..."` ([source code](https://github.com/apache/hadoop/blob/branch-2.8.5/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/RunJar.java#L156))
   3. The example `WordCount` will handle generic Hadoop command-line options ([source code](https://github.com/apache/hadoop/blob/branch-2.8.5/hadoop-mapreduce-project/hadoop-mapreduce-examples/src/main/java/org/apache/hadoop/examples/WordCount.java#L68))
   4. The class `GenericOptionsParser` will parse the options and set the configuration ([source code](https://github.com/apache/hadoop/blob/branch-2.8.5/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/GenericOptionsParser.java))
   
   So, the `[GENERIC_OPTIONS]` and `[COMMAND_OPTIONS]` can be swapped in position and order if you use the `GenericOptionsParser`


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-dolphinscheduler] chengshiwen edited a comment on issue #4438: [Bug][Server] There is something wrong with the spliced sentences in Mr task

Posted by GitBox <gi...@apache.org>.
chengshiwen edited a comment on issue #4438:
URL: https://github.com/apache/incubator-dolphinscheduler/issues/4438#issuecomment-759457064


   Confirmed by source code:
   1. Running command `bin/hadoop jar wordcount.jar org.apache.hadoop.examples.WordCount [GENERIC_OPTIONS] [COMMAND_OPTIONS]` will execute `"$JAVA" $JAVA_HEAP_MAX $HADOOP_OPTS $CLASS "$@"` with `CLASS=org.apache.hadoop.util.RunJar` ([source code](https://github.com/apache/hadoop/blob/branch-2.8.5/hadoop-common-project/hadoop-common/src/main/bin/hadoop#L166))
   2. The class `org.apache.hadoop.util.RunJar` parses the args with usage `"RunJar jarFile [mainClass] args..."` ([source code](https://github.com/apache/hadoop/blob/branch-2.8.5/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/RunJar.java#L156))
   3. The example `WordCount` will handle generic Hadoop command-line options ([source code](https://github.com/apache/hadoop/blob/branch-2.8.5/hadoop-mapreduce-project/hadoop-mapreduce-examples/src/main/java/org/apache/hadoop/examples/WordCount.java#L68))
   4. The class `GenericOptionsParser` will parse the options and set the configuration ([source code](https://github.com/apache/hadoop/blob/branch-2.8.5/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/GenericOptionsParser.java))
   
   So, the `[GENERIC_OPTIONS]` and `[COMMAND_OPTIONS]` can be swapped in position and order if you use the `GenericOptionsParser`. Otherwise, this `[GENERIC_OPTIONS]` will not take effect. In other words, this issue depends on your implementation.
   
   **Conclusion**: This issue is not a bug.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-dolphinscheduler] chengshiwen commented on issue #4438: [Bug][Server] There is something wrong with the spliced sentences in Mr task

Posted by GitBox <gi...@apache.org>.
chengshiwen commented on issue #4438:
URL: https://github.com/apache/incubator-dolphinscheduler/issues/4438#issuecomment-759424882


   @Slivery1 
   In my memory, the hadoop or yarn jar command should be
   ```
   hadoop jar <jar> [mainClass] [GENERIC_OPTIONS] args...
   yarn jar <jar> [mainClass] [GENERIC_OPTIONS] args...
   ```
   Refs:
   - [hadoop commands](http://hadoop.apache.org/docs/r2.8.5/hadoop-project-dist/hadoop-common/CommandsManual.html)
   - [mapreduce tutorial](http://hadoop.apache.org/docs/r2.8.5/hadoop-mapreduce-client/hadoop-mapreduce-client-core/MapReduceTutorial.html)
   ```
   hadoop [--config confdir] [--loglevel loglevel] [COMMAND] [GENERIC_OPTIONS] [COMMAND_OPTIONS]
   hadoop jar <jar> [mainClass] args...
   ```
   Examples:
   ```
   bin/hadoop jar hadoop-mapreduce-examples-<ver>.jar wordcount -files cachefile.txt -libjars mylib.jar -archives myarchive.zip input output
   ```
   
   Let me check it and fix it if there is a problem.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-dolphinscheduler] chengshiwen edited a comment on issue #4438: [Bug][Server] There is something wrong with the spliced sentences in Mr task

Posted by GitBox <gi...@apache.org>.
chengshiwen edited a comment on issue #4438:
URL: https://github.com/apache/incubator-dolphinscheduler/issues/4438#issuecomment-759424882


   @Slivery1 
   In my memory, the hadoop or yarn jar command should be
   ```
   hadoop jar <jar> [mainClass] [GENERIC_OPTIONS] args...
   yarn jar <jar> [mainClass] [GENERIC_OPTIONS] args...
   ```
   Refs:
   - [hadoop commands](http://hadoop.apache.org/docs/r2.8.5/hadoop-project-dist/hadoop-common/CommandsManual.html)
   - [mapreduce tutorial](http://hadoop.apache.org/docs/r2.8.5/hadoop-mapreduce-client/hadoop-mapreduce-client-core/MapReduceTutorial.html)
   ```
   hadoop [--config confdir] [--loglevel loglevel] [COMMAND] [GENERIC_OPTIONS] [COMMAND_OPTIONS]
   hadoop jar <jar> [mainClass] args...
   ```
   Examples:
   ```
   bin/hadoop jar hadoop-mapreduce-examples-<ver>.jar wordcount -files cachefile.txt -libjars mylib.jar -archives myarchive.zip input output
   $ bin/hadoop jar wc.jar WordCount2 -Dwordcount.case.sensitive=false /user/joe/wordcount/input /user/joe/wordcount/output -skip /user/joe/wordcount/patterns.txt
   ```
   
   Let me check it and fix it if there is a problem.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-dolphinscheduler] Slivery1 commented on issue #4438: [Bug][Server] There is something wrong with the spliced sentences in Mr task

Posted by GitBox <gi...@apache.org>.
Slivery1 commented on issue #4438:
URL: https://github.com/apache/incubator-dolphinscheduler/issues/4438#issuecomment-759843844


   Thank you!


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-dolphinscheduler] chengshiwen edited a comment on issue #4438: [Bug][Server] There is something wrong with the spliced sentences in Mr task

Posted by GitBox <gi...@apache.org>.
chengshiwen edited a comment on issue #4438:
URL: https://github.com/apache/incubator-dolphinscheduler/issues/4438#issuecomment-759433321


   @Slivery1 
   Do you use the `GenericOptionsParser` to handle generic Hadoop command-line options? Like the follows?
   The command example is
   ```
   bin/hadoop jar wc.jar WordCount2 -Dwordcount.case.sensitive=false /user/joe/wordcount/input /user/joe/wordcount/output -skip /user/joe/wordcount/patterns.txt
   ```
   ```
   public static void main(String[] args) throws Exception {
     Configuration conf = new Configuration();
     GenericOptionsParser optionParser = new GenericOptionsParser(conf, args);
     String[] remainingArgs = optionParser.getRemainingArgs();
     if ((remainingArgs.length != 2) && (remainingArgs.length != 4)) {
       System.err.println("Usage: wordcount <in> <out> [-skip skipPatternFile]");
       System.exit(2);
     }
     Job job = Job.getInstance(conf, "word count");
     job.setJarByClass(WordCount2.class);
     job.setMapperClass(TokenizerMapper.class);
     job.setCombinerClass(IntSumReducer.class);
     job.setReducerClass(IntSumReducer.class);
     job.setOutputKeyClass(Text.class);
     job.setOutputValueClass(IntWritable.class);
   
     List<String> otherArgs = new ArrayList<String>();
     for (int i=0; i < remainingArgs.length; ++i) {
       if ("-skip".equals(remainingArgs[i])) {
         job.addCacheFile(new Path(remainingArgs[++i]).toUri());
         job.getConfiguration().setBoolean("wordcount.skip.patterns", true);
       } else {
         otherArgs.add(remainingArgs[i]);
       }
     }
     FileInputFormat.addInputPath(job, new Path(otherArgs.get(0)));
     FileOutputFormat.setOutputPath(job, new Path(otherArgs.get(1)));
   
     System.exit(job.waitForCompletion(true) ? 0 : 1);
   }
   ```
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-dolphinscheduler] zhuangchong removed a comment on issue #4438: [Bug][Server] There is something wrong with the spliced sentences in Mr task

Posted by GitBox <gi...@apache.org>.
zhuangchong removed a comment on issue #4438:
URL: https://github.com/apache/incubator-dolphinscheduler/issues/4438#issuecomment-759424400


   I can fix it.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-dolphinscheduler] chengshiwen edited a comment on issue #4438: [Bug][Server] There is something wrong with the spliced sentences in Mr task

Posted by GitBox <gi...@apache.org>.
chengshiwen edited a comment on issue #4438:
URL: https://github.com/apache/incubator-dolphinscheduler/issues/4438#issuecomment-759424882


   @Slivery1 
   In my memory, the hadoop or yarn jar command should be
   ```
   hadoop jar <jar> [mainClass] [GENERIC_OPTIONS] args...
   yarn jar <jar> [mainClass] [GENERIC_OPTIONS] args...
   ```
   Refs:
   - [hadoop commands](http://hadoop.apache.org/docs/r2.8.5/hadoop-project-dist/hadoop-common/CommandsManual.html)
   - [mapreduce tutorial](http://hadoop.apache.org/docs/r2.8.5/hadoop-mapreduce-client/hadoop-mapreduce-client-core/MapReduceTutorial.html)
   ```
   hadoop [--config confdir] [--loglevel loglevel] [COMMAND] [GENERIC_OPTIONS] [COMMAND_OPTIONS]
   hadoop jar <jar> [mainClass] args...
   ```
   Examples:
   ```
   bin/hadoop jar hadoop-mapreduce-examples-<ver>.jar wordcount -files cachefile.txt -libjars mylib.jar -archives myarchive.zip input output
   bin/hadoop jar wc.jar WordCount2 -Dwordcount.case.sensitive=false /user/joe/wordcount/input /user/joe/wordcount/output -skip /user/joe/wordcount/patterns.txt
   ```
   
   Let me check it and fix it if there is a problem.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-dolphinscheduler] chengshiwen edited a comment on issue #4438: [Bug][Server] There is something wrong with the spliced sentences in Mr task

Posted by GitBox <gi...@apache.org>.
chengshiwen edited a comment on issue #4438:
URL: https://github.com/apache/incubator-dolphinscheduler/issues/4438#issuecomment-759457064


   Confirmed by source code:
   1. Running command `bin/hadoop jar wordcount.jar org.apache.hadoop.examples.WordCount [GENERIC_OPTIONS] [COMMAND_OPTIONS]` will execute `"$JAVA" $JAVA_HEAP_MAX $HADOOP_OPTS $CLASS "$@"` with `CLASS=org.apache.hadoop.util.RunJar` ([source code](https://github.com/apache/hadoop/blob/branch-2.8.5/hadoop-common-project/hadoop-common/src/main/bin/hadoop#L166))
   2. The class `org.apache.hadoop.util.RunJar` parses the args with usage `"RunJar jarFile [mainClass] args..."` ([source code](https://github.com/apache/hadoop/blob/branch-2.8.5/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/RunJar.java#L156))
   3. The example `WordCount` will handle generic Hadoop command-line options ([source code](https://github.com/apache/hadoop/blob/branch-2.8.5/hadoop-mapreduce-project/hadoop-mapreduce-examples/src/main/java/org/apache/hadoop/examples/WordCount.java#L68))
   4. The class `GenericOptionsParser` will parse the options and set the configuration ([source code](https://github.com/apache/hadoop/blob/branch-2.8.5/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/GenericOptionsParser.java))
   
   So, the `[GENERIC_OPTIONS]` and `[COMMAND_OPTIONS]` can be swapped in position and order if you use the `GenericOptionsParser`. Otherwise, this `[GENERIC_OPTIONS]` will not work. In other words, this issue depends on your implementation.
   
   **Conclusion**: This issue is not a bug.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-dolphinscheduler] zhuangchong commented on issue #4438: [Bug][Server] There is something wrong with the spliced sentences in Mr task

Posted by GitBox <gi...@apache.org>.
zhuangchong commented on issue #4438:
URL: https://github.com/apache/incubator-dolphinscheduler/issues/4438#issuecomment-759427204


   > @Slivery1
   > In my memory, the hadoop or yarn jar command should be
   > 
   > ```
   > hadoop jar <jar> [mainClass] [GENERIC_OPTIONS] args...
   > yarn jar <jar> [mainClass] [GENERIC_OPTIONS] args...
   > ```
   > 
   > Refs:
   > 
   > * [hadoop commands](http://hadoop.apache.org/docs/r2.8.5/hadoop-project-dist/hadoop-common/CommandsManual.html)
   > * [mapreduce tutorial](http://hadoop.apache.org/docs/r2.8.5/hadoop-mapreduce-client/hadoop-mapreduce-client-core/MapReduceTutorial.html)
   > 
   > ```
   > hadoop [--config confdir] [--loglevel loglevel] [COMMAND] [GENERIC_OPTIONS] [COMMAND_OPTIONS]
   > hadoop jar <jar> [mainClass] args...
   > ```
   > 
   > Examples:
   > 
   > ```
   > bin/hadoop jar hadoop-mapreduce-examples-<ver>.jar wordcount -files cachefile.txt -libjars mylib.jar -archives myarchive.zip input output
   > ```
   > 
   > Let me check it and fix it if there is a problem.
   
   👍


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-dolphinscheduler] chengshiwen edited a comment on issue #4438: [Bug][Server] There is something wrong with the spliced sentences in Mr task

Posted by GitBox <gi...@apache.org>.
chengshiwen edited a comment on issue #4438:
URL: https://github.com/apache/incubator-dolphinscheduler/issues/4438#issuecomment-759457064


   Confirmed by source code:
   1. Running command `bin/hadoop jar wordcount.jar org.apache.hadoop.examples.WordCount [GENERIC_OPTIONS] [COMMAND_OPTIONS]` will execute `"$JAVA" $JAVA_HEAP_MAX $HADOOP_OPTS $CLASS "$@"` with `CLASS=org.apache.hadoop.util.RunJar` ([source code](https://github.com/apache/hadoop/blob/branch-2.8.5/hadoop-common-project/hadoop-common/src/main/bin/hadoop#L166))
   2. The class `org.apache.hadoop.util.RunJar` parses the args with usage `"RunJar jarFile [mainClass] args..."`, but not parse the generic options ([source code](https://github.com/apache/hadoop/blob/branch-2.8.5/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/RunJar.java#L156))
   3. The example `WordCount` will handle generic Hadoop command-line options ([source code](https://github.com/apache/hadoop/blob/branch-2.8.5/hadoop-mapreduce-project/hadoop-mapreduce-examples/src/main/java/org/apache/hadoop/examples/WordCount.java#L68))
   4. The class `GenericOptionsParser` will parse the options and set the configuration ([source code](https://github.com/apache/hadoop/blob/branch-2.8.5/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/GenericOptionsParser.java))
   
   So, the `[GENERIC_OPTIONS]` and `[COMMAND_OPTIONS]` can be swapped in position and order if you use the `GenericOptionsParser`. Otherwise, this `[GENERIC_OPTIONS]` will not take effect. In other words, this issue depends on your implementation.
   
   **Conclusion**: This issue is not a bug.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org