You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spark.apache.org by Rachana Srivastava <Ra...@markmonitor.com> on 2016/02/05 18:38:18 UTC

Spark process failing to receive data from the Kafka queue in yarn-client mode.

I am trying to run following code using yarn-client mode in but getting slow readprocessor error mentioned below but the code works just fine in the local mode.  Any pointer is really appreciated.

Line of code to receive data from the Kafka Queue:
JavaPairReceiverInputDStream<String, String> messages =  KafkaUtils.createStream(jssc, String.class, String.class, StringDecoder.class, StringDecoder.class, kafkaParams, kafkaTopicMap, StorageLevel.MEMORY_ONLY());

JavaDStream<String> lines = messages.map(new Function<Tuple2<String, String>, String>() {
      public String call(Tuple2<String, String> tuple2) {
                              LOG.info(" &&&&&&&&&&&&&&&&&&&& Input json stream data  " +  tuple2._2);
        return tuple2._2();
      }
    });


Error Details:
016-02-05 11:44:00 WARN DFSClient:975 - Slow ReadProcessor read fields took 30
011ms (threshold=30000ms); ack: seqno: 1960 reply: 0 reply: 0 reply: 0 downstrea
mAckTimeNanos: 1227280, targets: [DatanodeInfoWithStorage[10.0.0.245:50010,DS-a5
5d9212-3771-4936-bbe7-02035e7de148,DISK], DatanodeInfoWithStorage[10.0.0.243:500
10,DS-231b9915-c2e2-4392-b075-8a52ba1820ac,DISK], DatanodeInfoWithStorage[10.0.0
.244:50010,DS-6b8b5814-7dd7-4315-847c-b73bd375af0e,DISK]]
2016-02-05 11:44:00 INFO BlockManager:59 - Removing RDD 1954
2016-02-05 11:44:00 INFO MapPartitionsRDD:59 - Removing RDD 1955 from persisten