You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@drill.apache.org by LittleCho <li...@littlecho.tw> on 2017/11/07 14:21:29 UTC
Question about how Drill optimizes the queries and splits the loads in
HDFS cluster?
Hello Sir,
I have been studying installing Drill on data nodes within a hadoop
cluster. According to the Drill's online document, we can install
Drill on each datanode of hdfs. And then we can change the connection
setting in file storage plugin to hdfs's namenode to finish the set
up. And here comes my question, as we know a file will be split into
several blocks based on the setting, so is the query will be split
and assigned to each drill instance on each datanode? I would like to
know how more about how Drill works in distributed mode with hdfs
cluster. Thank you!!
--
BR, LittleCho
Re: Question about how Drill optimizes the queries and splits the
loads in HDFS cluster?
Posted by Chun Chang <cc...@mapr.com>.
Yes, data locality is considered in deciding which drillbit gets to work on what.
________________________________
From: LittleCho <li...@littlecho.tw>
Sent: Tuesday, November 7, 2017 6:21:29 AM
To: user@drill.apache.org
Subject: Question about how Drill optimizes the queries and splits the loads in HDFS cluster?
Hello Sir,
I have been studying installing Drill on data nodes within a hadoop
cluster. According to the Drill's online document, we can install
Drill on each datanode of hdfs. And then we can change the connection
setting in file storage plugin to hdfs's namenode to finish the set
up. And here comes my question, as we know a file will be split into
several blocks based on the setting, so is the query will be split
and assigned to each drill instance on each datanode? I would like to
know how more about how Drill works in distributed mode with hdfs
cluster. Thank you!!
--
BR, LittleCho