You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@druid.apache.org by GitBox <gi...@apache.org> on 2020/02/10 02:21:27 UTC

[GitHub] [druid] xiaohui0318 opened a new issue #9342: Use spark to load the Druid

xiaohui0318 opened a new issue #9342: Use spark to load the Druid
URL: https://github.com/apache/druid/issues/9342
 
 
   Recently, the company is going to go to druid, I am learning the knowledge of storage, but I found that MR was used for data loading. Later, I found the project spark-druid-batch, and found that it was no longer updated. I would like to ask why the community did not develop the spark version of the library code, based on what considerations, is the MR library performance is good enough now or is spark unstable? Please give advice or comments

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org

[GitHub] [druid] sascha-coenen commented on issue #9342: Use spark to load the Druid

Posted by GitBox <gi...@apache.org>.

sascha-coenen commented on issue #9342: Use spark to load the Druid
URL: https://github.com/apache/druid/issues/9342#issuecomment-584773323
 
 
   Hi,
   I'm just another Druid user, so I can't comment on the "why" part of your question, but it might be interesting for you to know that the most recent two Druid releases were heavily focussed on bringing into place a new native ingestion which might eventually be recommendable over both the MR and the Spark based ingestion methods.
   You can find it here: https://druid.apache.org/docs/latest/ingestion/native-batch.html
   The index_parallel task along with a new type of Druid node named "indexer"
   (https://staging-druid.imply.io/docs/design/indexer.html) would be the most recent variant for ingesting data. This new native setup is still labelled as experimental but might soon be the simplest way to ingest data.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org

[GitHub] [druid] itaiy commented on issue #9342: Use spark to load the Druid

Posted by GitBox <gi...@apache.org>.

itaiy commented on issue #9342: Use spark to load the Druid
URL: https://github.com/apache/druid/issues/9342#issuecomment-591433732
 
 
   To follow-up on the question and above reply - in some cases, we'd still like to have a way to ingest data directly from Spark.
   There was a related conversation going on in Druid's user google group and on the Slack channel, and it's now consolidated to Druid's dev mailing list, see [here](https://lists.apache.org/thread.html/r8219a7be0583ae3d9a2303fa7f21872782cf0703812a410bb62acfef%40%3Cdev.druid.apache.org%3E)

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org

[GitHub] [druid] xiaohui0318 closed issue #9342: Use spark to load the Druid

Posted by GitBox <gi...@apache.org>.

xiaohui0318 closed issue #9342: Use spark to load the Druid
URL: https://github.com/apache/druid/issues/9342
 
 
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org