You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2021/05/13 21:57:13 UTC

[GitHub] [iceberg] kbendick removed a comment on pull request #2577: Spark: Add read.locality.enabled to TableProperties to support disabl…

kbendick removed a comment on pull request #2577:
URL: https://github.com/apache/iceberg/pull/2577#issuecomment-840856930


   > I think that this can be configured via `spark.locality.wait`. I think if you set it to zero, it will just automatically give up looking for a data local node. At least that's what I've done when reading from S3 with yarn (which is by definition not local).
   > 
   > ```
   > Number of milliseconds to wait to launch a data-local task before giving up and launching it on a less-local node.
   > The same wait will be used to step through multiple locality levels (process-local, node-local, rack-local and then any).
   > It is also possible to customize the waiting time for each level by setting spark.locality.wait.node, etc.
   > You should increase this setting if your tasks are long and see poor locality, but the default usually works well.
   > ```
   
   Given that you say it takes 30seconds, that would align with the default value of `3000` (appears to be in milliseconds).
   
   If it's possible to leave this as a spark property, maybe it's not something we really need defined on the table level? I'm open to discuss on that.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org