You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2019/02/12 09:03:31 UTC

[GitHub] HyukjinKwon edited a comment on issue #23746: [SPARK-26761][SQL][R] Vectorized R gapply() implementation

HyukjinKwon edited a comment on issue #23746: [SPARK-26761][SQL][R] Vectorized R gapply() implementation
URL: https://github.com/apache/spark/pull/23746#issuecomment-462671978
 
 
   > what about [e8982ca#diff-b11442485f6b77bf47b58b4747321638R267](https://github.com/apache/spark/commit/e8982ca7ad94e98d907babf2d6f1068b7cd064c6#diff-b11442485f6b77bf47b58b4747321638R267) can this be changed from file to stream also?
   
   Yes.. it actually used file on purpose actually.... if I use a file to transfer the data (instead of socket), I can send batch by batch in a streaming matter but if I use socket, I should buffer all the batches and then send them at once due to the Arrow R API limitation (ARROW-4512).
   
   I used file there because the existing protocol at R DataFrame -> Spark DataFrame was already using a file .. but looks all other protocol are using sockets. So, I had to use socket here to match the protocol although I have to buffer all the batches due to Arrow R API limitation.
   
   BTW, this limitation looks going to be fixed at Arrow 0.14.0. As soon as it's fixed, we can match it to Python side's vectorization easily because I tried to follow Python side's as possible as I can.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org