You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2022/01/19 16:00:41 UTC

[GitHub] [iceberg] RussellSpitzer commented on issue #3921: A BaseDataReader error always occurs when traversing a partitioned table

RussellSpitzer commented on issue #3921:
URL: https://github.com/apache/iceberg/issues/3921#issuecomment-1016613183


   Is the purpose here to accumulate all the records into an on Executor linked queue? I'm a little nervous around the direct manipulation of the iterators here as well as the building of Executor specific memory constructs. 
   
   I think if I was debugging this the first thing I would try is just doing a full collect of the dataframe. Make sure the normal pathway works fine.
   
   ```scala
   ds.collect // or if this is to large, ds.take(10) 
   ```
   
    Then after that I would probably try an implementation where we don't manually touch the iterators and doesn't use a shared memory construct, something like :
   
   ```scala
   ds.foreachPartition{ p: Iterator[java.lang.Long] => 
     p.foreach( i => print(i)) 
   }
   ```
   
   Then I would add back in the memory construct.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org