You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by "Fokko (via GitHub)" <gi...@apache.org> on 2023/03/31 11:18:59 UTC

[GitHub] [iceberg] Fokko opened a new issue, #7256: Python: Write Parquet file using PyArrow

Fokko opened a new issue, #7256:
URL: https://github.com/apache/iceberg/issues/7256

   ### Feature Request / Improvement
   
   When writing data using Python, we want to leverage external dependencies that write the actual data. This is because there are many libraries out there that do a great job, and we don't want to reinvent the wheel.
   
   I think we can mirror the read path, where we use `pq.read_table`, we can use `pq.write_table`, and leverage the `metadata_collector` to collect the statistics that need to be stored in the `ManifestEntry`.
   
   To make the PR small, I would be to start with unpartitioned tables only, to avoid having to construct the right path.
   
   ### Query engine
   
   None


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] Fokko commented on issue #7256: Python: Write Parquet file using PyArrow

Posted by "Fokko (via GitHub)" <gi...@apache.org>.
Fokko commented on issue #7256:
URL: https://github.com/apache/iceberg/issues/7256#issuecomment-1738546750

   Not stale :)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


Re: [I] Python: Write Parquet file using PyArrow [iceberg]

Posted by "Fokko (via GitHub)" <gi...@apache.org>.
Fokko closed issue #7256: Python: Write Parquet file using PyArrow
URL: https://github.com/apache/iceberg/issues/7256


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


Re: [I] Python: Write Parquet file using PyArrow [iceberg]

Posted by "Fokko (via GitHub)" <gi...@apache.org>.
Fokko commented on issue #7256:
URL: https://github.com/apache/iceberg/issues/7256#issuecomment-1742730066

   I'll close this one. We're migrating to the `iceberg-python` repository, but I think this one is in :)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] github-actions[bot] commented on issue #7256: Python: Write Parquet file using PyArrow

Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] commented on issue #7256:
URL: https://github.com/apache/iceberg/issues/7256#issuecomment-1738265588

   This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs. To permanently prevent this issue from being considered stale, add the label 'not-stale', but commenting on the issue is preferred when possible.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] ggbaro commented on issue #7256: Python: Write Parquet file using PyArrow

Posted by "ggbaro (via GitHub)" <gi...@apache.org>.
ggbaro commented on issue #7256:
URL: https://github.com/apache/iceberg/issues/7256#issuecomment-1491954019

   @dariocurr @emanueledomingo @giacomorebecchi @enryls


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org