You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "Joris Van den Bossche (Jira)" <ji...@apache.org> on 2022/12/07 14:07:00 UTC

[jira] [Commented] (ARROW-14799) [C++] Adding tabular pretty printing of Table / RecordBatch

    [ https://issues.apache.org/jira/browse/ARROW-14799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17644352#comment-17644352 ] 

Joris Van den Bossche commented on ARROW-14799:
-----------------------------------------------

If we tackle this in C++, it might be worth checking out duckdb's implementation. If we decide to tackle this in the bindings, for Python it might be worth checking out ibis' implementation (using rich, they recently revamped there table representation, including support for nested columns).

> [C++] Adding tabular pretty printing of Table / RecordBatch
> -----------------------------------------------------------
>
>                 Key: ARROW-14799
>                 URL: https://issues.apache.org/jira/browse/ARROW-14799
>             Project: Apache Arrow
>          Issue Type: Improvement
>          Components: C++
>            Reporter: Joris Van den Bossche
>            Priority: Major
>
> It would be nice to show a "preview" (eg xx number of first and last rows) of a Table or RecordBatch in a traditional tabular form (like pandas DataFrames, or R data.frame / tibbles have, or in a format that resembles markdown tables). 
> This could also be added in the bindings, but we could also do it on the C++ level to benefit multiple bindings at once.
> Based on a quick search, there is https://github.com/p-ranav/tabulate which could be vendored (it has a single-include version).
> I suppose that nested data types could represent a challenge on how to include those in a tabular format, though.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)