You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@phoenix.apache.org by "Chaitanya (JIRA)" <ji...@apache.org> on 2017/05/25 12:59:04 UTC
[jira] [Commented] (PHOENIX-3887) Bulk export of large query result
set
[ https://issues.apache.org/jira/browse/PHOENIX-3887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16024702#comment-16024702 ]
Chaitanya commented on PHOENIX-3887:
------------------------------------
This looks like it can be supported by a MR job very similar to org.apache.phoenix.mapreduce.CsvBulkLoadTool. I am ready to contribute for this feature. Any help regarding structure of code / contribution guidelines is appreciated.
> Bulk export of large query result set
> -------------------------------------
>
> Key: PHOENIX-3887
> URL: https://issues.apache.org/jira/browse/PHOENIX-3887
> Project: Phoenix
> Issue Type: Improvement
> Affects Versions: 4.8.0
> Reporter: Chaitanya
> Priority: Minor
> Labels: beginner
> Original Estimate: 504h
> Remaining Estimate: 504h
>
> Query results with large number of rows can not be consumed by a single JDBC connection.
> To export these results as a CSV file either on a local filesystem or HDFS, we can connect by Spark / Hive to Phoenix but there can be a tool (or MR job) which can be implemented just like CsvBulkLoadTool or ./psql.py. This is a very common use case for big results.
> Similar functionality exists in Postgres via COPY command and in Redshift via UNLOAD command.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)