You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@arrow.apache.org by "Yosuke Shiro (Jira)" <ji...@apache.org> on 2019/12/31 01:04:00 UTC
[jira] [Resolved] (ARROW-7474) [Ruby] Save CSV files faster
[ https://issues.apache.org/jira/browse/ARROW-7474?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Yosuke Shiro resolved ARROW-7474.
---------------------------------
Fix Version/s: 1.0.0
Resolution: Fixed
Issue resolved by pull request 6106
[https://github.com/apache/arrow/pull/6106]
> [Ruby] Save CSV files faster
> ----------------------------
>
> Key: ARROW-7474
> URL: https://issues.apache.org/jira/browse/ARROW-7474
> Project: Apache Arrow
> Issue Type: Improvement
> Components: Ruby
> Reporter: kojix2
> Assignee: Kouhei Sutou
> Priority: Minor
> Labels: pull-request-available
> Fix For: 1.0.0
>
> Attachments: arrow.png
>
> Time Spent: 0.5h
> Remaining Estimate: 0h
>
> Hi developers
> Saving Arrow::Table in CSV format may be slow.
> Ad hoc benchmarks...
>
> {code:ruby}
>
> require 'arrow'
> require 'csv'
> require 'gr/plot'
> t = Arrow::Table.load('some_nice.tsv', format: :csv, delimiter: "\t".ord)
> n = 1.step(1000, 100).to_a
> arrow_save_times = []
> csv_save_times = []
> n.each do |i|
> t2 = t.slice(0, i)
> start = Time.now
> t2.save('test.csv')
> arrow_save_times << p(Time.now - start)
> t2 = t.raw_records
> start = Time.now
> CSV.open('test2.csv', 'w') do |csv|
> t2.each do |r|
> csv << r
> end
> end
> csv_save_times << p(Time.now - start)
> end
> GR.stem([n, arrow_save_times], [n, csv_save_times],
> labels: ["arrow", "CSV"], xlabel: "lines", ylabel: "time", location: 2)
> GR.savefig("arrow.png")
> gets
> {code}
>
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)