You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@arrow.apache.org by "kojix2 (Jira)" <ji...@apache.org> on 2019/12/27 06:48:00 UTC
[jira] [Created] (ARROW-7474) [Ruby] Slow to save files in CSV
format
kojix2 created ARROW-7474:
-----------------------------
Summary: [Ruby] Slow to save files in CSV format
Key: ARROW-7474
URL: https://issues.apache.org/jira/browse/ARROW-7474
Project: Apache Arrow
Issue Type: Improvement
Components: Ruby
Reporter: kojix2
Attachments: arrow.png
Hi developers
Saving Arrow::Table in CSV format may be slow.
Ad hoc benchmarks...
{code:ruby}
require 'arrow'
require 'csv'
require 'gr/plot'
t = Arrow::Table.load('some_nice.tsv', format: :csv, delimiter: "\t".ord)
n = 1.step(1000, 100).to_a
arrow_save_times = []
csv_save_times = []
n.each do |i|
t2 = t.slice(0, i)
start = Time.now
t2.save('test.csv')
arrow_save_times << p(Time.now - start)
t2 = t.raw_records
start = Time.now
CSV.open('test2.csv', 'w') do |csv|
t2.each do |r|
csv << r
end
end
csv_save_times << p(Time.now - start)
end
GR.stem([n, arrow_save_times], [n, csv_save_times],
labels: ["arrow", "CSV"], xlabel: "lines", ylabel: "time", location: 2)
GR.savefig("arrow.png")
gets
{code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)