You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@arrow.apache.org by "kojix2 (Jira)" <ji...@apache.org> on 2019/12/27 06:48:00 UTC

[jira] [Created] (ARROW-7474) [Ruby] Slow to save files in CSV format

kojix2 created ARROW-7474:
-----------------------------

             Summary: [Ruby] Slow to save files in CSV format
                 Key: ARROW-7474
                 URL: https://issues.apache.org/jira/browse/ARROW-7474
             Project: Apache Arrow
          Issue Type: Improvement
          Components: Ruby
            Reporter: kojix2
         Attachments: arrow.png

Hi developers

Saving Arrow::Table in CSV format may be slow.

Ad hoc benchmarks...

 
{code:ruby}
 
require 'arrow'
require 'csv'
require 'gr/plot'
t = Arrow::Table.load('some_nice.tsv', format: :csv, delimiter: "\t".ord)
n = 1.step(1000, 100).to_a
arrow_save_times = []
csv_save_times = []
n.each do |i|
 t2 = t.slice(0, i)
start = Time.now
 t2.save('test.csv')
 arrow_save_times << p(Time.now - start)
t2 = t.raw_records
start = Time.now
 CSV.open('test2.csv', 'w') do |csv|
 t2.each do |r|
 csv << r
 end
 end
 csv_save_times << p(Time.now - start)
end
GR.stem([n, arrow_save_times], [n, csv_save_times],
 labels: ["arrow", "CSV"], xlabel: "lines", ylabel: "time", location: 2)
GR.savefig("arrow.png")
gets
{code}
 

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)