You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ozone.apache.org by "Ethan Rose (Jira)" <ji...@apache.org> on 2021/10/20 20:38:10 UTC

[jira] [Updated] (HDDS-3552) OzoneFS is slow compared to HDFS using Spark job

     [ https://issues.apache.org/jira/browse/HDDS-3552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ethan Rose updated HDDS-3552:
-----------------------------
    Target Version/s: 1.3.0  (was: 1.2.0)

I am managing the 1.2.0 release and we currently have more than 600 issues targeted for 1.2.0. I am moving the target field to 1.3.0.

If you are actively working on this jira and believe this should be targeted for the 1.2.0 release, Please reach out to me via Apache email or Slack.

> OzoneFS is slow compared to HDFS using Spark job
> ------------------------------------------------
>
>                 Key: HDDS-3552
>                 URL: https://issues.apache.org/jira/browse/HDDS-3552
>             Project: Apache Ozone
>          Issue Type: Improvement
>            Reporter: Marton Elek
>            Assignee: Marton Elek
>            Priority: Major
>              Labels: performance
>
> Reported by "Andrey Mindrin" on the the-asf Slack:
> {quote}
> We have made a few tests to compare OZONE (0.4 and 0.5 on Cloudera Runtime 7.0.3 with 3 nodes) performance with HDFS and OZONE is slower in most cases. For example, Spark application with 18 containers that copies 6 Gb parquet file is about 50% slower on OzoneFS. The only one case shows the same performance - Hive queries over partitioned tables.
>  simple spark code we used:
> {code}
> val file = spark.read.format(format).load(path_input)
> file.write.format(format).save(path_output)
> {code}
> Tested on CSV file with 800 million records, 2 columns and parquet file converted from CSV above. Just copied file from HDFS to HDFS and from Ozone to Ozone. Application time is 1m 14s on HDFS and  1m 51s (+50%) on Ozone (parquet file). Ozone has default settings. (edited) 
> {quote}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org