You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by "beliefer (via GitHub)" <gi...@apache.org> on 2023/10/25 09:27:46 UTC

[PR] [SPARK-45664][SQL] Introduce a mapper for orc compression codecs [spark]

beliefer opened a new pull request, #43528:
URL: https://github.com/apache/spark/pull/43528

   ### What changes were proposed in this pull request?
   Currently, Spark supported all the orc compression codecs, but the orc supported compression codecs and spark supported are not completely one-on-one due to Spark introduce two compression codecs `NONE` and `UNCOMPRESSED`.
   
   On the other hand, there are a lot of magic strings copy from orc compression codecs. This issue lead to developers need to manually maintain its consistency. It is easy to make mistakes and reduce development efficiency.
   
   
   ### Why are the changes needed?
   Let developers easy to use orc compression codecs.
   
   
   ### Does this PR introduce _any_ user-facing change?
   'No'.
   Introduce a new class.
   
   
   ### How was this patch tested?
   Exists test cases.
   
   
   ### Was this patch authored or co-authored using generative AI tooling?
   'No'.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-45664][SQL] Introduce a mapper for orc compression codecs [spark]

Posted by "beliefer (via GitHub)" <gi...@apache.org>.
beliefer commented on PR #43528:
URL: https://github.com/apache/spark/pull/43528#issuecomment-1782155961

   > Just a question. May I ask why we do this ORC-specific change? Are you going to do the same things for all data sources like Parquet and Avro at Apache Spark 4.0.0?
   
   Because orc supported compression codecs and spark supported are not completely one-on-one due to Spark introduce two compression codecs `NONE` and `UNCOMPRESSED`. This change also make tests easy and reduce the magic strings.
   
   I'm doing the same things for Parquet and Avro at Apache Spark 4.0.0.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-45664][SQL] Introduce a mapper for orc compression codecs [spark]

Posted by "dongjoon-hyun (via GitHub)" <gi...@apache.org>.
dongjoon-hyun closed pull request #43528: [SPARK-45664][SQL] Introduce a mapper for orc compression codecs
URL: https://github.com/apache/spark/pull/43528


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-45664][SQL] Introduce a mapper for orc compression codecs [spark]

Posted by "beliefer (via GitHub)" <gi...@apache.org>.
beliefer commented on PR #43528:
URL: https://github.com/apache/spark/pull/43528#issuecomment-1784869112

   @dongjoon-hyun Thank you!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-45664][SQL] Introduce a mapper for orc compression codecs [spark]

Posted by "beliefer (via GitHub)" <gi...@apache.org>.
beliefer commented on PR #43528:
URL: https://github.com/apache/spark/pull/43528#issuecomment-1778881370

   ping @dongjoon-hyun cc @srowen @viirya 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org