You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/10/20 18:07:34 UTC

[GitHub] [arrow-datafusion] avantgardnerio commented on a diff in pull request #3907: Add serde for plans with tables from `TableProviderFactory`s

avantgardnerio commented on code in PR #3907:
URL: https://github.com/apache/arrow-datafusion/pull/3907#discussion_r1000965184


##########
datafusion/proto/src/logical_plan.rs:
##########
@@ -760,55 +804,21 @@ impl AsLogicalPlan for LogicalPlanNode {
                     .collect::<Result<Vec<_>, _>>()?;
 
                 if let Some(listing_table) = source.downcast_ref::<ListingTable>() {
-                    let any = listing_table.options().format.as_any();
-                    let file_format_type = if let Some(parquet) =
-                        any.downcast_ref::<ParquetFormat>()
-                    {
-                        FileFormatType::Parquet(protobuf::ParquetFormat {
-                            enable_pruning: parquet.enable_pruning(),
-                        })
-                    } else if let Some(csv) = any.downcast_ref::<CsvFormat>() {
-                        FileFormatType::Csv(protobuf::CsvFormat {
-                            delimiter: byte_to_string(csv.delimiter())?,
-                            has_header: csv.has_header(),
-                        })
-                    } else if any.is::<AvroFormat>() {
-                        FileFormatType::Avro(protobuf::AvroFormat {})
-                    } else {
-                        return Err(proto_error(format!(
-                            "Error converting file format, {:?} is invalid as a datafusion foramt.",
-                            listing_table.options().format
-                        )));
-                    };
-                    Ok(protobuf::LogicalPlanNode {
-                        logical_plan_type: Some(LogicalPlanType::ListingScan(
-                            protobuf::ListingTableScanNode {
-                                file_format_type: Some(file_format_type),
-                                table_name: table_name.to_owned(),
-                                collect_stat: listing_table.options().collect_stat,
-                                file_extension: listing_table
-                                    .options()
-                                    .file_extension
-                                    .clone(),
-                                table_partition_cols: listing_table
-                                    .options()
-                                    .table_partition_cols
-                                    .clone(),
-                                paths: listing_table
-                                    .table_paths()
-                                    .iter()
-                                    .map(|x| x.to_string())
-                                    .collect(),
-                                schema: Some(schema),
-                                projection,
-                                filters,
-                                target_partitions: listing_table
-                                    .options()
-                                    .target_partitions
-                                    as u32,
-                            },
-                        )),
-                    })
+                    Self::serialize_listing_table(
+                        table_name,
+                        projection,
+                        schema,
+                        filters,
+                        listing_table,
+                    )
+                } else if let Some(custom_table) = source.downcast_ref::<CustomTable>() {

Review Comment:
   > I don't fully understand this proposal 
   
   That delta PR really explains what I'm trying to do. Unfortunately the end goal spans datafusion, delta-rs & ballista, so it's hard to convey. And I'm open to doing it simpler ways if they exist and fulfill the broader goal of supporting table types unknown to datafusion (i.e. delta, but possibly others).
   
   > the client code needs to get the concrete TableProvider
   
   As long as the deserializing node also has the `TableProviderFactory` registered, it can get a concrete one created. The CustomTable is just a wrapper that contains enough information to get it instantiated by the factory.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org