You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by "alamb (via GitHub)" <gi...@apache.org> on 2023/10/26 21:27:26 UTC

[PR] `WIP` Prototype new DFSchema implementation [arrow-datafusion]

alamb opened a new pull request, #7944:
URL: https://github.com/apache/arrow-datafusion/pull/7944

   I have long been bothered by the amount of copying required in DFSchema and how akward it is to use 
   
   The current setup also makes adding additional indexes such as described on https://github.com/apache/arrow-datafusion/issues/7698 very hard 
   
   This PR is part of a plan to review the current implementation and possibly  simplify it / prepare for faster DataFusion planning


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] `WIP` Prototype new DFSchema implementation [arrow-datafusion]

Posted by "alamb (via GitHub)" <gi...@apache.org>.
alamb commented on code in PR #7944:
URL: https://github.com/apache/arrow-datafusion/pull/7944#discussion_r1373834408


##########
datafusion/common/src/dfschema.rs:
##########
@@ -33,17 +33,36 @@ use crate::{
 
 use arrow::compute::can_cast_types;
 use arrow::datatypes::{DataType, Field, FieldRef, Fields, Schema, SchemaRef};
+use sqlparser::ast::Table;
 
 /// A reference-counted reference to a `DFSchema`.
 pub type DFSchemaRef = Arc<DFSchema>;
 
 /// DFSchema wraps an Arrow schema and adds relation names
+///
+/// # Example
+/// ```
+/// Creating a DF schema from an arrow schema
+/// ```
+///
+/// ```
+/// Converting from DF schema to arrow schema
+/// ```
+///
+/// ```
+/// Iterating over qualified fields
+/// ```
 #[derive(Debug, Clone, PartialEq, Eq)]
 pub struct DFSchema {
-    /// Fields
-    fields: Vec<DFField>,
-    /// Additional metadata in form of key value pairs
-    metadata: HashMap<String, String>,
+    /// Inner arrow schema
+    inner: SchemaRef,
+
+    /// Optional qualifiers for each column in this schema. In the same order as

Review Comment:
   This is the key design --remove `DFField` and simply wrap an inner arrow Schema ref. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] `WIP` Prototype new DFSchema implementation [arrow-datafusion]

Posted by "alamb (via GitHub)" <gi...@apache.org>.
alamb closed pull request #7944: `WIP` Prototype new DFSchema implementation
URL: https://github.com/apache/arrow-datafusion/pull/7944


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] `WIP` Prototype new DFSchema implementation [arrow-datafusion]

Posted by "alamb (via GitHub)" <gi...@apache.org>.
alamb commented on PR #7944:
URL: https://github.com/apache/arrow-datafusion/pull/7944#issuecomment-1905889772

   I think @matthewmturner  is working on something real, so closing this one


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org