You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by "byteink (via GitHub)" <gi...@apache.org> on 2023/06/02 03:13:37 UTC

[GitHub] [arrow-datafusion] byteink commented on issue #6492: Mismatch in MemTable (Select Into with aggregate window functions having no alias)

byteink commented on issue #6492:
URL: https://github.com/apache/arrow-datafusion/issues/6492#issuecomment-1573076672

   Another similar case of failure:
   ```shell
   DataFusion CLI v25.0.0
   ❯ create table t (a int not null);
   0 rows in set. Query took 0.004 seconds.
   
   ❯ insert into t values(1);
   Error during planning: Inserting query must have the same schema with the table.
   ```
   Two non-matching schemas are:
   **input schema**: Schema { fields: [Field { name: "a", data_type: Int32, nullable: **true**, dict_id: 0, dict_is_ordered: false, metadata: {} }], metadata: {} }
   **table schema**: Schema { fields: [Field { name: "a", data_type: Int32, nullable: **false**, dict_id: 0, dict_is_ordered: false, metadata: {} }], metadata: {} }
     
       
        
   Can we ignore the schema part of batches and only focus on the actual data part?
   And use the function `RecordBatch::try_new` to check if the data in the RecordBatch matches the schema of the target table. 
   ```rust
   impl MemTable {
       /// Create a new in-memory table from the provided schema and record batches
       pub fn try_new(schema: SchemaRef, partitions: Vec<Vec<RecordBatch>>) -> Result<Self> {
           let mut batches = Vec::with_capacity(partitions.len());
           for partition in partitions {
               let new_partition = partition
                   .iter()
                   .map(|batch| {
                       RecordBatch::try_new(schema.clone(), batch.columns().to_vec())
                           .map_err(DataFusionError::ArrowError)
                   })
                   .collect::<Result<Vec<_>>>()?;
               batches.push(Arc::new(RwLock::new(new_partition)));
           }
           Ok(Self { schema, batches })
       }
   }
   ```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org