You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by "tustvold (via GitHub)" <gi...@apache.org> on 2023/02/12 12:35:06 UTC

[GitHub] [arrow-datafusion] tustvold commented on a diff in pull request #4908: added a method to read multiple locations at the same time.

tustvold commented on code in PR #4908:
URL: https://github.com/apache/arrow-datafusion/pull/4908#discussion_r1103794858


##########
datafusion/core/src/execution/context.rs:
##########
@@ -613,50 +615,29 @@ impl SessionContext {
     /// [`read_table`](Self::read_table) with a [`ListingTable`].
     async fn _read_type<'a>(
         &self,
-        table_path: impl AsRef<str>,
+        table_paths: Vec<impl AsRef<str>>,
         options: impl ReadOptions<'a>,
     ) -> Result<DataFrame> {
-        let table_path = ListingTableUrl::parse(table_path)?;
+        let table_paths = table_paths
+            .iter()
+            .map(ListingTableUrl::parse)
+            .collect::<Result<Vec<ListingTableUrl>>>()?;
         let session_config = self.copied_config();
         let listing_options = options.to_listing_options(&session_config);
         let resolved_schema = match options
-            .get_resolved_schema(&session_config, self.state(), table_path.clone())
+            .get_resolved_schema(&session_config, self.state(), table_paths[0].clone())
             .await
         {
             Ok(resolved_schema) => resolved_schema,
             Err(e) => return Err(e),
         };
-        let config = ListingTableConfig::new(table_path)
+        let config = ListingTableConfig::new_with_multi_paths(table_paths)
             .with_listing_options(listing_options)
             .with_schema(resolved_schema);
         let provider = ListingTable::try_new(config)?;
         self.read_table(Arc::new(provider))
     }
 
-    /// Creates a [`DataFrame`] for reading an Avro data source.
-    ///
-    /// For more control such as reading multiple files, you can use
-    /// [`read_table`](Self::read_table) with a [`ListingTable`].
-    pub async fn read_avro(
-        &self,
-        table_path: impl AsRef<str>,
-        options: AvroReadOptions<'_>,
-    ) -> Result<DataFrame> {
-        self._read_type(table_path, options).await
-    }

Review Comment:
   > But we have to use the use statement for the DataFilePaths whenever we call read_csv and others.
   
   I'm surprised by this, you shouldn't need to use a trait in order to satisfy a type constraint on a generic method?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org