You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@arrow.apache.org by "HaoXuAI (via GitHub)" <gi...@apache.org> on 2024/04/30 23:51:15 UTC

[I] Directly load ADBC to Spark Dataframe [arrow-adbc]

HaoXuAI opened a new issue, #1801:
URL: https://github.com/apache/arrow-adbc/issues/1801

   ### What feature or improvement would you like to see?
   
   Similar to JDBC, something like:
   ```
   jdbcDF = spark.read \
       .format("adbc") \
       .option("url", "adbc:postgresql") \
       .option("dbtable", "schema.tablename") \
       .option("user", "username") \
       .option("password", "password") \
       .load()
   ```
   That way help to leverage ADBC in Spark compute environment.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] Directly read ADBC to Spark Dataframe [arrow-adbc]

Posted by "lidavidm (via GitHub)" <gi...@apache.org>.
lidavidm commented on issue #1801:
URL: https://github.com/apache/arrow-adbc/issues/1801#issuecomment-2087740883

   I think this is should be a Spark feature request?
   
   What I would like to do here is provide a JNI driver that can leverage the better-optimized postgresql/snowflake drivers from Java, though.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] Directly read ADBC to Spark Dataframe [arrow-adbc]

Posted by "lidavidm (via GitHub)" <gi...@apache.org>.
lidavidm commented on issue #1801:
URL: https://github.com/apache/arrow-adbc/issues/1801#issuecomment-2087748054

   I don't believe anyone is working on this. Best to take it to the Spark community.
   
   The ADBC driver for postgres, snowflake in Java just wraps JDBC. It doesn't provide any benefits. If we had JNI bindings to the C++/Go drivers we might see some performance benefits.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] Directly read ADBC to Spark Dataframe [arrow-adbc]

Posted by "HaoXuAI (via GitHub)" <gi...@apache.org>.
HaoXuAI commented on issue #1801:
URL: https://github.com/apache/arrow-adbc/issues/1801#issuecomment-2087746496

   > I think this should be a Spark feature request?
   > 
   > What I would like to do here is provide a JNI driver that can leverage the better-optimized postgresql/snowflake drivers from Java, though.
   
   Right, it should be a spark feature. I'm posting here to check if it is a meaningful feature, and someone from the arrow team is already working on it. :)
   What do you mean by JNI driver?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] Directly read ADBC to Spark Dataframe [arrow-adbc]

Posted by "tokoko (via GitHub)" <gi...@apache.org>.
tokoko commented on issue #1801:
URL: https://github.com/apache/arrow-adbc/issues/1801#issuecomment-2098340047

   @HaoXuAI hey, fancy seeing you here 😄 I've started [this](https://github.com/tokoko/spark-adbc) a while ago and then abandoned it (changed jobs and was no longer using Dremio). Can help you bring it back from the dead if you have a use case.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] Directly read ADBC to Spark Dataframe [arrow-adbc]

Posted by "HaoXuAI (via GitHub)" <gi...@apache.org>.
HaoXuAI commented on issue #1801:
URL: https://github.com/apache/arrow-adbc/issues/1801#issuecomment-2087750446

   make sense. let me post it in the Spark repo.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] Directly read ADBC to Spark Dataframe [arrow-adbc]

Posted by "HaoXuAI (via GitHub)" <gi...@apache.org>.
HaoXuAI commented on issue #1801:
URL: https://github.com/apache/arrow-adbc/issues/1801#issuecomment-2098782347

   Hey @tokoko ! Great to see you here as well. Not a direct use case on work, but thinking about using ADBC in a project to read data on spark. Do you want to directly contribute to spark or keep it a plugin?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] Directly read ADBC to Spark Dataframe [arrow-adbc]

Posted by "tokoko (via GitHub)" <gi...@apache.org>.
tokoko commented on issue #1801:
URL: https://github.com/apache/arrow-adbc/issues/1801#issuecomment-2098980039

   @HaoXuAI My goal at the time was to get it to mostly working condition as a plugin and then contribute, but we can do it either way.
   
   @lidavidm Even with JNI drivers, the adbc java interface itself will still look the same, right? spark data source implementation will be independent of how drivers are implemented.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] Directly read ADBC to Spark Dataframe [arrow-adbc]

Posted by "lidavidm (via GitHub)" <gi...@apache.org>.
lidavidm commented on issue #1801:
URL: https://github.com/apache/arrow-adbc/issues/1801#issuecomment-2099402770

   Yes, the idea of JNI would be to implement the same Java-side interface


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org