You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by "WillAyd (via GitHub)" <gi...@apache.org> on 2023/06/02 18:32:22 UTC

[GitHub] [arrow-adbc] WillAyd opened a new pull request, #725: feat(c/driver/postgresql): Implement PRIMARY KEY in GetObjects ALL depth

WillAyd opened a new pull request, #725:
URL: https://github.com/apache/arrow-adbc/pull/725

   (no comment)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-adbc] WillAyd commented on a diff in pull request #725: feat(c/driver/postgresql): Implement PRIMARY KEY in GetObjects ALL depth

Posted by "WillAyd (via GitHub)" <gi...@apache.org>.
WillAyd commented on code in PR #725:
URL: https://github.com/apache/arrow-adbc/pull/725#discussion_r1214689873


##########
c/driver/postgresql/postgresql_test.cc:
##########
@@ -258,6 +258,131 @@ TEST_F(PostgresConnectionTest, GetObjectsGetDbSchemas) {
   ASSERT_TRUE(seen_public) << "public schema does not exist";
 }
 
+TEST_F(PostgresConnectionTest, GetObjectsGetAllFindsPrimaryKey) {
+  ASSERT_THAT(AdbcConnectionNew(&connection, &error), IsOkStatus(&error));
+  ASSERT_THAT(AdbcConnectionInit(&connection, &database, &error), IsOkStatus(&error));
+
+  if (!quirks()->supports_get_objects()) {
+    GTEST_SKIP();
+  }
+
+  adbc_validation::StreamReader reader;
+  ASSERT_THAT(
+      AdbcConnectionGetObjects(&connection, ADBC_OBJECT_DEPTH_ALL, nullptr, nullptr,
+                               nullptr, nullptr, nullptr, &reader.stream.value, &error),
+      IsOkStatus(&error));
+  ASSERT_NO_FATAL_FAILURE(reader.GetSchema());
+  ASSERT_NO_FATAL_FAILURE(reader.Next());
+  ASSERT_NE(nullptr, reader.array->release);
+  ASSERT_GT(reader.array->length, 0);
+
+  // should exist on public.adbc_test id column
+  bool seen_id_column = false;
+  bool seen_primary_key = false;
+
+  // TODO: these are in the PqObjectsHelper.GetObjects method; move to shared location

Review Comment:
   Thinking these eventually should belong in utils so they can be used by all c/c++ driver implementations



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-adbc] lidavidm commented on pull request #725: feat(c/driver/postgresql): Implement PRIMARY KEY in GetObjects ALL depth

Posted by "lidavidm (via GitHub)" <gi...@apache.org>.
lidavidm commented on PR #725:
URL: https://github.com/apache/arrow-adbc/pull/725#issuecomment-1577357285

   In general I wouldn't assume that a table in one test will exist in another, best to create it again


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-adbc] lidavidm commented on a diff in pull request #725: feat(c/driver/postgresql): Implement PRIMARY KEY in GetObjects ALL depth

Posted by "lidavidm (via GitHub)" <gi...@apache.org>.
lidavidm commented on code in PR #725:
URL: https://github.com/apache/arrow-adbc/pull/725#discussion_r1218485284


##########
c/driver/postgresql/connection.cc:
##########
@@ -459,6 +460,24 @@ class PqGetObjectsHelper {
     return ADBC_STATUS_OK;
   }
 
+  // libpq PQexecParams can use either text or binary transfers
+  // For now we are using text transfer internally, so arrays are sent
+  // back like {element1, element2} within a const char*
+  std::vector<std::string> PqTextArrayToVector(std::string text_array) {
+    text_array.erase(0, 1);
+    text_array.erase(text_array.size() - 1);
+
+    std::vector<std::string> elements;
+    std::stringstream ss(text_array);

Review Comment:
   nit: move `text_array`?



##########
c/driver/postgresql/connection.cc:
##########
@@ -459,6 +460,24 @@ class PqGetObjectsHelper {
     return ADBC_STATUS_OK;
   }
 
+  // libpq PQexecParams can use either text or binary transfers
+  // For now we are using text transfer internally, so arrays are sent
+  // back like {element1, element2} within a const char*
+  std::vector<std::string> PqTextArrayToVector(std::string text_array) {
+    text_array.erase(0, 1);
+    text_array.erase(text_array.size() - 1);
+
+    std::vector<std::string> elements;
+    std::stringstream ss(text_array);
+    std::string tmp;
+
+    while (getline(ss, tmp, ',')) {
+      elements.push_back(tmp);

Review Comment:
   And move `tmp`?



##########
c/driver/postgresql/connection.cc:
##########
@@ -513,12 +532,12 @@ class PqGetObjectsHelper {
           ArrowArrayAppendString(constraint_type_col_, ArrowCharView(constraint_type)),
           error_);
 
-      for (auto i = 0; i < row[2].len; i++) {
-        // TODO: get actual name in here
-        const char* data = row[2].data;
+      auto constraint_column_names =
+          PqTextArrayToVector(std::move(std::string(row[2].data)));
+      for (auto constraint_column_name : constraint_column_names) {

Review Comment:
   nit: `const auto&`?



##########
c/driver/postgresql/connection.cc:
##########
@@ -513,12 +532,12 @@ class PqGetObjectsHelper {
           ArrowArrayAppendString(constraint_type_col_, ArrowCharView(constraint_type)),
           error_);
 
-      for (auto i = 0; i < row[2].len; i++) {
-        // TODO: get actual name in here
-        const char* data = row[2].data;
+      auto constraint_column_names =
+          PqTextArrayToVector(std::move(std::string(row[2].data)));

Review Comment:
   move here is redundant, though



##########
c/driver/postgresql/postgresql_test.cc:
##########
@@ -258,6 +258,131 @@ TEST_F(PostgresConnectionTest, GetObjectsGetDbSchemas) {
   ASSERT_TRUE(seen_public) << "public schema does not exist";
 }
 
+TEST_F(PostgresConnectionTest, GetObjectsGetAllFindsPrimaryKey) {
+  ASSERT_THAT(AdbcConnectionNew(&connection, &error), IsOkStatus(&error));
+  ASSERT_THAT(AdbcConnectionInit(&connection, &database, &error), IsOkStatus(&error));
+
+  if (!quirks()->supports_get_objects()) {
+    GTEST_SKIP();
+  }
+
+  adbc_validation::StreamReader reader;
+  ASSERT_THAT(
+      AdbcConnectionGetObjects(&connection, ADBC_OBJECT_DEPTH_ALL, nullptr, nullptr,
+                               nullptr, nullptr, nullptr, &reader.stream.value, &error),
+      IsOkStatus(&error));
+  ASSERT_NO_FATAL_FAILURE(reader.GetSchema());
+  ASSERT_NO_FATAL_FAILURE(reader.Next());
+  ASSERT_NE(nullptr, reader.array->release);
+  ASSERT_GT(reader.array->length, 0);
+
+  // should exist on public.adbc_test id column
+  bool seen_id_column = false;
+  bool seen_primary_key = false;
+
+  // TODO: these are in the PqObjectsHelper.GetObjects method; move to shared location

Review Comment:
   I think eventually we want a better way of adapting iterators/rows to Arrow, yeah. I had been toying with the idea of having an IDL so you could generate type-safe row-based iterators/builders from a schema definition.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-adbc] WillAyd commented on pull request #725: feat(c/driver/postgresql): Implement PRIMARY KEY in GetObjects ALL depth

Posted by "WillAyd (via GitHub)" <gi...@apache.org>.
WillAyd commented on PR #725:
URL: https://github.com/apache/arrow-adbc/pull/725#issuecomment-1577388525

   Makes sense. I was just a bit unsure how to make this table specifically for the test as we don't yet have access to the `statement` member to create a key in the connection test class. A simple hack would be to add that, though to maintain the hierachy of tests going from Connection -> Statements tests I think better if I construct the table with a key via the quirks


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-adbc] lidavidm merged pull request #725: feat(c/driver/postgresql): Implement PRIMARY KEY in GetObjects ALL depth

Posted by "lidavidm (via GitHub)" <gi...@apache.org>.
lidavidm merged PR #725:
URL: https://github.com/apache/arrow-adbc/pull/725


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-adbc] WillAyd commented on pull request #725: feat(c/driver/postgresql): Implement PRIMARY KEY in GetObjects ALL depth

Posted by "WillAyd (via GitHub)" <gi...@apache.org>.
WillAyd commented on PR #725:
URL: https://github.com/apache/arrow-adbc/pull/725#issuecomment-1577110795

   I think the test failure is a timing issue where the table being tested against doesn't get created until the Statement tests are executed. Do you think we should try and update the current `bulk_ingest` table to have a primary key across the board? Or is it worth creating a separate quirk to create such a table?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-adbc] lidavidm commented on pull request #725: feat(c/driver/postgresql): Implement PRIMARY KEY in GetObjects ALL depth

Posted by "lidavidm (via GitHub)" <gi...@apache.org>.
lidavidm commented on PR #725:
URL: https://github.com/apache/arrow-adbc/pull/725#issuecomment-1577405217

   Sounds reasonable enough (especially as that may be database-dependent).
   
   You can always create a statement on the fly (using Handle to manage via RAII) if you need to


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org