You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by "brettbuddin (via GitHub)" <gi...@apache.org> on 2023/02/01 15:38:29 UTC

[GitHub] [arrow] brettbuddin opened a new pull request, #33983: [Go] Use shared cache in SQLite example.

brettbuddin opened a new pull request, #33983:
URL: https://github.com/apache/arrow/pull/33983

   <!--
   Thanks for opening a pull request!
   If this is your first pull request you can find detailed information on how 
   to contribute here:
     * [New Contributor's Guide](https://arrow.apache.org/docs/dev/developers/guide/step_by_step/pr_lifecycle.html#reviews-and-merge-of-the-pull-request)
     * [Contributing Overview](https://arrow.apache.org/docs/dev/developers/overview.html)
   
   
   If this is not a [minor PR](https://github.com/apache/arrow/blob/master/CONTRIBUTING.md#Minor-Fixes). Could you open an issue for this pull request on GitHub? https://github.com/apache/arrow/issues/new/choose
   
   Opening GitHub issues ahead of time contributes to the [Openness](http://theapacheway.com/open/#:~:text=Openness%20allows%20new%20users%20the,must%20happen%20in%20the%20open.) of the Apache Arrow project.
   
   Then could you also rename the pull request title in the following format?
   
       GH-${GITHUB_ISSUE_ID}: [${COMPONENT}] ${SUMMARY}
   
   or
   
       MINOR: [${COMPONENT}] ${SUMMARY}
   
   In the case of PARQUET issues on JIRA the title also supports:
   
       PARQUET-${JIRA_ISSUE_ID}: [${COMPONENT}] ${SUMMARY}
   
   -->
   
   ### Rationale for this change
   
   When calling this server with a few concurrent requests, you can easily force the Go stdlib to spawn a new connection. Since only the `:memory:` DSN is specified, the stdlib doesn't know any better and effectively creates a new database. The result is a lot of `no such table` errors.
   
   ### What changes are included in this PR?
   
   This change adds the `?cache=shared` as specified at:
   
   https://www.sqlite.org/sharedcache.html#enabling_shared_cache_mode
   
   This ensures that these new connections participate in same in-memory database and can see all the tables created at server start.
   
   ### Are these changes tested?
   
   I haven't written any additional tests here, because this behavior is difficult to test. Suggestions are welcome on an approach here.
   
   ### Are there any user-facing changes?
   
   No.
   
   <!--
   Please uncomment the line below (and provide explanation) if the changes fix either (a) a security vulnerability, (b) a bug that caused incorrect or invalid data to be produced, or (c) a bug that causes a crash (even when the API contract is upheld). We use this to highlight fixes to issues that may affect users without their knowledge. For this reason, fixing bugs that cause errors don't count, since those are usually obvious.
   -->
   <!-- **This PR contains a "Critical Fix".** -->


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] brettbuddin commented on pull request #33983: MINOR: [Go] Use shared cache in SQLite example.

Posted by "brettbuddin (via GitHub)" <gi...@apache.org>.
brettbuddin commented on PR #33983:
URL: https://github.com/apache/arrow/pull/33983#issuecomment-1420032159

   @zeroshade After working the problem a bit I wasn't able to come up with a solution where augmenting the Flight server gave me what I needed. To me, it's more clear to separate the creation of the database from the gRPC service—making a dependency relationship. In `main.go` this results in the ability to just `defer` closure for when the server is completely out of the way, and in tests we just articulate closure in the suite's test teardown.
   
   The consequence is an added step in creation of the server. Given my experience with this example: the point is to provide a binary to use in integration testing. It's not so important to me how many steps it takes to create that gRPC service layer. Thoughts?
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] lidavidm commented on pull request #33983: MINOR: [Go] Use shared cache in SQLite example.

Posted by "lidavidm (via GitHub)" <gi...@apache.org>.
lidavidm commented on PR #33983:
URL: https://github.com/apache/arrow/pull/33983#issuecomment-1412423801

   Hmm, you can't easily force Go to open two connections?
   
   Possibly have one client start a transaction (with just a BEGIN TRANSACTION) and hopefully that ties up a connection in the pool, forcing the next request to use a new connection?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] zeroshade commented on pull request #33983: MINOR: [Go] Use shared cache in SQLite example.

Posted by "zeroshade (via GitHub)" <gi...@apache.org>.
zeroshade commented on PR #33983:
URL: https://github.com/apache/arrow/pull/33983#issuecomment-1412463314

   @brettbuddin looks like you're getting "table already exists" errors in the CI. I'm guessing because I put the table creation into the startup of the server so when it goes to run the next test the tables already exist because of the cached option.
   
   You could potentially put the table creation into the test itself via `SetupSuite` and then have the tests run in parallel?
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] brettbuddin commented on pull request #33983: MINOR: [Go] Use shared cache in SQLite example.

Posted by "brettbuddin (via GitHub)" <gi...@apache.org>.
brettbuddin commented on PR #33983:
URL: https://github.com/apache/arrow/pull/33983#issuecomment-1419340957

   @zeroshade Yep I'll take a shot at it today. I realize now I didn't actually add a test like we discussed either. Think once I adjust a few things we'll be able to have both.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] brettbuddin commented on pull request #33983: MINOR: [Go] Use shared cache in SQLite example.

Posted by "brettbuddin (via GitHub)" <gi...@apache.org>.
brettbuddin commented on PR #33983:
URL: https://github.com/apache/arrow/pull/33983#issuecomment-1412943819

   @zeroshade I took a crack at it. I realized that the server shutdown is never closing the SQLite database so even though we're creating a new server structure and overwriting `srv` in the tests, it will continue to think its in the same memory space. We'd have the same problem if we simply pushed it down into the tests without closure.
   
   I'm not exactly happy with the shape of what I have though. It seems like the owning Flight server should be able to ensure this is called without me manually calling it the teardown. function. However, there isn't a suitable hook point without being too invasive in `server.go` and friends. The `Shutdown()` I've added feels tacked on, because it is.
   
   Either `db.Close()` will need to be called or a new "memory space" will need to be created explicitly for each server start-up via something like `file:<random-id-for-server>?mode=memory&cache=shared` in dialing. My vote is for proper closure, because it's less articulation. However, if concurrent execution of test cases (like you mentioned) is a priority, the new memory space idea should be considered.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] brettbuddin commented on pull request #33983: MINOR: [Go] Use shared cache in SQLite example.

Posted by "brettbuddin (via GitHub)" <gi...@apache.org>.
brettbuddin commented on PR #33983:
URL: https://github.com/apache/arrow/pull/33983#issuecomment-1412425356

   > Possibly have one client start a transaction (with just a BEGIN TRANSACTION) and hopefully that ties up a connection in the pool, forcing the next request to use a new connection?
   
   @lidavidm This is similar to what I was thinking of trying. Let me sit with it for the day and see how far I get.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] zeroshade commented on pull request #33983: MINOR: [Go] Use shared cache in SQLite example.

Posted by "zeroshade (via GitHub)" <gi...@apache.org>.
zeroshade commented on PR #33983:
URL: https://github.com/apache/arrow/pull/33983#issuecomment-1412409845

   @lidavidm can you think of a good way to test this properly?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] ursabot commented on pull request #33983: MINOR: [Go] Use shared cache in SQLite example.

Posted by "ursabot (via GitHub)" <gi...@apache.org>.
ursabot commented on PR #33983:
URL: https://github.com/apache/arrow/pull/33983#issuecomment-1422403912

   Benchmark runs are scheduled for baseline = 63cb14e374dfb602aaac5319daf61258f17556e3 and contender = 39bad5442c6447bf07594b09e4b29118b3211460. 39bad5442c6447bf07594b09e4b29118b3211460 is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
   Conbench compare runs links:
   [Failed :arrow_down:0.0% :arrow_up:0.0%] [ec2-t3-xlarge-us-east-2](https://conbench.ursa.dev/compare/runs/c1eb9a548ac840b885ac88b1b582ea56...150aa9ef5d7945969ae992309a845bd0/)
   [Failed :arrow_down:1.22% :arrow_up:0.12%] [test-mac-arm](https://conbench.ursa.dev/compare/runs/c075289922034d53affc875c7223a6b9...4550260c557b4719b00fd4dc3972387a/)
   [Failed :arrow_down:0.0% :arrow_up:0.0%] [ursa-i9-9960x](https://conbench.ursa.dev/compare/runs/9cd759abddb94d9b9d0d6164de193441...3dee5dfdb23f4a17a97a4f03a607887a/)
   [Finished :arrow_down:0.67% :arrow_up:0.03%] [ursa-thinkcentre-m75q](https://conbench.ursa.dev/compare/runs/e897ac2aa9ac4c249bb1fe2bacf9166d...55ef4e4c61f44e6b825c36c26d303b8a/)
   Buildkite builds:
   [Failed] [`39bad544` ec2-t3-xlarge-us-east-2](https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ec2-t3-xlarge-us-east-2/builds/2345)
   [Failed] [`39bad544` test-mac-arm](https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-test-mac-arm/builds/2373)
   [Failed] [`39bad544` ursa-i9-9960x](https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ursa-i9-9960x/builds/2343)
   [Finished] [`39bad544` ursa-thinkcentre-m75q](https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ursa-thinkcentre-m75q/builds/2364)
   [Failed] [`63cb14e3` ec2-t3-xlarge-us-east-2](https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ec2-t3-xlarge-us-east-2/builds/2344)
   [Failed] [`63cb14e3` test-mac-arm](https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-test-mac-arm/builds/2372)
   [Failed] [`63cb14e3` ursa-i9-9960x](https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ursa-i9-9960x/builds/2342)
   [Finished] [`63cb14e3` ursa-thinkcentre-m75q](https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ursa-thinkcentre-m75q/builds/2363)
   Supported benchmarks:
   ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python, R. Runs only benchmarks with cloud = True
   test-mac-arm: Supported benchmark langs: C++, Python, R
   ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
   ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] brettbuddin commented on pull request #33983: MINOR: [Go] Use shared cache in SQLite example.

Posted by "brettbuddin (via GitHub)" <gi...@apache.org>.
brettbuddin commented on PR #33983:
URL: https://github.com/apache/arrow/pull/33983#issuecomment-1412424061

   It's possible this is causing some failures in CI. I'll have to go through it and see the impact.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] zeroshade commented on pull request #33983: MINOR: [Go] Use shared cache in SQLite example.

Posted by "zeroshade (via GitHub)" <gi...@apache.org>.
zeroshade commented on PR #33983:
URL: https://github.com/apache/arrow/pull/33983#issuecomment-1419299343

   @brettbuddin I agree with you, there should be a better hook point. Do you want to take a crack at being invasive and putting together a better hook point? If not, I think this solution is fine for now until we put together a better one.
   
   Thoughts?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] zeroshade merged pull request #33983: MINOR: [Go] Use shared cache in SQLite example.

Posted by "zeroshade (via GitHub)" <gi...@apache.org>.
zeroshade merged PR #33983:
URL: https://github.com/apache/arrow/pull/33983


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org