You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@devlake.apache.org by "samcrang (via GitHub)" <gi...@apache.org> on 2023/03/24 15:15:46 UTC

[GitHub] [incubator-devlake] samcrang opened a new issue, #4772: [Bug][pydevlake] Panic when running blueprint with large-ish amounts of data

samcrang opened a new issue, #4772:
URL: https://github.com/apache/incubator-devlake/issues/4772

   ### Search before asking
   
   - [X] I had searched in the [issues](https://github.com/apache/incubator-devlake/issues?q=is%3Aissue) and found no similar issues.
   
   
   ### What happened
   
   We have an internal release tool which I've written a plugin to scrape releases from. We're using the new `pydevlake` plugin stuff.
   
   I've obfuscated some of the logs because the details aren't relevant to outside of where I work so I hope I've not removed anything useful.
   
   When trigger a run of a blueprint (`curl -v -X 'POST' 'http://localhost:8080/blueprints/1/trigger'`) we end up with a `panic`:
   
   ```
   ...
   time="2023-03-24 15:01:22" level=info msg=" [pipeline service] [pipeline #1] [task #1] [collectHopperReleases] DEBUG: https://***.net:443 "GET /api/apps/test-http/releases?page=13 HTTP/1.1" 200 None"
   time="2023-03-24 15:01:22" level=info msg=" [pipeline service] [pipeline #1] [task #1] [collectHopperReleases] DEBUG: https://***.net:443 "GET /api/apps/test-http/releases?page=15 HTTP/1.1" 200 None"
   time="2023-03-24 15:01:22" level=info msg=" [pipeline service] [pipeline #1] [task #1] [collectHopperReleases] DEBUG: https://***.net:443 "GET /api/apps/test-http/releases?page=17 HTTP/1.1" 200 None"
   time="2023-03-24 15:01:23" level=info msg=" [pipeline service] [pipeline #1] [task #1] [collectHopperReleases] DEBUG: https://***.net:443 "GET /api/apps/test-http/releases?page=19 HTTP/1.1" 200 None"
   time="2023-03-24 15:01:23" level=info msg=" [pipeline service] [pipeline #1] [task #1] [collectHopperReleases] DEBUG: https://***.net:443 "GET /api/apps/test-http/releases?page=21 HTTP/1.1" 200 None"
   panic: runtime error: invalid memory address or nil pointer dereference
   [signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x10082d0aa]
   
   goroutine 77 [running]:
   github.com/apache/incubator-devlake/core/utils.scanOutputPipe.func1()
           /Users/sam/src/incubator-devlake/backend/core/utils/ipc.go:246 +0xea
   created by github.com/apache/incubator-devlake/core/utils.StreamProcess
           /Users/sam/src/incubator-devlake/backend/core/utils/ipc.go:190 +0x34b
   exit status 2
   make[1]: *** [run] Error 1
   make: *** [dev] Error 2
   ```
   
   This works fine when run directly:
   
   ```
   poetry run hopper/main.py collect "{\"db_url\": \"mysql://deliveroo:deliveroo@127.0.0.1:3306/devlake\", \"scope\": {\"id\": \"test-http\", \"name\": \"test-http\"}, \"connection\": { \"id\": 1, \"name\": \"hopper\", \"endpoint\": \"https://***.net\", \"token\": \"${TOKEN}\"}}" releases 3>&1
   DEBUG: Starting new HTTPS connection (1): hopper-sandbox.deliveroo.net:443
   DEBUG: https://***.net:443 "GET /api/apps/test-http/releases?page=1 HTTP/1.1" 200 None
   DEBUG: https://***.net:443 "GET /api/apps/test-http/releases?page=3 HTTP/1.1" 200 None
   DEBUG: https://***.net:443 "GET /api/apps/test-http/releases?page=5 HTTP/1.1" 200 None
   DEBUG: https://***.net:443 "GET /api/apps/test-http/releases?page=7 HTTP/1.1" 200 None
   DEBUG: https://***.net:443 "GET /api/apps/test-http/releases?page=9 HTTP/1.1" 200 None
   DEBUG: https://***.net:443 "GET /api/apps/test-http/releases?page=11 HTTP/1.1" 200 None
   DEBUG: https://***.net:443 "GET /api/apps/test-http/releases?page=13 HTTP/1.1" 200 None
   DEBUG: https://***.net:443 "GET /api/apps/test-http/releases?page=15 HTTP/1.1" 200 None
   DEBUG: https://***.net:443 "GET /api/apps/test-http/releases?page=17 HTTP/1.1" 200 None
   DEBUG: https://***.net:443 "GET /api/apps/test-http/releases?page=19 HTTP/1.1" 200 None
   DEBUG: https://***.net:443 "GET /api/apps/test-http/releases?page=21 HTTP/1.1" 200 None
   {"increment": 100, "current": 100}
   DEBUG: https://***.net:443 "GET /api/apps/test-http/releases?page=23 HTTP/1.1" 200 None
   DEBUG: https://***.net:443 "GET /api/apps/test-http/releases?page=23 HTTP/1.1" 200 None
   DEBUG: https://***.net:443 "GET /api/apps/test-http/releases?page=23 HTTP/1.1" 200 None
   DEBUG: https://***.net:443 "GET /api/apps/test-http/releases?page=25 HTTP/1.1" 200 None
   DEBUG: https://***.net:443 "GET /api/apps/test-http/releases?page=27 HTTP/1.1" 200 None
   DEBUG: https://***.net:443 "GET /api/apps/test-http/releases?page=29 HTTP/1.1" 200 None
   ...
   ```
   
   I suspect what happens is [the message sent here](https://github.com/apache/incubator-devlake/blob/main/backend/python/pydevlake/pydevlake/subtasks.py#L68-L71) isn't correctly handled on the go side of the IPC.
   
   I've worked around this in the short term by setting `sync_point_interval` to something very large.
   
   ### What do you expect to happen
   
   It should not panic the blueprint should run successfully.
   
   ### How to reproduce
   
   See above.
   
   ### Anything else
   
   Every time.
   
   ### Version
   
   main
   
   ### Are you willing to submit PR?
   
   - [ ] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of Conduct](https://www.apache.org/foundation/policies/conduct)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@devlake.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-devlake] hezyin closed issue #4772: [Bug][pydevlake] Panic when running blueprint with large-ish amounts of data

Posted by "hezyin (via GitHub)" <gi...@apache.org>.
hezyin closed issue #4772: [Bug][pydevlake] Panic when running blueprint with large-ish amounts of data
URL: https://github.com/apache/incubator-devlake/issues/4772


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@devlake.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-devlake] keon94 commented on issue #4772: [Bug][pydevlake] Panic when running blueprint with large-ish amounts of data

Posted by "keon94 (via GitHub)" <gi...@apache.org>.
keon94 commented on issue #4772:
URL: https://github.com/apache/incubator-devlake/issues/4772#issuecomment-1490502943

   I've fixed the issue here: https://github.com/apache/incubator-devlake/pull/4815
   The root cause was a simple missing nil check on Go side. Will be merged soon.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@devlake.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-devlake] samcrang commented on issue #4772: [Bug][pydevlake] Panic when running blueprint with large-ish amounts of data

Posted by "samcrang (via GitHub)" <gi...@apache.org>.
samcrang commented on issue #4772:
URL: https://github.com/apache/incubator-devlake/issues/4772#issuecomment-1497415325

   @keon94 We're running from 561ce0d and it looks like this no longer crashes 👍🏻 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@devlake.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-devlake] samcrang commented on issue #4772: [Bug][pydevlake] Panic when running blueprint with large-ish amounts of data

Posted by "samcrang (via GitHub)" <gi...@apache.org>.
samcrang commented on issue #4772:
URL: https://github.com/apache/incubator-devlake/issues/4772#issuecomment-1497415561

   Thanks for fixing this!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@devlake.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-devlake] klesh commented on issue #4772: [Bug][pydevlake] Panic when running blueprint with large-ish amounts of data

Posted by "klesh (via GitHub)" <gi...@apache.org>.
klesh commented on issue #4772:
URL: https://github.com/apache/incubator-devlake/issues/4772#issuecomment-1484703995

   Thanks for reporting the issue 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@devlake.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-devlake] keon94 commented on issue #4772: [Bug][pydevlake] Panic when running blueprint with large-ish amounts of data

Posted by "keon94 (via GitHub)" <gi...@apache.org>.
keon94 commented on issue #4772:
URL: https://github.com/apache/incubator-devlake/issues/4772#issuecomment-1489596909

   Hi @samcrang. So the default value of 100 in subtasks.py causes the crash? What value have you set it to? Also is the crash consistent?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@devlake.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-devlake] keon94 commented on issue #4772: [Bug][pydevlake] Panic when running blueprint with large-ish amounts of data

Posted by "keon94 (via GitHub)" <gi...@apache.org>.
keon94 commented on issue #4772:
URL: https://github.com/apache/incubator-devlake/issues/4772#issuecomment-1496216303

   @samcrang the fix is in main. Can you confirm your issue is resolved?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@devlake.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org