You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@skywalking.apache.org by "Superskyyy (via GitHub)" <gi...@apache.org> on 2023/02/18 21:49:06 UTC

[GitHub] [skywalking] Superskyyy opened a new issue, #10408: [Feature] Python agent performance enhancement with asyncio

Superskyyy opened a new issue, #10408:
URL: https://github.com/apache/skywalking/issues/10408

   ### Search before asking
   
   - [X] I had searched in the [issues](https://github.com/apache/skywalking/issues?q=is%3Aissue) and found no similar feature requirement.
   
   
   ### Description
   
   Currently, our Python agent is implemented with the Threading module to provide data reporters. Yet with the growth of the Python agent, it is now fully capable and requires more resources than when only tracing was supported (we start many threads and gRPC itself creates even more threads when streaming). 
   
   Casual testing shows using WRK to benchmark FastAPI, Uvicorn and agent together considerably reduces the throughput. 
   
   Background:
   
   In Python, Global Interpreter Lock at least before Python 3.12 (There's hope that GIL will be removed in somewhat near future) will limit that, **at any given time only one thread can execute their code,** with the exception of I/O time and C lib time. Since our data reporting is mostly I/O bound, threads will not block each other but they introduce a lot of wasted operation of switching threads around to see if they have completed their I/O tasks.
   
   *Asyncio*: Asyncio is a built-in Python library to provide cooperative-multitasking (async/await) coroutines. Each of our used protocols (gRPC, HTTP, Kafka) have mature support for asyncio-based clients. This will totally eliminate thread-switching cost within the agent scope and we gain finer control over the I/O wait.
   
   *Uvloop*: Uviloop is also a mature library that is used by Uvicorn as a drop-in replacement for the built-in Python event loop. It can provide a 2-4x speed up to the native event loop.
   
   The plan is to deprecate or provide an alternative implementation of data reporters (Trace/Log/Meter), maybe also for profilers. The alternative implementation should also work for gRPC/HTTP/Kafka using corresponding async clients.
   
   gRPC with asyncio API: https://grpc.github.io/grpc/python/grpc_asyncio.html
   
   To replace: Sync API
   
   HTTP: AIOHTTP/ HTTPX, I personally prefer aiohttp since it's more mature, but both should be easily swappable and okay. (why not try both and see what is better).
   
   To replace: Requests
   
   
   Kafka: 
   Confluent (seems better yet asyncio support is a bit shaky) https://www.confluent.io/blog/kafka-python-asyncio-integration/
   aiokafka (may not have that active maintenance) https://github.com/aio-libs/aiokafka
   
   To replace: kafka-python (plus it's unmaintained)
   
   Important consideration: Eventloop cannot survive forks, be careful to postpone agent start if a fork can be predicted (like gunicorn + Uvicorn worker can work properly). 
   
   Since FastAPI/ASGI-based web frameworks now dominate the Python web stack, and direct fork usage is very rare (eventloop can safely use processpool.executor) this can happen almost without breaking any user application.
   
   Old reporters should be slowly deprecated as a fallback for a release or two.
   
   
   
   ### Use case
   
   Make Python agent-introduced I/O overhead to user applications much lower.
   
   
   
   ### Related issues
   
   There could also be overhead in non-optimized span creation, yet the IO overhead should be addressed first as the primary target. 
   
   ### Are you willing to submit a PR?
   
   - [X] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of Conduct](https://www.apache.org/foundation/policies/conduct)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [skywalking] kezhenxu94 commented on issue #10408: [GSOC][Feature] Python agent performance enhancement with asyncio

Posted by "kezhenxu94 (via GitHub)" <gi...@apache.org>.
kezhenxu94 commented on issue #10408:
URL: https://github.com/apache/skywalking/issues/10408#issuecomment-1484323334

   > > Hey @Superskyyy, Thanks for letting me know, and yeah, sure, I'm eager to contribute.
   > 
   > 
   > 
   > Cool! We currently have a potential thing yet to be implemented. LIke our OAP backend repo, Python agent could benefit from publishing ghcr images on every success commit to the main branch.
   
   Hey, this is WIP in https://github.com/apache/skywalking-python/pull/297


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [skywalking] Nageshbansal commented on issue #10408: [Feature] Python agent performance enhancement with asyncio

Posted by "Nageshbansal (via GitHub)" <gi...@apache.org>.
Nageshbansal commented on issue #10408:
URL: https://github.com/apache/skywalking/issues/10408#issuecomment-1462717707

   hey @Superskyyy, i would like to work on the Python agent. Could you please suggest to me some issues on which I can work to gain some experience for this particular project. Thanks


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [skywalking] Superskyyy commented on issue #10408: [Feature] Python agent performance enhancement with asyncio

Posted by "Superskyyy (via GitHub)" <gi...@apache.org>.
Superskyyy commented on issue #10408:
URL: https://github.com/apache/skywalking/issues/10408#issuecomment-1435777137

   Two things should be done:
   1. A simple but reliable performance test job in CI/local.
   2. Replace all clients to asyncio.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [skywalking] Superskyyy commented on issue #10408: [GSOC][Feature] Python agent performance enhancement with asyncio

Posted by "Superskyyy (via GitHub)" <gi...@apache.org>.
Superskyyy commented on issue #10408:
URL: https://github.com/apache/skywalking/issues/10408#issuecomment-1484329595

   > > > Hey @Superskyyy, Thanks for letting me know, and yeah, sure, I'm eager to contribute.
   > > 
   > > 
   > > 
   > > Cool! We currently have a potential thing yet to be implemented. LIke our OAP backend repo, Python agent could benefit from publishing ghcr images on every success commit to the main branch.
   > 
   > Hey, this is WIP in https://github.com/apache/skywalking-python/pull/297
   
   right, as @kezhenxu94 said, a student is already working on the task since a week ago. I forgot to update here. 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [skywalking] Superskyyy commented on issue #10408: [Feature] Python agent performance enhancement with asyncio

Posted by "Superskyyy (via GitHub)" <gi...@apache.org>.
Superskyyy commented on issue #10408:
URL: https://github.com/apache/skywalking/issues/10408#issuecomment-1469057983

   > hey @Superskyyy, i would like to work on the Python agent. Could you please suggest to me some issues on which I can work to gain some experience for this particular project. Thanks
   
   Hi there, since @FAWC438 has already taken various steps in a similar issue of Python agent, this one is currently reserved for him and I'm afraid it won't be available for you. However, if you would like to work on the Python agent in general I certainlly can help you getting started. Have you tried it and SkyWalking OAP yet?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [skywalking] Superskyyy commented on issue #10408: [Feature] Python agent performance enhancement with asyncio

Posted by "Superskyyy (via GitHub)" <gi...@apache.org>.
Superskyyy commented on issue #10408:
URL: https://github.com/apache/skywalking/issues/10408#issuecomment-1444973840

   > > @FAWC438 Since you said you are new to open-source (though I know you are familiar with open telemetry), I suggest looking at a less complex issue I just created #10447, it should be good for you to familiarize with open-source contribution (a potentially major enhancement from simple changes) and our ecosystem projects.
   > 
   > That's great. I will check it soon. Thank you very much.
   
   Please ping me in that issue when you are ready to work on it.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [skywalking] Superskyyy commented on issue #10408: [Feature] Python agent performance enhancement with asyncio

Posted by "Superskyyy (via GitHub)" <gi...@apache.org>.
Superskyyy commented on issue #10408:
URL: https://github.com/apache/skywalking/issues/10408#issuecomment-1444241709

   @FAWC438 Since you said you are new to open-source (though I know you are familiar with open telemetry), I suggest looking at a less complex issue I just created https://github.com/apache/skywalking/issues/10447, it should be good for you to familiarize with open-source contribution (a potentially major enhancement from simple changes) and our ecosystem projects.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [skywalking] Superskyyy commented on issue #10408: [Feature] Python agent performance enhancement with asyncio

Posted by "Superskyyy (via GitHub)" <gi...@apache.org>.
Superskyyy commented on issue #10408:
URL: https://github.com/apache/skywalking/issues/10408#issuecomment-1442785768

   @moz9ovic Hi there, were you saying that you will take this one? I saw the message was deleted.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [skywalking] wu-sheng commented on issue #10408: [GSOC][Feature] Python agent performance enhancement with asyncio

Posted by "wu-sheng (via GitHub)" <gi...@apache.org>.
wu-sheng commented on issue #10408:
URL: https://github.com/apache/skywalking/issues/10408#issuecomment-1696603136

   @Superskyyy @FAWC438 Is this a GSoC or OSPP project?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [skywalking] Superskyyy commented on issue #10408: [Feature] Python agent performance enhancement with asyncio

Posted by "Superskyyy (via GitHub)" <gi...@apache.org>.
Superskyyy commented on issue #10408:
URL: https://github.com/apache/skywalking/issues/10408#issuecomment-1436493366

   `loop.call_soon_threadsafe/run_coroutine_threadsafe` should be used to wrap asyncio event loop job submission instead of forcing user to use await (which otherwise will be propagated bottom to up throughout application, very bad).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [skywalking] Nageshbansal commented on issue #10408: [Feature] Python agent performance enhancement with asyncio

Posted by "Nageshbansal (via GitHub)" <gi...@apache.org>.
Nageshbansal commented on issue #10408:
URL: https://github.com/apache/skywalking/issues/10408#issuecomment-1470926860

   Hey @Superskyyy, Thanks for letting me know, and yeah, sure, I'm eager to contribute.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [skywalking] Nageshbansal commented on issue #10408: [GSOC][Feature] Python agent performance enhancement with asyncio

Posted by "Nageshbansal (via GitHub)" <gi...@apache.org>.
Nageshbansal commented on issue #10408:
URL: https://github.com/apache/skywalking/issues/10408#issuecomment-1484659130

   Okay, Thanks for letting me know. and what about this current issue for the GSOC


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [skywalking] Superskyyy commented on issue #10408: [Feature] Python agent performance enhancement with asyncio

Posted by "Superskyyy (via GitHub)" <gi...@apache.org>.
Superskyyy commented on issue #10408:
URL: https://github.com/apache/skywalking/issues/10408#issuecomment-1471022017

   > Hey @Superskyyy, Thanks for letting me know, and yeah, sure, I'm eager to contribute.
   
   Cool! We currently have a potential thing yet to be implemented. LIke our OAP backend repo, Python agent could benefit from publishing ghcr images on every success commit to the main branch. If you would like to do this one let me know. I will open an issue to track. See this https://github.com/apache/skywalking/blob/master/.github/workflows/publish-docker.yaml for an reference.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [skywalking] wu-sheng closed issue #10408: [GSOC][Feature] Python agent performance enhancement with asyncio

Posted by "wu-sheng (via GitHub)" <gi...@apache.org>.
wu-sheng closed issue #10408: [GSOC][Feature] Python agent performance enhancement with asyncio
URL: https://github.com/apache/skywalking/issues/10408


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [skywalking] FAWC438 commented on issue #10408: [Feature] Python agent performance enhancement with asyncio

Posted by "FAWC438 (via GitHub)" <gi...@apache.org>.
FAWC438 commented on issue #10408:
URL: https://github.com/apache/skywalking/issues/10408#issuecomment-1442902278

   Can I get more information about this issue?  I ‘m new to open source and noticed the skywalking community on GSOC2023. I have some experience with Python async that may be helpful with this one.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [skywalking] Nageshbansal commented on issue #10408: [GSOC][Feature] Python agent performance enhancement with asyncio

Posted by "Nageshbansal (via GitHub)" <gi...@apache.org>.
Nageshbansal commented on issue #10408:
URL: https://github.com/apache/skywalking/issues/10408#issuecomment-1484222917

   Hello @Superskyyy, I apologize for the delayed response; I have been traveling since last week. Thank you for bringing this to my attention, and I am happy to look into it. Additionally, I wanted to mention that I am considering submitting a proposal for the project in GSoC. If you have a moment, I would appreciate your feedback on whether the idea I have in mind would be a good fit for the program.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [skywalking] Superskyyy commented on issue #10408: [GSOC][Feature] Python agent performance enhancement with asyncio

Posted by "Superskyyy (via GitHub)" <gi...@apache.org>.
Superskyyy commented on issue #10408:
URL: https://github.com/apache/skywalking/issues/10408#issuecomment-1485474485

   > Okay, Thanks for letting me know. and what about this current issue for the GSOC
   
   As stated before, this issue is being actively worked on since a month ago and is under significant progress already. Maybe you can take a look at the BanyanDB GSOC tasks, not sure if they have potential candidates already?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [skywalking] Superskyyy commented on issue #10408: [GSOC][Feature] Python agent performance enhancement with asyncio

Posted by "Superskyyy (via GitHub)" <gi...@apache.org>.
Superskyyy commented on issue #10408:
URL: https://github.com/apache/skywalking/issues/10408#issuecomment-1696628414

   > @Superskyyy @FAWC438 Is this a GSoC or OSPP project?
   
   Its an GSoC project. There might be some further adjustments coming but main body is done.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [skywalking] moz9ovic commented on issue #10408: [Feature] Python agent performance enhancement with asyncio

Posted by "moz9ovic (via GitHub)" <gi...@apache.org>.
moz9ovic commented on issue #10408:
URL: https://github.com/apache/skywalking/issues/10408#issuecomment-1442020606

   Hello @Superskyyy , I'm willing to take it on


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [skywalking] Nageshbansal commented on issue #10408: [GSOC][Feature] Python agent performance enhancement with asyncio

Posted by "Nageshbansal (via GitHub)" <gi...@apache.org>.
Nageshbansal commented on issue #10408:
URL: https://github.com/apache/skywalking/issues/10408#issuecomment-1486133074

   Okay. Thanks


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [skywalking] FAWC438 commented on issue #10408: [Feature] Python agent performance enhancement with asyncio

Posted by "FAWC438 (via GitHub)" <gi...@apache.org>.
FAWC438 commented on issue #10408:
URL: https://github.com/apache/skywalking/issues/10408#issuecomment-1444969992

   > @FAWC438 Since you said you are new to open-source (though I know you are familiar with open telemetry), I suggest looking at a less complex issue I just created https://github.com/apache/skywalking/issues/10447, it should be good for you to familiarize with open-source contribution (a potentially major enhancement from simple changes) and our ecosystem projects.
   
   That's great. I will check it soon. Thank you very much.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [skywalking] Superskyyy commented on issue #10408: [Feature] Python agent performance enhancement with asyncio

Posted by "Superskyyy (via GitHub)" <gi...@apache.org>.
Superskyyy commented on issue #10408:
URL: https://github.com/apache/skywalking/issues/10408#issuecomment-1444046831

   > Can I get more information about this issue?  I ‘m new to open source and noticed the skywalking community on GSOC2023. I have some experience with Python async that may be helpful with this one.
   
   Hello there! Good to see that you are interested. What specific info do you require in addition to the description?
   
   Btw,regarding GSOC 2023, do you have a rough idea of what to work on? This particular issue cannot qualify as a gsoc topic as it cannot wait till the summer(aimed for next release). But Im more than happy to work with you to come up with a valid one. For detailed discussions on GSOC/OSPP, you can send me an email.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org