You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/04/24 07:54:31 UTC

[GitHub] [arrow] motybz opened a new issue, #12974: [Python][flight] How to log the GeneratorStream duration

motybz opened a new issue, #12974:
URL: https://github.com/apache/arrow/issues/12974

   
   I made a do_get function that returned the GeneratorStream object.
   
   ```python
   def time_it(func):
       """This decorator prints the execution time for the decorated function."""
   
       @wraps(func)
       def wrapper(*args, **kwargs):
           start = time.time()
           result = func(*args, **kwargs)
           end = time.time()
           logging.debug("{} executed in {}s".format(func.__name__, round(end - start, 2)))
           return result
   
       return wrapper
   
   class FlightServer(pa.flight.FlightServerBase):
       def __init__(self, host="localhost", location=None,filesystem="s3", **kwargs):
           super(FlightServer, self).__init__(location, **kwargs)
   
       @time_it
       def do_get(self, context, ticket):
           ....
           scanner = dataset.scanner(
               batch_size=1000*1000*10
               )
           return pa.flight.GeneratorStream(
               scanner.projected_schema, scanner.to_batches()
               )
   ```
   The logged time is not for the complete data transfer between the server and the client but for the initial connection.
   How can I log the complete duration of the data streaming (from the server-side...)?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] lidavidm commented on issue #12974: [Python][flight] How to log the GeneratorStream duration

Posted by GitBox <gi...@apache.org>.
lidavidm commented on issue #12974:
URL: https://github.com/apache/arrow/issues/12974#issuecomment-1108490825

   Do the timing in the generator itself, or use middleware, something like
   
   ```python
   class TimingServerMiddleware(flight.ServerMiddleware):
       def __init__(self, start):
           self.start = start
           
       def call_completed(self, exception):
           print("Duration:", time.monotonic() - self.start)
   
   class TimingServerMiddlewareFactory(flight.ServerMiddlewareFactory):
       def start_call(self, info, headers):
           return TimingServerMiddleware(time.monotonic())
   
   server = Server(..., middlware={"timing": TimingServerMiddlewareFactory()})
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] motybz commented on issue #12974: [Python][flight] How to log the GeneratorStream duration

Posted by GitBox <gi...@apache.org>.
motybz commented on issue #12974:
URL: https://github.com/apache/arrow/issues/12974#issuecomment-1117201801

   It works, thank you...
   
   I made some changes and fixed the "middlware" typo
   
   `class TimingServerMiddleware(flight.ServerMiddleware):
       def __init__(self, start):
           self.start = start
           
       def call_completed(self, exception):
           logging.debug(f"Data consumed in {time.monotonic() - self.start}s")
   
   class TimingServerMiddlewareFactory(flight.ServerMiddlewareFactory):
       def start_call(self, info, headers):
           return TimingServerMiddleware(time.monotonic())`


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org