You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/04/24 07:54:31 UTC
[GitHub] [arrow] motybz opened a new issue, #12974: [Python][flight] How to log the GeneratorStream duration
motybz opened a new issue, #12974:
URL: https://github.com/apache/arrow/issues/12974
I made a do_get function that returned the GeneratorStream object.
```python
def time_it(func):
"""This decorator prints the execution time for the decorated function."""
@wraps(func)
def wrapper(*args, **kwargs):
start = time.time()
result = func(*args, **kwargs)
end = time.time()
logging.debug("{} executed in {}s".format(func.__name__, round(end - start, 2)))
return result
return wrapper
class FlightServer(pa.flight.FlightServerBase):
def __init__(self, host="localhost", location=None,filesystem="s3", **kwargs):
super(FlightServer, self).__init__(location, **kwargs)
@time_it
def do_get(self, context, ticket):
....
scanner = dataset.scanner(
batch_size=1000*1000*10
)
return pa.flight.GeneratorStream(
scanner.projected_schema, scanner.to_batches()
)
```
The logged time is not for the complete data transfer between the server and the client but for the initial connection.
How can I log the complete duration of the data streaming (from the server-side...)?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] lidavidm commented on issue #12974: [Python][flight] How to log the GeneratorStream duration
Posted by GitBox <gi...@apache.org>.
lidavidm commented on issue #12974:
URL: https://github.com/apache/arrow/issues/12974#issuecomment-1108490825
Do the timing in the generator itself, or use middleware, something like
```python
class TimingServerMiddleware(flight.ServerMiddleware):
def __init__(self, start):
self.start = start
def call_completed(self, exception):
print("Duration:", time.monotonic() - self.start)
class TimingServerMiddlewareFactory(flight.ServerMiddlewareFactory):
def start_call(self, info, headers):
return TimingServerMiddleware(time.monotonic())
server = Server(..., middlware={"timing": TimingServerMiddlewareFactory()})
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] motybz commented on issue #12974: [Python][flight] How to log the GeneratorStream duration
Posted by GitBox <gi...@apache.org>.
motybz commented on issue #12974:
URL: https://github.com/apache/arrow/issues/12974#issuecomment-1117201801
It works, thank you...
I made some changes and fixed the "middlware" typo
`class TimingServerMiddleware(flight.ServerMiddleware):
def __init__(self, start):
self.start = start
def call_completed(self, exception):
logging.debug(f"Data consumed in {time.monotonic() - self.start}s")
class TimingServerMiddlewareFactory(flight.ServerMiddlewareFactory):
def start_call(self, info, headers):
return TimingServerMiddleware(time.monotonic())`
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org