You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@arrow.apache.org by "marcin-krystianc (via GitHub)" <gi...@apache.org> on 2024/03/14 11:36:40 UTC

[I] Is it possible to enable logging with Python/PyArrow ? [arrow]

marcin-krystianc opened a new issue, #40550:
URL: https://github.com/apache/arrow/issues/40550

   ### Describe the enhancement requested
   
   Hi,
   
   we've encountered quite a few times a situation, where more debug logs would help to understand performance discrepancies when reading parquet files. It turns out that various APIs use different default values for performance-relevant parameters. 
   Thus if there was a possibility to enable debug logs and e.g. see what parameters are used to create a `ParquetFileReader` it would help a lot in understanding these differences.
   
   Would you be open to contributions that:
   - Add the possibility of enabling logging with PyArrow (As far I can tell it is only possible with C++ at the moment)
   - Extend logging around `ParquetFileReader`
   
   ### Component(s)
   
   Parquet, Python


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] [Python] Is it possible to enable logging with Python/PyArrow ? [arrow]

Posted by "marcin-krystianc (via GitHub)" <gi...@apache.org>.
marcin-krystianc commented on issue #40550:
URL: https://github.com/apache/arrow/issues/40550#issuecomment-2003642989

   > Existing code has some "tracing", would that works?
   
   The existing tracing/logging infrastructure is ok but:
   - I don't see the way to call `arrow::util::ArrowLog::StartArrowLog` from PyArrow
   - The traces I'm thinking about are not yet implemented (there is in fact very little tracing in the entire library)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] [Python] Is it possible to enable logging with Python/PyArrow ? [arrow]

Posted by "marcin-krystianc (via GitHub)" <gi...@apache.org>.
marcin-krystianc commented on issue #40550:
URL: https://github.com/apache/arrow/issues/40550#issuecomment-2003617188

   For now, I'm thinking about adding a code to log the content of `parquet::ReaderProperties` and `parquet::ArrowReaderProperties`. 
   For example, the value of `pre_buffer` parameter has high impact with high-latency file systems. But sometimes that value is not explicitly set and user relies on the implicit default value for it. If a user then uses different API then it might end up using different value for the `pre_buffer` parameter and be unaware of it. With the logs I'm thinking about it would be much easier to track these situations down.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] [Python] Is it possible to enable logging with Python/PyArrow ? [arrow]

Posted by "mapleFU (via GitHub)" <gi...@apache.org>.
mapleFU commented on issue #40550:
URL: https://github.com/apache/arrow/issues/40550#issuecomment-2002793148

   🤔What kind of logging would you like to have?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] [Python] Is it possible to enable logging with Python/PyArrow ? [arrow]

Posted by "mapleFU (via GitHub)" <gi...@apache.org>.
mapleFU commented on issue #40550:
URL: https://github.com/apache/arrow/issues/40550#issuecomment-2003629683

   Existing code has some "tracing", would that works?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org