You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by "kat-grayson (via GitHub)" <gi...@apache.org> on 2023/05/18 09:36:59 UTC

[GitHub] [arrow] kat-grayson commented on issue #35508: [C++][Python] Adding data to tdigest in pyarrow

kat-grayson commented on issue #35508:
URL: https://github.com/apache/arrow/issues/35508#issuecomment-1552795987

   Hi @marsupialtail, I have just tried to use your implementation above using the package `ldbpy` and unfortunately I'm running into some errors. Very simply the lines I tried to run are:
   
   `import pyarrow as pa
   import numpy as np
   from pyarrow.cffi import ffi
   import polars 
   import ldbpy, time 
   import pyarrow.compute as pac
   
   c_schema = ffi.new("struct ArrowSchema*")
   schema_ptr = int(ffi.cast("uintptr_t", c_schema))
   c_array = ffi.new("struct ArrowArray*")
   array_ptr = int(ffi.cast("uintptr_t", c_array))
   
   a = ldbpy.NTDigest(20,100,10000)
   start = time.time()
   
   a.batch_add_arrow([array_ptr] * 20, [schema_ptr] * 20)`
   
   Running the last line with the function `batch_add_arrow` caused the error: `Assertion failed: arrowSchema->release != nullptr
   Error message: arrowSchema was released.
   terminate called recursively
   Aborted`
   
   Do you know what's happening here? Also some more general questions, the line `a = ldbpy.NTDigest(20,100,10000)`, what do the input values represent? 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org