You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by "kat-grayson (via GitHub)" <gi...@apache.org> on 2023/05/18 09:36:59 UTC
[GitHub] [arrow] kat-grayson commented on issue #35508: [C++][Python] Adding data to tdigest in pyarrow
kat-grayson commented on issue #35508:
URL: https://github.com/apache/arrow/issues/35508#issuecomment-1552795987
Hi @marsupialtail, I have just tried to use your implementation above using the package `ldbpy` and unfortunately I'm running into some errors. Very simply the lines I tried to run are:
`import pyarrow as pa
import numpy as np
from pyarrow.cffi import ffi
import polars
import ldbpy, time
import pyarrow.compute as pac
c_schema = ffi.new("struct ArrowSchema*")
schema_ptr = int(ffi.cast("uintptr_t", c_schema))
c_array = ffi.new("struct ArrowArray*")
array_ptr = int(ffi.cast("uintptr_t", c_array))
a = ldbpy.NTDigest(20,100,10000)
start = time.time()
a.batch_add_arrow([array_ptr] * 20, [schema_ptr] * 20)`
Running the last line with the function `batch_add_arrow` caused the error: `Assertion failed: arrowSchema->release != nullptr
Error message: arrowSchema was released.
terminate called recursively
Aborted`
Do you know what's happening here? Also some more general questions, the line `a = ldbpy.NTDigest(20,100,10000)`, what do the input values represent?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org