You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by "Fokko (via GitHub)" <gi...@apache.org> on 2023/06/05 22:44:12 UTC

[GitHub] [arrow] Fokko commented on issue #35748: [Python] Implement efficient merging of chunked arrays

Fokko commented on issue #35748:
URL: https://github.com/apache/arrow/issues/35748#issuecomment-1577578808

   @jorisvandenbossche Sorry for not being clear. It is not about the concat operation but about the merge operation. Both of the input arrays are already unique and sorted. The input arrays can overlap, but the arrays themselves don't.
   
   The expected output would be:
   ```
   >>> merged_chunked = pa.chunked_array([chunk for arr in [a1, a2] for chunk in arr.chunks])
   >>> merged_chunked.unique()
   <pyarrow.lib.Int64Array object at 0x7fe2e692f880>
   [
     1,
     2,
     3,
     4, <-- 4 comes after 3 and before 6
     6,
     7, <-- 7 is present in both arrays, but we are interested in the value just once.
     8
   ]
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org