You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@asterixdb.apache.org by "ASF subversion and git services (Jira)" <ji...@apache.org> on 2021/11/10 00:34:00 UTC

[jira] [Commented] (ASTERIXDB-2895) Support variable size buffers in Python UDF IPC

    [ https://issues.apache.org/jira/browse/ASTERIXDB-2895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17441444#comment-17441444 ] 

ASF subversion and git services commented on ASTERIXDB-2895:
------------------------------------------------------------

Commit 8d6cf7574430de4f4049fbf5e6726cf2ede5d8aa in asterixdb's branch refs/heads/master from Ian Maxon
[ https://gitbox.apache.org/repos/asf?p=asterixdb.git;h=8d6cf75 ]

[ASTERIXDB-2895][RT] Vsize buffers in PyUDF IPC

- user mode changes: no
- storage format changes: no
- interface changes: no

Details:

- Convert most uses of ByteBuffer to ArrayBackedValueStorage
  so that the size of the buffer can grow arbitrarily with
  the data
- Convert ADM-to-Msgpack serialiation to use IVisitablePointable
- Convert all serialization interfaces that used ByteBuffer
  to use DataOutput instead
- Fix UTF8 encoding bugs by using StandardToModifiedUTF8DataOutput
- Adapt some of the UTF8 printing code to be used for
  UTF8 output to msgpack
- Fix CSV output printer to not ignore surrogate pairs
- Fix ASTERIXDB-29773 (returned records from PyUDF aren't sorted)

Change-Id: Ic95e592b42139b4750af8bb20291f926b3c973e2
Reviewed-on: https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/12643
Reviewed-by: Wael Alkowaileet <wa...@gmail.com>
Integration-Tests: Jenkins <je...@fulliautomatix.ics.uci.edu>
Tested-by: Jenkins <je...@fulliautomatix.ics.uci.edu>
Contrib: Ian Maxon <im...@uci.edu>


> Support variable size buffers in Python UDF IPC
> -----------------------------------------------
>
>                 Key: ASTERIXDB-2895
>                 URL: https://issues.apache.org/jira/browse/ASTERIXDB-2895
>             Project: Apache AsterixDB
>          Issue Type: Improvement
>          Components: FUN - Functions
>    Affects Versions: 0.9.7
>            Reporter: Ian Maxon
>            Assignee: Ian Maxon
>            Priority: Major
>
> Currently the Python IPC uses largely fixed buffer sizes for object serialization into Python. Ideally this should support variable sized buffers so that big objects can be used in UDFs without using huge buffer sizes. 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)