You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by "slobodan-ilic (via GitHub)" <gi...@apache.org> on 2023/09/12 06:38:50 UTC

[GitHub] [arrow] slobodan-ilic commented on a diff in pull request #37656: [Python][Docs] Add examples for MapArray.from_arrays

slobodan-ilic commented on code in PR #37656:
URL: https://github.com/apache/arrow/pull/37656#discussion_r1322505446


##########
python/pyarrow/array.pxi:
##########
@@ -2363,6 +2363,78 @@ cdef class MapArray(ListArray):
         Returns
         -------
         map_array : MapArray
+
+        Examples
+        --------
+        First, let's understand the structure of our dataset when viewed in a rectangular data model. 
+        The total of 5 respondents answered the question "How much did you like the movie x?".
+        The value -1 in the integer array means that the value is missing. The boolean array
+        represents the null bitmask corresponding to the missing values in the integer array.
+
+        >>> movies_rectangular = np.ma.masked_array([
+        >>>     [10, -1, -1],
+        >>>     [8, 4, 5],
+        >>>     [-1, 10, 3],
+        >>>     [-1, -1, -1],
+        >>>     [-1, -1, -1]
+        >>> ],
+        >>> [
+        >>>     [False, True, True],
+        >>>     [False, False, False],
+        >>>     [True, False, False],
+        >>>     [True, True, True],
+        >>>     [True, True, True],
+        >>> ])
+
+        To represent the same data with the MapArray and from_arrays, the data is
+        formed like this:
+
+        >>> offsets = [
+        >>>     0, #  -- row 1 start
+        >>>     1, #  -- row 2 start
+        >>>     4, #  -- row 3 start
+        >>>     6, #  -- row 4 start
+        >>>     6, #  -- row 5 start
+        >>>     6, #  -- row 5 end
+        >>> ]
+        >>> movies = [
+        >>>     "Dark Knight", #  ---------------------------------- row 1
+        >>>     "Dark Knight", "Meet the Parents", "Superman", #  -- row 2
+        >>>     "Meet the Parents", "Superman", #  ----------------- row 3
+        >>> ]
+        >>> likings = [
+        >>>     10, #  -------- row 1
+        >>>     8, 4, 5, #  --- row 2
+        >>>     10, 3 #  ------ row 3
+        >>> ]
+        >>> pa.MapArray.from_arrays(offsets, movies, likings).to_pandas()
+        0                                  [(Dark Knight, 10)]
+        1    [(Dark Knight, 8), (Meet the Parents, 4), (Sup...
+        2              [(Meet the Parents, 10), (Superman, 5)]

Review Comment:
   True. Fixed now. Thx.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org