You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@beam.apache.org by GitBox <gi...@apache.org> on 2021/11/05 17:52:18 UTC

[GitHub] [beam] TheNeuralBit commented on a change in pull request #15833: [WIP] BEAM-12561 method truncate on series and dataframe

TheNeuralBit commented on a change in pull request #15833:
URL: https://github.com/apache/beam/pull/15833#discussion_r743869147



##########
File path: sdks/python/apache_beam/dataframe/frames.py
##########
@@ -922,6 +922,19 @@ def mask(self, cond, **kwargs):
     """mask is not parallelizable when ``errors="ignore"`` is specified."""
     return self.where(~cond, **kwargs)
 
+  @frame_base.with_docs_from(pd.DataFrame)
+  @frame_base.args_to_kwargs(pd.DataFrame)
+  @frame_base.populate_defaults(pd.DataFrame)
+  def truncate(self, before, after, axis):
+    return frame_base.DeferredFrame.wrap(
+        expressions.ComputedExpression(
+            'truncate',
+            lambda df: df.truncate(before=before, after=after, axis=axis),

Review comment:
       Hey @AlikRodriguez sorry I missed your question!
   
   > I'm struggling with an error with df['A'].truncate(before=2, after=4) on pandas_doctest_test, and the error message is ValueError: truncate requires a sorted index, How can I fix this?
   
   The issue here seems to be that each individual partitioned dataframe (i.e. `df` in this method) is not necessarily sorted by index. I'd suggest just making this method call `sort_index` before it calls truncate, e.g. `df.sort_index().truncate(...)`. We should only do that when axis is 'index' though, in the axis='columns` case the user is in control of the order of the columns, so raising that error would be appropriate.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org