You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@beam.apache.org by GitBox <gi...@apache.org> on 2022/07/14 08:08:07 UTC

[GitHub] [beam] PhilippeMoussalli opened a new issue, #22267: [Feature Request]: Implement 'DataFrame.loc.setitem' for deferred DataFrame

PhilippeMoussalli opened a new issue, #22267:
URL: https://github.com/apache/beam/issues/22267

   ### What would you like to happen?
   
   We would like to use the DataFrame API to perform operations on certain columns.
   
   As an example: 
   
   ```
   # Initialize pipeline
   p = beam.Pipeline(InteractiveRunner())
   
   # Create a deferred Beam DataFrame
   beam_df = p | beam.dataframe.io.read_csv('test.csv', splittable=True)
   
   # Perform operation on specific columns
   columns = ['col_1','col_2']
   beam_df.loc[:,columns] = beam_df.loc[:, columns] - beam_df.loc[:, columns].mean()
   ```
   
   This is currently not supported since the `loc.setitem()` method is not implemented yet .
   
   
   
   ### Issue Priority
   
   Priority: 2
   
   ### Issue Component
   
   Component: dsl-dataframe


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] tvalentyn commented on issue #22267: [Feature Request]: Implement 'DataFrame.loc.setitem' for deferred DataFrame

Posted by GitBox <gi...@apache.org>.
tvalentyn commented on issue #22267:
URL: https://github.com/apache/beam/issues/22267#issuecomment-1271867999

   Actually, I found some notes where it was sized as small.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] tvalentyn commented on issue #22267: [Feature Request]: Implement 'DataFrame.loc.setitem' for deferred DataFrame

Posted by GitBox <gi...@apache.org>.
tvalentyn commented on issue #22267:
URL: https://github.com/apache/beam/issues/22267#issuecomment-1271865087

   @TheNeuralBit how would you size this request (S, M, L) and/or starter issue? 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org