You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@systemds.apache.org by GitBox <gi...@apache.org> on 2022/11/07 23:13:51 UTC

[GitHub] [systemds] BACtaki commented on pull request #1714: [SYSTEMDS-3254] Add new aliases for countDistinct built-in function

BACtaki commented on PR #1714:
URL: https://github.com/apache/systemds/pull/1714#issuecomment-1306350347

   > Hi @BACtaki !
   > 
   > Thanks for the addition,
   > When i made the task i was trying to make the methods the same as in R, since much of the syntax is similar.
   > 
   > https://www.geeksforgeeks.org/unique-function-in-r/
   > 
   > Unfortunately it seems i was a bit quick in my decision, since unique != count distinct.
   > and what really is missing is the unique() function to return all unique elements in a matrix.
   > 
   > Sorry for the inconvenience.
   > 
   > Best regards
   > Sebastian.
   
   Thanks for clearing that up @Baunsgaard ! 
   
   > [..] what really is missing is the unique() function to return all unique elements in a matrix.
   
   I see a unique function in `builtins.java`:
   ```
   	UNIQUE("unique", true),
   ```
   
   In fact, there is an existing test for this function, which performs an equivalence check for the following:
   
   - `unique.dml`
   ```
   X = read($1);
   R = unique(X = X);
   write(R, $2);
   ```
   
   - `unique.R`
   ```
   args<-commandArgs(TRUE)
   options(digits=22)
   library("Matrix")
   
   X = as.matrix(readMM(paste(args[1], "X.mtx", sep="")));
   R = unique(X[order(X[,1]),]);
   writeMM(as(R, "CsparseMatrix"), paste(args[2], "R", sep=""));
   ```
   
   Is this the function you had in mind for the [JIRA](https://issues.apache.org/jira/browse/SYSTEMDS-3254) or was it something else? 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@systemds.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org