You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@commons.apache.org by "Thomas Neidhart (JIRA)" <ji...@apache.org> on 2012/11/14 19:48:12 UTC

[jira] [Created] (MATH-897) Add DBScan clustering algorithm

Thomas Neidhart created MATH-897:
------------------------------------

             Summary: Add DBScan clustering algorithm
                 Key: MATH-897
                 URL: https://issues.apache.org/jira/browse/MATH-897
             Project: Commons Math
          Issue Type: Sub-task
            Reporter: Thomas Neidhart
            Priority: Minor




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MATH-897) Add DBScan clustering algorithm

Posted by "Thomas Neidhart (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MATH-897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Thomas Neidhart updated MATH-897:
---------------------------------

    Fix Version/s:     (was: 3.2)
                   3.1
    
> Add DBScan clustering algorithm
> -------------------------------
>
>                 Key: MATH-897
>                 URL: https://issues.apache.org/jira/browse/MATH-897
>             Project: Commons Math
>          Issue Type: Sub-task
>            Reporter: Thomas Neidhart
>            Priority: Minor
>             Fix For: 3.1
>
>         Attachments: MATH-748.txt, MATH-897-review.patch, MATH-897-test.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MATH-897) Add DBScan clustering algorithm

Posted by "Thomas Neidhart (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MATH-897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13499790#comment-13499790 ] 

Thomas Neidhart commented on MATH-897:
--------------------------------------

Committed in r1410882 with minor modifications and additional tests.

User guide still needs to be updated.
                
> Add DBScan clustering algorithm
> -------------------------------
>
>                 Key: MATH-897
>                 URL: https://issues.apache.org/jira/browse/MATH-897
>             Project: Commons Math
>          Issue Type: Sub-task
>            Reporter: Thomas Neidhart
>            Priority: Minor
>             Fix For: 3.1
>
>         Attachments: MATH-748.txt, MATH-897-review.patch, MATH-897-test.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MATH-897) Add DBScan clustering algorithm

Posted by "Thomas Neidhart (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MATH-897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13505279#comment-13505279 ] 

Thomas Neidhart commented on MATH-897:
--------------------------------------

Why did you change the fix version to 3.2? The code is already committed, just the user guide is missing, but there is none yet for the whole clustering package.
                
> Add DBScan clustering algorithm
> -------------------------------
>
>                 Key: MATH-897
>                 URL: https://issues.apache.org/jira/browse/MATH-897
>             Project: Commons Math
>          Issue Type: Sub-task
>            Reporter: Thomas Neidhart
>            Priority: Minor
>             Fix For: 3.2
>
>         Attachments: MATH-748.txt, MATH-897-review.patch, MATH-897-test.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MATH-897) Add DBScan clustering algorithm

Posted by "Thomas Neidhart (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MATH-897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13498904#comment-13498904 ] 

Thomas Neidhart commented on MATH-897:
--------------------------------------

ok thanks for the test and the link to the original paper, I will update the patch.
                
> Add DBScan clustering algorithm
> -------------------------------
>
>                 Key: MATH-897
>                 URL: https://issues.apache.org/jira/browse/MATH-897
>             Project: Commons Math
>          Issue Type: Sub-task
>            Reporter: Thomas Neidhart
>            Priority: Minor
>             Fix For: 3.2
>
>         Attachments: MATH-748.txt, MATH-897-review.patch, MATH-897-test.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MATH-897) Add DBScan clustering algorithm

Posted by "Reid Hochstedler (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MATH-897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13498149#comment-13498149 ] 

Reid Hochstedler commented on MATH-897:
---------------------------------------

You may want to add a JavaDoc comment referencing the paper located at http://www.dbs.ifi.lmu.de/Publikationen/Papers/KDD-96.final.frame.pdf
                
> Add DBScan clustering algorithm
> -------------------------------
>
>                 Key: MATH-897
>                 URL: https://issues.apache.org/jira/browse/MATH-897
>             Project: Commons Math
>          Issue Type: Sub-task
>            Reporter: Thomas Neidhart
>            Priority: Minor
>             Fix For: 3.2
>
>         Attachments: MATH-748.txt, MATH-897-review.patch, MATH-897-test.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MATH-897) Add DBScan clustering algorithm

Posted by "Thomas Neidhart (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MATH-897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Thomas Neidhart updated MATH-897:
---------------------------------

    Attachment: MATH-897-review.patch

Hi Reid,

please find attached a review of your patch with the following changes:

 * minor javadoc updates
 * use more specific exceptions
 * the data points are now an input to the cluster method instead of the ctor (similar to the kmeans++ clusterer)
 * fix the expandCluster to match the algorithm on wikipedia (last if + the way how to determine if a point is already part of a cluster)
 * change the visited set to a map to mark also if a point is part of a cluster, see above
 * improve the merge method
 * make the call to cluster thread-safe, similar to the kmeans++ clusterer. This may not be necessary, but I prefer it like this usually.

What do you think about the changes?

btw. for the future, if there are multiple clustering algorithms, we should think about a unifying interface.
                
> Add DBScan clustering algorithm
> -------------------------------
>
>                 Key: MATH-897
>                 URL: https://issues.apache.org/jira/browse/MATH-897
>             Project: Commons Math
>          Issue Type: Sub-task
>            Reporter: Thomas Neidhart
>            Priority: Minor
>             Fix For: 3.2
>
>         Attachments: MATH-748.txt, MATH-897-review.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MATH-897) Add DBScan clustering algorithm

Posted by "Reid Hochstedler (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MATH-897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Reid Hochstedler updated MATH-897:
----------------------------------

    Attachment: MATH-897-test.patch

Those changes look good to me, I've added a patch to include tests for EuclideanDoublePoint, to ensure that it stays consistent with EuclideanIntegerPoint.
                
> Add DBScan clustering algorithm
> -------------------------------
>
>                 Key: MATH-897
>                 URL: https://issues.apache.org/jira/browse/MATH-897
>             Project: Commons Math
>          Issue Type: Sub-task
>            Reporter: Thomas Neidhart
>            Priority: Minor
>             Fix For: 3.2
>
>         Attachments: MATH-748.txt, MATH-897-review.patch, MATH-897-test.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MATH-897) Add DBScan clustering algorithm

Posted by "Thomas Neidhart (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MATH-897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13497487#comment-13497487 ] 

Thomas Neidhart commented on MATH-897:
--------------------------------------

ahja, I also extracted the EuclideanDoublePoint from the test and included it to the source as it is quite reasonable imho.
                
> Add DBScan clustering algorithm
> -------------------------------
>
>                 Key: MATH-897
>                 URL: https://issues.apache.org/jira/browse/MATH-897
>             Project: Commons Math
>          Issue Type: Sub-task
>            Reporter: Thomas Neidhart
>            Priority: Minor
>             Fix For: 3.2
>
>         Attachments: MATH-748.txt, MATH-897-review.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MATH-897) Add DBScan clustering algorithm

Posted by "Gilles (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MATH-897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13505346#comment-13505346 ] 

Gilles commented on MATH-897:
-----------------------------

bq. Why did you change the fix version to 3.2? The code is already committed, just the user guide is missing [...]

Since you did not resolve it, I thought that you intentionally wanted it to be kept open until the user guide is updated. I think that it's fine to resolve it now (and add an entry in "changes.xml", also in the "description").

                
> Add DBScan clustering algorithm
> -------------------------------
>
>                 Key: MATH-897
>                 URL: https://issues.apache.org/jira/browse/MATH-897
>             Project: Commons Math
>          Issue Type: Sub-task
>            Reporter: Thomas Neidhart
>            Priority: Minor
>             Fix For: 3.2
>
>         Attachments: MATH-748.txt, MATH-897-review.patch, MATH-897-test.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MATH-897) Add DBScan clustering algorithm

Posted by "Gilles (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MATH-897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Gilles updated MATH-897:
------------------------

    Fix Version/s:     (was: 3.1)
                   3.2
    
> Add DBScan clustering algorithm
> -------------------------------
>
>                 Key: MATH-897
>                 URL: https://issues.apache.org/jira/browse/MATH-897
>             Project: Commons Math
>          Issue Type: Sub-task
>            Reporter: Thomas Neidhart
>            Priority: Minor
>             Fix For: 3.2
>
>         Attachments: MATH-748.txt, MATH-897-review.patch, MATH-897-test.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MATH-897) Add DBScan clustering algorithm

Posted by "Thomas Neidhart (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MATH-897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13505744#comment-13505744 ] 

Thomas Neidhart commented on MATH-897:
--------------------------------------

Thanks, just wanted to do it myself now ;-)
                
> Add DBScan clustering algorithm
> -------------------------------
>
>                 Key: MATH-897
>                 URL: https://issues.apache.org/jira/browse/MATH-897
>             Project: Commons Math
>          Issue Type: Sub-task
>            Reporter: Thomas Neidhart
>            Priority: Minor
>             Fix For: 3.1
>
>         Attachments: MATH-748.txt, MATH-897-review.patch, MATH-897-test.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MATH-897) Add DBScan clustering algorithm

Posted by "Reid Hochstedler (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MATH-897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Reid Hochstedler updated MATH-897:
----------------------------------

    Attachment: MATH-748.txt

Implementation of DBSCAN clustering.
                
> Add DBScan clustering algorithm
> -------------------------------
>
>                 Key: MATH-897
>                 URL: https://issues.apache.org/jira/browse/MATH-897
>             Project: Commons Math
>          Issue Type: Sub-task
>            Reporter: Thomas Neidhart
>            Priority: Minor
>             Fix For: 3.2
>
>         Attachments: MATH-748.txt
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (MATH-897) Add DBScan clustering algorithm

Posted by "Gilles (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MATH-897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Gilles resolved MATH-897.
-------------------------

       Resolution: Fixed
    Fix Version/s:     (was: 3.2)
                   3.1
    
> Add DBScan clustering algorithm
> -------------------------------
>
>                 Key: MATH-897
>                 URL: https://issues.apache.org/jira/browse/MATH-897
>             Project: Commons Math
>          Issue Type: Sub-task
>            Reporter: Thomas Neidhart
>            Priority: Minor
>             Fix For: 3.1
>
>         Attachments: MATH-748.txt, MATH-897-review.patch, MATH-897-test.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MATH-897) Add DBScan clustering algorithm

Posted by "Gilles (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MATH-897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13505737#comment-13505737 ] 

Gilles commented on MATH-897:
-----------------------------

bq. add an entry in "changes.xml"

Done in revision 1414848.
                
> Add DBScan clustering algorithm
> -------------------------------
>
>                 Key: MATH-897
>                 URL: https://issues.apache.org/jira/browse/MATH-897
>             Project: Commons Math
>          Issue Type: Sub-task
>            Reporter: Thomas Neidhart
>            Priority: Minor
>             Fix For: 3.1
>
>         Attachments: MATH-748.txt, MATH-897-review.patch, MATH-897-test.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira