You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@couchdb.apache.org by Randall Leeds <ra...@gmail.com> on 2012/06/04 06:19:50 UTC

Re: Hierarchical comments Hacker News style

On Wed, May 16, 2012 at 2:40 AM, Luca Matteis <lm...@gmail.com> wrote:
> I'm trying to implement a basic way of displaying comments in the way
> that Hacker News provides, using CouchDB. Not only ordered
> hierarchically, but also, each level of the tree should be ordered by
> a "points" variable. If you're familiar with sites such as Hacker News
> or Reddit you know that each level of the comment tree is ordered by
> the amount of points and also the date they were posted - however, for
> sake of simplicity I'll just talk about a "points" variable.
>
> The idea is that I want a view to return it in the order I except, and
> not make many Ajax calls for example, to retrieve them and make them
> look like they're ordered correctly.
>
> This is what I got so far: Each document is a "comment". Each comment
> has a property `path` which is an ordered list containing all its
> parents. So for example, imagine I have 4 comments (with _id `1`, `2`,
> `3` and `4`). Comment `2` is children of `1`, comment `3` is children
> of `2`, and comment `4` is also children of `1`. This is what the data
> would look like:
>
>    { _id: 1, path: ["1"] },
>    { _id: 2, path: ["1", "2"] },
>    { _id: 3, path: ["1", "2", "3"] }
>    { _id: 4, path: ["1", "4"] }
>
> This works quite well for the hierarchy. A simple `view` will already
> return things ordered with each comment underneath the correct one.
>
> The issue comes when I want to order each "level" of the tree
> independently. So for example documents `2` and `4` belong to the same
> branch, but are ordered, on that level, by their ID. Instead I want
> them ordered based on a "points" variable that I want to add to the
> path - but can't seem to understand where I could be adding this
> variable for it to work the way I want it.
>
> Is there a way to do this? Consider that the "points" variable will
> change in time.

The standard threading algorithm is jwz's:
http://www.jwz.org/doc/threading.html

Max Ogden ported a ruby implementation to JS:
https://github.com/maxogden/conversationThreading-js/

I've just fixed a couple bugs in there this week as I needed something similar.

It makes sense to me to use the format you have so that a range query
can get a full thread, but then do the sorting in a list function or
client side using the jwz algorithm. It's simply to write a recursive
traversal through the children so you can sort the children in each
iteration of the recursion by whatever property you like.

It's ugly, poorly commented (read: not at all), written in
CoffeeScript, and contains a bunch of d3.js stuff, but you can see a
similar loop I did here:
https://github.com/hypothesis/hypothes.is/blob/master/hypothesis/js/src/hypothesis.coffee#L201

... with the actual use of the message threading library here (which
uses undefined for the subject, as I'm not grouping by "subject" as in
email):
https://github.com/hypothesis/hypothes.is/blob/master/hypothesis/js/src/hypothesis.coffee#L155

In my case the "thread" is a '/'-delimited string rather than an
array, so you'll see some .split() going on.

The advantage of using this algorithm is also that it properly handles
threading even when some nodes are not available (deleted, not
replicated in order, etc), inserting blank containers with no
`message` property of their own.

I didn't read the whole thread, but just scanned to see if anyone
mentioned this. Hope it helps!

-Randall