You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@cassandra.apache.org by Lior Menashe <li...@gmail.com> on 2015/12/31 14:36:12 UTC

Apache Cassandra - Question about data model

Hi,

Just got your mail from the #cassandra channel on the web chat because i
couldn't get an answer...

I have a question that i'll be glad if you can help me or give me a
direction.

I have an activity feed like the activity feed on Instagram. When user
(lets say UserA) enters his page he can see all the activities that are
related to him,
for example, user B liked your post.user C commented on your post etc...

the cassandra data model that i thought about is:

userID UDID (partition key)
datetimeadded timestamp (clustering column DESC)
userID_Name text
userID_Picture_URL text
userID_From UDID (this is userB from the example)
userID_From_Name text
userID_From_Picture_URL

With this structure i can get the different activities to a user and it
works just fine. My problem is that userID_From can change his name and his
pictire and i need this data to be updated all arround the different tables
because i want to show the current right values.

The problem is that the update is a table scan and it's not efficient.
Should i hold only the ID and every time that i select a slice of the data
and get a several ID's i'll do a nother query to query about
the values of the users name and picture path? Should i do something else?

Best regards,
Lior

Re: Apache Cassandra - Question about data model

Posted by Jack Krupansky <ja...@gmail.com>.
It's best to ask usage and data modeling questions on the user email list -
this list is the dev list, for development of Cassandra itself, not for
development of applications.

See:
http://cassandra.apache.org/


-- Jack Krupansky

On Thu, Dec 31, 2015 at 8:36 AM, Lior Menashe <li...@gmail.com>
wrote:

> Hi,
>
> Just got your mail from the #cassandra channel on the web chat because i
> couldn't get an answer...
>
> I have a question that i'll be glad if you can help me or give me a
> direction.
>
> I have an activity feed like the activity feed on Instagram. When user
> (lets say UserA) enters his page he can see all the activities that are
> related to him,
> for example, user B liked your post.user C commented on your post etc...
>
> the cassandra data model that i thought about is:
>
> userID UDID (partition key)
> datetimeadded timestamp (clustering column DESC)
> userID_Name text
> userID_Picture_URL text
> userID_From UDID (this is userB from the example)
> userID_From_Name text
> userID_From_Picture_URL
>
> With this structure i can get the different activities to a user and it
> works just fine. My problem is that userID_From can change his name and his
> pictire and i need this data to be updated all arround the different tables
> because i want to show the current right values.
>
> The problem is that the update is a table scan and it's not efficient.
> Should i hold only the ID and every time that i select a slice of the data
> and get a several ID's i'll do a nother query to query about
> the values of the users name and picture path? Should i do something else?
>
> Best regards,
> Lior
>

Re: Apache Cassandra - Question about data model

Posted by Lior Menashe <li...@gmail.com>.
Hi Matthias,

Thanks for your answer.
According to what you've wrote, if i will select the first 30 lines from
the feed table to a user
i'll need to perform up to 30 more queries to the user table in order to
get the users data.

Isn't it better to use Cassandra for the feed and Some Sql Server to get
the users data in one query?

BR,
Lior



2015-12-31 17:58 GMT+02:00 Matthias Eichstaedt <
matthias.eichstaedt@gmail.com>:

> Hi Lior,
> how about something like this where you separate the user fields into a
> separate USER_TABLE:
>
> FEED_TABLE
> userID UDID (partition key)
> datetimeadded timestamp (clustering column DESC)
> userID_From UDID (this is userB from the example)
>
> USER_TABLE
> userID UDID (partition key)
> userID_Name text
> userID_Picture_URL text
>
> You have an extra query but you can change the name and picture in one
> place.
>
> Matthias
>
> On Thu, Dec 31, 2015 at 5:36 AM, Lior Menashe <li...@gmail.com>
> wrote:
>
> > Hi,
> >
> > Just got your mail from the #cassandra channel on the web chat because i
> > couldn't get an answer...
> >
> > I have a question that i'll be glad if you can help me or give me a
> > direction.
> >
> > I have an activity feed like the activity feed on Instagram. When user
> > (lets say UserA) enters his page he can see all the activities that are
> > related to him,
> > for example, user B liked your post.user C commented on your post etc...
> >
> > the cassandra data model that i thought about is:
> >
> > userID UDID (partition key)
> > datetimeadded timestamp (clustering column DESC)
> > userID_Name text
> > userID_Picture_URL text
> > userID_From UDID (this is userB from the example)
> > userID_From_Name text
> > userID_From_Picture_URL
> >
> > With this structure i can get the different activities to a user and it
> > works just fine. My problem is that userID_From can change his name and
> his
> > pictire and i need this data to be updated all arround the different
> tables
> > because i want to show the current right values.
> >
> > The problem is that the update is a table scan and it's not efficient.
> > Should i hold only the ID and every time that i select a slice of the
> data
> > and get a several ID's i'll do a nother query to query about
> > the values of the users name and picture path? Should i do something
> else?
> >
> > Best regards,
> > Lior
> >
>



-- 
ליאור מנשה

Re: Apache Cassandra - Question about data model

Posted by Matthias Eichstaedt <ma...@gmail.com>.
Hi Lior,
how about something like this where you separate the user fields into a
separate USER_TABLE:

FEED_TABLE
userID UDID (partition key)
datetimeadded timestamp (clustering column DESC)
userID_From UDID (this is userB from the example)

USER_TABLE
userID UDID (partition key)
userID_Name text
userID_Picture_URL text

You have an extra query but you can change the name and picture in one
place.

Matthias

On Thu, Dec 31, 2015 at 5:36 AM, Lior Menashe <li...@gmail.com>
wrote:

> Hi,
>
> Just got your mail from the #cassandra channel on the web chat because i
> couldn't get an answer...
>
> I have a question that i'll be glad if you can help me or give me a
> direction.
>
> I have an activity feed like the activity feed on Instagram. When user
> (lets say UserA) enters his page he can see all the activities that are
> related to him,
> for example, user B liked your post.user C commented on your post etc...
>
> the cassandra data model that i thought about is:
>
> userID UDID (partition key)
> datetimeadded timestamp (clustering column DESC)
> userID_Name text
> userID_Picture_URL text
> userID_From UDID (this is userB from the example)
> userID_From_Name text
> userID_From_Picture_URL
>
> With this structure i can get the different activities to a user and it
> works just fine. My problem is that userID_From can change his name and his
> pictire and i need this data to be updated all arround the different tables
> because i want to show the current right values.
>
> The problem is that the update is a table scan and it's not efficient.
> Should i hold only the ID and every time that i select a slice of the data
> and get a several ID's i'll do a nother query to query about
> the values of the users name and picture path? Should i do something else?
>
> Best regards,
> Lior
>