You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@esme.apache.org by Ethan Jewett <es...@gmail.com> on 2010/01/01 18:41:42 UTC

Search questions

I'm testing out the search functionality a bit and I've got a couple
of questions:

1. Search doesn't appear to be working on
http://esmecloudserverapache.dickhirsch.staxapps.net/  Or rather, it's
working, but it only has messages in the index from around Dec 15th
and 16th. It seems to work fine when I build locally.

2. It looks like search filters messages by people I follow. When I
follow a new user, their messages start showing up in my search. When
I unfollow, those messages no longer show up in my search. Was this by
design? Would people be open to at least allowing an option to search
across all messages I have *access* to (by pool), rather than just
those of people I follow?

3. I'm thinking of using search in a way where I would want to search
*only* based on tags and *not* on the message text itself. Is this
currently possible? If not, can we set it up?

I'll be looking into this further on my own, but any pointers or
answers would be much appreciated.

Ethan

Re: Search questions

Posted by Ethan Jewett <es...@gmail.com>.
Also, one more thing:

4. If I am not in a pool "test_pool", then search doesn't return
messages to that pool. However, if I am added to pool "test_pool",
then search will return messages from *before* I was added. Is that by
design? I think that violates our pool design principles. If it is an
issue, then I'll create a Jira ticket and we'll need to get it fixed,
probably by checking the date/time that a user was added to a pool
against the message posting date/time.

Thanks,
Ethan

On Fri, Jan 1, 2010 at 11:41 AM, Ethan Jewett <es...@gmail.com> wrote:
> I'm testing out the search functionality a bit and I've got a couple
> of questions:
>
> 1. Search doesn't appear to be working on
> http://esmecloudserverapache.dickhirsch.staxapps.net/  Or rather, it's
> working, but it only has messages in the index from around Dec 15th
> and 16th. It seems to work fine when I build locally.
>
> 2. It looks like search filters messages by people I follow. When I
> follow a new user, their messages start showing up in my search. When
> I unfollow, those messages no longer show up in my search. Was this by
> design? Would people be open to at least allowing an option to search
> across all messages I have *access* to (by pool), rather than just
> those of people I follow?
>
> 3. I'm thinking of using search in a way where I would want to search
> *only* based on tags and *not* on the message text itself. Is this
> currently possible? If not, can we set it up?
>
> I'll be looking into this further on my own, but any pointers or
> answers would be much appreciated.
>
> Ethan
>

Re: Search questions

Posted by Ethan Jewett <es...@gmail.com>.
On Fri, Jan 1, 2010 at 1:54 PM, Richard Hirsch <hi...@gmail.com> wrote:

> This is what I meant by indirectly.

Ok, this is what I've done in the latest commit for the
/api2/pools/POOLID/messages?history=40 calls. I'll switch the other
ones over if this seems to be working well. (I want to hear from
people who might have concerns about performance and other design
concerns.)

> Exactly. I think that the search should be hidden in the existing API
>
> http://apiwiki.twitter.com/Twitter-Search-API-Method%3A-search

I was initially heavily influenced by the Twitter Search API, but the
tack you are suggesting (and I'm now thinking it is the right way) is
the exact opposite. I'm just pointing this out to everybody. I think
it's neither good nor bad to diverge from Twitter's API design
philosophy.

> easiest way would be to try and then see if search still works.

It did! It's in my latest commit.

Ethan

Re: Search questions

Posted by Richard Hirsch <hi...@gmail.com>.
On Fri, Jan 1, 2010 at 8:30 PM, Ethan Jewett <es...@gmail.com> wrote:
> On Fri, Jan 1, 2010 at 12:33 PM, Richard Hirsch <hi...@gmail.com> wrote:
>
>> Are you thinking of adding search to the new api indirectly - which I
>> prefer- or directly with a "search" method?
>
> I'm not sure I understand what you mean be "directly" and "indirectly".
>
> My initial goal was to be able to implement a relatively flexible
> "search" method that would allow for a search on any searchable
> property or combination (and/or) of searchable properties. I think
> that's probably what you mean by "directly".
>
> I also see an opportunity to replace the non-streaming parts of things
> like the api2/user/messages (for example
> api2/user/messages?history=300 << ouch) with a call to the
> Lucene/Compass index. There is still a call to Messages.findMessages,
> but at least we wouldn't be doing search queries on the messages DB. I
> take is that's what you mean by "indirectly"?

This is what I meant by indirectly.

>
> Do you think it would make sense to go with purely incorporating
> search into the existing API methods as filter parameters on the
> message endpoints in api2? For example GET
> /api2/user/messages?history=20&filter_tags=esme,api (returns the last
> 20 messages in the user's timeline tagged with "esme" and "api") I'm
> not sure what the syntax would be here, but I'll think about it and
> I'll have a look at other search APIs.

Exactly. I think that the search should be hidden in the existing API

http://apiwiki.twitter.com/Twitter-Search-API-Method%3A-search

>
>> Probably a local problem. I haven't tried search on the stax instance
>> for a while. If desired, I could restart the server.
>
> Not a big deal. I just wanted to make sure that there isn't something
> broken about search in general, only with that server.
>
>> The search functionality was conceived before we started pools. Thus,
>> pools weren't include back then.   I'd like to think that search in
>> pool would be useful. What about searching across all "my" pools or a
>> certain pool?
>
> I see that from looking at the code now (thanks for pointing me
> there!). I think adding pool as a searchable property should be done,
> and I can definitely do that. What I'm not clear on is if this will
> break the existing search index in some way, so maybe someone who
> knows a lot more about the search than me can guide me here.

easiest way would be to try and then see if search still works.

>
>> Look here for the current search implementation:
>> http://svn.apache.org/viewvc/incubator/esme/trunk/server/src/main/scala/org/apache/esme/model/Message.scala?view=markup
>> - starting on line 148
>
> Thanks for the pointer. Looks like tags are already indexed, so we can
> do an arbitrary search query based only on tags. Awesome! I'll try
> this out soon-ish to make sure.
>
> Ethan
>

Re: Search questions

Posted by Ethan Jewett <es...@gmail.com>.
On Fri, Jan 1, 2010 at 12:33 PM, Richard Hirsch <hi...@gmail.com> wrote:

> Are you thinking of adding search to the new api indirectly - which I
> prefer- or directly with a "search" method?

I'm not sure I understand what you mean be "directly" and "indirectly".

My initial goal was to be able to implement a relatively flexible
"search" method that would allow for a search on any searchable
property or combination (and/or) of searchable properties. I think
that's probably what you mean by "directly".

I also see an opportunity to replace the non-streaming parts of things
like the api2/user/messages (for example
api2/user/messages?history=300 << ouch) with a call to the
Lucene/Compass index. There is still a call to Messages.findMessages,
but at least we wouldn't be doing search queries on the messages DB. I
take is that's what you mean by "indirectly"?

Do you think it would make sense to go with purely incorporating
search into the existing API methods as filter parameters on the
message endpoints in api2? For example GET
/api2/user/messages?history=20&filter_tags=esme,api (returns the last
20 messages in the user's timeline tagged with "esme" and "api") I'm
not sure what the syntax would be here, but I'll think about it and
I'll have a look at other search APIs.

> Probably a local problem. I haven't tried search on the stax instance
> for a while. If desired, I could restart the server.

Not a big deal. I just wanted to make sure that there isn't something
broken about search in general, only with that server.

> The search functionality was conceived before we started pools. Thus,
> pools weren't include back then.   I'd like to think that search in
> pool would be useful. What about searching across all "my" pools or a
> certain pool?

I see that from looking at the code now (thanks for pointing me
there!). I think adding pool as a searchable property should be done,
and I can definitely do that. What I'm not clear on is if this will
break the existing search index in some way, so maybe someone who
knows a lot more about the search than me can guide me here.

> Look here for the current search implementation:
> http://svn.apache.org/viewvc/incubator/esme/trunk/server/src/main/scala/org/apache/esme/model/Message.scala?view=markup
> - starting on line 148

Thanks for the pointer. Looks like tags are already indexed, so we can
do an arbitrary search query based only on tags. Awesome! I'll try
this out soon-ish to make sure.

Ethan

Re: Search questions

Posted by Richard Hirsch <hi...@gmail.com>.
On Fri, Jan 1, 2010 at 6:41 PM, Ethan Jewett <es...@gmail.com> wrote:
> I'm testing out the search functionality a bit and I've got a couple
> of questions:

Are you thinking of adding search to the new api indirectly - which I
prefer- or directly with a "search" method?

>
> 1. Search doesn't appear to be working on
> http://esmecloudserverapache.dickhirsch.staxapps.net/  Or rather, it's
> working, but it only has messages in the index from around Dec 15th
> and 16th. It seems to work fine when I build locally.

Probably a local problem. I haven't tried search on the stax instance
for a while. If desired, I could restart the server.

>
> 2. It looks like search filters messages by people I follow. When I
> follow a new user, their messages start showing up in my search. When
> I unfollow, those messages no longer show up in my search. Was this by
> design? Would people be open to at least allowing an option to search
> across all messages I have *access* to (by pool), rather than just
> those of people I follow?

The search functionality was conceived before we started pools. Thus,
pools weren't include back then.   I'd like to think that search in
pool would be useful. What about searching across all "my" pools or a
certain pool?

>
> 3. I'm thinking of using search in a way where I would want to search
> *only* based on tags and *not* on the message text itself. Is this
> currently possible? If not, can we set it up?

Look here for the current search implementation:
http://svn.apache.org/viewvc/incubator/esme/trunk/server/src/main/scala/org/apache/esme/model/Message.scala?view=markup
- starting on line 148
>
> I'll be looking into this further on my own, but any pointers or
> answers would be much appreciated.
>
> Ethan
>