You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by harish singh <ha...@gmail.com> on 2015/01/20 05:09:49 UTC

Newly observed Facets

Hi,

I have asked a question on Stackoverflow:
http://stackoverflow.com/questions/28036051/solr-newly-observed-facets

I searched the mailing list and found that not many reply there. So asking
the same question here:

 have two fields in my solr index data: "userName" and "startTimeISO" along
with many other fields. Now I want to query for all the "userNames" that
were seen TODAY but not seen in the last 30 days. Basically, I am trying to
find out Newly Observed UserNames for today.

Now the Solr Facet query I am running is:

facet.pivot: "userName,startTimeISO",
fq: " NOT startTimeISO:["2014-12-20T00:00:00.000Z" TO
"2015-01-18T00:00:00.000Z"] AND
startTimeISO:["2015-01-19T00:00:00.000Z" TO
"2015-01-20T00:00:00.000Z"]"

But I am for some reason getting incorrect results. For example, I see
userName: "bla" the above query. If I run the same query for tomorrow, I am
again see "bla" in my Facet Results.

I am some how not able to get the correct logic. Perhaps I am not using all
the tools provided by solr, which I am unaware of?

Can someone help me here? I dont mind testing all of your suggestions and
coming back and forth with different suggestions.


Thanks,

Harish

Re: Newly observed Facets

Posted by harish singh <ha...@gmail.com>.
Thanks Alvaro. That worked.

On Tue, Jan 20, 2015 at 9:59 AM, harish singh <ha...@gmail.com>
wrote:

> ok. So I am trying this query:
>
>
> http://cluster1.com:8983/solr/my_collection_shard4_replica1/select?q=*%3A*&rows=0&wt=json&indent=true&facet=true&facet.field=userName&fq=startTimeISO:[NOW-1DAY%20TO%20NOW]&fq=-_query_:%22{!join%20from=%
>  userName%20to=%userName}startTimeISO:[NOW-30DAYS%20TO%20NOW-1DAYS]%22
>
> This query is giving me results. I will now validate it.
>
> Just to confirm, Is this the right query that I am using? Please correct
> me if I am wrong here.
>
> On Tue, Jan 20, 2015 at 9:47 AM, Alvaro Cabrerizo <to...@gmail.com>
> wrote:
>
>> Hi,
>>
>> In case your data looks like:
>>
>> "id": "1",
>> "userName": "one",
>> "startTimeISO": "2015-01-20T17:24:32.888Z"
>>
>> "id": "2",
>> "userName": "one",
>> "startTimeISO": "2015-01-16T17:24:50.208Z"
>>
>> "id": "3",
>> "userName": "two",
>> "startTimeISO": "2015-01-20T17:25:06.109Z"
>>
>>
>> You could use the next query combination
>>
>> q=*:*
>> fq=startTimeISO:[NOW-1DAY TO NOW]  //this will give you all the users that
>> were seen today
>> fq=-_query_:"{!join from=userName to=userName}startTimeISO:[NOW-30DAYS TO
>> NOW-1DAYS]"  //dont include those documents that have others with the same
>> name and were viewed during the last 30 days.
>>
>> Regards.
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> On Tue, Jan 20, 2015 at 5:32 PM, harish singh <ha...@gmail.com>
>> wrote:
>>
>> > Well, that is the problem I am facing. Just checking if there is a way
>> to
>> > compute the diff from 18th for the 19th.
>> > One option is:
>> > Get all the facets for 19th.
>> > Get all facets for 18th.
>> > Do a diff and Eliminate intersection.
>> >
>> > But this isn't optimal as the number of facets returned but solr query
>> can
>> > be huge.
>> >
>> > Any other way to get around with? Any tool that solr provides?
>> >
>> > On Tue, Jan 20, 2015, 8:10 AM Shawn Heisey <ap...@elyograg.org> wrote:
>> >
>> > > On 1/20/2015 8:52 AM, harish singh wrote:
>> > > > Yes I got that. But I am still stuck at this point. Consider it like
>> > > this:
>> > > > I do not know what are the usernames in all the documents.
>> > > > I only know there is time associated with each record.
>> > > >
>> > > > So Say, I have usernames "a", "b", "c", "d" present in my data for
>> the
>> > > 18th
>> > > > of January.
>> > > > And for the 19th, I have usernames "a", "b","c", "d", "e".
>> > > > Then my query for newly observed username for today over last two
>> days
>> > > > should return me "e"
>> > >
>> > > If you query for only documents that match the 19th, and those
>> documents
>> > > contain all five usernames, you're going to get all five usernames in
>> > > your facet.  Solr has no way to know that the other four usernames
>> > > should be excluded - it can only show you facets for the documents
>> > > matching your query, the other documents in your index will have no
>> > > bearing on the results.
>> > >
>> > > Thanks,
>> > > Shawn
>> > >
>> > >
>> >
>>
>
>

Re: Newly observed Facets

Posted by harish singh <ha...@gmail.com>.
ok. So I am trying this query:

http://cluster1.com:8983/solr/my_collection_shard4_replica1/select?q=*%3A*&rows=0&wt=json&indent=true&facet=true&facet.field=userName&fq=startTimeISO:[NOW-1DAY%20TO%20NOW]&fq=-_query_:%22{!join%20from=%
 userName%20to=%userName}startTimeISO:[NOW-30DAYS%20TO%20NOW-1DAYS]%22

This query is giving me results. I will now validate it.

Just to confirm, Is this the right query that I am using? Please correct me
if I am wrong here.

On Tue, Jan 20, 2015 at 9:47 AM, Alvaro Cabrerizo <to...@gmail.com>
wrote:

> Hi,
>
> In case your data looks like:
>
> "id": "1",
> "userName": "one",
> "startTimeISO": "2015-01-20T17:24:32.888Z"
>
> "id": "2",
> "userName": "one",
> "startTimeISO": "2015-01-16T17:24:50.208Z"
>
> "id": "3",
> "userName": "two",
> "startTimeISO": "2015-01-20T17:25:06.109Z"
>
>
> You could use the next query combination
>
> q=*:*
> fq=startTimeISO:[NOW-1DAY TO NOW]  //this will give you all the users that
> were seen today
> fq=-_query_:"{!join from=userName to=userName}startTimeISO:[NOW-30DAYS TO
> NOW-1DAYS]"  //dont include those documents that have others with the same
> name and were viewed during the last 30 days.
>
> Regards.
>
>
>
>
>
>
>
>
>
> On Tue, Jan 20, 2015 at 5:32 PM, harish singh <ha...@gmail.com>
> wrote:
>
> > Well, that is the problem I am facing. Just checking if there is a way to
> > compute the diff from 18th for the 19th.
> > One option is:
> > Get all the facets for 19th.
> > Get all facets for 18th.
> > Do a diff and Eliminate intersection.
> >
> > But this isn't optimal as the number of facets returned but solr query
> can
> > be huge.
> >
> > Any other way to get around with? Any tool that solr provides?
> >
> > On Tue, Jan 20, 2015, 8:10 AM Shawn Heisey <ap...@elyograg.org> wrote:
> >
> > > On 1/20/2015 8:52 AM, harish singh wrote:
> > > > Yes I got that. But I am still stuck at this point. Consider it like
> > > this:
> > > > I do not know what are the usernames in all the documents.
> > > > I only know there is time associated with each record.
> > > >
> > > > So Say, I have usernames "a", "b", "c", "d" present in my data for
> the
> > > 18th
> > > > of January.
> > > > And for the 19th, I have usernames "a", "b","c", "d", "e".
> > > > Then my query for newly observed username for today over last two
> days
> > > > should return me "e"
> > >
> > > If you query for only documents that match the 19th, and those
> documents
> > > contain all five usernames, you're going to get all five usernames in
> > > your facet.  Solr has no way to know that the other four usernames
> > > should be excluded - it can only show you facets for the documents
> > > matching your query, the other documents in your index will have no
> > > bearing on the results.
> > >
> > > Thanks,
> > > Shawn
> > >
> > >
> >
>

Re: Newly observed Facets

Posted by Alvaro Cabrerizo <to...@gmail.com>.
Hi,

In case your data looks like:

"id": "1",
"userName": "one",
"startTimeISO": "2015-01-20T17:24:32.888Z"

"id": "2",
"userName": "one",
"startTimeISO": "2015-01-16T17:24:50.208Z"

"id": "3",
"userName": "two",
"startTimeISO": "2015-01-20T17:25:06.109Z"


You could use the next query combination

q=*:*
fq=startTimeISO:[NOW-1DAY TO NOW]  //this will give you all the users that
were seen today
fq=-_query_:"{!join from=userName to=userName}startTimeISO:[NOW-30DAYS TO
NOW-1DAYS]"  //dont include those documents that have others with the same
name and were viewed during the last 30 days.

Regards.









On Tue, Jan 20, 2015 at 5:32 PM, harish singh <ha...@gmail.com>
wrote:

> Well, that is the problem I am facing. Just checking if there is a way to
> compute the diff from 18th for the 19th.
> One option is:
> Get all the facets for 19th.
> Get all facets for 18th.
> Do a diff and Eliminate intersection.
>
> But this isn't optimal as the number of facets returned but solr query can
> be huge.
>
> Any other way to get around with? Any tool that solr provides?
>
> On Tue, Jan 20, 2015, 8:10 AM Shawn Heisey <ap...@elyograg.org> wrote:
>
> > On 1/20/2015 8:52 AM, harish singh wrote:
> > > Yes I got that. But I am still stuck at this point. Consider it like
> > this:
> > > I do not know what are the usernames in all the documents.
> > > I only know there is time associated with each record.
> > >
> > > So Say, I have usernames "a", "b", "c", "d" present in my data for the
> > 18th
> > > of January.
> > > And for the 19th, I have usernames "a", "b","c", "d", "e".
> > > Then my query for newly observed username for today over last two days
> > > should return me "e"
> >
> > If you query for only documents that match the 19th, and those documents
> > contain all five usernames, you're going to get all five usernames in
> > your facet.  Solr has no way to know that the other four usernames
> > should be excluded - it can only show you facets for the documents
> > matching your query, the other documents in your index will have no
> > bearing on the results.
> >
> > Thanks,
> > Shawn
> >
> >
>

Re: Newly observed Facets

Posted by harish singh <ha...@gmail.com>.
Well, that is the problem I am facing. Just checking if there is a way to
compute the diff from 18th for the 19th.
One option is:
Get all the facets for 19th.
Get all facets for 18th.
Do a diff and Eliminate intersection.

But this isn't optimal as the number of facets returned but solr query can
be huge.

Any other way to get around with? Any tool that solr provides?

On Tue, Jan 20, 2015, 8:10 AM Shawn Heisey <ap...@elyograg.org> wrote:

> On 1/20/2015 8:52 AM, harish singh wrote:
> > Yes I got that. But I am still stuck at this point. Consider it like
> this:
> > I do not know what are the usernames in all the documents.
> > I only know there is time associated with each record.
> >
> > So Say, I have usernames "a", "b", "c", "d" present in my data for the
> 18th
> > of January.
> > And for the 19th, I have usernames "a", "b","c", "d", "e".
> > Then my query for newly observed username for today over last two days
> > should return me "e"
>
> If you query for only documents that match the 19th, and those documents
> contain all five usernames, you're going to get all five usernames in
> your facet.  Solr has no way to know that the other four usernames
> should be excluded - it can only show you facets for the documents
> matching your query, the other documents in your index will have no
> bearing on the results.
>
> Thanks,
> Shawn
>
>

Re: Newly observed Facets

Posted by Shawn Heisey <ap...@elyograg.org>.
On 1/20/2015 8:52 AM, harish singh wrote:
> Yes I got that. But I am still stuck at this point. Consider it like this:
> I do not know what are the usernames in all the documents.
> I only know there is time associated with each record.
>
> So Say, I have usernames "a", "b", "c", "d" present in my data for the 18th
> of January.
> And for the 19th, I have usernames "a", "b","c", "d", "e".
> Then my query for newly observed username for today over last two days
> should return me "e"

If you query for only documents that match the 19th, and those documents
contain all five usernames, you're going to get all five usernames in
your facet.  Solr has no way to know that the other four usernames
should be excluded - it can only show you facets for the documents
matching your query, the other documents in your index will have no
bearing on the results.

Thanks,
Shawn


Re: Newly observed Facets

Posted by harish singh <ha...@gmail.com>.
Yes I got that. But I am still stuck at this point. Consider it like this:
I do not know what are the usernames in all the documents.
I only know there is time associated with each record.

So Say, I have usernames "a", "b", "c", "d" present in my data for the 18th
of January.
And for the 19th, I have usernames "a", "b","c", "d", "e".
Then my query for newly observed username for today over last two days
should return me "e"

Keep in my I can query only using "starttimeIso".

On Tue, Jan 20, 2015, 1:20 AM Alvaro Cabrerizo <to...@gmail.com> wrote:

> Hi Harish,
>
> What I was requesting you in my previous mail was to try (yourself) to
> understand your data using specific queries. Apart from that, remember that
> facet is doing over indexed data thus if you have two documents with nameA
> as "user A" and nameB as "user B", and they are tokenized.... you will have
> 3 different facets (not two): *user*, *A* and *B*.
>
> Regards.
>
> On Tue, Jan 20, 2015 at 10:05 AM, harish singh <ha...@gmail.com>
> wrote:
>
> > I am not querying for a specific usernames.
> > Each day, there will be many usernames observed at different times.
> > But there might be some usernames that were never seen in the last 30
> days,
> > but they were observed today.
> > That is the main challenge I am having.
> >
> > How to identify which usernames from today were not seen in the last 30
> > days.
> >
> > On Tue, Jan 20, 2015, 1:02 AM Alvaro Cabrerizo <to...@gmail.com>
> wrote:
> >
> > > Ok,
> > >
> > > Thus as commented before, in case your starttimeISO is single-value you
> > > only need to add the range clause: startTimeISO:["2015-01-19T00:
> > > 00:00.000Z" TO "2015-01-20T00:00:00.000Z"]". There is no need to add
> both
> > > NOT A AND B as the documents that satisfy B will automatically satisfy
> A.
> > >
> > > If you query:
> > >
> > > q: username:bla
> > >
> > > How many documents do you have? where they observed at different
> > > starttimeISO (I mean maybe you have different documents with similar
> > names
> > > that where observed in different times?
> > >
> > > Regards.
> > >
> > >
> > >
> > >
> > >
> > > On Tue, Jan 20, 2015 at 9:45 AM, harish singh <
> harish.singh22@gmail.com>
> > > wrote:
> > >
> > > > Every entry in the document has a username, starttimeISO and uuid
> > (which
> > > is
> > > > not starttimeiso)
> > > >
> > > > So every record has a starttimeISO which is the time when the
> username
> > > was
> > > > seen.
> > > >
> > > > The document looks like this:
> > > > {
> > > > Uuid: xxx
> > > > StartTimeISO: 2015-01-18T00:00:00.000Z
> > > > Username: abc
> > > > }
> > > >
> > > > There are multiple records in the document.
> > > > Indexing is applied on "username" and "starttimeiso" fields.
> > > >
> > >
> >
>

Re: Newly observed Facets

Posted by Alvaro Cabrerizo <to...@gmail.com>.
Hi Harish,

What I was requesting you in my previous mail was to try (yourself) to
understand your data using specific queries. Apart from that, remember that
facet is doing over indexed data thus if you have two documents with nameA
as "user A" and nameB as "user B", and they are tokenized.... you will have
3 different facets (not two): *user*, *A* and *B*.

Regards.

On Tue, Jan 20, 2015 at 10:05 AM, harish singh <ha...@gmail.com>
wrote:

> I am not querying for a specific usernames.
> Each day, there will be many usernames observed at different times.
> But there might be some usernames that were never seen in the last 30 days,
> but they were observed today.
> That is the main challenge I am having.
>
> How to identify which usernames from today were not seen in the last 30
> days.
>
> On Tue, Jan 20, 2015, 1:02 AM Alvaro Cabrerizo <to...@gmail.com> wrote:
>
> > Ok,
> >
> > Thus as commented before, in case your starttimeISO is single-value you
> > only need to add the range clause: startTimeISO:["2015-01-19T00:
> > 00:00.000Z" TO "2015-01-20T00:00:00.000Z"]". There is no need to add both
> > NOT A AND B as the documents that satisfy B will automatically satisfy A.
> >
> > If you query:
> >
> > q: username:bla
> >
> > How many documents do you have? where they observed at different
> > starttimeISO (I mean maybe you have different documents with similar
> names
> > that where observed in different times?
> >
> > Regards.
> >
> >
> >
> >
> >
> > On Tue, Jan 20, 2015 at 9:45 AM, harish singh <ha...@gmail.com>
> > wrote:
> >
> > > Every entry in the document has a username, starttimeISO and uuid
> (which
> > is
> > > not starttimeiso)
> > >
> > > So every record has a starttimeISO which is the time when the username
> > was
> > > seen.
> > >
> > > The document looks like this:
> > > {
> > > Uuid: xxx
> > > StartTimeISO: 2015-01-18T00:00:00.000Z
> > > Username: abc
> > > }
> > >
> > > There are multiple records in the document.
> > > Indexing is applied on "username" and "starttimeiso" fields.
> > >
> >
>

Re: Newly observed Facets

Posted by harish singh <ha...@gmail.com>.
I am not querying for a specific usernames.
Each day, there will be many usernames observed at different times.
But there might be some usernames that were never seen in the last 30 days,
but they were observed today.
That is the main challenge I am having.

How to identify which usernames from today were not seen in the last 30
days.

On Tue, Jan 20, 2015, 1:02 AM Alvaro Cabrerizo <to...@gmail.com> wrote:

> Ok,
>
> Thus as commented before, in case your starttimeISO is single-value you
> only need to add the range clause: startTimeISO:["2015-01-19T00:
> 00:00.000Z" TO "2015-01-20T00:00:00.000Z"]". There is no need to add both
> NOT A AND B as the documents that satisfy B will automatically satisfy A.
>
> If you query:
>
> q: username:bla
>
> How many documents do you have? where they observed at different
> starttimeISO (I mean maybe you have different documents with similar names
> that where observed in different times?
>
> Regards.
>
>
>
>
>
> On Tue, Jan 20, 2015 at 9:45 AM, harish singh <ha...@gmail.com>
> wrote:
>
> > Every entry in the document has a username, starttimeISO and uuid (which
> is
> > not starttimeiso)
> >
> > So every record has a starttimeISO which is the time when the username
> was
> > seen.
> >
> > The document looks like this:
> > {
> > Uuid: xxx
> > StartTimeISO: 2015-01-18T00:00:00.000Z
> > Username: abc
> > }
> >
> > There are multiple records in the document.
> > Indexing is applied on "username" and "starttimeiso" fields.
> >
>

Re: Newly observed Facets

Posted by Alvaro Cabrerizo <to...@gmail.com>.
Ok,

Thus as commented before, in case your starttimeISO is single-value you
only need to add the range clause: startTimeISO:["2015-01-19T00:
00:00.000Z" TO "2015-01-20T00:00:00.000Z"]". There is no need to add both
NOT A AND B as the documents that satisfy B will automatically satisfy A.

If you query:

q: username:bla

How many documents do you have? where they observed at different
starttimeISO (I mean maybe you have different documents with similar names
that where observed in different times?

Regards.





On Tue, Jan 20, 2015 at 9:45 AM, harish singh <ha...@gmail.com>
wrote:

> Every entry in the document has a username, starttimeISO and uuid (which is
> not starttimeiso)
>
> So every record has a starttimeISO which is the time when the username was
> seen.
>
> The document looks like this:
> {
> Uuid: xxx
> StartTimeISO: 2015-01-18T00:00:00.000Z
> Username: abc
> }
>
> There are multiple records in the document.
> Indexing is applied on "username" and "starttimeiso" fields.
>

Re: Newly observed Facets

Posted by harish singh <ha...@gmail.com>.
Every entry in the document has a username, starttimeISO and uuid (which is
not starttimeiso)

So every record has a starttimeISO which is the time when the username was
seen.

The document looks like this:
{
Uuid: xxx
StartTimeISO: 2015-01-18T00:00:00.000Z
Username: abc
}

There are multiple records in the document.
Indexing is applied on "username" and "starttimeiso" fields.

Re: Newly observed Facets

Posted by Alvaro Cabrerizo <to...@gmail.com>.
is startTimeISO single or multi-valued? (In other words, do you add a new
value to startTimeISO everytime a document is observed or just the first
time?)

The idea is to clarify the data you store in the index. So how does your
document look like:

id:XXX-YYY-ZZZ
name: theName
startTimeISO:[2015-01-17T00:00:00.000Z, 2015-01-17T00:00:00.000Z,
2015-01-19T00:00:00.000Z]

or

id:XXX-YYY-ZZZ
name: theName
startTimeISO:"2015-01-17T00:00:00.000Z"

Again, when you ask for documents with name bla, what do you get, just one
or various?


Regards

On Tue, Jan 20, 2015 at 8:55 AM, harish singh <ha...@gmail.com>
wrote:

> yes. Fieldtype for startTimeISO is date.
> My issue is I need to find out all the "newly observed" usernames for a
> given day.
> So, for today, If I am saying that username "xyz" is newly observed, then
> it means that this username "xyz" was not present in the last 30 days data
> and it was only observed today. So it is "new"
>
> Hope I could explain it to you well. Ask me for any more questions
>
> On Mon, Jan 19, 2015 at 11:45 PM, Alvaro Cabrerizo <to...@gmail.com>
> wrote:
>
> > At first impression, everything seems ok.
> >
> > Anyway, is the startTimeISO single-value or multivalued field? In case it
> > is single-value the clause startTimeISO:["2015-01-19T00:
> > 00:00.000Z" TO "2015-01-20T00:00:00.000Z"]" is sufficient to exclude
> other
> > period of time. I also guess that the startTimeISO fieldtype is date.
> >
> > Other option for rewriting your query (just for testing as I can't find
> any
> > problem in the query you presented)
> >
> > q=*:*
> > fq=startTimeISO:[NOW/DAY-1DAYS TO NOW]
> > raw parameters: facet.pivot=userName,startTimeISO
> >
> > It will be also helpful to know the raw response, just to discard that
> the
> > name bla doesnt appear in different documents. For example usin:
> >
> > query: userName:bla
> > fq=startTimeISO:[NOW/DAY-1DAYS TO NOW]
> > fl= id,userName, startTimeISO
> >
> >
> > Hope it helps.
> >
> >
> >
> >
> >
> > On Tue, Jan 20, 2015 at 5:09 AM, harish singh <ha...@gmail.com>
> > wrote:
> >
> > > Hi,
> > >
> > > I have asked a question on Stackoverflow:
> > > http://stackoverflow.com/questions/28036051/solr-newly-observed-facets
> > >
> > > I searched the mailing list and found that not many reply there. So
> > asking
> > > the same question here:
> > >
> > >  have two fields in my solr index data: "userName" and "startTimeISO"
> > along
> > > with many other fields. Now I want to query for all the "userNames"
> that
> > > were seen TODAY but not seen in the last 30 days. Basically, I am
> trying
> > to
> > > find out Newly Observed UserNames for today.
> > >
> > > Now the Solr Facet query I am running is:
> > >
> > > facet.pivot: "userName,startTimeISO",
> > > fq: " NOT startTimeISO:["2014-12-20T00:00:00.000Z" TO
> > > "2015-01-18T00:00:00.000Z"] AND
> > > startTimeISO:["2015-01-19T00:00:00.000Z" TO
> > > "2015-01-20T00:00:00.000Z"]"
> > >
> > > But I am for some reason getting incorrect results. For example, I see
> > > userName: "bla" the above query. If I run the same query for tomorrow,
> I
> > am
> > > again see "bla" in my Facet Results.
> > >
> > > I am some how not able to get the correct logic. Perhaps I am not using
> > all
> > > the tools provided by solr, which I am unaware of?
> > >
> > > Can someone help me here? I dont mind testing all of your suggestions
> and
> > > coming back and forth with different suggestions.
> > >
> > >
> > > Thanks,
> > >
> > > Harish
> > >
> >
>

Re: Newly observed Facets

Posted by harish singh <ha...@gmail.com>.
yes. Fieldtype for startTimeISO is date.
My issue is I need to find out all the "newly observed" usernames for a
given day.
So, for today, If I am saying that username "xyz" is newly observed, then
it means that this username "xyz" was not present in the last 30 days data
and it was only observed today. So it is "new"

Hope I could explain it to you well. Ask me for any more questions

On Mon, Jan 19, 2015 at 11:45 PM, Alvaro Cabrerizo <to...@gmail.com>
wrote:

> At first impression, everything seems ok.
>
> Anyway, is the startTimeISO single-value or multivalued field? In case it
> is single-value the clause startTimeISO:["2015-01-19T00:
> 00:00.000Z" TO "2015-01-20T00:00:00.000Z"]" is sufficient to exclude other
> period of time. I also guess that the startTimeISO fieldtype is date.
>
> Other option for rewriting your query (just for testing as I can't find any
> problem in the query you presented)
>
> q=*:*
> fq=startTimeISO:[NOW/DAY-1DAYS TO NOW]
> raw parameters: facet.pivot=userName,startTimeISO
>
> It will be also helpful to know the raw response, just to discard that the
> name bla doesnt appear in different documents. For example usin:
>
> query: userName:bla
> fq=startTimeISO:[NOW/DAY-1DAYS TO NOW]
> fl= id,userName, startTimeISO
>
>
> Hope it helps.
>
>
>
>
>
> On Tue, Jan 20, 2015 at 5:09 AM, harish singh <ha...@gmail.com>
> wrote:
>
> > Hi,
> >
> > I have asked a question on Stackoverflow:
> > http://stackoverflow.com/questions/28036051/solr-newly-observed-facets
> >
> > I searched the mailing list and found that not many reply there. So
> asking
> > the same question here:
> >
> >  have two fields in my solr index data: "userName" and "startTimeISO"
> along
> > with many other fields. Now I want to query for all the "userNames" that
> > were seen TODAY but not seen in the last 30 days. Basically, I am trying
> to
> > find out Newly Observed UserNames for today.
> >
> > Now the Solr Facet query I am running is:
> >
> > facet.pivot: "userName,startTimeISO",
> > fq: " NOT startTimeISO:["2014-12-20T00:00:00.000Z" TO
> > "2015-01-18T00:00:00.000Z"] AND
> > startTimeISO:["2015-01-19T00:00:00.000Z" TO
> > "2015-01-20T00:00:00.000Z"]"
> >
> > But I am for some reason getting incorrect results. For example, I see
> > userName: "bla" the above query. If I run the same query for tomorrow, I
> am
> > again see "bla" in my Facet Results.
> >
> > I am some how not able to get the correct logic. Perhaps I am not using
> all
> > the tools provided by solr, which I am unaware of?
> >
> > Can someone help me here? I dont mind testing all of your suggestions and
> > coming back and forth with different suggestions.
> >
> >
> > Thanks,
> >
> > Harish
> >
>

Re: Newly observed Facets

Posted by Alvaro Cabrerizo <to...@gmail.com>.
At first impression, everything seems ok.

Anyway, is the startTimeISO single-value or multivalued field? In case it
is single-value the clause startTimeISO:["2015-01-19T00:
00:00.000Z" TO "2015-01-20T00:00:00.000Z"]" is sufficient to exclude other
period of time. I also guess that the startTimeISO fieldtype is date.

Other option for rewriting your query (just for testing as I can't find any
problem in the query you presented)

q=*:*
fq=startTimeISO:[NOW/DAY-1DAYS TO NOW]
raw parameters: facet.pivot=userName,startTimeISO

It will be also helpful to know the raw response, just to discard that the
name bla doesnt appear in different documents. For example usin:

query: userName:bla
fq=startTimeISO:[NOW/DAY-1DAYS TO NOW]
fl= id,userName, startTimeISO


Hope it helps.





On Tue, Jan 20, 2015 at 5:09 AM, harish singh <ha...@gmail.com>
wrote:

> Hi,
>
> I have asked a question on Stackoverflow:
> http://stackoverflow.com/questions/28036051/solr-newly-observed-facets
>
> I searched the mailing list and found that not many reply there. So asking
> the same question here:
>
>  have two fields in my solr index data: "userName" and "startTimeISO" along
> with many other fields. Now I want to query for all the "userNames" that
> were seen TODAY but not seen in the last 30 days. Basically, I am trying to
> find out Newly Observed UserNames for today.
>
> Now the Solr Facet query I am running is:
>
> facet.pivot: "userName,startTimeISO",
> fq: " NOT startTimeISO:["2014-12-20T00:00:00.000Z" TO
> "2015-01-18T00:00:00.000Z"] AND
> startTimeISO:["2015-01-19T00:00:00.000Z" TO
> "2015-01-20T00:00:00.000Z"]"
>
> But I am for some reason getting incorrect results. For example, I see
> userName: "bla" the above query. If I run the same query for tomorrow, I am
> again see "bla" in my Facet Results.
>
> I am some how not able to get the correct logic. Perhaps I am not using all
> the tools provided by solr, which I am unaware of?
>
> Can someone help me here? I dont mind testing all of your suggestions and
> coming back and forth with different suggestions.
>
>
> Thanks,
>
> Harish
>