You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Andrew Lundgren <lu...@familysearch.org> on 2013/03/15 20:24:36 UTC

Query.toString printing binary in the output...

We use the toString call on the query in our logs.  For some numeric types, the encoded form of the number is being printed instead of the readable form.

This makes tail and some other tools very unhappy...

Here is a partial example of a query.toString() that would have had binary in it.  As a short term work around I replaced all non-printable characters in the string with an '_'.

(collection_id:`__z_[^0.027 collection_id:`__nB+^0.026 collection_id:`__Zl_^0.025 collection_id:`__i49^0.024 collection_id:`__Pq%^0.023 collection_id:`__VCS^0.022 collection_id:`__WbH^0.021 collection_id:`__Yu_^0.02 collection_id:`__UF&^0.019 collection_id:`__I2g^0.018 collection_id:`__PP_^0.016999999 collection_id:`__Ysv^0.015999999 collection_id:`__Oe_^0.014999999 collection_id:`__Ysw^0.013999999 collection_id:`__Wi_^0.012999998 collection_id:`__fLi^0.011999998 collection_id:`__XRk^0.010999998 collection_id:`__Uz[^0.009999998 collection_id:`__SE_^0.008999998 collection_id:`__Ysx^0.007999998 collection_id:`__Ysh^0.0069999974 collection_id:`__fLh^0.0059999973 collection_id:`__f _^0.004999997 collection_id:`__`^C^0.003999997 collection_id:`__fKM^0.002999997 collection_id:`__Szo^0.001999997 collection_id:`__f ]^9.99997E-4)

But, as you can see, that is less than useful...

I spent some time looking at the source and found that Term does not contain the type of the embedded data.  Any possible solutions to this short of walking the query and getting the type of each field from the schema and creating my own print function?

Thanks!

--
Andrew




 NOTICE: This email message is for the sole use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message.


Re: Query.toString printing binary in the output...

Posted by Erick Erickson <er...@gmail.com>.
I'm afraid I won't have time to dig into this for a while, anyone else want
to chime in?

Erick


On Tue, Mar 19, 2013 at 9:08 AM, Andrew Lundgren
<lu...@familysearch.org>wrote:

> This is perhaps more clear:
>
> Assuming you have a schema where:
>
>   <field name="collection_id" type="integer" indexed="true" stored="false"
> required="true" omitTermFreqAndPositions="true"/>
>
> Then:
>
>   void testSamplePrint()throws IOException, SAXException,
> ParserConfigurationException{
>
>       SolrConfig config = new SolrConfig("solrconfig.xml");
>       IndexSchema schema = new IndexSchema(config, "schema.xml", null);
>
>       TermQuery aTerm=new TermQuery(new Term("TestString","123456"));
>       TermQuery bTerm=new TermQuery(new Term("TestString",
>
> schema.getField("collection_id").getType().readableToIndexed("123456")));
>
>       System.out.printf("%s\n", aTerm.toString());
>       System.out.printf("%s\n", bTerm.toString());
>
>       assertEquals(aTerm.toString(),bTerm.toString());
>
>   }
>
> The test output is:
>
> java.lang.AssertionError:
> Expected :TestString:123456
> Actual   :TestString:`
>
> I believe that this is because the Term does not know that it contains an
> encoded integer, and thus cannot parse it.  If the TermQuery knew the type,
> it could also decode it.  But w/o a query to the schema, I don't know how
> to get the toString to function correctly.
>
>
> -----Original Message-----
> From: Erick Erickson [mailto:erickerickson@gmail.com]
> Sent: Monday, March 18, 2013 7:55 PM
> To: solr-user@lucene.apache.org
> Subject: Re: Query.toString printing binary in the output...
>
> If you simply attach &debug=all to your URL, you should see the query come
> back in your response, XML, JSON, whatever. If that also shows bizarre
> characters, then that will give you some idea whether it's in Solr or not.
>
> But you haven't given us much info about how/where you call toString. You
> may be getting into trouble with character sets (although I'd find that
> quite odd, but its a possibility.
>
> What I'm really finding confusing is that you're mentioning Term alongside
> query.toString() (at least that's what I think you're saying), which has
> nothing at all to do with Terms, it's just the query string passed in. So
> I'm really puzzled as to what you're doing to get this kind of output, it
> almost looks like you're trying to print out the _results_ of a query, not
> the query.
>
> So some clarification would be helpful...
>
> Best
> Erick
>
>
> On Mon, Mar 18, 2013 at 12:01 PM, Andrew Lundgren <
> lundgren@familysearch.org
> > wrote:
>
> > I am sorry, I don't follow what you mean by debug=query.  Can you
> > elaborate on that a bit?
> >
> > Thanks!
> >
> > -----Original Message-----
> > From: Erick Erickson [mailto:erickerickson@gmail.com]
> > Sent: Sunday, March 17, 2013 8:09 AM
> > To: solr-user@lucene.apache.org
> > Subject: Re: Query.toString printing binary in the output...
> >
> > Hmmm, without looking at the code, somehow when you specify
> > debug=query you get readable results, maybe that code would be a place
> to start?
> >
> > And are you looking for the parsed output? Otherwise you could print
> > original query.
> >
> > Not much help....
> > Erick
> >
> >
> > On Fri, Mar 15, 2013 at 3:24 PM, Andrew Lundgren
> > <lu...@familysearch.org>wrote:
> >
> > > We use the toString call on the query in our logs.  For some numeric
> > > types, the encoded form of the number is being printed instead of
> > > the readable form.
> > >
> > > This makes tail and some other tools very unhappy...
> > >
> > > Here is a partial example of a query.toString() that would have had
> > > binary in it.  As a short term work around I replaced all
> > > non-printable characters in the string with an '_'.
> > >
> > > (collection_id:`__z_[^0.027 collection_id:`__nB+^0.026
> > > collection_id:`__Zl_^0.025 collection_id:`__i49^0.024
> > > collection_id:`__Pq%^0.023 collection_id:`__VCS^0.022
> > > collection_id:`__WbH^0.021 collection_id:`__Yu_^0.02
> > > collection_id:`__UF&^0.019 collection_id:`__I2g^0.018
> > > collection_id:`__PP_^0.016999999 collection_id:`__Ysv^0.015999999
> > > collection_id:`__Oe_^0.014999999 collection_id:`__Ysw^0.013999999
> > > collection_id:`__Wi_^0.012999998 collection_id:`__fLi^0.011999998
> > > collection_id:`__XRk^0.010999998 collection_id:`__Uz[^0.009999998
> > > collection_id:`__SE_^0.008999998 collection_id:`__Ysx^0.007999998
> > > collection_id:`__Ysh^0.0069999974 collection_id:`__fLh^0.0059999973
> > > collection_id:`__f _^0.004999997 collection_id:`__`^C^0.003999997
> > > collection_id:`__fKM^0.002999997 collection_id:`__Szo^0.001999997
> > > collection_id:`__f ]^9.99997E-4)
> > >
> > > But, as you can see, that is less than useful...
> > >
> > > I spent some time looking at the source and found that Term does not
> > > contain the type of the embedded data.  Any possible solutions to
> > > this short of walking the query and getting the type of each field
> > > from the schema and creating my own print function?
> > >
> > > Thanks!
> > >
> > > --
> > > Andrew
> > >
> > >
> > >
> > >
> > >  NOTICE: This email message is for the sole use of the intended
> > > recipient(s) and may contain confidential and privileged information.
> > > Any unauthorized review, use, disclosure or distribution is
> > > prohibited. If you are not the intended recipient, please contact
> > > the sender by reply email and destroy all copies of the original
> message.
> > >
> > >
> >
> >
> >  NOTICE: This email message is for the sole use of the intended
> > recipient(s) and may contain confidential and privileged information.
> > Any unauthorized review, use, disclosure or distribution is
> > prohibited. If you are not the intended recipient, please contact the
> > sender by reply email and destroy all copies of the original message.
> >
> >
>
>
>  NOTICE: This email message is for the sole use of the intended
> recipient(s) and may contain confidential and privileged information. Any
> unauthorized review, use, disclosure or distribution is prohibited. If you
> are not the intended recipient, please contact the sender by reply email
> and destroy all copies of the original message.
>
>

Re: Query.toString printing binary in the output...

Posted by Jack Krupansky <ja...@basetechnology.com>.
As Erick was suggesting, add &debugQuery=debug (or &debugQuery=true) to your 
Solr query request, and Solr will display more detail about the parsed 
query.

For example, I see this on a query of an integer field:

curl 
"http://localhost:8983/solr/select/?q=++(i_i:123)+&debugQuery=true&indent=true"
...
<lst name="debug">
  <str name="rawquerystring">  (i_i:123) </str>
  <str name="querystring">  (i_i:123) </str>
  <str name="parsedquery">i_i:123</str>
  <str name="parsedquery_toString">i_i:`#8;#0;#0;#0;{</str>

The "parsedquery_toString" is doing a Query.toString as you have suggested.

But, note that the "parsedquery" displays the source term, exactly as you 
expected. This is because the Solr debug component uses a Solr utility 
method, QueryParsing.toString that is a hardcoded version of Query.toString 
that is schema-aware. The latter is not schema-aware because it is a Lucene 
method and Lucene has no concept of a schema.

-- Jack Krupansky

-----Original Message----- 
From: Andrew Lundgren
Sent: Tuesday, March 19, 2013 12:08 PM
To: solr-user@lucene.apache.org
Subject: RE: Query.toString printing binary in the output...

This is perhaps more clear:

Assuming you have a schema where:

  <field name="collection_id" type="integer" indexed="true" stored="false" 
required="true" omitTermFreqAndPositions="true"/>

Then:

  void testSamplePrint()throws IOException, SAXException, 
ParserConfigurationException{

      SolrConfig config = new SolrConfig("solrconfig.xml");
      IndexSchema schema = new IndexSchema(config, "schema.xml", null);

      TermQuery aTerm=new TermQuery(new Term("TestString","123456"));
      TermQuery bTerm=new TermQuery(new Term("TestString",
              schema.getField("collection_id").getType().readableToIndexed("123456")));

      System.out.printf("%s\n", aTerm.toString());
      System.out.printf("%s\n", bTerm.toString());

      assertEquals(aTerm.toString(),bTerm.toString());

  }

The test output is:

java.lang.AssertionError:
Expected :TestString:123456
Actual   :TestString:`

I believe that this is because the Term does not know that it contains an 
encoded integer, and thus cannot parse it.  If the TermQuery knew the type, 
it could also decode it.  But w/o a query to the schema, I don't know how to 
get the toString to function correctly.


-----Original Message-----
From: Erick Erickson [mailto:erickerickson@gmail.com]
Sent: Monday, March 18, 2013 7:55 PM
To: solr-user@lucene.apache.org
Subject: Re: Query.toString printing binary in the output...

If you simply attach &debug=all to your URL, you should see the query come 
back in your response, XML, JSON, whatever. If that also shows bizarre 
characters, then that will give you some idea whether it's in Solr or not.

But you haven't given us much info about how/where you call toString. You 
may be getting into trouble with character sets (although I'd find that 
quite odd, but its a possibility.

What I'm really finding confusing is that you're mentioning Term alongside
query.toString() (at least that's what I think you're saying), which has 
nothing at all to do with Terms, it's just the query string passed in. So 
I'm really puzzled as to what you're doing to get this kind of output, it 
almost looks like you're trying to print out the _results_ of a query, not 
the query.

So some clarification would be helpful...

Best
Erick


On Mon, Mar 18, 2013 at 12:01 PM, Andrew Lundgren <lundgren@familysearch.org
> wrote:

> I am sorry, I don't follow what you mean by debug=query.  Can you
> elaborate on that a bit?
>
> Thanks!
>
> -----Original Message-----
> From: Erick Erickson [mailto:erickerickson@gmail.com]
> Sent: Sunday, March 17, 2013 8:09 AM
> To: solr-user@lucene.apache.org
> Subject: Re: Query.toString printing binary in the output...
>
> Hmmm, without looking at the code, somehow when you specify
> debug=query you get readable results, maybe that code would be a place to 
> start?
>
> And are you looking for the parsed output? Otherwise you could print
> original query.
>
> Not much help....
> Erick
>
>
> On Fri, Mar 15, 2013 at 3:24 PM, Andrew Lundgren
> <lu...@familysearch.org>wrote:
>
> > We use the toString call on the query in our logs.  For some numeric
> > types, the encoded form of the number is being printed instead of
> > the readable form.
> >
> > This makes tail and some other tools very unhappy...
> >
> > Here is a partial example of a query.toString() that would have had
> > binary in it.  As a short term work around I replaced all
> > non-printable characters in the string with an '_'.
> >
> > (collection_id:`__z_[^0.027 collection_id:`__nB+^0.026
> > collection_id:`__Zl_^0.025 collection_id:`__i49^0.024
> > collection_id:`__Pq%^0.023 collection_id:`__VCS^0.022
> > collection_id:`__WbH^0.021 collection_id:`__Yu_^0.02
> > collection_id:`__UF&^0.019 collection_id:`__I2g^0.018
> > collection_id:`__PP_^0.016999999 collection_id:`__Ysv^0.015999999
> > collection_id:`__Oe_^0.014999999 collection_id:`__Ysw^0.013999999
> > collection_id:`__Wi_^0.012999998 collection_id:`__fLi^0.011999998
> > collection_id:`__XRk^0.010999998 collection_id:`__Uz[^0.009999998
> > collection_id:`__SE_^0.008999998 collection_id:`__Ysx^0.007999998
> > collection_id:`__Ysh^0.0069999974 collection_id:`__fLh^0.0059999973
> > collection_id:`__f _^0.004999997 collection_id:`__`^C^0.003999997
> > collection_id:`__fKM^0.002999997 collection_id:`__Szo^0.001999997
> > collection_id:`__f ]^9.99997E-4)
> >
> > But, as you can see, that is less than useful...
> >
> > I spent some time looking at the source and found that Term does not
> > contain the type of the embedded data.  Any possible solutions to
> > this short of walking the query and getting the type of each field
> > from the schema and creating my own print function?
> >
> > Thanks!
> >
> > --
> > Andrew
> >
> >
> >
> >
> >  NOTICE: This email message is for the sole use of the intended
> > recipient(s) and may contain confidential and privileged information.
> > Any unauthorized review, use, disclosure or distribution is
> > prohibited. If you are not the intended recipient, please contact
> > the sender by reply email and destroy all copies of the original 
> > message.
> >
> >
>
>
>  NOTICE: This email message is for the sole use of the intended
> recipient(s) and may contain confidential and privileged information.
> Any unauthorized review, use, disclosure or distribution is
> prohibited. If you are not the intended recipient, please contact the
> sender by reply email and destroy all copies of the original message.
>
>


NOTICE: This email message is for the sole use of the intended recipient(s) 
and may contain confidential and privileged information. Any unauthorized 
review, use, disclosure or distribution is prohibited. If you are not the 
intended recipient, please contact the sender by reply email and destroy all 
copies of the original message. 


RE: Query.toString printing binary in the output...

Posted by Andrew Lundgren <lu...@familysearch.org>.
This is perhaps more clear:

Assuming you have a schema where:

  <field name="collection_id" type="integer" indexed="true" stored="false" required="true" omitTermFreqAndPositions="true"/>

Then:

  void testSamplePrint()throws IOException, SAXException, ParserConfigurationException{

      SolrConfig config = new SolrConfig("solrconfig.xml");
      IndexSchema schema = new IndexSchema(config, "schema.xml", null);

      TermQuery aTerm=new TermQuery(new Term("TestString","123456"));
      TermQuery bTerm=new TermQuery(new Term("TestString",
              schema.getField("collection_id").getType().readableToIndexed("123456")));

      System.out.printf("%s\n", aTerm.toString());
      System.out.printf("%s\n", bTerm.toString());

      assertEquals(aTerm.toString(),bTerm.toString());

  }

The test output is: 

java.lang.AssertionError: 
Expected :TestString:123456
Actual   :TestString:`

I believe that this is because the Term does not know that it contains an encoded integer, and thus cannot parse it.  If the TermQuery knew the type, it could also decode it.  But w/o a query to the schema, I don't know how to get the toString to function correctly.


-----Original Message-----
From: Erick Erickson [mailto:erickerickson@gmail.com] 
Sent: Monday, March 18, 2013 7:55 PM
To: solr-user@lucene.apache.org
Subject: Re: Query.toString printing binary in the output...

If you simply attach &debug=all to your URL, you should see the query come back in your response, XML, JSON, whatever. If that also shows bizarre characters, then that will give you some idea whether it's in Solr or not.

But you haven't given us much info about how/where you call toString. You may be getting into trouble with character sets (although I'd find that quite odd, but its a possibility.

What I'm really finding confusing is that you're mentioning Term alongside
query.toString() (at least that's what I think you're saying), which has nothing at all to do with Terms, it's just the query string passed in. So I'm really puzzled as to what you're doing to get this kind of output, it almost looks like you're trying to print out the _results_ of a query, not the query.

So some clarification would be helpful...

Best
Erick


On Mon, Mar 18, 2013 at 12:01 PM, Andrew Lundgren <lundgren@familysearch.org
> wrote:

> I am sorry, I don't follow what you mean by debug=query.  Can you 
> elaborate on that a bit?
>
> Thanks!
>
> -----Original Message-----
> From: Erick Erickson [mailto:erickerickson@gmail.com]
> Sent: Sunday, March 17, 2013 8:09 AM
> To: solr-user@lucene.apache.org
> Subject: Re: Query.toString printing binary in the output...
>
> Hmmm, without looking at the code, somehow when you specify 
> debug=query you get readable results, maybe that code would be a place to start?
>
> And are you looking for the parsed output? Otherwise you could print 
> original query.
>
> Not much help....
> Erick
>
>
> On Fri, Mar 15, 2013 at 3:24 PM, Andrew Lundgren
> <lu...@familysearch.org>wrote:
>
> > We use the toString call on the query in our logs.  For some numeric 
> > types, the encoded form of the number is being printed instead of 
> > the readable form.
> >
> > This makes tail and some other tools very unhappy...
> >
> > Here is a partial example of a query.toString() that would have had 
> > binary in it.  As a short term work around I replaced all 
> > non-printable characters in the string with an '_'.
> >
> > (collection_id:`__z_[^0.027 collection_id:`__nB+^0.026
> > collection_id:`__Zl_^0.025 collection_id:`__i49^0.024
> > collection_id:`__Pq%^0.023 collection_id:`__VCS^0.022
> > collection_id:`__WbH^0.021 collection_id:`__Yu_^0.02
> > collection_id:`__UF&^0.019 collection_id:`__I2g^0.018
> > collection_id:`__PP_^0.016999999 collection_id:`__Ysv^0.015999999
> > collection_id:`__Oe_^0.014999999 collection_id:`__Ysw^0.013999999
> > collection_id:`__Wi_^0.012999998 collection_id:`__fLi^0.011999998
> > collection_id:`__XRk^0.010999998 collection_id:`__Uz[^0.009999998
> > collection_id:`__SE_^0.008999998 collection_id:`__Ysx^0.007999998
> > collection_id:`__Ysh^0.0069999974 collection_id:`__fLh^0.0059999973 
> > collection_id:`__f _^0.004999997 collection_id:`__`^C^0.003999997
> > collection_id:`__fKM^0.002999997 collection_id:`__Szo^0.001999997 
> > collection_id:`__f ]^9.99997E-4)
> >
> > But, as you can see, that is less than useful...
> >
> > I spent some time looking at the source and found that Term does not 
> > contain the type of the embedded data.  Any possible solutions to 
> > this short of walking the query and getting the type of each field 
> > from the schema and creating my own print function?
> >
> > Thanks!
> >
> > --
> > Andrew
> >
> >
> >
> >
> >  NOTICE: This email message is for the sole use of the intended
> > recipient(s) and may contain confidential and privileged information.
> > Any unauthorized review, use, disclosure or distribution is 
> > prohibited. If you are not the intended recipient, please contact 
> > the sender by reply email and destroy all copies of the original message.
> >
> >
>
>
>  NOTICE: This email message is for the sole use of the intended
> recipient(s) and may contain confidential and privileged information. 
> Any unauthorized review, use, disclosure or distribution is 
> prohibited. If you are not the intended recipient, please contact the 
> sender by reply email and destroy all copies of the original message.
>
>


 NOTICE: This email message is for the sole use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message.


RE: Query.toString printing binary in the output...

Posted by Andrew Lundgren <lu...@familysearch.org>.
I have not.  Just guessing, but that looks like code that walks a query and uses the schema to figure out what the types should be.

That looks like the call I should be using.  Any idea of how much of performance impact this has compared to just the Query.toString call (that admittedly doesn't always work)?

I haven't used the debug option either, but I don't think that is the right path because we are currently logging all of the queries, and that seems to be targeted more at a one off operation.  (Still helpful to know for those cases though.  Thank you.)


-----Original Message-----
From: Jack Krupansky [mailto:jack@basetechnology.com] 
Sent: Tuesday, March 19, 2013 5:20 PM
To: solr-user@lucene.apache.org
Subject: Re: Query.toString printing binary in the output...

Did you try QueryParsing.toString? As in:

logger.info("db retrieve time=" + (System.currentTimeMillis() - start) + ", query=" +
         QueryParsing.toString(rb.getQuery(), rb.req.getSchema()) + ", indexIds=" + getIndexIds(rb));

-- Jack Krupansky

-----Original Message-----
From: Andrew Lundgren
Sent: Tuesday, March 19, 2013 11:52 AM
To: solr-user@lucene.apache.org
Subject: RE: Query.toString printing binary in the output...

Thank you for clarifying.

The logging line is this:

logger.info("db retrieve time=" + (System.currentTimeMillis() - start) + ", query=" +
         rb.getQuery().toString().replaceAll("\\p{Cntrl}", "_") + ", indexIds=" + getIndexIds(rb));

(The replaceAll call is used to clean out the binary.)

The a complete log looks like this:  (I removed some values and inserted
Zs.)

2013-03-19 01:36:58,648 INFO
[org.apache.solr.handler.component.DatabaseComponent] (http-8080-19) [] [] 
[] [] [] ip-10-212-91-229/10.212.91.229   db retrieve time=53, 
query=+(+(givenname:ZZZZ^1.8 | givenname_standard:ZZZZ^1.08 |
givenname:?^-3.6179998 | givenname:Z^0.17999999) +(surname:ZZZZ^1.8 |
surname_standard:ZZZZ^1.08) +(birth_year:1855^0.495 | birth_year:1856^0.495 
| (-marriage_year:[1850 TO 1854]^1.0E-4 -death_year:[1850 TO
1854]^1.0E-4 -residence_year:[1850 TO 1854]^1.0E-4 -other_year:[1850 TO
1854]^1.0E-4 +est_birth_year_range:[180 TO 185]^-1.005)) 
+((+(birth_place:amherst,1929953 |
birth_place_ancestors:amherst,1929953^0.99 | birth_place:amherst,6279984 |
birth_place_ancestors:amherst,6279984^0.99 |
birth_place:novascotia,1927164^0.7 |
birth_place_ancestors:novascotia,1927164^0.69 |
birth_place:cumberland,1929953^0.7 |
birth_place_ancestors:cumberland,1929953^0.69 | birth_place:canada,-1^0.2)) 
| (+birth_place:?^-2.01 +((record_place:amherst,1929953^0.7 |
record_place_ancestors:amherst,1929953^0.69299996 |
record_place:amherst,6279984^0.7 |
record_place_ancestors:amherst,6279984^0.69299996 |
record_place:novascotia,1927164^0.48999998 |
record_place_ancestors:novascotia,1927164^0.48299998 |
record_place:cumberland,1929953^0.48999998 |
record_place_ancestors:cumberland,1929953^0.48299998 |
record_place:canada,-1^0.14))))) is_principal:T^0.01
(collection_id:`__z_[^0.027 collection_id:`__nB+^0.026
collection_id:`__Zl_^0.025 collection_id:`__i49^0.024
collection_id:`__Pq%^0.023 collection_id:`__VCS^0.022
collection_id:`__WbH^0.021 collection_id:`__Yu_^0.02
collection_id:`__UF&^0.019 collection_id:`__I2g^0.018
collection_id:`__PP_^0.016999999 collection_id:`__Ysv^0.015999999
collection_id:`__Oe_^0.014999999 collection_id:`__Ysw^0.013999999
collection_id:`__Wi_^0.012999998 collection_id:`__fLi^0.011999998
collection_id:`__XRk^0.010999998 collection_id:`__Uz[^0.009999998
collection_id:`__SE_^0.008999998 collection_id:`__Ysx^0.007999998
collection_id:`__Ysh^0.0069999974 collection_id:`__fLh^0.0059999973 collection_id:`__f _^0.004999997 collection_id:`__`^C^0.003999997
collection_id:`__fKM^0.002999997 collection_id:`__Szo^0.001999997 collection_id:`__f ]^9.99997E-4) record_type:`_____^0.11
record_country:Canada^0.1 record_subcountry:Canada,Nova Scotia^0.1, indexIds=5649621248770, 5649707485955, 5649774056450, 5650368372995, 5650800358658, 40314148353, 17914147586, 77849158944, 77849158945, 77849158946, 77849158947, 77849158948, 77849158949, 77849158950, 77849158951, 77849158952, 77849158953, 77849158954, 77849158955, 77849158956


We have seen these types of issues (though the opposite) when querying with non-encoded ints.

When preparing the query we have to encode the collection IDs like this:

        Query q = new TermQuery(new Term(SolrTag.COLLECTION_ID.getName(),
type.readableToIndexed(Integer.toString(collectionId))));

So perhaps I am using the wrong term when I used encoded, maybe it should have been Indexed?  But that seems to have other meanings would be potentially more confusing.  These are the Terms that are being printed above that remain in the non-readable format when toString is called. 
(Perhaps we should be using something other than readableToIndexed?)


Thanks!


-----Original Message-----
From: Erick Erickson [mailto:erickerickson@gmail.com]
Sent: Monday, March 18, 2013 7:55 PM
To: solr-user@lucene.apache.org
Subject: Re: Query.toString printing binary in the output...

If you simply attach &debug=all to your URL, you should see the query come back in your response, XML, JSON, whatever. If that also shows bizarre characters, then that will give you some idea whether it's in Solr or not.

But you haven't given us much info about how/where you call toString. You may be getting into trouble with character sets (although I'd find that quite odd, but its a possibility.

What I'm really finding confusing is that you're mentioning Term alongside
query.toString() (at least that's what I think you're saying), which has nothing at all to do with Terms, it's just the query string passed in. So I'm really puzzled as to what you're doing to get this kind of output, it almost looks like you're trying to print out the _results_ of a query, not the query.

So some clarification would be helpful...

Best
Erick


On Mon, Mar 18, 2013 at 12:01 PM, Andrew Lundgren <lundgren@familysearch.org
> wrote:

> I am sorry, I don't follow what you mean by debug=query.  Can you 
> elaborate on that a bit?
>
> Thanks!
>
> -----Original Message-----
> From: Erick Erickson [mailto:erickerickson@gmail.com]
> Sent: Sunday, March 17, 2013 8:09 AM
> To: solr-user@lucene.apache.org
> Subject: Re: Query.toString printing binary in the output...
>
> Hmmm, without looking at the code, somehow when you specify 
> debug=query you get readable results, maybe that code would be a place 
> to start?
>
> And are you looking for the parsed output? Otherwise you could print 
> original query.
>
> Not much help....
> Erick
>
>
> On Fri, Mar 15, 2013 at 3:24 PM, Andrew Lundgren
> <lu...@familysearch.org>wrote:
>
> > We use the toString call on the query in our logs.  For some numeric 
> > types, the encoded form of the number is being printed instead of 
> > the readable form.
> >
> > This makes tail and some other tools very unhappy...
> >
> > Here is a partial example of a query.toString() that would have had 
> > binary in it.  As a short term work around I replaced all 
> > non-printable characters in the string with an '_'.
> >
> > (collection_id:`__z_[^0.027 collection_id:`__nB+^0.026
> > collection_id:`__Zl_^0.025 collection_id:`__i49^0.024
> > collection_id:`__Pq%^0.023 collection_id:`__VCS^0.022
> > collection_id:`__WbH^0.021 collection_id:`__Yu_^0.02
> > collection_id:`__UF&^0.019 collection_id:`__I2g^0.018
> > collection_id:`__PP_^0.016999999 collection_id:`__Ysv^0.015999999
> > collection_id:`__Oe_^0.014999999 collection_id:`__Ysw^0.013999999
> > collection_id:`__Wi_^0.012999998 collection_id:`__fLi^0.011999998
> > collection_id:`__XRk^0.010999998 collection_id:`__Uz[^0.009999998
> > collection_id:`__SE_^0.008999998 collection_id:`__Ysx^0.007999998
> > collection_id:`__Ysh^0.0069999974 collection_id:`__fLh^0.0059999973 
> > collection_id:`__f _^0.004999997 collection_id:`__`^C^0.003999997
> > collection_id:`__fKM^0.002999997 collection_id:`__Szo^0.001999997 
> > collection_id:`__f ]^9.99997E-4)
> >
> > But, as you can see, that is less than useful...
> >
> > I spent some time looking at the source and found that Term does not 
> > contain the type of the embedded data.  Any possible solutions to 
> > this short of walking the query and getting the type of each field 
> > from the schema and creating my own print function?
> >
> > Thanks!
> >
> > --
> > Andrew
> >
> >
> >
> >
> >  NOTICE: This email message is for the sole use of the intended
> > recipient(s) and may contain confidential and privileged information.
> > Any unauthorized review, use, disclosure or distribution is 
> > prohibited. If you are not the intended recipient, please contact 
> > the sender by reply email and destroy all copies of the original 
> > message.
> >
> >
>
>
>  NOTICE: This email message is for the sole use of the intended
> recipient(s) and may contain confidential and privileged information.
> Any unauthorized review, use, disclosure or distribution is 
> prohibited. If you are not the intended recipient, please contact the 
> sender by reply email and destroy all copies of the original message.
>
>


NOTICE: This email message is for the sole use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message. 


 NOTICE: This email message is for the sole use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message.


Re: Query.toString printing binary in the output...

Posted by Jack Krupansky <ja...@basetechnology.com>.
Did you try QueryParsing.toString? As in:

logger.info("db retrieve time=" + (System.currentTimeMillis() - start) + ", 
query=" +
         QueryParsing.toString(rb.getQuery(), rb.req.getSchema()) + ", 
indexIds=" + getIndexIds(rb));

-- Jack Krupansky

-----Original Message----- 
From: Andrew Lundgren
Sent: Tuesday, March 19, 2013 11:52 AM
To: solr-user@lucene.apache.org
Subject: RE: Query.toString printing binary in the output...

Thank you for clarifying.

The logging line is this:

logger.info("db retrieve time=" + (System.currentTimeMillis() - start) + ", 
query=" +
         rb.getQuery().toString().replaceAll("\\p{Cntrl}", "_") + ", 
indexIds=" + getIndexIds(rb));

(The replaceAll call is used to clean out the binary.)

The a complete log looks like this:  (I removed some values and inserted 
Zs.)

2013-03-19 01:36:58,648 INFO 
[org.apache.solr.handler.component.DatabaseComponent] (http-8080-19) [] [] 
[] [] [] ip-10-212-91-229/10.212.91.229   db retrieve time=53, 
query=+(+(givenname:ZZZZ^1.8 | givenname_standard:ZZZZ^1.08 | 
givenname:?^-3.6179998 | givenname:Z^0.17999999) +(surname:ZZZZ^1.8 | 
surname_standard:ZZZZ^1.08) +(birth_year:1855^0.495 | birth_year:1856^0.495 
| (-marriage_year:[1850 TO 1854]^1.0E-4 -death_year:[1850 TO 
1854]^1.0E-4 -residence_year:[1850 TO 1854]^1.0E-4 -other_year:[1850 TO 
1854]^1.0E-4 +est_birth_year_range:[180 TO 185]^-1.005)) 
+((+(birth_place:amherst,1929953 | 
birth_place_ancestors:amherst,1929953^0.99 | birth_place:amherst,6279984 | 
birth_place_ancestors:amherst,6279984^0.99 | 
birth_place:novascotia,1927164^0.7 | 
birth_place_ancestors:novascotia,1927164^0.69 | 
birth_place:cumberland,1929953^0.7 | 
birth_place_ancestors:cumberland,1929953^0.69 | birth_place:canada,-1^0.2)) 
| (+birth_place:?^-2.01 +((record_place:amherst,1929953^0.7 | 
record_place_ancestors:amherst,1929953^0.69299996 | 
record_place:amherst,6279984^0.7 | 
record_place_ancestors:amherst,6279984^0.69299996 | 
record_place:novascotia,1927164^0.48999998 | 
record_place_ancestors:novascotia,1927164^0.48299998 | 
record_place:cumberland,1929953^0.48999998 | 
record_place_ancestors:cumberland,1929953^0.48299998 | 
record_place:canada,-1^0.14))))) is_principal:T^0.01 
(collection_id:`__z_[^0.027 collection_id:`__nB+^0.026 
collection_id:`__Zl_^0.025 collection_id:`__i49^0.024 
collection_id:`__Pq%^0.023 collection_id:`__VCS^0.022 
collection_id:`__WbH^0.021 collection_id:`__Yu_^0.02 
collection_id:`__UF&^0.019 collection_id:`__I2g^0.018 
collection_id:`__PP_^0.016999999 collection_id:`__Ysv^0.015999999 
collection_id:`__Oe_^0.014999999 collection_id:`__Ysw^0.013999999 
collection_id:`__Wi_^0.012999998 collection_id:`__fLi^0.011999998 
collection_id:`__XRk^0.010999998 collection_id:`__Uz[^0.009999998 
collection_id:`__SE_^0.008999998 collection_id:`__Ysx^0.007999998 
collection_id:`__Ysh^0.0069999974 collection_id:`__fLh^0.0059999973 
collection_id:`__f _^0.004999997 collection_id:`__`^C^0.003999997 
collection_id:`__fKM^0.002999997 collection_id:`__Szo^0.001999997 
collection_id:`__f ]^9.99997E-4) record_type:`_____^0.11 
record_country:Canada^0.1 record_subcountry:Canada,Nova Scotia^0.1, 
indexIds=5649621248770, 5649707485955, 5649774056450, 5650368372995, 
5650800358658, 40314148353, 17914147586, 77849158944, 77849158945, 
77849158946, 77849158947, 77849158948, 77849158949, 77849158950, 
77849158951, 77849158952, 77849158953, 77849158954, 77849158955, 77849158956


We have seen these types of issues (though the opposite) when querying with 
non-encoded ints.

When preparing the query we have to encode the collection IDs like this:

        Query q = new TermQuery(new Term(SolrTag.COLLECTION_ID.getName(), 
type.readableToIndexed(Integer.toString(collectionId))));

So perhaps I am using the wrong term when I used encoded, maybe it should 
have been Indexed?  But that seems to have other meanings would be 
potentially more confusing.  These are the Terms that are being printed 
above that remain in the non-readable format when toString is called. 
(Perhaps we should be using something other than readableToIndexed?)


Thanks!


-----Original Message-----
From: Erick Erickson [mailto:erickerickson@gmail.com]
Sent: Monday, March 18, 2013 7:55 PM
To: solr-user@lucene.apache.org
Subject: Re: Query.toString printing binary in the output...

If you simply attach &debug=all to your URL, you should see the query come 
back in your response, XML, JSON, whatever. If that also shows bizarre 
characters, then that will give you some idea whether it's in Solr or not.

But you haven't given us much info about how/where you call toString. You 
may be getting into trouble with character sets (although I'd find that 
quite odd, but its a possibility.

What I'm really finding confusing is that you're mentioning Term alongside
query.toString() (at least that's what I think you're saying), which has 
nothing at all to do with Terms, it's just the query string passed in. So 
I'm really puzzled as to what you're doing to get this kind of output, it 
almost looks like you're trying to print out the _results_ of a query, not 
the query.

So some clarification would be helpful...

Best
Erick


On Mon, Mar 18, 2013 at 12:01 PM, Andrew Lundgren <lundgren@familysearch.org
> wrote:

> I am sorry, I don't follow what you mean by debug=query.  Can you
> elaborate on that a bit?
>
> Thanks!
>
> -----Original Message-----
> From: Erick Erickson [mailto:erickerickson@gmail.com]
> Sent: Sunday, March 17, 2013 8:09 AM
> To: solr-user@lucene.apache.org
> Subject: Re: Query.toString printing binary in the output...
>
> Hmmm, without looking at the code, somehow when you specify
> debug=query you get readable results, maybe that code would be a place to 
> start?
>
> And are you looking for the parsed output? Otherwise you could print
> original query.
>
> Not much help....
> Erick
>
>
> On Fri, Mar 15, 2013 at 3:24 PM, Andrew Lundgren
> <lu...@familysearch.org>wrote:
>
> > We use the toString call on the query in our logs.  For some numeric
> > types, the encoded form of the number is being printed instead of
> > the readable form.
> >
> > This makes tail and some other tools very unhappy...
> >
> > Here is a partial example of a query.toString() that would have had
> > binary in it.  As a short term work around I replaced all
> > non-printable characters in the string with an '_'.
> >
> > (collection_id:`__z_[^0.027 collection_id:`__nB+^0.026
> > collection_id:`__Zl_^0.025 collection_id:`__i49^0.024
> > collection_id:`__Pq%^0.023 collection_id:`__VCS^0.022
> > collection_id:`__WbH^0.021 collection_id:`__Yu_^0.02
> > collection_id:`__UF&^0.019 collection_id:`__I2g^0.018
> > collection_id:`__PP_^0.016999999 collection_id:`__Ysv^0.015999999
> > collection_id:`__Oe_^0.014999999 collection_id:`__Ysw^0.013999999
> > collection_id:`__Wi_^0.012999998 collection_id:`__fLi^0.011999998
> > collection_id:`__XRk^0.010999998 collection_id:`__Uz[^0.009999998
> > collection_id:`__SE_^0.008999998 collection_id:`__Ysx^0.007999998
> > collection_id:`__Ysh^0.0069999974 collection_id:`__fLh^0.0059999973
> > collection_id:`__f _^0.004999997 collection_id:`__`^C^0.003999997
> > collection_id:`__fKM^0.002999997 collection_id:`__Szo^0.001999997
> > collection_id:`__f ]^9.99997E-4)
> >
> > But, as you can see, that is less than useful...
> >
> > I spent some time looking at the source and found that Term does not
> > contain the type of the embedded data.  Any possible solutions to
> > this short of walking the query and getting the type of each field
> > from the schema and creating my own print function?
> >
> > Thanks!
> >
> > --
> > Andrew
> >
> >
> >
> >
> >  NOTICE: This email message is for the sole use of the intended
> > recipient(s) and may contain confidential and privileged information.
> > Any unauthorized review, use, disclosure or distribution is
> > prohibited. If you are not the intended recipient, please contact
> > the sender by reply email and destroy all copies of the original 
> > message.
> >
> >
>
>
>  NOTICE: This email message is for the sole use of the intended
> recipient(s) and may contain confidential and privileged information.
> Any unauthorized review, use, disclosure or distribution is
> prohibited. If you are not the intended recipient, please contact the
> sender by reply email and destroy all copies of the original message.
>
>


NOTICE: This email message is for the sole use of the intended recipient(s) 
and may contain confidential and privileged information. Any unauthorized 
review, use, disclosure or distribution is prohibited. If you are not the 
intended recipient, please contact the sender by reply email and destroy all 
copies of the original message. 


RE: Query.toString printing binary in the output...

Posted by Andrew Lundgren <lu...@familysearch.org>.
Thank you for clarifying.

The logging line is this:

logger.info("db retrieve time=" + (System.currentTimeMillis() - start) + ", query=" +
         rb.getQuery().toString().replaceAll("\\p{Cntrl}", "_") + ", indexIds=" + getIndexIds(rb));

(The replaceAll call is used to clean out the binary.)

The a complete log looks like this:  (I removed some values and inserted Zs.)

2013-03-19 01:36:58,648 INFO  [org.apache.solr.handler.component.DatabaseComponent] (http-8080-19) [] [] [] [] [] ip-10-212-91-229/10.212.91.229   db retrieve time=53, query=+(+(givenname:ZZZZ^1.8 | givenname_standard:ZZZZ^1.08 | givenname:?^-3.6179998 | givenname:Z^0.17999999) +(surname:ZZZZ^1.8 | surname_standard:ZZZZ^1.08) +(birth_year:1855^0.495 | birth_year:1856^0.495 | (-marriage_year:[1850 TO 1854]^1.0E-4 -death_year:[1850 TO 1854]^1.0E-4 -residence_year:[1850 TO 1854]^1.0E-4 -other_year:[1850 TO 1854]^1.0E-4 +est_birth_year_range:[180 TO 185]^-1.005)) +((+(birth_place:amherst,1929953 | birth_place_ancestors:amherst,1929953^0.99 | birth_place:amherst,6279984 | birth_place_ancestors:amherst,6279984^0.99 | birth_place:novascotia,1927164^0.7 | birth_place_ancestors:novascotia,1927164^0.69 | birth_place:cumberland,1929953^0.7 | birth_place_ancestors:cumberland,1929953^0.69 | birth_place:canada,-1^0.2)) | (+birth_place:?^-2.01 +((record_place:amherst,1929953^0.7 | record_place_ancestors:amherst,1929953^0.69299996 | record_place:amherst,6279984^0.7 | record_place_ancestors:amherst,6279984^0.69299996 | record_place:novascotia,1927164^0.48999998 | record_place_ancestors:novascotia,1927164^0.48299998 | record_place:cumberland,1929953^0.48999998 | record_place_ancestors:cumberland,1929953^0.48299998 | record_place:canada,-1^0.14))))) is_principal:T^0.01 (collection_id:`__z_[^0.027 collection_id:`__nB+^0.026 collection_id:`__Zl_^0.025 collection_id:`__i49^0.024 collection_id:`__Pq%^0.023 collection_id:`__VCS^0.022 collection_id:`__WbH^0.021 collection_id:`__Yu_^0.02 collection_id:`__UF&^0.019 collection_id:`__I2g^0.018 collection_id:`__PP_^0.016999999 collection_id:`__Ysv^0.015999999 collection_id:`__Oe_^0.014999999 collection_id:`__Ysw^0.013999999 collection_id:`__Wi_^0.012999998 collection_id:`__fLi^0.011999998 collection_id:`__XRk^0.010999998 collection_id:`__Uz[^0.009999998 collection_id:`__SE_^0.008999998 collection_id:`__Ysx^0.007999998 collection_id:`__Ysh^0.0069999974 collection_id:`__fLh^0.0059999973 collection_id:`__f _^0.004999997 collection_id:`__`^C^0.003999997 collection_id:`__fKM^0.002999997 collection_id:`__Szo^0.001999997 collection_id:`__f ]^9.99997E-4) record_type:`_____^0.11 record_country:Canada^0.1 record_subcountry:Canada,Nova Scotia^0.1, indexIds=5649621248770, 5649707485955, 5649774056450, 5650368372995, 5650800358658, 40314148353, 17914147586, 77849158944, 77849158945, 77849158946, 77849158947, 77849158948, 77849158949, 77849158950, 77849158951, 77849158952, 77849158953, 77849158954, 77849158955, 77849158956  


We have seen these types of issues (though the opposite) when querying with non-encoded ints.  

When preparing the query we have to encode the collection IDs like this:

        Query q = new TermQuery(new Term(SolrTag.COLLECTION_ID.getName(), type.readableToIndexed(Integer.toString(collectionId))));

So perhaps I am using the wrong term when I used encoded, maybe it should have been Indexed?  But that seems to have other meanings would be potentially more confusing.  These are the Terms that are being printed above that remain in the non-readable format when toString is called.  (Perhaps we should be using something other than readableToIndexed?)


Thanks!


-----Original Message-----
From: Erick Erickson [mailto:erickerickson@gmail.com] 
Sent: Monday, March 18, 2013 7:55 PM
To: solr-user@lucene.apache.org
Subject: Re: Query.toString printing binary in the output...

If you simply attach &debug=all to your URL, you should see the query come back in your response, XML, JSON, whatever. If that also shows bizarre characters, then that will give you some idea whether it's in Solr or not.

But you haven't given us much info about how/where you call toString. You may be getting into trouble with character sets (although I'd find that quite odd, but its a possibility.

What I'm really finding confusing is that you're mentioning Term alongside
query.toString() (at least that's what I think you're saying), which has nothing at all to do with Terms, it's just the query string passed in. So I'm really puzzled as to what you're doing to get this kind of output, it almost looks like you're trying to print out the _results_ of a query, not the query.

So some clarification would be helpful...

Best
Erick


On Mon, Mar 18, 2013 at 12:01 PM, Andrew Lundgren <lundgren@familysearch.org
> wrote:

> I am sorry, I don't follow what you mean by debug=query.  Can you 
> elaborate on that a bit?
>
> Thanks!
>
> -----Original Message-----
> From: Erick Erickson [mailto:erickerickson@gmail.com]
> Sent: Sunday, March 17, 2013 8:09 AM
> To: solr-user@lucene.apache.org
> Subject: Re: Query.toString printing binary in the output...
>
> Hmmm, without looking at the code, somehow when you specify 
> debug=query you get readable results, maybe that code would be a place to start?
>
> And are you looking for the parsed output? Otherwise you could print 
> original query.
>
> Not much help....
> Erick
>
>
> On Fri, Mar 15, 2013 at 3:24 PM, Andrew Lundgren
> <lu...@familysearch.org>wrote:
>
> > We use the toString call on the query in our logs.  For some numeric 
> > types, the encoded form of the number is being printed instead of 
> > the readable form.
> >
> > This makes tail and some other tools very unhappy...
> >
> > Here is a partial example of a query.toString() that would have had 
> > binary in it.  As a short term work around I replaced all 
> > non-printable characters in the string with an '_'.
> >
> > (collection_id:`__z_[^0.027 collection_id:`__nB+^0.026
> > collection_id:`__Zl_^0.025 collection_id:`__i49^0.024
> > collection_id:`__Pq%^0.023 collection_id:`__VCS^0.022
> > collection_id:`__WbH^0.021 collection_id:`__Yu_^0.02
> > collection_id:`__UF&^0.019 collection_id:`__I2g^0.018
> > collection_id:`__PP_^0.016999999 collection_id:`__Ysv^0.015999999
> > collection_id:`__Oe_^0.014999999 collection_id:`__Ysw^0.013999999
> > collection_id:`__Wi_^0.012999998 collection_id:`__fLi^0.011999998
> > collection_id:`__XRk^0.010999998 collection_id:`__Uz[^0.009999998
> > collection_id:`__SE_^0.008999998 collection_id:`__Ysx^0.007999998
> > collection_id:`__Ysh^0.0069999974 collection_id:`__fLh^0.0059999973 
> > collection_id:`__f _^0.004999997 collection_id:`__`^C^0.003999997
> > collection_id:`__fKM^0.002999997 collection_id:`__Szo^0.001999997 
> > collection_id:`__f ]^9.99997E-4)
> >
> > But, as you can see, that is less than useful...
> >
> > I spent some time looking at the source and found that Term does not 
> > contain the type of the embedded data.  Any possible solutions to 
> > this short of walking the query and getting the type of each field 
> > from the schema and creating my own print function?
> >
> > Thanks!
> >
> > --
> > Andrew
> >
> >
> >
> >
> >  NOTICE: This email message is for the sole use of the intended
> > recipient(s) and may contain confidential and privileged information.
> > Any unauthorized review, use, disclosure or distribution is 
> > prohibited. If you are not the intended recipient, please contact 
> > the sender by reply email and destroy all copies of the original message.
> >
> >
>
>
>  NOTICE: This email message is for the sole use of the intended
> recipient(s) and may contain confidential and privileged information. 
> Any unauthorized review, use, disclosure or distribution is 
> prohibited. If you are not the intended recipient, please contact the 
> sender by reply email and destroy all copies of the original message.
>
>


 NOTICE: This email message is for the sole use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message.


Re: Query.toString printing binary in the output...

Posted by Erick Erickson <er...@gmail.com>.
If you simply attach &debug=all to your URL, you should see the query come
back in your response, XML, JSON, whatever. If that also shows bizarre
characters, then that will give you some idea whether it's in Solr or not.

But you haven't given us much info about how/where you call toString. You
may be getting into trouble with character sets (although I'd find that
quite odd, but its a possibility.

What I'm really finding confusing is that you're mentioning Term alongside
query.toString() (at least that's what I think you're saying), which has
nothing at all to do with Terms, it's just the query string passed in. So
I'm really puzzled as to what you're doing to get this kind of output, it
almost looks like you're trying to print out the _results_ of a query, not
the query.

So some clarification would be helpful...

Best
Erick


On Mon, Mar 18, 2013 at 12:01 PM, Andrew Lundgren <lundgren@familysearch.org
> wrote:

> I am sorry, I don't follow what you mean by debug=query.  Can you
> elaborate on that a bit?
>
> Thanks!
>
> -----Original Message-----
> From: Erick Erickson [mailto:erickerickson@gmail.com]
> Sent: Sunday, March 17, 2013 8:09 AM
> To: solr-user@lucene.apache.org
> Subject: Re: Query.toString printing binary in the output...
>
> Hmmm, without looking at the code, somehow when you specify debug=query
> you get readable results, maybe that code would be a place to start?
>
> And are you looking for the parsed output? Otherwise you could print
> original query.
>
> Not much help....
> Erick
>
>
> On Fri, Mar 15, 2013 at 3:24 PM, Andrew Lundgren
> <lu...@familysearch.org>wrote:
>
> > We use the toString call on the query in our logs.  For some numeric
> > types, the encoded form of the number is being printed instead of the
> > readable form.
> >
> > This makes tail and some other tools very unhappy...
> >
> > Here is a partial example of a query.toString() that would have had
> > binary in it.  As a short term work around I replaced all
> > non-printable characters in the string with an '_'.
> >
> > (collection_id:`__z_[^0.027 collection_id:`__nB+^0.026
> > collection_id:`__Zl_^0.025 collection_id:`__i49^0.024
> > collection_id:`__Pq%^0.023 collection_id:`__VCS^0.022
> > collection_id:`__WbH^0.021 collection_id:`__Yu_^0.02
> > collection_id:`__UF&^0.019 collection_id:`__I2g^0.018
> > collection_id:`__PP_^0.016999999 collection_id:`__Ysv^0.015999999
> > collection_id:`__Oe_^0.014999999 collection_id:`__Ysw^0.013999999
> > collection_id:`__Wi_^0.012999998 collection_id:`__fLi^0.011999998
> > collection_id:`__XRk^0.010999998 collection_id:`__Uz[^0.009999998
> > collection_id:`__SE_^0.008999998 collection_id:`__Ysx^0.007999998
> > collection_id:`__Ysh^0.0069999974 collection_id:`__fLh^0.0059999973
> > collection_id:`__f _^0.004999997 collection_id:`__`^C^0.003999997
> > collection_id:`__fKM^0.002999997 collection_id:`__Szo^0.001999997
> > collection_id:`__f ]^9.99997E-4)
> >
> > But, as you can see, that is less than useful...
> >
> > I spent some time looking at the source and found that Term does not
> > contain the type of the embedded data.  Any possible solutions to this
> > short of walking the query and getting the type of each field from the
> > schema and creating my own print function?
> >
> > Thanks!
> >
> > --
> > Andrew
> >
> >
> >
> >
> >  NOTICE: This email message is for the sole use of the intended
> > recipient(s) and may contain confidential and privileged information.
> > Any unauthorized review, use, disclosure or distribution is
> > prohibited. If you are not the intended recipient, please contact the
> > sender by reply email and destroy all copies of the original message.
> >
> >
>
>
>  NOTICE: This email message is for the sole use of the intended
> recipient(s) and may contain confidential and privileged information. Any
> unauthorized review, use, disclosure or distribution is prohibited. If you
> are not the intended recipient, please contact the sender by reply email
> and destroy all copies of the original message.
>
>

RE: Query.toString printing binary in the output...

Posted by Andrew Lundgren <lu...@familysearch.org>.
I am sorry, I don't follow what you mean by debug=query.  Can you elaborate on that a bit?

Thanks!

-----Original Message-----
From: Erick Erickson [mailto:erickerickson@gmail.com] 
Sent: Sunday, March 17, 2013 8:09 AM
To: solr-user@lucene.apache.org
Subject: Re: Query.toString printing binary in the output...

Hmmm, without looking at the code, somehow when you specify debug=query you get readable results, maybe that code would be a place to start?

And are you looking for the parsed output? Otherwise you could print original query.

Not much help....
Erick


On Fri, Mar 15, 2013 at 3:24 PM, Andrew Lundgren
<lu...@familysearch.org>wrote:

> We use the toString call on the query in our logs.  For some numeric 
> types, the encoded form of the number is being printed instead of the 
> readable form.
>
> This makes tail and some other tools very unhappy...
>
> Here is a partial example of a query.toString() that would have had 
> binary in it.  As a short term work around I replaced all 
> non-printable characters in the string with an '_'.
>
> (collection_id:`__z_[^0.027 collection_id:`__nB+^0.026
> collection_id:`__Zl_^0.025 collection_id:`__i49^0.024
> collection_id:`__Pq%^0.023 collection_id:`__VCS^0.022
> collection_id:`__WbH^0.021 collection_id:`__Yu_^0.02
> collection_id:`__UF&^0.019 collection_id:`__I2g^0.018
> collection_id:`__PP_^0.016999999 collection_id:`__Ysv^0.015999999
> collection_id:`__Oe_^0.014999999 collection_id:`__Ysw^0.013999999
> collection_id:`__Wi_^0.012999998 collection_id:`__fLi^0.011999998
> collection_id:`__XRk^0.010999998 collection_id:`__Uz[^0.009999998
> collection_id:`__SE_^0.008999998 collection_id:`__Ysx^0.007999998
> collection_id:`__Ysh^0.0069999974 collection_id:`__fLh^0.0059999973 
> collection_id:`__f _^0.004999997 collection_id:`__`^C^0.003999997
> collection_id:`__fKM^0.002999997 collection_id:`__Szo^0.001999997 
> collection_id:`__f ]^9.99997E-4)
>
> But, as you can see, that is less than useful...
>
> I spent some time looking at the source and found that Term does not 
> contain the type of the embedded data.  Any possible solutions to this 
> short of walking the query and getting the type of each field from the 
> schema and creating my own print function?
>
> Thanks!
>
> --
> Andrew
>
>
>
>
>  NOTICE: This email message is for the sole use of the intended
> recipient(s) and may contain confidential and privileged information. 
> Any unauthorized review, use, disclosure or distribution is 
> prohibited. If you are not the intended recipient, please contact the 
> sender by reply email and destroy all copies of the original message.
>
>


 NOTICE: This email message is for the sole use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message.


Re: Query.toString printing binary in the output...

Posted by Erick Erickson <er...@gmail.com>.
Hmmm, without looking at the code, somehow when you specify debug=query you
get readable results, maybe that code would be a place to start?

And are you looking for the parsed output? Otherwise you could print
original query.

Not much help....
Erick


On Fri, Mar 15, 2013 at 3:24 PM, Andrew Lundgren
<lu...@familysearch.org>wrote:

> We use the toString call on the query in our logs.  For some numeric
> types, the encoded form of the number is being printed instead of the
> readable form.
>
> This makes tail and some other tools very unhappy...
>
> Here is a partial example of a query.toString() that would have had binary
> in it.  As a short term work around I replaced all non-printable characters
> in the string with an '_'.
>
> (collection_id:`__z_[^0.027 collection_id:`__nB+^0.026
> collection_id:`__Zl_^0.025 collection_id:`__i49^0.024
> collection_id:`__Pq%^0.023 collection_id:`__VCS^0.022
> collection_id:`__WbH^0.021 collection_id:`__Yu_^0.02
> collection_id:`__UF&^0.019 collection_id:`__I2g^0.018
> collection_id:`__PP_^0.016999999 collection_id:`__Ysv^0.015999999
> collection_id:`__Oe_^0.014999999 collection_id:`__Ysw^0.013999999
> collection_id:`__Wi_^0.012999998 collection_id:`__fLi^0.011999998
> collection_id:`__XRk^0.010999998 collection_id:`__Uz[^0.009999998
> collection_id:`__SE_^0.008999998 collection_id:`__Ysx^0.007999998
> collection_id:`__Ysh^0.0069999974 collection_id:`__fLh^0.0059999973
> collection_id:`__f _^0.004999997 collection_id:`__`^C^0.003999997
> collection_id:`__fKM^0.002999997 collection_id:`__Szo^0.001999997
> collection_id:`__f ]^9.99997E-4)
>
> But, as you can see, that is less than useful...
>
> I spent some time looking at the source and found that Term does not
> contain the type of the embedded data.  Any possible solutions to this
> short of walking the query and getting the type of each field from the
> schema and creating my own print function?
>
> Thanks!
>
> --
> Andrew
>
>
>
>
>  NOTICE: This email message is for the sole use of the intended
> recipient(s) and may contain confidential and privileged information. Any
> unauthorized review, use, disclosure or distribution is prohibited. If you
> are not the intended recipient, please contact the sender by reply email
> and destroy all copies of the original message.
>
>