You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Alok Dhir <ad...@symplicity.com> on 2008/11/17 18:36:25 UTC

sole 1.3: bug in phps response writer

Distributed queries:

curl 'http://devxen0:8983/solr/core0/select? 
shards=search3:0,search3:8983/solr/ 
core2&version=2.2&start=0&rows=10&q=instance%3Arit%5C- 
csm.symplicity.com+AND+label%3ALogin&wt=php'

curl 'http://devxen0:8983/solr/core0/select? 
shards=search3:0,search3:8983/solr/ 
core2&version=2.2&start=0&rows=10&q=instance%3Arit%5C- 
csm.symplicity.com+AND+label%3ALogin&wt=xml

curl 'http://devxen0:8983/solr/core0/select? 
shards=search3:0,search3:8983/solr/ 
core2&version=2.2&start=0&rows=10&q=instance%3Arit%5C- 
csm.symplicity.com+AND+label%3ALogin&wt=json''

All work fine, providing identical results in their respective formats  
(note the change in the wt param).

curl 'http://devxen0:8983/solr/core0/select?shards=search3:8983/solr/ 
core0,search3:8983/solr/core2&version=2.2&start=0&rows=10&q=instance 
%3Arit%5C-csm.symplicity.com+AND+label%3ALogin&wt=phps'

fails with:

java.lang.IllegalArgumentException: Map size must not be negative
	at  
org 
.apache 
.solr 
.request 
.PHPSerializedWriter.writeMapOpener(PHPSerializedResponseWriter.java: 
195)
	at  
org 
.apache 
.solr.request.JSONWriter.writeSolrDocument(JSONResponseWriter.java:392)
	at  
org 
.apache 
.solr.request.JSONWriter.writeSolrDocumentList(JSONResponseWriter.java: 
547)
	at  
org 
.apache 
.solr.request.TextResponseWriter.writeVal(TextResponseWriter.java:147)
	at  
org 
.apache 
.solr 
.request.JSONWriter.writeNamedListAsMapMangled(JSONResponseWriter.java: 
150)
	at  
org 
.apache 
.solr 
.request 
.PHPSerializedWriter.writeNamedList(PHPSerializedResponseWriter.java:71)
	at  
org 
.apache 
.solr 
.request 
.PHPSerializedWriter.writeResponse(PHPSerializedResponseWriter.java:66)
	at  
org 
.apache 
.solr 
.request 
.PHPSerializedResponseWriter.write(PHPSerializedResponseWriter.java:47)
	at  
org 
.apache 
.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:257)
	at org.mortbay.jetty.servlet.ServletHandler 
$CachedChain.doFilter(ServletHandler.java:1089)
	at  
org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:365)
	at  
org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java: 
216)
	at  
org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181)
	at  
org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:712)
	at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java: 
405)
	at  
org 
.mortbay 
.jetty 
.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java: 
211)
	at  
org 
.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java: 
114)
	at  
org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:139)
	at org.mortbay.jetty.Server.handle(Server.java:285)
	at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java: 
502)
	at org.mortbay.jetty.HttpConnection 
$RequestHandler.headerComplete(HttpConnection.java:821)
	at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:513)
	at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:208)
	at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:378)
	at org.mortbay.jetty.bio.SocketConnector 
$Connection.run(SocketConnector.java:226)
	at org.mortbay.thread.BoundedThreadPool 
$PoolThread.run(BoundedThreadPool.java:442)

Questions:

1) Is this known?  I didn't see it in the issue treacker.

2) What's the better course of action: a) download source, fix, submit  
patch, wait for new relase; b) drop phps and use json instead?

Thanks



Re: sole 1.3: bug in phps response writer

Posted by Poohneat <po...@gmail.com>.
Hey Otis, 
I don't think this issue has been solved yet. I am working with Solr 1.3
release and yet i get the same exception as the original post. 
I have Solr 1.3 release with the localsolr jars. 

Any advice is helpful ... for now i will use the json response writer and
work around this bug. 

Thanks 
--
take care


Otis Gospodnetic wrote:
> 
> Hi Alok,
> 
> I don't think it's a known issue and 2. a) sounds like the best and most
> appreciated approach! :)
> 
> 
> Otis
> --
> Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
> 
> 
> 
> 
> ________________________________
> From: Alok Dhir <ad...@symplicity.com>
> To: solr-user@lucene.apache.org
> Sent: Monday, November 17, 2008 12:36:25 PM
> Subject: sole 1.3: bug in phps response writer
> 
> Distributed queries:
> 
> curl
> 'http://devxen0:8983/solr/core0/select?shards=search3:0,search3:8983/solr/core2&version=2.2&start=0&rows=10&q=instance%3Arit%5C-csm.symplicity.com+AND+label%3ALogin&wt=php'
> 
> curl
> 'http://devxen0:8983/solr/core0/select?shards=search3:0,search3:8983/solr/core2&version=2.2&start=0&rows=10&q=instance%3Arit%5C-csm.symplicity.com+AND+label%3ALogin&wt=xml
> 
> curl
> 'http://devxen0:8983/solr/core0/select?shards=search3:0,search3:8983/solr/core2&version=2.2&start=0&rows=10&q=instance%3Arit%5C-csm.symplicity.com+AND+label%3ALogin&wt=json''
> 
> All work fine, providing identical results in their respective formats
> (note the change in the wt param).
> 
> curl
> 'http://devxen0:8983/solr/core0/select?shards=search3:8983/solr/core0,search3:8983/solr/core2&version=2.2&start=0&rows=10&q=instance%3Arit%5C-csm.symplicity.com+AND+label%3ALogin&wt=phps'
> 
> fails with:
> 
> java.lang.IllegalArgumentException: Map size must not be negative
>     at
> org.apache.solr.request.PHPSerializedWriter.writeMapOpener(PHPSerializedResponseWriter.java:195)
>     at
> org.apache.solr.request.JSONWriter.writeSolrDocument(JSONResponseWriter.java:392)
>     at
> org.apache.solr.request.JSONWriter.writeSolrDocumentList(JSONResponseWriter.java:547)
>     at
> org.apache.solr.request.TextResponseWriter.writeVal(TextResponseWriter.java:147)
>     at
> org.apache.solr.request.JSONWriter.writeNamedListAsMapMangled(JSONResponseWriter.java:150)
>     at
> org.apache.solr.request.PHPSerializedWriter.writeNamedList(PHPSerializedResponseWriter.java:71)
>     at
> org.apache.solr.request.PHPSerializedWriter.writeResponse(PHPSerializedResponseWriter.java:66)
>     at
> org.apache.solr.request.PHPSerializedResponseWriter.write(PHPSerializedResponseWriter.java:47)
>     at
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:257)
>     at
> org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1089)
>     at
> org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:365)
>     at
> org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
>     at
> org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181)
>     at
> org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:712)
>     at
> org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:405)
>     at
> org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:211)
>     at
> org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114)
>     at
> org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:139)
>     at org.mortbay.jetty.Server.handle(Server.java:285)
>     at
> org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:502)
>     at
> org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:821)
>     at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:513)
>     at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:208)
>     at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:378)
>     at
> org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:226)
>     at
> org.mortbay.thread.BoundedThreadPool$PoolThread.run(BoundedThreadPool.java:442)
> 
> Questions:
> 
> 1) Is this known?  I didn't see it in the issue treacker.
> 
> 2) What's the better course of action: a) download source, fix, submit
> patch, wait for new relase; b) drop phps and use json instead?
> 
> Thanks
> 

-- 
View this message in context: http://www.nabble.com/sole-1.3%3A-bug-in-phps-response-writer-tp20544146p24834570.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Processing of prx file for phrase queries: Whole position list for term read?

Posted by Erik Hatcher <er...@ehatchersolutions.com>.
Rather than attempt an answer to your questions directly, I'll mention  
how other projects have dealt with the very-common-word issue.  Nutch,  
for example, has a list of high frequency terms and concatenates them  
with the successive word in order to form less-frequent aggregate  
terms.  The original term is also indexed, but during querying in  
phrases, the common terms are again concatenated, thus making querying  
a lot faster.

I may not have explained it entirely accurately, but that's the gist.   
Have a look at Nutch's Analyzer for more details.

	Erik


On Nov 18, 2008, at 4:00 PM, Burton-West, Tom wrote:

> Hello,
>
> We are working with a very large index and with large documents (300+
> page books.)  It appears that the bottleneck on our system is the disk
> IO involved in reading position information from the prx file for
> commonly occuring terms.
>
> An example slow query is  "the new economics".
>
> To process the above phrase query for the word "the", does the entire
> part of the .prx file for the word "the" need to be read in to  
> memory or
> only the fragments of the entries for the word "the" that contain
> specific doc ids?
>
> In reading the lucene index file formats document
> (http://lucene.apache.org/java/2_4_0/fileformats.html) its not clear
> whether the .tis file stores a pointer into the .prx file for a term
> (and therefore the entire list of doc_ids and positions for that term
> needs to be read into memory), or if the .tis file stores a pointer to
> the term **and doc id** in the prx file, in which case only the
> positions for a given doc id would need to be read. Or if somehow the
> .frq file has information on where to find the doc id in the .prx  
> file.
>
>
> The documentation for the .tis file says that it stores ProxDelta  
> which
> is based on the term (rather than the term/doc id).  On the other hand
> the documentation for the .prx file states that Positions entries are
> "ordered by increasing document number (the document number is  
> implicit
> from the .frq file)"
>
>
> Tom


Processing of prx file for phrase queries: Whole position list for term read?

Posted by "Burton-West, Tom" <tb...@umich.edu>.
Hello,

We are working with a very large index and with large documents (300+
page books.)  It appears that the bottleneck on our system is the disk
IO involved in reading position information from the prx file for
commonly occuring terms. 

An example slow query is  "the new economics".    

To process the above phrase query for the word "the", does the entire
part of the .prx file for the word "the" need to be read in to memory or
only the fragments of the entries for the word "the" that contain
specific doc ids?

In reading the lucene index file formats document
(http://lucene.apache.org/java/2_4_0/fileformats.html) its not clear
whether the .tis file stores a pointer into the .prx file for a term
(and therefore the entire list of doc_ids and positions for that term
needs to be read into memory), or if the .tis file stores a pointer to
the term **and doc id** in the prx file, in which case only the
positions for a given doc id would need to be read. Or if somehow the
.frq file has information on where to find the doc id in the .prx file.


The documentation for the .tis file says that it stores ProxDelta which
is based on the term (rather than the term/doc id).  On the other hand
the documentation for the .prx file states that Positions entries are
"ordered by increasing document number (the document number is implicit
from the .frq file)"


Tom


Re: sole 1.3: bug in phps response writer

Posted by James liu <li...@gmail.com>.
i find url not same as the others
-- 
regards
j.L

Re: sole 1.3: bug in phps response writer

Posted by Otis Gospodnetic <ot...@yahoo.com>.
Hi Alok,

I don't think it's a known issue and 2. a) sounds like the best and most appreciated approach! :)


Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch




________________________________
From: Alok Dhir <ad...@symplicity.com>
To: solr-user@lucene.apache.org
Sent: Monday, November 17, 2008 12:36:25 PM
Subject: sole 1.3: bug in phps response writer

Distributed queries:

curl 'http://devxen0:8983/solr/core0/select?shards=search3:0,search3:8983/solr/core2&version=2.2&start=0&rows=10&q=instance%3Arit%5C-csm.symplicity.com+AND+label%3ALogin&wt=php'

curl 'http://devxen0:8983/solr/core0/select?shards=search3:0,search3:8983/solr/core2&version=2.2&start=0&rows=10&q=instance%3Arit%5C-csm.symplicity.com+AND+label%3ALogin&wt=xml

curl 'http://devxen0:8983/solr/core0/select?shards=search3:0,search3:8983/solr/core2&version=2.2&start=0&rows=10&q=instance%3Arit%5C-csm.symplicity.com+AND+label%3ALogin&wt=json''

All work fine, providing identical results in their respective formats (note the change in the wt param).

curl 'http://devxen0:8983/solr/core0/select?shards=search3:8983/solr/core0,search3:8983/solr/core2&version=2.2&start=0&rows=10&q=instance%3Arit%5C-csm.symplicity.com+AND+label%3ALogin&wt=phps'

fails with:

java.lang.IllegalArgumentException: Map size must not be negative
    at org.apache.solr.request.PHPSerializedWriter.writeMapOpener(PHPSerializedResponseWriter.java:195)
    at org.apache.solr.request.JSONWriter.writeSolrDocument(JSONResponseWriter.java:392)
    at org.apache.solr.request.JSONWriter.writeSolrDocumentList(JSONResponseWriter.java:547)
    at org.apache.solr.request.TextResponseWriter.writeVal(TextResponseWriter.java:147)
    at org.apache.solr.request.JSONWriter.writeNamedListAsMapMangled(JSONResponseWriter.java:150)
    at org.apache.solr.request.PHPSerializedWriter.writeNamedList(PHPSerializedResponseWriter.java:71)
    at org.apache.solr.request.PHPSerializedWriter.writeResponse(PHPSerializedResponseWriter.java:66)
    at org.apache.solr.request.PHPSerializedResponseWriter.write(PHPSerializedResponseWriter.java:47)
    at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:257)
    at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1089)
    at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:365)
    at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
    at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181)
    at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:712)
    at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:405)
    at org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:211)
    at org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114)
    at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:139)
    at org.mortbay.jetty.Server.handle(Server.java:285)
    at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:502)
    at org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:821)
    at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:513)
    at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:208)
    at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:378)
    at org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:226)
    at org.mortbay.thread.BoundedThreadPool$PoolThread.run(BoundedThreadPool.java:442)

Questions:

1) Is this known?  I didn't see it in the issue treacker.

2) What's the better course of action: a) download source, fix, submit patch, wait for new relase; b) drop phps and use json instead?

Thanks