You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Alok Dhir <ad...@symplicity.com> on 2008/11/17 18:36:25 UTC
sole 1.3: bug in phps response writer
Distributed queries:
curl 'http://devxen0:8983/solr/core0/select?
shards=search3:0,search3:8983/solr/
core2&version=2.2&start=0&rows=10&q=instance%3Arit%5C-
csm.symplicity.com+AND+label%3ALogin&wt=php'
curl 'http://devxen0:8983/solr/core0/select?
shards=search3:0,search3:8983/solr/
core2&version=2.2&start=0&rows=10&q=instance%3Arit%5C-
csm.symplicity.com+AND+label%3ALogin&wt=xml
curl 'http://devxen0:8983/solr/core0/select?
shards=search3:0,search3:8983/solr/
core2&version=2.2&start=0&rows=10&q=instance%3Arit%5C-
csm.symplicity.com+AND+label%3ALogin&wt=json''
All work fine, providing identical results in their respective formats
(note the change in the wt param).
curl 'http://devxen0:8983/solr/core0/select?shards=search3:8983/solr/
core0,search3:8983/solr/core2&version=2.2&start=0&rows=10&q=instance
%3Arit%5C-csm.symplicity.com+AND+label%3ALogin&wt=phps'
fails with:
java.lang.IllegalArgumentException: Map size must not be negative
at
org
.apache
.solr
.request
.PHPSerializedWriter.writeMapOpener(PHPSerializedResponseWriter.java:
195)
at
org
.apache
.solr.request.JSONWriter.writeSolrDocument(JSONResponseWriter.java:392)
at
org
.apache
.solr.request.JSONWriter.writeSolrDocumentList(JSONResponseWriter.java:
547)
at
org
.apache
.solr.request.TextResponseWriter.writeVal(TextResponseWriter.java:147)
at
org
.apache
.solr
.request.JSONWriter.writeNamedListAsMapMangled(JSONResponseWriter.java:
150)
at
org
.apache
.solr
.request
.PHPSerializedWriter.writeNamedList(PHPSerializedResponseWriter.java:71)
at
org
.apache
.solr
.request
.PHPSerializedWriter.writeResponse(PHPSerializedResponseWriter.java:66)
at
org
.apache
.solr
.request
.PHPSerializedResponseWriter.write(PHPSerializedResponseWriter.java:47)
at
org
.apache
.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:257)
at org.mortbay.jetty.servlet.ServletHandler
$CachedChain.doFilter(ServletHandler.java:1089)
at
org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:365)
at
org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:
216)
at
org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181)
at
org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:712)
at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:
405)
at
org
.mortbay
.jetty
.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:
211)
at
org
.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:
114)
at
org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:139)
at org.mortbay.jetty.Server.handle(Server.java:285)
at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:
502)
at org.mortbay.jetty.HttpConnection
$RequestHandler.headerComplete(HttpConnection.java:821)
at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:513)
at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:208)
at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:378)
at org.mortbay.jetty.bio.SocketConnector
$Connection.run(SocketConnector.java:226)
at org.mortbay.thread.BoundedThreadPool
$PoolThread.run(BoundedThreadPool.java:442)
Questions:
1) Is this known? I didn't see it in the issue treacker.
2) What's the better course of action: a) download source, fix, submit
patch, wait for new relase; b) drop phps and use json instead?
Thanks
Re: sole 1.3: bug in phps response writer
Posted by Poohneat <po...@gmail.com>.
Hey Otis,
I don't think this issue has been solved yet. I am working with Solr 1.3
release and yet i get the same exception as the original post.
I have Solr 1.3 release with the localsolr jars.
Any advice is helpful ... for now i will use the json response writer and
work around this bug.
Thanks
--
take care
Otis Gospodnetic wrote:
>
> Hi Alok,
>
> I don't think it's a known issue and 2. a) sounds like the best and most
> appreciated approach! :)
>
>
> Otis
> --
> Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
>
>
>
>
> ________________________________
> From: Alok Dhir <ad...@symplicity.com>
> To: solr-user@lucene.apache.org
> Sent: Monday, November 17, 2008 12:36:25 PM
> Subject: sole 1.3: bug in phps response writer
>
> Distributed queries:
>
> curl
> 'http://devxen0:8983/solr/core0/select?shards=search3:0,search3:8983/solr/core2&version=2.2&start=0&rows=10&q=instance%3Arit%5C-csm.symplicity.com+AND+label%3ALogin&wt=php'
>
> curl
> 'http://devxen0:8983/solr/core0/select?shards=search3:0,search3:8983/solr/core2&version=2.2&start=0&rows=10&q=instance%3Arit%5C-csm.symplicity.com+AND+label%3ALogin&wt=xml
>
> curl
> 'http://devxen0:8983/solr/core0/select?shards=search3:0,search3:8983/solr/core2&version=2.2&start=0&rows=10&q=instance%3Arit%5C-csm.symplicity.com+AND+label%3ALogin&wt=json''
>
> All work fine, providing identical results in their respective formats
> (note the change in the wt param).
>
> curl
> 'http://devxen0:8983/solr/core0/select?shards=search3:8983/solr/core0,search3:8983/solr/core2&version=2.2&start=0&rows=10&q=instance%3Arit%5C-csm.symplicity.com+AND+label%3ALogin&wt=phps'
>
> fails with:
>
> java.lang.IllegalArgumentException: Map size must not be negative
> at
> org.apache.solr.request.PHPSerializedWriter.writeMapOpener(PHPSerializedResponseWriter.java:195)
> at
> org.apache.solr.request.JSONWriter.writeSolrDocument(JSONResponseWriter.java:392)
> at
> org.apache.solr.request.JSONWriter.writeSolrDocumentList(JSONResponseWriter.java:547)
> at
> org.apache.solr.request.TextResponseWriter.writeVal(TextResponseWriter.java:147)
> at
> org.apache.solr.request.JSONWriter.writeNamedListAsMapMangled(JSONResponseWriter.java:150)
> at
> org.apache.solr.request.PHPSerializedWriter.writeNamedList(PHPSerializedResponseWriter.java:71)
> at
> org.apache.solr.request.PHPSerializedWriter.writeResponse(PHPSerializedResponseWriter.java:66)
> at
> org.apache.solr.request.PHPSerializedResponseWriter.write(PHPSerializedResponseWriter.java:47)
> at
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:257)
> at
> org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1089)
> at
> org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:365)
> at
> org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
> at
> org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181)
> at
> org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:712)
> at
> org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:405)
> at
> org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:211)
> at
> org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114)
> at
> org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:139)
> at org.mortbay.jetty.Server.handle(Server.java:285)
> at
> org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:502)
> at
> org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:821)
> at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:513)
> at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:208)
> at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:378)
> at
> org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:226)
> at
> org.mortbay.thread.BoundedThreadPool$PoolThread.run(BoundedThreadPool.java:442)
>
> Questions:
>
> 1) Is this known? I didn't see it in the issue treacker.
>
> 2) What's the better course of action: a) download source, fix, submit
> patch, wait for new relase; b) drop phps and use json instead?
>
> Thanks
>
--
View this message in context: http://www.nabble.com/sole-1.3%3A-bug-in-phps-response-writer-tp20544146p24834570.html
Sent from the Solr - User mailing list archive at Nabble.com.
Re: Processing of prx file for phrase queries: Whole position list for term read?
Posted by Erik Hatcher <er...@ehatchersolutions.com>.
Rather than attempt an answer to your questions directly, I'll mention
how other projects have dealt with the very-common-word issue. Nutch,
for example, has a list of high frequency terms and concatenates them
with the successive word in order to form less-frequent aggregate
terms. The original term is also indexed, but during querying in
phrases, the common terms are again concatenated, thus making querying
a lot faster.
I may not have explained it entirely accurately, but that's the gist.
Have a look at Nutch's Analyzer for more details.
Erik
On Nov 18, 2008, at 4:00 PM, Burton-West, Tom wrote:
> Hello,
>
> We are working with a very large index and with large documents (300+
> page books.) It appears that the bottleneck on our system is the disk
> IO involved in reading position information from the prx file for
> commonly occuring terms.
>
> An example slow query is "the new economics".
>
> To process the above phrase query for the word "the", does the entire
> part of the .prx file for the word "the" need to be read in to
> memory or
> only the fragments of the entries for the word "the" that contain
> specific doc ids?
>
> In reading the lucene index file formats document
> (http://lucene.apache.org/java/2_4_0/fileformats.html) its not clear
> whether the .tis file stores a pointer into the .prx file for a term
> (and therefore the entire list of doc_ids and positions for that term
> needs to be read into memory), or if the .tis file stores a pointer to
> the term **and doc id** in the prx file, in which case only the
> positions for a given doc id would need to be read. Or if somehow the
> .frq file has information on where to find the doc id in the .prx
> file.
>
>
> The documentation for the .tis file says that it stores ProxDelta
> which
> is based on the term (rather than the term/doc id). On the other hand
> the documentation for the .prx file states that Positions entries are
> "ordered by increasing document number (the document number is
> implicit
> from the .frq file)"
>
>
> Tom
Processing of prx file for phrase queries: Whole position list for term read?
Posted by "Burton-West, Tom" <tb...@umich.edu>.
Hello,
We are working with a very large index and with large documents (300+
page books.) It appears that the bottleneck on our system is the disk
IO involved in reading position information from the prx file for
commonly occuring terms.
An example slow query is "the new economics".
To process the above phrase query for the word "the", does the entire
part of the .prx file for the word "the" need to be read in to memory or
only the fragments of the entries for the word "the" that contain
specific doc ids?
In reading the lucene index file formats document
(http://lucene.apache.org/java/2_4_0/fileformats.html) its not clear
whether the .tis file stores a pointer into the .prx file for a term
(and therefore the entire list of doc_ids and positions for that term
needs to be read into memory), or if the .tis file stores a pointer to
the term **and doc id** in the prx file, in which case only the
positions for a given doc id would need to be read. Or if somehow the
.frq file has information on where to find the doc id in the .prx file.
The documentation for the .tis file says that it stores ProxDelta which
is based on the term (rather than the term/doc id). On the other hand
the documentation for the .prx file states that Positions entries are
"ordered by increasing document number (the document number is implicit
from the .frq file)"
Tom
Re: sole 1.3: bug in phps response writer
Posted by James liu <li...@gmail.com>.
i find url not same as the others
--
regards
j.L
Re: sole 1.3: bug in phps response writer
Posted by Otis Gospodnetic <ot...@yahoo.com>.
Hi Alok,
I don't think it's a known issue and 2. a) sounds like the best and most appreciated approach! :)
Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
________________________________
From: Alok Dhir <ad...@symplicity.com>
To: solr-user@lucene.apache.org
Sent: Monday, November 17, 2008 12:36:25 PM
Subject: sole 1.3: bug in phps response writer
Distributed queries:
curl 'http://devxen0:8983/solr/core0/select?shards=search3:0,search3:8983/solr/core2&version=2.2&start=0&rows=10&q=instance%3Arit%5C-csm.symplicity.com+AND+label%3ALogin&wt=php'
curl 'http://devxen0:8983/solr/core0/select?shards=search3:0,search3:8983/solr/core2&version=2.2&start=0&rows=10&q=instance%3Arit%5C-csm.symplicity.com+AND+label%3ALogin&wt=xml
curl 'http://devxen0:8983/solr/core0/select?shards=search3:0,search3:8983/solr/core2&version=2.2&start=0&rows=10&q=instance%3Arit%5C-csm.symplicity.com+AND+label%3ALogin&wt=json''
All work fine, providing identical results in their respective formats (note the change in the wt param).
curl 'http://devxen0:8983/solr/core0/select?shards=search3:8983/solr/core0,search3:8983/solr/core2&version=2.2&start=0&rows=10&q=instance%3Arit%5C-csm.symplicity.com+AND+label%3ALogin&wt=phps'
fails with:
java.lang.IllegalArgumentException: Map size must not be negative
at org.apache.solr.request.PHPSerializedWriter.writeMapOpener(PHPSerializedResponseWriter.java:195)
at org.apache.solr.request.JSONWriter.writeSolrDocument(JSONResponseWriter.java:392)
at org.apache.solr.request.JSONWriter.writeSolrDocumentList(JSONResponseWriter.java:547)
at org.apache.solr.request.TextResponseWriter.writeVal(TextResponseWriter.java:147)
at org.apache.solr.request.JSONWriter.writeNamedListAsMapMangled(JSONResponseWriter.java:150)
at org.apache.solr.request.PHPSerializedWriter.writeNamedList(PHPSerializedResponseWriter.java:71)
at org.apache.solr.request.PHPSerializedWriter.writeResponse(PHPSerializedResponseWriter.java:66)
at org.apache.solr.request.PHPSerializedResponseWriter.write(PHPSerializedResponseWriter.java:47)
at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:257)
at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1089)
at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:365)
at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181)
at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:712)
at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:405)
at org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:211)
at org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114)
at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:139)
at org.mortbay.jetty.Server.handle(Server.java:285)
at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:502)
at org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:821)
at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:513)
at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:208)
at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:378)
at org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:226)
at org.mortbay.thread.BoundedThreadPool$PoolThread.run(BoundedThreadPool.java:442)
Questions:
1) Is this known? I didn't see it in the issue treacker.
2) What's the better course of action: a) download source, fix, submit patch, wait for new relase; b) drop phps and use json instead?
Thanks