You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Doug Steigerwald <ds...@mcclatchyinteractive.com> on 2008/02/20 18:31:47 UTC
YAML update request handler
A few months back I wrote a YAML update request handler to see if we could post documents faster
than with XMl. We did see some small speed improvements (didn't write down the numbers), but the
hacked together code was probably making it slower as well. Not sure if there are faster YAML
libraries out there either.
We're not actually using it, since it was just a small proof of concept type of project, but is this
anything people might be interested in?
--
Doug Steigerwald
Re: YAML update request handler
Posted by Noble Paul നോബിള് नोब्ळ् <no...@gmail.com>.
Without breaking the existing stuff we can add another interface
BinaryQueryResponse extends QueryResponseWriter{
public void write(OutputStream out, SolrQueryRequest request,
SolrQueryResponse response) throws IOException;
}
and in the SolrDispatchFilter do something like this
QueryResponseWriter responseWriter = core.getQueryResponseWriter(solrReq);
if (responseWriter instanceof BinaryQueryResponse ) {
BinaryQueryResponse binaryResp = (Object)
responseWriter;
binaryResp.write(response.getOutputStream(), solrReq, solrRsp);
} else {
responseWriter.write(response.getWriter(), solrReq, solrRsp);}
--Noble
On Fri, Feb 22, 2008 at 8:05 PM, Grant Ingersoll <gs...@apache.org> wrote:
> The DispatchFilter could probably be modified to have the option of
> using the ServletOutputStream instead of the Writer. It would take
> some doing to maintain the proper compatibility, but it can be done, I
> think. Maybe we could have a /binary path or something along those
> lines and SolrJ could use that. QueryResponseWriter could be extended
> to have a write method that takes an OutputStream. Caveat: I
> haven't fully investigated this, but I do believe it makes sense for
> SolrJ to use a binary format by default. The other thing it should do
> is make sure, when sending/receiving XML is that the XML is as "tight"
> as possible, i.e. minimal whitespace, etc.
>
> Just thinking out loud,
> Grant
>
> On Feb 22, 2008, at 8:29 AM, Noble Paul നോബിള്
>
>
> नोब्ळ् wrote:
>
> > The API forbids use of any non-text format.
> >
> > The QueryResponseWriter's write() method can take only a Writer. So we
> > cannot write any binary stream into that.
> >
> > --Noble
> >
> > On Fri, Feb 22, 2008 at 12:30 AM, Walter Underwood
> > <wu...@netflix.com> wrote:
> >> Python marshal format is worth a try. It is binary and can represent
> >> the same data as JSON. It should be a good fit to Solr.
> >>
> >> We benchmarked that against XML several years ago and it was 2X
> >> faster.
> >> Of course, XML parsers are a lot faster now.
> >>
> >> wunder
> >>
> >>
> >>
> >> On 2/21/08 10:50 AM, "Grant Ingersoll" <gs...@apache.org> wrote:
> >>
> >>> XML can be a problem when it is really lengthy (lots of results,
> >>> large
> >>> results) such that a binary format could be useful in certain cases
> >>> where we control both ends of the pipe (i.e. SolrJ.) I've seen apps
> >>> that deal with really large files wrapped in XML where the XML
> >>> parsing
> >>> takes a significant amount of time as compared to a more compact
> >>> binary format.
> >>>
> >>> I think it at least warrants profiling/testing.
> >>>
> >>> -Grant
> >>>
> >>> On Feb 21, 2008, at 12:07 PM, Noble Paul നോബിള്
> >>> नोब्ळ् wrote:
> >>>
> >>>> hi,
> >>>> The format over the wire is not of great significance because it
> >>>> gets
> >>>> unmarshalled into the corresponding language object as soon as it
> >>>> comes out
> >>>> of the wire. I would say XML/JSON should meet 99% of the
> >>>> requirements
> >>>> because all the platforms come with an unmarshaller for both of
> >>>> these.
> >>>>
> >>>> But,If it can offer good performance improvement it is worth
> >>>> trying.
> >>>> --Noble
> >>>>
> >>>> On Thu, Feb 21, 2008 at 3:41 AM, alexander lind <ma...@webstay.org>
> >>>> wrote:
> >>>>
> >>>>> On Feb 20, 2008, at 9:31 AM, Doug Steigerwald wrote:
> >>>>>
> >>>>>> A few months back I wrote a YAML update request handler to see
> >>>>>> if we
> >>>>>> could post documents faster than with XMl. We did see some small
> >>>>>> speed improvements (didn't write down the numbers), but the
> >>>>>> hacked
> >>>>>> together code was probably making it slower as well. Not sure if
> >>>>>> there are faster YAML libraries out there either.
> >>>>>>
> >>>>>> We're not actually using it, since it was just a small proof of
> >>>>>> concept type of project, but is this anything people might be
> >>>>>> interested in?
> >>>>>>
> >>>>>
> >>>>> Out of simple preference I would love to see a YAML request
> >>>>> handler
> >>>>> just because I like the YAML format. If its also faster than XML,
> >>>>> then
> >>>>> all the better.
> >>>>>
> >>>>> Cheers
> >>>>> Alec
> >>>>>
> >>>>
> >>>>
> >>>>
> >>>> --
> >>>> --Noble Paul
> >>>
> >>> --------------------------
> >>> Grant Ingersoll
> >>> http://www.lucenebootcamp.com
> >>> Next Training: April 7, 2008 at ApacheCon Europe in Amsterdam
> >>>
> >>> Lucene Helpful Hints:
> >>> http://wiki.apache.org/lucene-java/BasicsOfPerformance
> >>> http://wiki.apache.org/lucene-java/LuceneFAQ
> >>>
> >>>
> >>>
> >>>
> >>>
> >>
> >>
> >
> >
> >
> > --
> > --Noble Paul
>
> --------------------------
> Grant Ingersoll
> http://www.lucenebootcamp.com
> Next Training: April 7, 2008 at ApacheCon Europe in Amsterdam
>
> Lucene Helpful Hints:
> http://wiki.apache.org/lucene-java/BasicsOfPerformance
> http://wiki.apache.org/lucene-java/LuceneFAQ
>
>
>
>
>
>
--
--Noble Paul
Re: YAML update request handler
Posted by Grant Ingersoll <gs...@apache.org>.
The DispatchFilter could probably be modified to have the option of
using the ServletOutputStream instead of the Writer. It would take
some doing to maintain the proper compatibility, but it can be done, I
think. Maybe we could have a /binary path or something along those
lines and SolrJ could use that. QueryResponseWriter could be extended
to have a write method that takes an OutputStream. Caveat: I
haven't fully investigated this, but I do believe it makes sense for
SolrJ to use a binary format by default. The other thing it should do
is make sure, when sending/receiving XML is that the XML is as "tight"
as possible, i.e. minimal whitespace, etc.
Just thinking out loud,
Grant
On Feb 22, 2008, at 8:29 AM, Noble Paul നോബിള്
नोब्ळ् wrote:
> The API forbids use of any non-text format.
>
> The QueryResponseWriter's write() method can take only a Writer. So we
> cannot write any binary stream into that.
>
> --Noble
>
> On Fri, Feb 22, 2008 at 12:30 AM, Walter Underwood
> <wu...@netflix.com> wrote:
>> Python marshal format is worth a try. It is binary and can represent
>> the same data as JSON. It should be a good fit to Solr.
>>
>> We benchmarked that against XML several years ago and it was 2X
>> faster.
>> Of course, XML parsers are a lot faster now.
>>
>> wunder
>>
>>
>>
>> On 2/21/08 10:50 AM, "Grant Ingersoll" <gs...@apache.org> wrote:
>>
>>> XML can be a problem when it is really lengthy (lots of results,
>>> large
>>> results) such that a binary format could be useful in certain cases
>>> where we control both ends of the pipe (i.e. SolrJ.) I've seen apps
>>> that deal with really large files wrapped in XML where the XML
>>> parsing
>>> takes a significant amount of time as compared to a more compact
>>> binary format.
>>>
>>> I think it at least warrants profiling/testing.
>>>
>>> -Grant
>>>
>>> On Feb 21, 2008, at 12:07 PM, Noble Paul നോബിള്
>>> नोब्ळ् wrote:
>>>
>>>> hi,
>>>> The format over the wire is not of great significance because it
>>>> gets
>>>> unmarshalled into the corresponding language object as soon as it
>>>> comes out
>>>> of the wire. I would say XML/JSON should meet 99% of the
>>>> requirements
>>>> because all the platforms come with an unmarshaller for both of
>>>> these.
>>>>
>>>> But,If it can offer good performance improvement it is worth
>>>> trying.
>>>> --Noble
>>>>
>>>> On Thu, Feb 21, 2008 at 3:41 AM, alexander lind <ma...@webstay.org>
>>>> wrote:
>>>>
>>>>> On Feb 20, 2008, at 9:31 AM, Doug Steigerwald wrote:
>>>>>
>>>>>> A few months back I wrote a YAML update request handler to see
>>>>>> if we
>>>>>> could post documents faster than with XMl. We did see some small
>>>>>> speed improvements (didn't write down the numbers), but the
>>>>>> hacked
>>>>>> together code was probably making it slower as well. Not sure if
>>>>>> there are faster YAML libraries out there either.
>>>>>>
>>>>>> We're not actually using it, since it was just a small proof of
>>>>>> concept type of project, but is this anything people might be
>>>>>> interested in?
>>>>>>
>>>>>
>>>>> Out of simple preference I would love to see a YAML request
>>>>> handler
>>>>> just because I like the YAML format. If its also faster than XML,
>>>>> then
>>>>> all the better.
>>>>>
>>>>> Cheers
>>>>> Alec
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> --Noble Paul
>>>
>>> --------------------------
>>> Grant Ingersoll
>>> http://www.lucenebootcamp.com
>>> Next Training: April 7, 2008 at ApacheCon Europe in Amsterdam
>>>
>>> Lucene Helpful Hints:
>>> http://wiki.apache.org/lucene-java/BasicsOfPerformance
>>> http://wiki.apache.org/lucene-java/LuceneFAQ
>>>
>>>
>>>
>>>
>>>
>>
>>
>
>
>
> --
> --Noble Paul
--------------------------
Grant Ingersoll
http://www.lucenebootcamp.com
Next Training: April 7, 2008 at ApacheCon Europe in Amsterdam
Lucene Helpful Hints:
http://wiki.apache.org/lucene-java/BasicsOfPerformance
http://wiki.apache.org/lucene-java/LuceneFAQ
Re: YAML update request handler
Posted by Noble Paul നോബിള് नोब्ळ् <no...@gmail.com>.
The API forbids use of any non-text format.
The QueryResponseWriter's write() method can take only a Writer. So we
cannot write any binary stream into that.
--Noble
On Fri, Feb 22, 2008 at 12:30 AM, Walter Underwood
<wu...@netflix.com> wrote:
> Python marshal format is worth a try. It is binary and can represent
> the same data as JSON. It should be a good fit to Solr.
>
> We benchmarked that against XML several years ago and it was 2X faster.
> Of course, XML parsers are a lot faster now.
>
> wunder
>
>
>
> On 2/21/08 10:50 AM, "Grant Ingersoll" <gs...@apache.org> wrote:
>
> > XML can be a problem when it is really lengthy (lots of results, large
> > results) such that a binary format could be useful in certain cases
> > where we control both ends of the pipe (i.e. SolrJ.) I've seen apps
> > that deal with really large files wrapped in XML where the XML parsing
> > takes a significant amount of time as compared to a more compact
> > binary format.
> >
> > I think it at least warrants profiling/testing.
> >
> > -Grant
> >
> > On Feb 21, 2008, at 12:07 PM, Noble Paul നോബിള്
> > नोब्ळ् wrote:
> >
> >> hi,
> >> The format over the wire is not of great significance because it gets
> >> unmarshalled into the corresponding language object as soon as it
> >> comes out
> >> of the wire. I would say XML/JSON should meet 99% of the requirements
> >> because all the platforms come with an unmarshaller for both of these.
> >>
> >> But,If it can offer good performance improvement it is worth trying.
> >> --Noble
> >>
> >> On Thu, Feb 21, 2008 at 3:41 AM, alexander lind <ma...@webstay.org>
> >> wrote:
> >>
> >>> On Feb 20, 2008, at 9:31 AM, Doug Steigerwald wrote:
> >>>
> >>>> A few months back I wrote a YAML update request handler to see if we
> >>>> could post documents faster than with XMl. We did see some small
> >>>> speed improvements (didn't write down the numbers), but the hacked
> >>>> together code was probably making it slower as well. Not sure if
> >>>> there are faster YAML libraries out there either.
> >>>>
> >>>> We're not actually using it, since it was just a small proof of
> >>>> concept type of project, but is this anything people might be
> >>>> interested in?
> >>>>
> >>>
> >>> Out of simple preference I would love to see a YAML request handler
> >>> just because I like the YAML format. If its also faster than XML,
> >>> then
> >>> all the better.
> >>>
> >>> Cheers
> >>> Alec
> >>>
> >>
> >>
> >>
> >> --
> >> --Noble Paul
> >
> > --------------------------
> > Grant Ingersoll
> > http://www.lucenebootcamp.com
> > Next Training: April 7, 2008 at ApacheCon Europe in Amsterdam
> >
> > Lucene Helpful Hints:
> > http://wiki.apache.org/lucene-java/BasicsOfPerformance
> > http://wiki.apache.org/lucene-java/LuceneFAQ
> >
> >
> >
> >
> >
>
>
--
--Noble Paul
Re: YAML update request handler
Posted by Grant Ingersoll <gs...@apache.org>.
See https://issues.apache.org/jira/browse/SOLR-476
On Feb 22, 2008, at 5:17 AM, Noble Paul നോബിള്
नोब्ळ् wrote:
> The SolrJ client is designed with the ResponseParser as an abstract
> class (which is good). But I have no means to plugin my custom
> ResponseParser class.
> Add a setter method . setResponseParser(ResponseParser parser)
> and have a lazy initialization of Responseparser .
> if(_processor == null) _processor = new XMLResponseParser();
>
> in the beginning of the request method.
>
> While it is a good idea to use commons HttpClient It is a huge ball
> and chain to put those extra jars (comons-http-client,
> commons-logging, commons-codec ) in my simple client application . It
> is too much to ask by a client API which is just supposed to parse an
> xml response.
>
> If httpclient is not available we must be able to fall back to new
> URL().openConnection();
>
> --Noble
>
> On Fri, Feb 22, 2008 at 9:46 AM, Noble Paul നോബിള്
> नोब्ळ्
> <no...@gmail.com> wrote:
>> For the case where we use Solrj (we control both ends) It is best
>> to resort to a custom binary format. It works fastest and with
>> least cost /bandwidth . We can use a custom object serialization/
>> deserialization mechanism (java standard serialization is verbose )
>> which is lightweight .
>>
>> I can create a patch which can be used for the same if you think it
>> is useful.
>>
>> --Noble
>>
>>
>>
>>
>>
>>
>>
>> On Fri, Feb 22, 2008 at 12:20 AM, Grant Ingersoll <gsingers@apache.org
>> > wrote:
>>
>>> XML can be a problem when it is really lengthy (lots of results,
>>> large
>>> results) such that a binary format could be useful in certain cases
>>> where we control both ends of the pipe (i.e. SolrJ.) I've seen apps
>>> that deal with really large files wrapped in XML where the XML
>>> parsing
>>> takes a significant amount of time as compared to a more compact
>>> binary format.
>>>
>>> I think it at least warrants profiling/testing.
>>>
>>> -Grant
>>>
>>> On Feb 21, 2008, at 12:07 PM, Noble Paul നോബിള്
>>>
>>>
>>>
>>> नोब्ळ् wrote:
>>>
>>>> hi,
>>>> The format over the wire is not of great significance because it
>>>> gets
>>>> unmarshalled into the corresponding language object as soon as it
>>>> comes out
>>>> of the wire. I would say XML/JSON should meet 99% of the
>>>> requirements
>>>> because all the platforms come with an unmarshaller for both of
>>>> these.
>>>>
>>>> But,If it can offer good performance improvement it is worth
>>>> trying.
>>>> --Noble
>>>>
>>>> On Thu, Feb 21, 2008 at 3:41 AM, alexander lind <ma...@webstay.org>
>>>> wrote:
>>>>
>>>>> On Feb 20, 2008, at 9:31 AM, Doug Steigerwald wrote:
>>>>>
>>>>>> A few months back I wrote a YAML update request handler to see
>>>>>> if we
>>>>>> could post documents faster than with XMl. We did see some small
>>>>>> speed improvements (didn't write down the numbers), but the
>>>>>> hacked
>>>>>> together code was probably making it slower as well. Not sure if
>>>>>> there are faster YAML libraries out there either.
>>>>>>
>>>>>> We're not actually using it, since it was just a small proof of
>>>>>> concept type of project, but is this anything people might be
>>>>>> interested in?
>>>>>>
>>>>>
>>>>> Out of simple preference I would love to see a YAML request
>>>>> handler
>>>>> just because I like the YAML format. If its also faster than XML,
>>>>> then
>>>>> all the better.
>>>>>
>>>>> Cheers
>>>>> Alec
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> --Noble Paul
>>>
>>> --------------------------
>>> Grant Ingersoll
>>> http://www.lucenebootcamp.com
>>> Next Training: April 7, 2008 at ApacheCon Europe in Amsterdam
>>>
>>> Lucene Helpful Hints:
>>> http://wiki.apache.org/lucene-java/BasicsOfPerformance
>>> http://wiki.apache.org/lucene-java/LuceneFAQ
>>>
>>>
>>>
>>>
>>>
>>>
>>
>>
>>
>> --
>> --Noble Paul
>
>
>
> --
> --Noble Paul
--------------------------
Grant Ingersoll
http://www.lucenebootcamp.com
Next Training: April 7, 2008 at ApacheCon Europe in Amsterdam
Lucene Helpful Hints:
http://wiki.apache.org/lucene-java/BasicsOfPerformance
http://wiki.apache.org/lucene-java/LuceneFAQ
Re: YAML update request handler
Posted by Noble Paul നോബിള് नोब्ळ् <no...@gmail.com>.
The SolrJ client is designed with the ResponseParser as an abstract
class (which is good). But I have no means to plugin my custom
ResponseParser class.
Add a setter method . setResponseParser(ResponseParser parser)
and have a lazy initialization of Responseparser .
if(_processor == null) _processor = new XMLResponseParser();
in the beginning of the request method.
While it is a good idea to use commons HttpClient It is a huge ball
and chain to put those extra jars (comons-http-client,
commons-logging, commons-codec ) in my simple client application . It
is too much to ask by a client API which is just supposed to parse an
xml response.
If httpclient is not available we must be able to fall back to new
URL().openConnection();
--Noble
On Fri, Feb 22, 2008 at 9:46 AM, Noble Paul നോബിള് नोब्ळ्
<no...@gmail.com> wrote:
> For the case where we use Solrj (we control both ends) It is best to resort to a custom binary format. It works fastest and with least cost /bandwidth . We can use a custom object serialization/deserialization mechanism (java standard serialization is verbose ) which is lightweight .
>
> I can create a patch which can be used for the same if you think it is useful.
>
> --Noble
>
>
>
>
>
>
>
> On Fri, Feb 22, 2008 at 12:20 AM, Grant Ingersoll <gs...@apache.org> wrote:
>
> > XML can be a problem when it is really lengthy (lots of results, large
> > results) such that a binary format could be useful in certain cases
> > where we control both ends of the pipe (i.e. SolrJ.) I've seen apps
> > that deal with really large files wrapped in XML where the XML parsing
> > takes a significant amount of time as compared to a more compact
> > binary format.
> >
> > I think it at least warrants profiling/testing.
> >
> > -Grant
> >
> > On Feb 21, 2008, at 12:07 PM, Noble Paul നോബിള്
> >
> >
> >
> > नोब्ळ् wrote:
> >
> > > hi,
> > > The format over the wire is not of great significance because it gets
> > > unmarshalled into the corresponding language object as soon as it
> > > comes out
> > > of the wire. I would say XML/JSON should meet 99% of the requirements
> > > because all the platforms come with an unmarshaller for both of these.
> > >
> > > But,If it can offer good performance improvement it is worth trying.
> > > --Noble
> > >
> > > On Thu, Feb 21, 2008 at 3:41 AM, alexander lind <ma...@webstay.org>
> > > wrote:
> > >
> > >> On Feb 20, 2008, at 9:31 AM, Doug Steigerwald wrote:
> > >>
> > >>> A few months back I wrote a YAML update request handler to see if we
> > >>> could post documents faster than with XMl. We did see some small
> > >>> speed improvements (didn't write down the numbers), but the hacked
> > >>> together code was probably making it slower as well. Not sure if
> > >>> there are faster YAML libraries out there either.
> > >>>
> > >>> We're not actually using it, since it was just a small proof of
> > >>> concept type of project, but is this anything people might be
> > >>> interested in?
> > >>>
> > >>
> > >> Out of simple preference I would love to see a YAML request handler
> > >> just because I like the YAML format. If its also faster than XML,
> > >> then
> > >> all the better.
> > >>
> > >> Cheers
> > >> Alec
> > >>
> > >
> > >
> > >
> > > --
> > > --Noble Paul
> >
> > --------------------------
> > Grant Ingersoll
> > http://www.lucenebootcamp.com
> > Next Training: April 7, 2008 at ApacheCon Europe in Amsterdam
> >
> > Lucene Helpful Hints:
> > http://wiki.apache.org/lucene-java/BasicsOfPerformance
> > http://wiki.apache.org/lucene-java/LuceneFAQ
> >
> >
> >
> >
> >
> >
>
>
>
> --
> --Noble Paul
--
--Noble Paul
Re: YAML update request handler
Posted by Noble Paul നോബിള് नोब्ळ् <no...@gmail.com>.
For the case where we use Solrj (we control both ends) It is best to resort
to a custom binary format. It works fastest and with least cost /bandwidth .
We can use a custom object serialization/deserialization mechanism (java
standard serialization is verbose ) which is lightweight .
I can create a patch which can be used for the same if you think it is
useful.
--Noble
On Fri, Feb 22, 2008 at 12:20 AM, Grant Ingersoll <gs...@apache.org>
wrote:
> XML can be a problem when it is really lengthy (lots of results, large
> results) such that a binary format could be useful in certain cases
> where we control both ends of the pipe (i.e. SolrJ.) I've seen apps
> that deal with really large files wrapped in XML where the XML parsing
> takes a significant amount of time as compared to a more compact
> binary format.
>
> I think it at least warrants profiling/testing.
>
> -Grant
>
> On Feb 21, 2008, at 12:07 PM, Noble Paul നോബിള്
> नोब्ळ् wrote:
>
> > hi,
> > The format over the wire is not of great significance because it gets
> > unmarshalled into the corresponding language object as soon as it
> > comes out
> > of the wire. I would say XML/JSON should meet 99% of the requirements
> > because all the platforms come with an unmarshaller for both of these.
> >
> > But,If it can offer good performance improvement it is worth trying.
> > --Noble
> >
> > On Thu, Feb 21, 2008 at 3:41 AM, alexander lind <ma...@webstay.org>
> > wrote:
> >
> >> On Feb 20, 2008, at 9:31 AM, Doug Steigerwald wrote:
> >>
> >>> A few months back I wrote a YAML update request handler to see if we
> >>> could post documents faster than with XMl. We did see some small
> >>> speed improvements (didn't write down the numbers), but the hacked
> >>> together code was probably making it slower as well. Not sure if
> >>> there are faster YAML libraries out there either.
> >>>
> >>> We're not actually using it, since it was just a small proof of
> >>> concept type of project, but is this anything people might be
> >>> interested in?
> >>>
> >>
> >> Out of simple preference I would love to see a YAML request handler
> >> just because I like the YAML format. If its also faster than XML,
> >> then
> >> all the better.
> >>
> >> Cheers
> >> Alec
> >>
> >
> >
> >
> > --
> > --Noble Paul
>
> --------------------------
> Grant Ingersoll
> http://www.lucenebootcamp.com
> Next Training: April 7, 2008 at ApacheCon Europe in Amsterdam
>
> Lucene Helpful Hints:
> http://wiki.apache.org/lucene-java/BasicsOfPerformance
> http://wiki.apache.org/lucene-java/LuceneFAQ
>
>
>
>
>
>
--
--Noble Paul
Re: YAML update request handler
Posted by Walter Underwood <wu...@netflix.com>.
Python marshal format is worth a try. It is binary and can represent
the same data as JSON. It should be a good fit to Solr.
We benchmarked that against XML several years ago and it was 2X faster.
Of course, XML parsers are a lot faster now.
wunder
On 2/21/08 10:50 AM, "Grant Ingersoll" <gs...@apache.org> wrote:
> XML can be a problem when it is really lengthy (lots of results, large
> results) such that a binary format could be useful in certain cases
> where we control both ends of the pipe (i.e. SolrJ.) I've seen apps
> that deal with really large files wrapped in XML where the XML parsing
> takes a significant amount of time as compared to a more compact
> binary format.
>
> I think it at least warrants profiling/testing.
>
> -Grant
>
> On Feb 21, 2008, at 12:07 PM, Noble Paul നോബിള്
> नोब्ळ् wrote:
>
>> hi,
>> The format over the wire is not of great significance because it gets
>> unmarshalled into the corresponding language object as soon as it
>> comes out
>> of the wire. I would say XML/JSON should meet 99% of the requirements
>> because all the platforms come with an unmarshaller for both of these.
>>
>> But,If it can offer good performance improvement it is worth trying.
>> --Noble
>>
>> On Thu, Feb 21, 2008 at 3:41 AM, alexander lind <ma...@webstay.org>
>> wrote:
>>
>>> On Feb 20, 2008, at 9:31 AM, Doug Steigerwald wrote:
>>>
>>>> A few months back I wrote a YAML update request handler to see if we
>>>> could post documents faster than with XMl. We did see some small
>>>> speed improvements (didn't write down the numbers), but the hacked
>>>> together code was probably making it slower as well. Not sure if
>>>> there are faster YAML libraries out there either.
>>>>
>>>> We're not actually using it, since it was just a small proof of
>>>> concept type of project, but is this anything people might be
>>>> interested in?
>>>>
>>>
>>> Out of simple preference I would love to see a YAML request handler
>>> just because I like the YAML format. If its also faster than XML,
>>> then
>>> all the better.
>>>
>>> Cheers
>>> Alec
>>>
>>
>>
>>
>> --
>> --Noble Paul
>
> --------------------------
> Grant Ingersoll
> http://www.lucenebootcamp.com
> Next Training: April 7, 2008 at ApacheCon Europe in Amsterdam
>
> Lucene Helpful Hints:
> http://wiki.apache.org/lucene-java/BasicsOfPerformance
> http://wiki.apache.org/lucene-java/LuceneFAQ
>
>
>
>
>
Re: YAML update request handler
Posted by Grant Ingersoll <gs...@apache.org>.
XML can be a problem when it is really lengthy (lots of results, large
results) such that a binary format could be useful in certain cases
where we control both ends of the pipe (i.e. SolrJ.) I've seen apps
that deal with really large files wrapped in XML where the XML parsing
takes a significant amount of time as compared to a more compact
binary format.
I think it at least warrants profiling/testing.
-Grant
On Feb 21, 2008, at 12:07 PM, Noble Paul നോബിള്
नोब्ळ् wrote:
> hi,
> The format over the wire is not of great significance because it gets
> unmarshalled into the corresponding language object as soon as it
> comes out
> of the wire. I would say XML/JSON should meet 99% of the requirements
> because all the platforms come with an unmarshaller for both of these.
>
> But,If it can offer good performance improvement it is worth trying.
> --Noble
>
> On Thu, Feb 21, 2008 at 3:41 AM, alexander lind <ma...@webstay.org>
> wrote:
>
>> On Feb 20, 2008, at 9:31 AM, Doug Steigerwald wrote:
>>
>>> A few months back I wrote a YAML update request handler to see if we
>>> could post documents faster than with XMl. We did see some small
>>> speed improvements (didn't write down the numbers), but the hacked
>>> together code was probably making it slower as well. Not sure if
>>> there are faster YAML libraries out there either.
>>>
>>> We're not actually using it, since it was just a small proof of
>>> concept type of project, but is this anything people might be
>>> interested in?
>>>
>>
>> Out of simple preference I would love to see a YAML request handler
>> just because I like the YAML format. If its also faster than XML,
>> then
>> all the better.
>>
>> Cheers
>> Alec
>>
>
>
>
> --
> --Noble Paul
--------------------------
Grant Ingersoll
http://www.lucenebootcamp.com
Next Training: April 7, 2008 at ApacheCon Europe in Amsterdam
Lucene Helpful Hints:
http://wiki.apache.org/lucene-java/BasicsOfPerformance
http://wiki.apache.org/lucene-java/LuceneFAQ
Re: YAML update request handler
Posted by Noble Paul നോബിള് नोब्ळ् <no...@gmail.com>.
hi,
The format over the wire is not of great significance because it gets
unmarshalled into the corresponding language object as soon as it comes out
of the wire. I would say XML/JSON should meet 99% of the requirements
because all the platforms come with an unmarshaller for both of these.
But,If it can offer good performance improvement it is worth trying.
--Noble
On Thu, Feb 21, 2008 at 3:41 AM, alexander lind <ma...@webstay.org> wrote:
> On Feb 20, 2008, at 9:31 AM, Doug Steigerwald wrote:
>
> > A few months back I wrote a YAML update request handler to see if we
> > could post documents faster than with XMl. We did see some small
> > speed improvements (didn't write down the numbers), but the hacked
> > together code was probably making it slower as well. Not sure if
> > there are faster YAML libraries out there either.
> >
> > We're not actually using it, since it was just a small proof of
> > concept type of project, but is this anything people might be
> > interested in?
> >
>
> Out of simple preference I would love to see a YAML request handler
> just because I like the YAML format. If its also faster than XML, then
> all the better.
>
> Cheers
> Alec
>
--
--Noble Paul
Re: YAML update request handler
Posted by alexander lind <ma...@webstay.org>.
On Feb 20, 2008, at 9:31 AM, Doug Steigerwald wrote:
> A few months back I wrote a YAML update request handler to see if we
> could post documents faster than with XMl. We did see some small
> speed improvements (didn't write down the numbers), but the hacked
> together code was probably making it slower as well. Not sure if
> there are faster YAML libraries out there either.
>
> We're not actually using it, since it was just a small proof of
> concept type of project, but is this anything people might be
> interested in?
>
Out of simple preference I would love to see a YAML request handler
just because I like the YAML format. If its also faster than XML, then
all the better.
Cheers
Alec