You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-dev@lucene.apache.org by "Yonik Seeley (JIRA)" <ji...@apache.org> on 2007/10/10 20:11:50 UTC

[jira] Created: (SOLR-377) speed increase for writers

speed increase for writers
--------------------------

                 Key: SOLR-377
                 URL: https://issues.apache.org/jira/browse/SOLR-377
             Project: Solr
          Issue Type: Improvement
            Reporter: Yonik Seeley


When solr is writing the response of large cached documents, the bottleneck is string encoding.
a buffered writer implementation that doesn't do any synchronization could offer some good speedups.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-377) speed increase for writers

Posted by "Yonik Seeley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12536343 ] 

Yonik Seeley commented on SOLR-377:
-----------------------------------

FYI, I haven't been able to reproduce any problems along these lines using the Jetty version that's bundled (and I set the FastWriter buffer size artificially low to exercise the boundary handling).


> speed increase for writers
> --------------------------
>
>                 Key: SOLR-377
>                 URL: https://issues.apache.org/jira/browse/SOLR-377
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Yonik Seeley
>         Attachments: fastwriter.patch, SOLR-377-phpresponsewriter.patch
>
>
> When solr is writing the response of large cached documents, the bottleneck is string encoding.
> a buffered writer implementation that doesn't do any synchronization could offer some good speedups.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (SOLR-377) speed increase for writers

Posted by "Yonik Seeley (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Yonik Seeley updated SOLR-377:
------------------------------

    Attachment: fastwriter.patch

attaching patch... adds an optimized unsynchronized buffered writer, changes some ResponseWriters use of strings to characters, removes buffering of string in JSON, etc.

Speed differences with *very* large documents:
json: 24% faster
ruby: 500% faster (ruby didn't buffer in a StringBuilder like JSON did)
python: 0% (bottleneck for these huge fields is buffering in the StringBuilder to see if we should prepend a 'u'... always prepending a 'u' and not buffering resulted in a ~20% improvement)
xml: 8% faster

With smaller documents, the speedups are likely to be greater because small writes like value separators would matter more.

If there are no objections, I'll commit in a few days.

> speed increase for writers
> --------------------------
>
>                 Key: SOLR-377
>                 URL: https://issues.apache.org/jira/browse/SOLR-377
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Yonik Seeley
>         Attachments: fastwriter.patch
>
>
> When solr is writing the response of large cached documents, the bottleneck is string encoding.
> a buffered writer implementation that doesn't do any synchronization could offer some good speedups.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-377) speed increase for writers

Posted by "Dave Lewis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12536350 ] 

Dave Lewis commented on SOLR-377:
---------------------------------

That appears to have been it, trunk works great!  Thanks!


> speed increase for writers
> --------------------------
>
>                 Key: SOLR-377
>                 URL: https://issues.apache.org/jira/browse/SOLR-377
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Yonik Seeley
>         Attachments: fastwriter.patch, SOLR-377-phpresponsewriter.patch
>
>
> When solr is writing the response of large cached documents, the bottleneck is string encoding.
> a buffered writer implementation that doesn't do any synchronization could offer some good speedups.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-377) speed increase for writers

Posted by "Yonik Seeley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12536347 ] 

Yonik Seeley commented on SOLR-377:
-----------------------------------

OK, I think it was a lack of flushing the buffer in the FastWriter.
I've checked in a patch... can you try with the trunk version?

> speed increase for writers
> --------------------------
>
>                 Key: SOLR-377
>                 URL: https://issues.apache.org/jira/browse/SOLR-377
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Yonik Seeley
>         Attachments: fastwriter.patch, SOLR-377-phpresponsewriter.patch
>
>
> When solr is writing the response of large cached documents, the bottleneck is string encoding.
> a buffered writer implementation that doesn't do any synchronization could offer some good speedups.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-377) speed increase for writers

Posted by "Yonik Seeley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12535805 ] 

Yonik Seeley commented on SOLR-377:
-----------------------------------

 Thanks Pieter, I just committed the PHP changes.

> speed increase for writers
> --------------------------
>
>                 Key: SOLR-377
>                 URL: https://issues.apache.org/jira/browse/SOLR-377
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Yonik Seeley
>         Attachments: fastwriter.patch, SOLR-377-phpresponsewriter.patch
>
>
> When solr is writing the response of large cached documents, the bottleneck is string encoding.
> a buffered writer implementation that doesn't do any synchronization could offer some good speedups.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-377) speed increase for writers

Posted by "Yonik Seeley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12536339 ] 

Yonik Seeley commented on SOLR-377:
-----------------------------------

What container are you using?
Jetty used to have a bug where the Writer they return to the servlet had issues with chars > 127 if you used writer.write(string,off,len)


> speed increase for writers
> --------------------------
>
>                 Key: SOLR-377
>                 URL: https://issues.apache.org/jira/browse/SOLR-377
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Yonik Seeley
>         Attachments: fastwriter.patch, SOLR-377-phpresponsewriter.patch
>
>
> When solr is writing the response of large cached documents, the bottleneck is string encoding.
> a buffered writer implementation that doesn't do any synchronization could offer some good speedups.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Resolved: (SOLR-377) speed increase for writers

Posted by "Yonik Seeley (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Yonik Seeley resolved SOLR-377.
-------------------------------

    Resolution: Fixed

committed.

> speed increase for writers
> --------------------------
>
>                 Key: SOLR-377
>                 URL: https://issues.apache.org/jira/browse/SOLR-377
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Yonik Seeley
>         Attachments: fastwriter.patch
>
>
> When solr is writing the response of large cached documents, the bottleneck is string encoding.
> a buffered writer implementation that doesn't do any synchronization could offer some good speedups.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (SOLR-377) speed increase for writers

Posted by "Pieter Berkel (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Pieter Berkel updated SOLR-377:
-------------------------------

    Attachment: SOLR-377-phpresponsewriter.patch

Sorry I've been a bit slow catching up with this issue.  Please find attached a trival patch to PHPResponseWriter.java that takes advantage of the new FastWriter code, it should provide speed improvements similar to the JSON writer (perhaps slightly less).

No fastwriter optimisation is necessary for PHPSerializedResponseWriter as there is no need to escape strings before they are written.


> speed increase for writers
> --------------------------
>
>                 Key: SOLR-377
>                 URL: https://issues.apache.org/jira/browse/SOLR-377
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Yonik Seeley
>         Attachments: fastwriter.patch, SOLR-377-phpresponsewriter.patch
>
>
> When solr is writing the response of large cached documents, the bottleneck is string encoding.
> a buffered writer implementation that doesn't do any synchronization could offer some good speedups.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-377) speed increase for writers

Posted by "Dave Lewis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12536332 ] 

Dave Lewis commented on SOLR-377:
---------------------------------

After this patch, using PHPSerializedResponseWriter returns output that is unreadable by my PHP application.  I know that doesn't make any sense, but I'm looking into it now.


> speed increase for writers
> --------------------------
>
>                 Key: SOLR-377
>                 URL: https://issues.apache.org/jira/browse/SOLR-377
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Yonik Seeley
>         Attachments: fastwriter.patch, SOLR-377-phpresponsewriter.patch
>
>
> When solr is writing the response of large cached documents, the bottleneck is string encoding.
> a buffered writer implementation that doesn't do any synchronization could offer some good speedups.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.