You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@solr.apache.org by "Sean W (Jira)" <ji...@apache.org> on 2022/06/01 21:23:00 UTC

[jira] [Updated] (SOLR-16228) solr 9 docker container searches 400%+ slower than solr 8 docker container

     [ https://issues.apache.org/jira/browse/SOLR-16228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sean W updated SOLR-16228:
--------------------------
    Description: 
Hi Everyone, 

I just found that Solr 8 searches are 400%+ faster than Solr 9. 

In case it helps, my setup is on Windows 10 using WSL2 and Dropbox, but using the official Docker images.

Steps to reproduce, starting with fast Solr 8:

 
{code:java}
NAME=solr8
PORT=8983
CORE=demo  # because that's available in the docker image - yay!
VERSION=8.11.1 

# create a named volume so you can retain the index and mount it to other versions of docker run
docker volume create $NAME
docker run \
            --mount type=volume,source=$NAME,target=/var/solr \
            --restart always -d -p 127.0.0.1:$PORT:8983 \
            --name $NAME -e SOLR_HEAP=8000m -t \
            --memory="9000m" \
            --memory-swap="9000m" \
            --memory-swappiness=0 \
            --mount type=bind,source=/mnt/c/Users/foo,target=/opt/solr/mydata \
            solr:$VERSION \
            solr-demo

# Solr 8
docker exec -ti --user=solr $NAME bash -c \
    'cp -r /opt/solr/server/solr/configsets/_default/conf/* /var/solr/data/$CORE/conf/'
docker restart $NAME

# seed PDFs -- capitalization MUST match filesystem here!!!
docker exec --user=solr \
            $NAME \
            find \
                 /opt/solr/mydata/Dropbox/PDFs/Shared \
            -not -path \
                '/opt/solr/mydata/Dropbox/PDFs/*/.stversions*' \
            -name '*.pdf' \
            -type f \
            -exec bin/post -c $CORE \
            "{}" ';'{code}
 

Now the slower Solr 9:
{code:java}
NAME=solr9
PORT=8984
CORE=demo  # because that's available in the docker image - yay!
VERSION=latest # create a named volume so you can retain the index and mount it to other versions of docker run
docker volume create $NAME
docker run \
            --mount type=volume,source=$NAME,target=/var/solr \
            --restart always -d -p 127.0.0.1:$PORT:8983 \
            --name $NAME -e SOLR_HEAP=8000m -t \
            --memory="9000m" \
            --memory-swap="9000m" \
            --memory-swappiness=0 \
            --mount type=bind,source=/mnt/c/Users/foo,target=/opt/solr/mydata \
            solr:$VERSION \
            solr-demo# seed PDFs -- capitalization MUST match filesystem here!!!
docker exec --user=solr \
            $NAME \
            find \
                 /opt/solr/mydata/Dropbox/PDFs/Shared \
            -not -path \
                '/opt/solr/mydata/Dropbox/PDFs/*/.stversions*' \
            -name '*.pdf' \
            -type f \
            -exec bin/post -c $CORE \
            "{}" ';'
 {code}
When using the Linux `time`  command, I see:
solr 8: .04 seconds REAL time

solr 9: 2.7 seconds REAL time

My query:
{code:java}
/usr/bin/time -f %e curl -G --data-urlencode q=xfoo -s 'http://localhost:$PORT/solr/$CORE/select?fl=highlighting,score&hl.method=unified&hl=true&hl.simple.pre=&hl.simple.post=&hl.snippets=3&hl.fragsize=100&hl.maxMultiValuedToMatch=2&hl.maxAnalyzedChars=0&hl.mergeContiguous=true&hl.requireFieldMatch=true&rows=9999 {code}
Could this be a performance regression from [changing the base image to eclipse-temurin:17-jre|https://issues.apache.org/jira/browse/SOLR-15949?focusedCommentId=17488070&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17488070]?

  was:
Hi Everyone, 

Solr 8 searches are 400%+ faster than Solr 9. 

Steps to reproduce, starting with fast Solr 8:

 
{code:java}
NAME=solr8
PORT=8983
CORE=demo  # because that's available in the docker image - yay!
VERSION=8.11.1 

# create a named volume so you can retain the index and mount it to other versions of docker run
docker volume create $NAME
docker run \
            --mount type=volume,source=$NAME,target=/var/solr \
            --restart always -d -p 127.0.0.1:$PORT:8983 \
            --name $NAME -e SOLR_HEAP=8000m -t \
            --memory="9000m" \
            --memory-swap="9000m" \
            --memory-swappiness=0 \
            --mount type=bind,source=/mnt/c/Users/foo,target=/opt/solr/mydata \
            solr:$VERSION \
            solr-demo

# Solr 8
docker exec -ti --user=solr $NAME bash -c \
    'cp -r /opt/solr/server/solr/configsets/_default/conf/* /var/solr/data/$CORE/conf/'
docker restart $NAME

# seed PDFs -- capitalization MUST match filesystem here!!!
docker exec --user=solr \
            $NAME \
            find \
                 /opt/solr/mydata/Dropbox/PDFs/Shared \
            -not -path \
                '/opt/solr/mydata/Dropbox/PDFs/*/.stversions*' \
            -name '*.pdf' \
            -type f \
            -exec bin/post -c $CORE \
            "{}" ';'{code}
 

Now the slower Solr 9:
{code:java}
NAME=solr9
PORT=8984
CORE=demo  # because that's available in the docker image - yay!
VERSION=latest# create a named volume so you can retain the index and mount it to other versions of docker run
docker volume create $NAME
docker run \
            --mount type=volume,source=$NAME,target=/var/solr \
            --restart always -d -p 127.0.0.1:$PORT:8983 \
            --name $NAME -e SOLR_HEAP=8000m -t \
            --memory="9000m" \
            --memory-swap="9000m" \
            --memory-swappiness=0 \
            --mount type=bind,source=/mnt/c/Users/chones,target=/opt/solr/mydata \
            solr:$VERSION \
            solr-demo# seed PDFs -- capitalization MUST match filesystem here!!!
docker exec --user=solr \
            $NAME \
            find \
                 /opt/solr/mydata/Dropbox/PDFs/Shared \
            -not -path \
                '/opt/solr/mydata/Dropbox/PDFs/*/.stversions*' \
            -name '*.pdf' \
            -type f \
            -exec bin/post -c $CORE \
            "{}" ';'
 {code}

When using the Linux `time`  command, I see:
solr 8: .04 seconds REAL time

solr 9: 2.7 seconds REAL time

My query:
{code:java}
/usr/bin/time -f %e curl -G --data-urlencode q=xfoo -s 'http://localhost:$PORT/solr/$CORE/select?fl=highlighting,score&hl.method=unified&hl=true&hl.simple.pre=&hl.simple.post=&hl.snippets=3&hl.fragsize=100&hl.maxMultiValuedToMatch=2&hl.maxAnalyzedChars=0&hl.mergeContiguous=true&hl.requireFieldMatch=true&rows=9999 {code}
 


> solr 9 docker container searches 400%+ slower than solr 8 docker container
> --------------------------------------------------------------------------
>
>                 Key: SOLR-16228
>                 URL: https://issues.apache.org/jira/browse/SOLR-16228
>             Project: Solr
>          Issue Type: Bug
>      Security Level: Public(Default Security Level. Issues are Public) 
>          Components: search
>    Affects Versions: 9.0
>            Reporter: Sean W
>            Priority: Major
>
> Hi Everyone, 
> I just found that Solr 8 searches are 400%+ faster than Solr 9. 
> In case it helps, my setup is on Windows 10 using WSL2 and Dropbox, but using the official Docker images.
> Steps to reproduce, starting with fast Solr 8:
>  
> {code:java}
> NAME=solr8
> PORT=8983
> CORE=demo  # because that's available in the docker image - yay!
> VERSION=8.11.1 
> # create a named volume so you can retain the index and mount it to other versions of docker run
> docker volume create $NAME
> docker run \
>             --mount type=volume,source=$NAME,target=/var/solr \
>             --restart always -d -p 127.0.0.1:$PORT:8983 \
>             --name $NAME -e SOLR_HEAP=8000m -t \
>             --memory="9000m" \
>             --memory-swap="9000m" \
>             --memory-swappiness=0 \
>             --mount type=bind,source=/mnt/c/Users/foo,target=/opt/solr/mydata \
>             solr:$VERSION \
>             solr-demo
> # Solr 8
> docker exec -ti --user=solr $NAME bash -c \
>     'cp -r /opt/solr/server/solr/configsets/_default/conf/* /var/solr/data/$CORE/conf/'
> docker restart $NAME
> # seed PDFs -- capitalization MUST match filesystem here!!!
> docker exec --user=solr \
>             $NAME \
>             find \
>                  /opt/solr/mydata/Dropbox/PDFs/Shared \
>             -not -path \
>                 '/opt/solr/mydata/Dropbox/PDFs/*/.stversions*' \
>             -name '*.pdf' \
>             -type f \
>             -exec bin/post -c $CORE \
>             "{}" ';'{code}
>  
> Now the slower Solr 9:
> {code:java}
> NAME=solr9
> PORT=8984
> CORE=demo  # because that's available in the docker image - yay!
> VERSION=latest # create a named volume so you can retain the index and mount it to other versions of docker run
> docker volume create $NAME
> docker run \
>             --mount type=volume,source=$NAME,target=/var/solr \
>             --restart always -d -p 127.0.0.1:$PORT:8983 \
>             --name $NAME -e SOLR_HEAP=8000m -t \
>             --memory="9000m" \
>             --memory-swap="9000m" \
>             --memory-swappiness=0 \
>             --mount type=bind,source=/mnt/c/Users/foo,target=/opt/solr/mydata \
>             solr:$VERSION \
>             solr-demo# seed PDFs -- capitalization MUST match filesystem here!!!
> docker exec --user=solr \
>             $NAME \
>             find \
>                  /opt/solr/mydata/Dropbox/PDFs/Shared \
>             -not -path \
>                 '/opt/solr/mydata/Dropbox/PDFs/*/.stversions*' \
>             -name '*.pdf' \
>             -type f \
>             -exec bin/post -c $CORE \
>             "{}" ';'
>  {code}
> When using the Linux `time`  command, I see:
> solr 8: .04 seconds REAL time
> solr 9: 2.7 seconds REAL time
> My query:
> {code:java}
> /usr/bin/time -f %e curl -G --data-urlencode q=xfoo -s 'http://localhost:$PORT/solr/$CORE/select?fl=highlighting,score&hl.method=unified&hl=true&hl.simple.pre=&hl.simple.post=&hl.snippets=3&hl.fragsize=100&hl.maxMultiValuedToMatch=2&hl.maxAnalyzedChars=0&hl.mergeContiguous=true&hl.requireFieldMatch=true&rows=9999 {code}
> Could this be a performance regression from [changing the base image to eclipse-temurin:17-jre|https://issues.apache.org/jira/browse/SOLR-15949?focusedCommentId=17488070&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17488070]?



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org