You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by "samuele.mattiuzzo" <sa...@gmail.com> on 2011/08/23 18:55:37 UTC

Solr indexing process: keep a persistent Mysql connection throu all the indexing process

I wrote my custom update handler for my solr installation, using jdbc to
query a mysql database. Everything works fine: the updater queries the db,
gets the data i need and update it in my documents! Fantastic!

Only issue is i have to open and close a mysql connection for every document
i read. Since we have something like 10kk indexed document, i was thinking
about opening a mysql connection at the very beginning of the indexing
process, keeping it stored somewhere and use it inside my custom update
handler. When the whole indexing process is complete, the connection should
be closed.

So far, is it possible?

Thanks all in advance!

--
View this message in context: http://lucene.472066.n3.nabble.com/Solr-indexing-process-keep-a-persistent-Mysql-connection-throu-all-the-indexing-process-tp3278608p3278608.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Solr indexing process: keep a persistent Mysql connection throu all the indexing process

Posted by "samuele.mattiuzzo" <sa...@gmail.com>.
those documents are unrelated to the database. the db i have is just storing
countries - region - cities, and it's used to do a refinement on a specific
solr field

example:

solrField "thetext" with content "Mary comes from London"

updateHandler polls the database for europe - great britain - london and
updates those values to the correct fields

isnt an update handler relative to a single document? at least, that's what
i understood...

--
View this message in context: http://lucene.472066.n3.nabble.com/Solr-indexing-process-keep-a-persistent-Mysql-connection-throu-all-the-indexing-process-tp3278608p3279764.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Solr indexing process: keep a persistent Mysql connection throu all the indexing process

Posted by Erick Erickson <er...@gmail.com>.
Put it anywhere you want <G>. Here's a good place
to start: http://www.javapractices.com/topic/TopicAction.do?Id=46
where the distributePresents method is the one you have
that returns the connection.

Here's a sample class that doesn't do much...
public enum MyEnum {
    INSTANCE;
  private String _tester = "";

    public void doStuff(String stuff) {
        if (_tester.length() == 0) {
          _tester = stuff;
          System.out.println("In initialization");
        }
        System.out.println("Tester is " + _tester + " Stuff is " + stuff);

    }
}

you can imagine doStuff as getConnection with logic to initialize the connection
where the string _tester is defined.

Now you can call it from anywhere the enum is available like this:

public class MyMain {
  public static void main(String[] args) {
    MyEnum.INSTANCE.doStuff("first time");
    MyEnum.INSTANCE.doStuff("second time");
    MyEnum.INSTANCE.doStuff("third time");
  }
}

Best
Erick

On Thu, Aug 25, 2011 at 9:43 AM, samuele.mattiuzzo <sa...@gmail.com> wrote:
> since i'm barely new to solr, can you please give some guidelines or provide
> an example i can look at for starters?
>
> i already tought about a singleton implementation, but i'm not sure where i
> have to put it and how should i start coding it
>
> --
> View this message in context: http://lucene.472066.n3.nabble.com/Solr-indexing-process-keep-a-persistent-Mysql-connection-throu-all-the-indexing-process-tp3278608p3283901.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>

Re: Solr indexing process: keep a persistent Mysql connection throu all the indexing process

Posted by "samuele.mattiuzzo" <sa...@gmail.com>.
since i'm barely new to solr, can you please give some guidelines or provide
an example i can look at for starters?

i already tought about a singleton implementation, but i'm not sure where i
have to put it and how should i start coding it

--
View this message in context: http://lucene.472066.n3.nabble.com/Solr-indexing-process-keep-a-persistent-Mysql-connection-throu-all-the-indexing-process-tp3278608p3283901.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Solr indexing process: keep a persistent Mysql connection throu all the indexing process

Posted by Erick Erickson <er...@gmail.com>.
Yes, but you can always employ a singleton to open and maintain a DB connection.

Best
Erick

On Tue, Aug 23, 2011 at 9:16 PM, samuele.mattiuzzo <sa...@gmail.com> wrote:
> those documents are unrelated to the database. the db i have is just storing
> countries - region - cities, and it's used to do a refinement on a specific
> solr field
>
> example:
>
> solrField "thetext" with content "Mary comes from London"
>
> updateHandler polls the database for europe - great britain - london and
> updates those values to the correct fields
>
> isnt an update handler relative to a single document? at least, that's what
> i understood...
>
> --
> View this message in context: http://lucene.472066.n3.nabble.com/Solr-indexing-process-keep-a-persistent-Mysql-connection-throu-all-the-indexing-process-tp3278608p3279765.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>

Re: Solr indexing process: keep a persistent Mysql connection throu all the indexing process

Posted by "samuele.mattiuzzo" <sa...@gmail.com>.
those documents are unrelated to the database. the db i have is just storing
countries - region - cities, and it's used to do a refinement on a specific
solr field

example:

solrField "thetext" with content "Mary comes from London"

updateHandler polls the database for europe - great britain - london and
updates those values to the correct fields

isnt an update handler relative to a single document? at least, that's what
i understood...

--
View this message in context: http://lucene.472066.n3.nabble.com/Solr-indexing-process-keep-a-persistent-Mysql-connection-throu-all-the-indexing-process-tp3278608p3279765.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Solr indexing process: keep a persistent Mysql connection throu all the indexing process

Posted by Tom <sp...@gmail.com>.
10K documents.  Why not just batch them?  

You could read in 10K from your database, load em into an array of
SolrDocuments. and them post them all at once to the Solr server?  Or do em
in 1K increments if they are really big.

--
View this message in context: http://lucene.472066.n3.nabble.com/Solr-indexing-process-keep-a-persistent-Mysql-connection-throu-all-the-indexing-process-tp3278608p3279708.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Solr indexing process: keep a persistent Mysql connection throu all the indexing process

Posted by Gora Mohanty <go...@mimirtech.com>.
On Tue, Aug 23, 2011 at 10:25 PM, samuele.mattiuzzo <sa...@gmail.com> wrote:
> I wrote my custom update handler for my solr installation, using jdbc to
> query a mysql database. Everything works fine: the updater queries the db,
> gets the data i need and update it in my documents! Fantastic!
>
> Only issue is i have to open and close a mysql connection for every document
> i read. Since we have something like 10kk indexed document, i was thinking
> about opening a mysql connection at the very beginning of the indexing
> process, keeping it stored somewhere and use it inside my custom update
> handler. When the whole indexing process is complete, the connection should
> be closed.
[...]

If you are using a custom update handler, then I imagine that
it is up to you to keep a persistent connection open.

You could also consider using the Solr DataImportHandler,
http://wiki.apache.org/solr/DataImportHandler . This can
interface with mysql, and does keep a persistent connection
open.

Regards,
Gora