You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Spadez <ja...@hotmail.com> on 2012/12/05 17:42:54 UTC

Concern with using external SQL server for DIH

Hi,

I am looking to import entries to my SOLR server by using the DIH,
connecting to an external postgre SQL server using the JDBC driver. I will
be importing about 50,000 entries each time. 

Is connecting to an external SQL server for my data unreliable or risky, or
is it instead perfrectly reasonable?

My alternative is to export the SQL file on the other server, download the
SQL file to my SOLR server, import it to my Solr servers copy of postgreSQL
and then run the DIH on the local database.




--
View this message in context: http://lucene.472066.n3.nabble.com/Concern-with-using-external-SQL-server-for-DIH-tp4024514.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Concern with using external SQL server for DIH

Posted by Gora Mohanty <go...@mimirtech.com>.
On 5 December 2012 22:12, Spadez <ja...@hotmail.com> wrote:
> Hi,
>
> I am looking to import entries to my SOLR server by using the DIH,
> connecting to an external postgre SQL server using the JDBC driver. I will
> be importing about 50,000 entries each time.

Unless you have a lot of data in each entry, importing 50,000
entries should be pretty trivial.

> Is connecting to an external SQL server for my data unreliable or risky, or
> is it instead perfrectly reasonable?
[...]

Reliability is largely a matter of network connectivity, load on the
database and Solr servers, and to a lesser extent, on the JDBC
driver used (for SQL server, jTDS seemed significantly better to
us). Again, for 50,000 entries, these should typically not be of
concern.

"Risky" in what sense? Are the data sensitive?

Regards,
Gora

Re: Concern with using external SQL server for DIH

Posted by Shawn Heisey <so...@elyograg.org>.
On 12/5/2012 9:42 AM, Spadez wrote:
> I am looking to import entries to my SOLR server by using the DIH,
> connecting to an external postgre SQL server using the JDBC driver. I will
> be importing about 50,000 entries each time.
>
> Is connecting to an external SQL server for my data unreliable or risky, or
> is it instead perfrectly reasonable?
>
> My alternative is to export the SQL file on the other server, download the
> SQL file to my SOLR server, import it to my Solr servers copy of postgreSQL
> and then run the DIH on the local database.

I use DIH in situations that require a full reindex.  The MySQL database 
has 78 million records and imports simultaneously to seven Solr shards 
on two servers.  It takes about three hours.

The only instability that we ever noticed was on older Solr versions 
(1.4.x) with a low mergeFactor.  We ran into a situation where Solr was 
doing a lot of simultaneous merges and stopped indexing data long enough 
that the JDBC connection timed out.  We increased our mergeFactor, and 
newer Solr versions have better configuration possibilities, so now we 
have more merging threads.

Thanks,
Shawn