You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Mike Copenhafer <co...@ne.bah.com> on 2010/06/08 00:52:41 UTC

DIH: Indexing multiple datasources with the same schema

Hi, I don't think my problem is unique, but I couldn't find any answers after
an hour of searching...

I have two databases with identical schemas and different data.  I want to
use DIH to index both into a single Solr index (right now, I have them in
separate indexes, but I find this cumbersome).  

So, my data-config.xml looks like

<dataSource type="JdbcDataSource" name="db1"
driver="oracle.jdbc.driver.OracleDriver" ... />
<dataSource type="JdbcDataSource" name="db2"
driver="oracle.jdbc.driver.OracleDriver" ... />               

Do I have to create entities for each data source, even though they contain
the same queries and operate on the same schema?  I know the following is
not possible:

    <entity name="event" dataSource="db1,db2" query="select ...

but I would like to avoid having to copy my entities for each data source. 
Am I missing something?  thank you!
-- 
View this message in context: http://lucene.472066.n3.nabble.com/DIH-Indexing-multiple-datasources-with-the-same-schema-tp877781p877781.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: DIH: Indexing multiple datasources with the same schema

Posted by alexei <ac...@gmail.com>.
Sorry about bringing an old thread back, I thought my solution could be
useful.
 
I also had to deal with multiple data sources. If the data source number
could be queried for in one of your parent entities then you could get it
using a variable as follows:

<entity name="ChildEntity" dataSource="db${YourParentEntity.DbId}" ... 

For the above to work I had to modify the 
org.apache.solr.handler.dataimport.ContextImpl.getDataSource()
Here is the replacement code for getDataSource: 


  public DataSource getDataSource() { 
    if (ds != null) return ds; 
    if(entity == null) return  null; 
    
    String dataSourceResolved =
this.getResolvedEntityAttribute("dataSource"); 
  
    if (entity.dataSrc == null) {       
        entity.dataSrc = dataImporter.getDataSourceInstance(entity,
dataSourceResolved, this); 
        entity.dataSource = dataSourceResolved; 
    } else if (!dataSourceResolved.equals(entity.dataSource)) { 
    entity.dataSrc.close(); 
        entity.dataSrc = dataImporter.getDataSourceInstance(entity,
dataSourceResolved, this); 
        entity.dataSource = dataSourceResolved; 
    } 
    if (entity.dataSrc != null && docBuilder != null &&
docBuilder.verboseDebug && 
             Context.FULL_DUMP.equals(currentProcess())) { 
      //debug is not yet implemented properly for deltas 
      entity.dataSrc =
docBuilder.writer.getDebugLogger().wrapDs(entity.dataSrc); 
    } 
    return entity.dataSrc; 
  } 


Cheers,
Alexei

--
View this message in context: http://lucene.472066.n3.nabble.com/DIH-Indexing-multiple-datasources-with-the-same-schema-tp877781p2786599.html
Sent from the Solr - User mailing list archive at Nabble.com.