You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by ahammad <ah...@gmail.com> on 2009/04/20 15:45:19 UTC

Using Solr to index a database

Hello,

I've never used Solr before, but I believe that it will suit my current
needs with indexing information from a database.

I downloaded and extracted Solr 1.3 to play around with it. I've been
looking at the following tutorials:
http://www.ibm.com/developerworks/java/library/j-solr-update/index.html
http://www.ibm.com/developerworks/java/library/j-solr-update/index.html 
http://wiki.apache.org/solr/DataImportHandler
http://wiki.apache.org/solr/DataImportHandler 

There are a few things I don't understand. For example, the IBM article
sometimes refers to directories that aren't there, or a little different
from what I have in my extracted copy of Solr (ie
solr-dw/rss/conf/solrconfig.xml). I tried to follow the steps as best I can,
but as soon as I put the following in solrconfig.xml, the whole thing
breaks:

<requestHandler name="/dataimport"
  class="org.apache.solr.handler.dataimport.DataImportHandler">
<lst name="defaults">
  <str name="config">rss-data-config.xml</str>
</lst>
</requestHandler>

Obviously I replace with my own info...One thing I don't quite get is the
data-config.xml file. What exactly is it? I've seen examples of what it
contains but since I don't know enough, I couldn't really adjust it. In any
case, this is the error I get, which may be because of a misconfigured
data-config.xml...

org.apache.solr.handler.dataimport.DataImportHandlerException: Exception
occurred while initializing context at
org.apache.solr.handler.dataimport.DataImporter.loadDataConfig(DataImporter.java:165)
at
org.apache.solr.handler.dataimport.DataImporter.<init>(DataImporter.java:99)
at
org.apache.solr.handler.dataimport.DataImportHandler.inform(DataImportHandler.java:96)
at
org.apache.solr.core.SolrResourceLoader.inform(SolrResourceLoader.java:388)
at org.apache.solr.core.SolrCore.<init>(SolrCore.java:571) at
org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:122)
at
org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:69)
at
org.apache.catalina.core.ApplicationFilterConfig.getFilter(ApplicationFilterConfig.java:221)
at
org.apache.catalina.core.ApplicationFilterConfig.setFilterDef(ApplicationFilterConfig.java:302)
at
org.apache.catalina.core.ApplicationFilterConfig.<init>(ApplicationFilterConfig.java:78)
at
org.apache.catalina.core.StandardContext.filterStart(StandardContext.java:3635)
at org.apache.catalina.core.StandardContext.start(StandardContext.java:4222)
at
org.apache.catalina.core.ContainerBase.addChildInternal(ContainerBase.java:760)
at org.apache.catalina.core.ContainerBase.addChild(ContainerBase.java:740)
at org.apache.catalina.core.StandardHost.addChild(StandardHost.java:544) at
org.apache.catalina.startup.HostConfig.deployWAR(HostConfig.java:831) at
org.apache.catalina.startup.HostConfig.deployWARs(HostConfig.java:720) at
org.apache.catalina.startup.HostConfig.deployApps(HostConfig.java:490) at
org.apache.catalina.startup.HostConfig.start(HostConfig.java:1150) at
org.apache.catalina.startup.HostConfig.lifecycleEvent(HostConfig.java:311)
at
org.apache.catalina.util.LifecycleSupport.fireLifecycleEvent(LifecycleSupport.java:120)
at org.apache.catalina.core.ContainerBase.start(ContainerBase.java:1022) at
org.apache.catalina.core.StandardHost.start(StandardHost.java:736) at
org.apache.catalina.core.ContainerBase.start(ContainerBase.java:1014) at
org.apache.catalina.core.StandardEngine.start(StandardEngine.java:443) at
org.apache.catalina.core.StandardService.start(StandardService.java:448) at
org.apache.catalina.core.StandardServer.start(StandardServer.java:700) at
org.apache.catalina.startup.Catalina.start(Catalina.java:552) at
sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at
sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source) at
sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source) at
java.lang.reflect.Method.invoke(Unknown Source) at
org.apache.catalina.startup.Bootstrap.start(Bootstrap.java:295) at
org.apache.catalina.startup.Bootstrap.main(Bootstrap.java:433) Caused by:
org.xml.sax.SAXParseException: The element type "document" must be
terminated by the matching end-tag "</document>". at
com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(Unknown Source)
at com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(Unknown
Source) at
org.apache.solr.handler.dataimport.DataImporter.loadDataConfig(DataImporter.java:153)


It's unclear to me what I need to be using, as in what directories/files I
need to implement this. Can someone please point me in the right direction?

BTW, I'm using Tomcat 5.5 because the prepackaged Jetty simply doesn't work
for me. It shows that it "started" in the command line, but it hangs, and
doesn't actually work when I try to hit the Solr admin page (page not found
type error). Jetty itself does start but the project doesn't seem to
deploy...

I apologize for the long post and if I didn't provide as much information as
I should. Let me know if you need clarification with anything I said.

Thank you very much.
-- 
View this message in context: http://www.nabble.com/Using-Solr-to-index-a-database-tp23136944p23136944.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Using Solr to index a database

Posted by ahammad <ah...@gmail.com>.
For now it's unclear, as this is sort of an "experiment" to see how much we
can do with it. I am inclined to use the index within Solr though, simply
for the very powerful querying (the stuff I've seen at least). I am not
exactly sure how much of the querying capabilities I'll require though.

I'll take a look at LuSql and see if it can be used for my purposes. I want
to get Solr working though, because I know that later down the road I'm
going to need it for another project...



Glen Newton wrote:
> 
> You have not indicated how you wish to use the index (inside Solr or not).
> 
> It is possible that LuSql might be an preferable alternative to
> Solr/DataImportHandler, depending on your requirements.
> 
> LuSql: http://lab.cisti-icist.nrc-cnrc.gc.ca/cistilabswiki/index.php/LuSql
> 
> Disclaimer: I am the author of LuSql.
> 
> -glen
> 
> 2009/4/20 ahammad <ah...@gmail.com>:
>>
>> Hello,
>>
>> I've never used Solr before, but I believe that it will suit my current
>> needs with indexing information from a database.
>>
>> I downloaded and extracted Solr 1.3 to play around with it. I've been
>> looking at the following tutorials:
>> http://www.ibm.com/developerworks/java/library/j-solr-update/index.html
>> http://www.ibm.com/developerworks/java/library/j-solr-update/index.html
>> http://wiki.apache.org/solr/DataImportHandler
>> http://wiki.apache.org/solr/DataImportHandler
>>
>> There are a few things I don't understand. For example, the IBM article
>> sometimes refers to directories that aren't there, or a little different
>> from what I have in my extracted copy of Solr (ie
>> solr-dw/rss/conf/solrconfig.xml). I tried to follow the steps as best I
>> can,
>> but as soon as I put the following in solrconfig.xml, the whole thing
>> breaks:
>>
>> <requestHandler name="/dataimport"
>>  class="org.apache.solr.handler.dataimport.DataImportHandler">
>> <lst name="defaults">
>>  <str name="config">rss-data-config.xml</str>
>> </lst>
>> </requestHandler>
>>
>> Obviously I replace with my own info...One thing I don't quite get is the
>> data-config.xml file. What exactly is it? I've seen examples of what it
>> contains but since I don't know enough, I couldn't really adjust it. In
>> any
>> case, this is the error I get, which may be because of a misconfigured
>> data-config.xml...
>>
>> org.apache.solr.handler.dataimport.DataImportHandlerException: Exception
>> occurred while initializing context at
>> org.apache.solr.handler.dataimport.DataImporter.loadDataConfig(DataImporter.java:165)
>> at
>> org.apache.solr.handler.dataimport.DataImporter.<init>(DataImporter.java:99)
>> at
>> org.apache.solr.handler.dataimport.DataImportHandler.inform(DataImportHandler.java:96)
>> at
>> org.apache.solr.core.SolrResourceLoader.inform(SolrResourceLoader.java:388)
>> at org.apache.solr.core.SolrCore.<init>(SolrCore.java:571) at
>> org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:122)
>> at
>> org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:69)
>> at
>> org.apache.catalina.core.ApplicationFilterConfig.getFilter(ApplicationFilterConfig.java:221)
>> at
>> org.apache.catalina.core.ApplicationFilterConfig.setFilterDef(ApplicationFilterConfig.java:302)
>> at
>> org.apache.catalina.core.ApplicationFilterConfig.<init>(ApplicationFilterConfig.java:78)
>> at
>> org.apache.catalina.core.StandardContext.filterStart(StandardContext.java:3635)
>> at
>> org.apache.catalina.core.StandardContext.start(StandardContext.java:4222)
>> at
>> org.apache.catalina.core.ContainerBase.addChildInternal(ContainerBase.java:760)
>> at
>> org.apache.catalina.core.ContainerBase.addChild(ContainerBase.java:740)
>> at org.apache.catalina.core.StandardHost.addChild(StandardHost.java:544)
>> at
>> org.apache.catalina.startup.HostConfig.deployWAR(HostConfig.java:831) at
>> org.apache.catalina.startup.HostConfig.deployWARs(HostConfig.java:720) at
>> org.apache.catalina.startup.HostConfig.deployApps(HostConfig.java:490) at
>> org.apache.catalina.startup.HostConfig.start(HostConfig.java:1150) at
>> org.apache.catalina.startup.HostConfig.lifecycleEvent(HostConfig.java:311)
>> at
>> org.apache.catalina.util.LifecycleSupport.fireLifecycleEvent(LifecycleSupport.java:120)
>> at org.apache.catalina.core.ContainerBase.start(ContainerBase.java:1022)
>> at
>> org.apache.catalina.core.StandardHost.start(StandardHost.java:736) at
>> org.apache.catalina.core.ContainerBase.start(ContainerBase.java:1014) at
>> org.apache.catalina.core.StandardEngine.start(StandardEngine.java:443) at
>> org.apache.catalina.core.StandardService.start(StandardService.java:448)
>> at
>> org.apache.catalina.core.StandardServer.start(StandardServer.java:700) at
>> org.apache.catalina.startup.Catalina.start(Catalina.java:552) at
>> sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at
>> sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source) at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source) at
>> java.lang.reflect.Method.invoke(Unknown Source) at
>> org.apache.catalina.startup.Bootstrap.start(Bootstrap.java:295) at
>> org.apache.catalina.startup.Bootstrap.main(Bootstrap.java:433) Caused by:
>> org.xml.sax.SAXParseException: The element type "document" must be
>> terminated by the matching end-tag "</document>". at
>> com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(Unknown
>> Source)
>> at
>> com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(Unknown
>> Source) at
>> org.apache.solr.handler.dataimport.DataImporter.loadDataConfig(DataImporter.java:153)
>>
>>
>> It's unclear to me what I need to be using, as in what directories/files
>> I
>> need to implement this. Can someone please point me in the right
>> direction?
>>
>> BTW, I'm using Tomcat 5.5 because the prepackaged Jetty simply doesn't
>> work
>> for me. It shows that it "started" in the command line, but it hangs, and
>> doesn't actually work when I try to hit the Solr admin page (page not
>> found
>> type error). Jetty itself does start but the project doesn't seem to
>> deploy...
>>
>> I apologize for the long post and if I didn't provide as much information
>> as
>> I should. Let me know if you need clarification with anything I said.
>>
>> Thank you very much.
>> --
>> View this message in context:
>> http://www.nabble.com/Using-Solr-to-index-a-database-tp23136944p23136944.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
>>
>>
> 
> 
> 
> -- 
> 
> -
> 
> 

-- 
View this message in context: http://www.nabble.com/Using-Solr-to-index-a-database-tp23136944p23137714.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Using Solr to index a database

Posted by Glen Newton <gl...@gmail.com>.
You have not indicated how you wish to use the index (inside Solr or not).

It is possible that LuSql might be an preferable alternative to
Solr/DataImportHandler, depending on your requirements.

LuSql: http://lab.cisti-icist.nrc-cnrc.gc.ca/cistilabswiki/index.php/LuSql

Disclaimer: I am the author of LuSql.

-glen

2009/4/20 ahammad <ah...@gmail.com>:
>
> Hello,
>
> I've never used Solr before, but I believe that it will suit my current
> needs with indexing information from a database.
>
> I downloaded and extracted Solr 1.3 to play around with it. I've been
> looking at the following tutorials:
> http://www.ibm.com/developerworks/java/library/j-solr-update/index.html
> http://www.ibm.com/developerworks/java/library/j-solr-update/index.html
> http://wiki.apache.org/solr/DataImportHandler
> http://wiki.apache.org/solr/DataImportHandler
>
> There are a few things I don't understand. For example, the IBM article
> sometimes refers to directories that aren't there, or a little different
> from what I have in my extracted copy of Solr (ie
> solr-dw/rss/conf/solrconfig.xml). I tried to follow the steps as best I can,
> but as soon as I put the following in solrconfig.xml, the whole thing
> breaks:
>
> <requestHandler name="/dataimport"
>  class="org.apache.solr.handler.dataimport.DataImportHandler">
> <lst name="defaults">
>  <str name="config">rss-data-config.xml</str>
> </lst>
> </requestHandler>
>
> Obviously I replace with my own info...One thing I don't quite get is the
> data-config.xml file. What exactly is it? I've seen examples of what it
> contains but since I don't know enough, I couldn't really adjust it. In any
> case, this is the error I get, which may be because of a misconfigured
> data-config.xml...
>
> org.apache.solr.handler.dataimport.DataImportHandlerException: Exception
> occurred while initializing context at
> org.apache.solr.handler.dataimport.DataImporter.loadDataConfig(DataImporter.java:165)
> at
> org.apache.solr.handler.dataimport.DataImporter.<init>(DataImporter.java:99)
> at
> org.apache.solr.handler.dataimport.DataImportHandler.inform(DataImportHandler.java:96)
> at
> org.apache.solr.core.SolrResourceLoader.inform(SolrResourceLoader.java:388)
> at org.apache.solr.core.SolrCore.<init>(SolrCore.java:571) at
> org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:122)
> at
> org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:69)
> at
> org.apache.catalina.core.ApplicationFilterConfig.getFilter(ApplicationFilterConfig.java:221)
> at
> org.apache.catalina.core.ApplicationFilterConfig.setFilterDef(ApplicationFilterConfig.java:302)
> at
> org.apache.catalina.core.ApplicationFilterConfig.<init>(ApplicationFilterConfig.java:78)
> at
> org.apache.catalina.core.StandardContext.filterStart(StandardContext.java:3635)
> at org.apache.catalina.core.StandardContext.start(StandardContext.java:4222)
> at
> org.apache.catalina.core.ContainerBase.addChildInternal(ContainerBase.java:760)
> at org.apache.catalina.core.ContainerBase.addChild(ContainerBase.java:740)
> at org.apache.catalina.core.StandardHost.addChild(StandardHost.java:544) at
> org.apache.catalina.startup.HostConfig.deployWAR(HostConfig.java:831) at
> org.apache.catalina.startup.HostConfig.deployWARs(HostConfig.java:720) at
> org.apache.catalina.startup.HostConfig.deployApps(HostConfig.java:490) at
> org.apache.catalina.startup.HostConfig.start(HostConfig.java:1150) at
> org.apache.catalina.startup.HostConfig.lifecycleEvent(HostConfig.java:311)
> at
> org.apache.catalina.util.LifecycleSupport.fireLifecycleEvent(LifecycleSupport.java:120)
> at org.apache.catalina.core.ContainerBase.start(ContainerBase.java:1022) at
> org.apache.catalina.core.StandardHost.start(StandardHost.java:736) at
> org.apache.catalina.core.ContainerBase.start(ContainerBase.java:1014) at
> org.apache.catalina.core.StandardEngine.start(StandardEngine.java:443) at
> org.apache.catalina.core.StandardService.start(StandardService.java:448) at
> org.apache.catalina.core.StandardServer.start(StandardServer.java:700) at
> org.apache.catalina.startup.Catalina.start(Catalina.java:552) at
> sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at
> sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source) at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source) at
> java.lang.reflect.Method.invoke(Unknown Source) at
> org.apache.catalina.startup.Bootstrap.start(Bootstrap.java:295) at
> org.apache.catalina.startup.Bootstrap.main(Bootstrap.java:433) Caused by:
> org.xml.sax.SAXParseException: The element type "document" must be
> terminated by the matching end-tag "</document>". at
> com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(Unknown Source)
> at com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(Unknown
> Source) at
> org.apache.solr.handler.dataimport.DataImporter.loadDataConfig(DataImporter.java:153)
>
>
> It's unclear to me what I need to be using, as in what directories/files I
> need to implement this. Can someone please point me in the right direction?
>
> BTW, I'm using Tomcat 5.5 because the prepackaged Jetty simply doesn't work
> for me. It shows that it "started" in the command line, but it hangs, and
> doesn't actually work when I try to hit the Solr admin page (page not found
> type error). Jetty itself does start but the project doesn't seem to
> deploy...
>
> I apologize for the long post and if I didn't provide as much information as
> I should. Let me know if you need clarification with anything I said.
>
> Thank you very much.
> --
> View this message in context: http://www.nabble.com/Using-Solr-to-index-a-database-tp23136944p23136944.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
>



-- 

-

Re: Using Solr to index a database

Posted by Amit Nithian <an...@gmail.com>.
Each PRODUCT would be a document in your index with fields number, name and
price. If you wanted to start off simple, your schema.xml could just define
these three fields; however, for a search index, you may want to index name
several ways (i.e. with and without stop words etc).

The DIH is intelligent enough to know what DB columns map to what schema
fields so you don't have to add unnecessary <field> elements to the
data-config.xml.

Your query would be "select * from products" and the delta query would only
select those products that have been changed or updated since some previous
date (a bit tangential to this discussion).

HTH,
Amit

On Wed, May 6, 2009 at 2:46 AM, uday kumar maddigatla <uk...@mach.com> wrote:

>
> Hi
>
> I too having the same doubt. I would like to check How Solr is index and
> search the data which is there in database.
>
>  For example, lets say we have a table called PRODUCTS, and within that
> table, we have the following columns:
> NUMBER (product number)
> NAME (product name)
> PRICE
>
> How would weindex all this information? Here is an example (from the links
> you provided) of xml (data-config.xml)
>             <entity name="item" pk="ID" query="select * from item"
>    ------->    deltaQuery="select id from item where last_modified >
> '${dataimporter.last_index_time}'">
>            <field column="NAME" name="name" />
>            <field column="NAME" name="nameSort" />
>            <field column="NAME" name="alphaNameSort" />
>
> Need Help in this.
>
>
>
>
> Noble Paul നോബിള്‍  नोब्ळ् wrote:
> >
> > delta query is for incremental imports
> >
> > us ethe  'query' attribute to import data
> >
> >
> > On Tue, Apr 21, 2009 at 7:35 PM, ahammad <ah...@gmail.com> wrote:
> >>
> >> Thanks for the link...
> >>
> >> I'm still a bit unclear as to how it goes. For example, lets say i have
> a
> >> table called PRODUCTS, and within that table, I have the following
> >> columns:
> >> NUMBER (product number)
> >> NAME (product name)
> >> PRICE
> >>
> >> How would I index all this information? Here is an example (from the
> >> links
> >> you provided) of xml that confuses me:
> >>
> >>            <entity name="item" pk="ID" query="select * from item"
> >>    ------->    deltaQuery="select id from item where last_modified >
> >> '${dataimporter.last_index_time}'">
> >>            <field column="NAME" name="name" />
> >>            <field column="NAME" name="nameSort" />
> >>            <field column="NAME" name="alphaNameSort" />
> >>
> >> What is that deltaQuery (or even if it was a regular "query" expression)
> >> line for? It seems to me like a sort of filter. What if I don't want to
> >> filter anything and just want to index all the rows?
> >>
> >> Cheers
> >>
> >>
> >>
> >>
> >> Noble Paul നോബിള്‍  नोब्ळ् wrote:
> >>>
> >>> On Mon, Apr 20, 2009 at 7:15 PM, ahammad <ah...@gmail.com>
> wrote:
> >>>>
> >>>> Hello,
> >>>>
> >>>> I've never used Solr before, but I believe that it will suit my
> current
> >>>> needs with indexing information from a database.
> >>>>
> >>>> I downloaded and extracted Solr 1.3 to play around with it. I've been
> >>>> looking at the following tutorials:
> >>>>
> http://www.ibm.com/developerworks/java/library/j-solr-update/index.html
> >>>>
> http://www.ibm.com/developerworks/java/library/j-solr-update/index.html
> >>>> http://wiki.apache.org/solr/DataImportHandler
> >>>> http://wiki.apache.org/solr/DataImportHandler
> >>>>
> >>>> There are a few things I don't understand. For example, the IBM
> article
> >>>> sometimes refers to directories that aren't there, or a little
> >>>> different
> >>>> from what I have in my extracted copy of Solr (ie
> >>>> solr-dw/rss/conf/solrconfig.xml). I tried to follow the steps as best
> I
> >>>> can,
> >>>> but as soon as I put the following in solrconfig.xml, the whole thing
> >>>> breaks:
> >>>>
> >>>> <requestHandler name="/dataimport"
> >>>>  class="org.apache.solr.handler.dataimport.DataImportHandler">
> >>>> <lst name="defaults">
> >>>>  <str name="config">rss-data-config.xml</str>
> >>>> </lst>
> >>>> </requestHandler>
> >>>>
> >>>> Obviously I replace with my own info...One thing I don't quite get is
> >>>> the
> >>>> data-config.xml file. What exactly is it? I've seen examples of what
> it
> >>>> contains but since I don't know enough, I couldn't really adjust it.
> In
> >>>> any
> >>>> case, this is the error I get, which may be because of a misconfigured
> >>>> data-config.xml...
> >>> the data-config.xml describes how to fetch data from various data
> >>> sources and index them into Solr.
> >>>
> >>> The stacktrace says that your xml is invalid.
> >>>
> >>> The best bet is to take one of the sample dataconfig xml files and make
> >>> changes.
> >>>
> >>>
> http://svn.apache.org/viewvc/lucene/solr/trunk/example/example-DIH/solr/db/conf/db-data-config.xml?revision=691151&view=markup
> >>>
> >>>
> http://svn.apache.org/viewvc/lucene/solr/trunk/example/example-DIH/solr/rss/conf/rss-data-config.xml?revision=691151&view=markup
> >>>
> >>>
> >>>>
> >>>> org.apache.solr.handler.dataimport.DataImportHandlerException:
> >>>> Exception
> >>>> occurred while initializing context at
> >>>>
> org.apache.solr.handler.dataimport.DataImporter.loadDataConfig(DataImporter.java:165)
> >>>> at
> >>>>
> org.apache.solr.handler.dataimport.DataImporter.<init>(DataImporter.java:99)
> >>>> at
> >>>>
> org.apache.solr.handler.dataimport.DataImportHandler.inform(DataImportHandler.java:96)
> >>>> at
> >>>>
> org.apache.solr.core.SolrResourceLoader.inform(SolrResourceLoader.java:388)
> >>>> at org.apache.solr.core.SolrCore.<init>(SolrCore.java:571) at
> >>>>
> org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:122)
> >>>> at
> >>>>
> org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:69)
> >>>> at
> >>>>
> org.apache.catalina.core.ApplicationFilterConfig.getFilter(ApplicationFilterConfig.java:221)
> >>>> at
> >>>>
> org.apache.catalina.core.ApplicationFilterConfig.setFilterDef(ApplicationFilterConfig.java:302)
> >>>> at
> >>>>
> org.apache.catalina.core.ApplicationFilterConfig.<init>(ApplicationFilterConfig.java:78)
> >>>> at
> >>>>
> org.apache.catalina.core.StandardContext.filterStart(StandardContext.java:3635)
> >>>> at
> >>>>
> org.apache.catalina.core.StandardContext.start(StandardContext.java:4222)
> >>>> at
> >>>>
> org.apache.catalina.core.ContainerBase.addChildInternal(ContainerBase.java:760)
> >>>> at
> >>>>
> org.apache.catalina.core.ContainerBase.addChild(ContainerBase.java:740)
> >>>> at
> >>>> org.apache.catalina.core.StandardHost.addChild(StandardHost.java:544)
> >>>> at
> >>>> org.apache.catalina.startup.HostConfig.deployWAR(HostConfig.java:831)
> >>>> at
> >>>> org.apache.catalina.startup.HostConfig.deployWARs(HostConfig.java:720)
> >>>> at
> >>>> org.apache.catalina.startup.HostConfig.deployApps(HostConfig.java:490)
> >>>> at
> >>>> org.apache.catalina.startup.HostConfig.start(HostConfig.java:1150) at
> >>>>
> org.apache.catalina.startup.HostConfig.lifecycleEvent(HostConfig.java:311)
> >>>> at
> >>>>
> org.apache.catalina.util.LifecycleSupport.fireLifecycleEvent(LifecycleSupport.java:120)
> >>>> at
> >>>> org.apache.catalina.core.ContainerBase.start(ContainerBase.java:1022)
> >>>> at
> >>>> org.apache.catalina.core.StandardHost.start(StandardHost.java:736) at
> >>>> org.apache.catalina.core.ContainerBase.start(ContainerBase.java:1014)
> >>>> at
> >>>> org.apache.catalina.core.StandardEngine.start(StandardEngine.java:443)
> >>>> at
> >>>>
> org.apache.catalina.core.StandardService.start(StandardService.java:448)
> >>>> at
> >>>> org.apache.catalina.core.StandardServer.start(StandardServer.java:700)
> >>>> at
> >>>> org.apache.catalina.startup.Catalina.start(Catalina.java:552) at
> >>>> sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at
> >>>> sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source) at
> >>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source) at
> >>>> java.lang.reflect.Method.invoke(Unknown Source) at
> >>>> org.apache.catalina.startup.Bootstrap.start(Bootstrap.java:295) at
> >>>> org.apache.catalina.startup.Bootstrap.main(Bootstrap.java:433) Caused
> >>>> by:
> >>>> org.xml.sax.SAXParseException: The element type "document" must be
> >>>> terminated by the matching end-tag "</document>". at
> >>>> com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(Unknown
> >>>> Source)
> >>>> at
> >>>>
> com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(Unknown
> >>>> Source) at
> >>>>
> org.apache.solr.handler.dataimport.DataImporter.loadDataConfig(DataImporter.java:153)
> >>>>
> >>>>
> >>>> It's unclear to me what I need to be using, as in what
> >>>> directories/files
> >>>> I
> >>>> need to implement this. Can someone please point me in the right
> >>>> direction?
> >>>>
> >>>> BTW, I'm using Tomcat 5.5 because the prepackaged Jetty simply doesn't
> >>>> work
> >>>> for me. It shows that it "started" in the command line, but it hangs,
> >>>> and
> >>>> doesn't actually work when I try to hit the Solr admin page (page not
> >>>> found
> >>>> type error). Jetty itself does start but the project doesn't seem to
> >>>> deploy...
> >>>>
> >>>> I apologize for the long post and if I didn't provide as much
> >>>> information
> >>>> as
> >>>> I should. Let me know if you need clarification with anything I said.
> >>>>
> >>>> Thank you very much.
> >>>> --
> >>>> View this message in context:
> >>>>
> http://www.nabble.com/Using-Solr-to-index-a-database-tp23136944p23136944.html
> >>>> Sent from the Solr - User mailing list archive at Nabble.com.
> >>>>
> >>>>
> >>>
> >>>
> >>>
> >>> --
> >>> --Noble Paul
> >>>
> >>>
> >>
> >> --
> >> View this message in context:
> >>
> http://www.nabble.com/Using-Solr-to-index-a-database-tp23136944p23156850.html
> >> Sent from the Solr - User mailing list archive at Nabble.com.
> >>
> >>
> >
> >
> >
> > --
> > --Noble Paul
> >
> >
>
> --
> View this message in context:
> http://www.nabble.com/Using-Solr-to-index-a-database-tp23136944p23403314.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
>

Re: Using Solr to index a database

Posted by uday kumar maddigatla <uk...@mach.com>.
Hi

I too having the same doubt. I would like to check How Solr is index and
search the data which is there in database.

 For example, lets say we have a table called PRODUCTS, and within that
table, we have the following columns: 
NUMBER (product number) 
NAME (product name) 
PRICE 

How would weindex all this information? Here is an example (from the links
you provided) of xml (data-config.xml)
            <entity name="item" pk="ID" query="select * from item" 
    ------->    deltaQuery="select id from item where last_modified >
'${dataimporter.last_index_time}'"> 
            <field column="NAME" name="name" /> 
            <field column="NAME" name="nameSort" /> 
            <field column="NAME" name="alphaNameSort" /> 

Need Help in this.




Noble Paul നോബിള്‍  नोब्ळ् wrote:
> 
> delta query is for incremental imports
> 
> us ethe  'query' attribute to import data
> 
> 
> On Tue, Apr 21, 2009 at 7:35 PM, ahammad <ah...@gmail.com> wrote:
>>
>> Thanks for the link...
>>
>> I'm still a bit unclear as to how it goes. For example, lets say i have a
>> table called PRODUCTS, and within that table, I have the following
>> columns:
>> NUMBER (product number)
>> NAME (product name)
>> PRICE
>>
>> How would I index all this information? Here is an example (from the
>> links
>> you provided) of xml that confuses me:
>>
>>            <entity name="item" pk="ID" query="select * from item"
>>    ------->    deltaQuery="select id from item where last_modified >
>> '${dataimporter.last_index_time}'">
>>            <field column="NAME" name="name" />
>>            <field column="NAME" name="nameSort" />
>>            <field column="NAME" name="alphaNameSort" />
>>
>> What is that deltaQuery (or even if it was a regular "query" expression)
>> line for? It seems to me like a sort of filter. What if I don't want to
>> filter anything and just want to index all the rows?
>>
>> Cheers
>>
>>
>>
>>
>> Noble Paul നോബിള്‍  नोब्ळ् wrote:
>>>
>>> On Mon, Apr 20, 2009 at 7:15 PM, ahammad <ah...@gmail.com> wrote:
>>>>
>>>> Hello,
>>>>
>>>> I've never used Solr before, but I believe that it will suit my current
>>>> needs with indexing information from a database.
>>>>
>>>> I downloaded and extracted Solr 1.3 to play around with it. I've been
>>>> looking at the following tutorials:
>>>> http://www.ibm.com/developerworks/java/library/j-solr-update/index.html
>>>> http://www.ibm.com/developerworks/java/library/j-solr-update/index.html
>>>> http://wiki.apache.org/solr/DataImportHandler
>>>> http://wiki.apache.org/solr/DataImportHandler
>>>>
>>>> There are a few things I don't understand. For example, the IBM article
>>>> sometimes refers to directories that aren't there, or a little
>>>> different
>>>> from what I have in my extracted copy of Solr (ie
>>>> solr-dw/rss/conf/solrconfig.xml). I tried to follow the steps as best I
>>>> can,
>>>> but as soon as I put the following in solrconfig.xml, the whole thing
>>>> breaks:
>>>>
>>>> <requestHandler name="/dataimport"
>>>>  class="org.apache.solr.handler.dataimport.DataImportHandler">
>>>> <lst name="defaults">
>>>>  <str name="config">rss-data-config.xml</str>
>>>> </lst>
>>>> </requestHandler>
>>>>
>>>> Obviously I replace with my own info...One thing I don't quite get is
>>>> the
>>>> data-config.xml file. What exactly is it? I've seen examples of what it
>>>> contains but since I don't know enough, I couldn't really adjust it. In
>>>> any
>>>> case, this is the error I get, which may be because of a misconfigured
>>>> data-config.xml...
>>> the data-config.xml describes how to fetch data from various data
>>> sources and index them into Solr.
>>>
>>> The stacktrace says that your xml is invalid.
>>>
>>> The best bet is to take one of the sample dataconfig xml files and make
>>> changes.
>>>
>>> http://svn.apache.org/viewvc/lucene/solr/trunk/example/example-DIH/solr/db/conf/db-data-config.xml?revision=691151&view=markup
>>>
>>> http://svn.apache.org/viewvc/lucene/solr/trunk/example/example-DIH/solr/rss/conf/rss-data-config.xml?revision=691151&view=markup
>>>
>>>
>>>>
>>>> org.apache.solr.handler.dataimport.DataImportHandlerException:
>>>> Exception
>>>> occurred while initializing context at
>>>> org.apache.solr.handler.dataimport.DataImporter.loadDataConfig(DataImporter.java:165)
>>>> at
>>>> org.apache.solr.handler.dataimport.DataImporter.<init>(DataImporter.java:99)
>>>> at
>>>> org.apache.solr.handler.dataimport.DataImportHandler.inform(DataImportHandler.java:96)
>>>> at
>>>> org.apache.solr.core.SolrResourceLoader.inform(SolrResourceLoader.java:388)
>>>> at org.apache.solr.core.SolrCore.<init>(SolrCore.java:571) at
>>>> org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:122)
>>>> at
>>>> org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:69)
>>>> at
>>>> org.apache.catalina.core.ApplicationFilterConfig.getFilter(ApplicationFilterConfig.java:221)
>>>> at
>>>> org.apache.catalina.core.ApplicationFilterConfig.setFilterDef(ApplicationFilterConfig.java:302)
>>>> at
>>>> org.apache.catalina.core.ApplicationFilterConfig.<init>(ApplicationFilterConfig.java:78)
>>>> at
>>>> org.apache.catalina.core.StandardContext.filterStart(StandardContext.java:3635)
>>>> at
>>>> org.apache.catalina.core.StandardContext.start(StandardContext.java:4222)
>>>> at
>>>> org.apache.catalina.core.ContainerBase.addChildInternal(ContainerBase.java:760)
>>>> at
>>>> org.apache.catalina.core.ContainerBase.addChild(ContainerBase.java:740)
>>>> at
>>>> org.apache.catalina.core.StandardHost.addChild(StandardHost.java:544)
>>>> at
>>>> org.apache.catalina.startup.HostConfig.deployWAR(HostConfig.java:831)
>>>> at
>>>> org.apache.catalina.startup.HostConfig.deployWARs(HostConfig.java:720)
>>>> at
>>>> org.apache.catalina.startup.HostConfig.deployApps(HostConfig.java:490)
>>>> at
>>>> org.apache.catalina.startup.HostConfig.start(HostConfig.java:1150) at
>>>> org.apache.catalina.startup.HostConfig.lifecycleEvent(HostConfig.java:311)
>>>> at
>>>> org.apache.catalina.util.LifecycleSupport.fireLifecycleEvent(LifecycleSupport.java:120)
>>>> at
>>>> org.apache.catalina.core.ContainerBase.start(ContainerBase.java:1022)
>>>> at
>>>> org.apache.catalina.core.StandardHost.start(StandardHost.java:736) at
>>>> org.apache.catalina.core.ContainerBase.start(ContainerBase.java:1014)
>>>> at
>>>> org.apache.catalina.core.StandardEngine.start(StandardEngine.java:443)
>>>> at
>>>> org.apache.catalina.core.StandardService.start(StandardService.java:448)
>>>> at
>>>> org.apache.catalina.core.StandardServer.start(StandardServer.java:700)
>>>> at
>>>> org.apache.catalina.startup.Catalina.start(Catalina.java:552) at
>>>> sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at
>>>> sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source) at
>>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source) at
>>>> java.lang.reflect.Method.invoke(Unknown Source) at
>>>> org.apache.catalina.startup.Bootstrap.start(Bootstrap.java:295) at
>>>> org.apache.catalina.startup.Bootstrap.main(Bootstrap.java:433) Caused
>>>> by:
>>>> org.xml.sax.SAXParseException: The element type "document" must be
>>>> terminated by the matching end-tag "</document>". at
>>>> com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(Unknown
>>>> Source)
>>>> at
>>>> com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(Unknown
>>>> Source) at
>>>> org.apache.solr.handler.dataimport.DataImporter.loadDataConfig(DataImporter.java:153)
>>>>
>>>>
>>>> It's unclear to me what I need to be using, as in what
>>>> directories/files
>>>> I
>>>> need to implement this. Can someone please point me in the right
>>>> direction?
>>>>
>>>> BTW, I'm using Tomcat 5.5 because the prepackaged Jetty simply doesn't
>>>> work
>>>> for me. It shows that it "started" in the command line, but it hangs,
>>>> and
>>>> doesn't actually work when I try to hit the Solr admin page (page not
>>>> found
>>>> type error). Jetty itself does start but the project doesn't seem to
>>>> deploy...
>>>>
>>>> I apologize for the long post and if I didn't provide as much
>>>> information
>>>> as
>>>> I should. Let me know if you need clarification with anything I said.
>>>>
>>>> Thank you very much.
>>>> --
>>>> View this message in context:
>>>> http://www.nabble.com/Using-Solr-to-index-a-database-tp23136944p23136944.html
>>>> Sent from the Solr - User mailing list archive at Nabble.com.
>>>>
>>>>
>>>
>>>
>>>
>>> --
>>> --Noble Paul
>>>
>>>
>>
>> --
>> View this message in context:
>> http://www.nabble.com/Using-Solr-to-index-a-database-tp23136944p23156850.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
>>
>>
> 
> 
> 
> -- 
> --Noble Paul
> 
> 

-- 
View this message in context: http://www.nabble.com/Using-Solr-to-index-a-database-tp23136944p23403314.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Using Solr to index a database

Posted by Noble Paul നോബിള്‍ नोब्ळ् <no...@gmail.com>.
delta query is for incremental imports

us ethe  'query' attribute to import data


On Tue, Apr 21, 2009 at 7:35 PM, ahammad <ah...@gmail.com> wrote:
>
> Thanks for the link...
>
> I'm still a bit unclear as to how it goes. For example, lets say i have a
> table called PRODUCTS, and within that table, I have the following columns:
> NUMBER (product number)
> NAME (product name)
> PRICE
>
> How would I index all this information? Here is an example (from the links
> you provided) of xml that confuses me:
>
>            <entity name="item" pk="ID" query="select * from item"
>    ------->    deltaQuery="select id from item where last_modified >
> '${dataimporter.last_index_time}'">
>            <field column="NAME" name="name" />
>            <field column="NAME" name="nameSort" />
>            <field column="NAME" name="alphaNameSort" />
>
> What is that deltaQuery (or even if it was a regular "query" expression)
> line for? It seems to me like a sort of filter. What if I don't want to
> filter anything and just want to index all the rows?
>
> Cheers
>
>
>
>
> Noble Paul നോബിള്‍  नोब्ळ् wrote:
>>
>> On Mon, Apr 20, 2009 at 7:15 PM, ahammad <ah...@gmail.com> wrote:
>>>
>>> Hello,
>>>
>>> I've never used Solr before, but I believe that it will suit my current
>>> needs with indexing information from a database.
>>>
>>> I downloaded and extracted Solr 1.3 to play around with it. I've been
>>> looking at the following tutorials:
>>> http://www.ibm.com/developerworks/java/library/j-solr-update/index.html
>>> http://www.ibm.com/developerworks/java/library/j-solr-update/index.html
>>> http://wiki.apache.org/solr/DataImportHandler
>>> http://wiki.apache.org/solr/DataImportHandler
>>>
>>> There are a few things I don't understand. For example, the IBM article
>>> sometimes refers to directories that aren't there, or a little different
>>> from what I have in my extracted copy of Solr (ie
>>> solr-dw/rss/conf/solrconfig.xml). I tried to follow the steps as best I
>>> can,
>>> but as soon as I put the following in solrconfig.xml, the whole thing
>>> breaks:
>>>
>>> <requestHandler name="/dataimport"
>>>  class="org.apache.solr.handler.dataimport.DataImportHandler">
>>> <lst name="defaults">
>>>  <str name="config">rss-data-config.xml</str>
>>> </lst>
>>> </requestHandler>
>>>
>>> Obviously I replace with my own info...One thing I don't quite get is the
>>> data-config.xml file. What exactly is it? I've seen examples of what it
>>> contains but since I don't know enough, I couldn't really adjust it. In
>>> any
>>> case, this is the error I get, which may be because of a misconfigured
>>> data-config.xml...
>> the data-config.xml describes how to fetch data from various data
>> sources and index them into Solr.
>>
>> The stacktrace says that your xml is invalid.
>>
>> The best bet is to take one of the sample dataconfig xml files and make
>> changes.
>>
>> http://svn.apache.org/viewvc/lucene/solr/trunk/example/example-DIH/solr/db/conf/db-data-config.xml?revision=691151&view=markup
>>
>> http://svn.apache.org/viewvc/lucene/solr/trunk/example/example-DIH/solr/rss/conf/rss-data-config.xml?revision=691151&view=markup
>>
>>
>>>
>>> org.apache.solr.handler.dataimport.DataImportHandlerException: Exception
>>> occurred while initializing context at
>>> org.apache.solr.handler.dataimport.DataImporter.loadDataConfig(DataImporter.java:165)
>>> at
>>> org.apache.solr.handler.dataimport.DataImporter.<init>(DataImporter.java:99)
>>> at
>>> org.apache.solr.handler.dataimport.DataImportHandler.inform(DataImportHandler.java:96)
>>> at
>>> org.apache.solr.core.SolrResourceLoader.inform(SolrResourceLoader.java:388)
>>> at org.apache.solr.core.SolrCore.<init>(SolrCore.java:571) at
>>> org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:122)
>>> at
>>> org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:69)
>>> at
>>> org.apache.catalina.core.ApplicationFilterConfig.getFilter(ApplicationFilterConfig.java:221)
>>> at
>>> org.apache.catalina.core.ApplicationFilterConfig.setFilterDef(ApplicationFilterConfig.java:302)
>>> at
>>> org.apache.catalina.core.ApplicationFilterConfig.<init>(ApplicationFilterConfig.java:78)
>>> at
>>> org.apache.catalina.core.StandardContext.filterStart(StandardContext.java:3635)
>>> at
>>> org.apache.catalina.core.StandardContext.start(StandardContext.java:4222)
>>> at
>>> org.apache.catalina.core.ContainerBase.addChildInternal(ContainerBase.java:760)
>>> at
>>> org.apache.catalina.core.ContainerBase.addChild(ContainerBase.java:740)
>>> at org.apache.catalina.core.StandardHost.addChild(StandardHost.java:544)
>>> at
>>> org.apache.catalina.startup.HostConfig.deployWAR(HostConfig.java:831) at
>>> org.apache.catalina.startup.HostConfig.deployWARs(HostConfig.java:720) at
>>> org.apache.catalina.startup.HostConfig.deployApps(HostConfig.java:490) at
>>> org.apache.catalina.startup.HostConfig.start(HostConfig.java:1150) at
>>> org.apache.catalina.startup.HostConfig.lifecycleEvent(HostConfig.java:311)
>>> at
>>> org.apache.catalina.util.LifecycleSupport.fireLifecycleEvent(LifecycleSupport.java:120)
>>> at org.apache.catalina.core.ContainerBase.start(ContainerBase.java:1022)
>>> at
>>> org.apache.catalina.core.StandardHost.start(StandardHost.java:736) at
>>> org.apache.catalina.core.ContainerBase.start(ContainerBase.java:1014) at
>>> org.apache.catalina.core.StandardEngine.start(StandardEngine.java:443) at
>>> org.apache.catalina.core.StandardService.start(StandardService.java:448)
>>> at
>>> org.apache.catalina.core.StandardServer.start(StandardServer.java:700) at
>>> org.apache.catalina.startup.Catalina.start(Catalina.java:552) at
>>> sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at
>>> sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source) at
>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source) at
>>> java.lang.reflect.Method.invoke(Unknown Source) at
>>> org.apache.catalina.startup.Bootstrap.start(Bootstrap.java:295) at
>>> org.apache.catalina.startup.Bootstrap.main(Bootstrap.java:433) Caused by:
>>> org.xml.sax.SAXParseException: The element type "document" must be
>>> terminated by the matching end-tag "</document>". at
>>> com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(Unknown
>>> Source)
>>> at
>>> com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(Unknown
>>> Source) at
>>> org.apache.solr.handler.dataimport.DataImporter.loadDataConfig(DataImporter.java:153)
>>>
>>>
>>> It's unclear to me what I need to be using, as in what directories/files
>>> I
>>> need to implement this. Can someone please point me in the right
>>> direction?
>>>
>>> BTW, I'm using Tomcat 5.5 because the prepackaged Jetty simply doesn't
>>> work
>>> for me. It shows that it "started" in the command line, but it hangs, and
>>> doesn't actually work when I try to hit the Solr admin page (page not
>>> found
>>> type error). Jetty itself does start but the project doesn't seem to
>>> deploy...
>>>
>>> I apologize for the long post and if I didn't provide as much information
>>> as
>>> I should. Let me know if you need clarification with anything I said.
>>>
>>> Thank you very much.
>>> --
>>> View this message in context:
>>> http://www.nabble.com/Using-Solr-to-index-a-database-tp23136944p23136944.html
>>> Sent from the Solr - User mailing list archive at Nabble.com.
>>>
>>>
>>
>>
>>
>> --
>> --Noble Paul
>>
>>
>
> --
> View this message in context: http://www.nabble.com/Using-Solr-to-index-a-database-tp23136944p23156850.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
>



-- 
--Noble Paul

Re: Using Solr to index a database

Posted by ahammad <ah...@gmail.com>.
Thanks for the link...

I'm still a bit unclear as to how it goes. For example, lets say i have a
table called PRODUCTS, and within that table, I have the following columns:
NUMBER (product number)
NAME (product name)
PRICE

How would I index all this information? Here is an example (from the links
you provided) of xml that confuses me:

            <entity name="item" pk="ID" query="select * from item"
    ------->    deltaQuery="select id from item where last_modified >
'${dataimporter.last_index_time}'">
            <field column="NAME" name="name" />
            <field column="NAME" name="nameSort" />
            <field column="NAME" name="alphaNameSort" />

What is that deltaQuery (or even if it was a regular "query" expression)
line for? It seems to me like a sort of filter. What if I don't want to
filter anything and just want to index all the rows?

Cheers




Noble Paul നോബിള്‍  नोब्ळ् wrote:
> 
> On Mon, Apr 20, 2009 at 7:15 PM, ahammad <ah...@gmail.com> wrote:
>>
>> Hello,
>>
>> I've never used Solr before, but I believe that it will suit my current
>> needs with indexing information from a database.
>>
>> I downloaded and extracted Solr 1.3 to play around with it. I've been
>> looking at the following tutorials:
>> http://www.ibm.com/developerworks/java/library/j-solr-update/index.html
>> http://www.ibm.com/developerworks/java/library/j-solr-update/index.html
>> http://wiki.apache.org/solr/DataImportHandler
>> http://wiki.apache.org/solr/DataImportHandler
>>
>> There are a few things I don't understand. For example, the IBM article
>> sometimes refers to directories that aren't there, or a little different
>> from what I have in my extracted copy of Solr (ie
>> solr-dw/rss/conf/solrconfig.xml). I tried to follow the steps as best I
>> can,
>> but as soon as I put the following in solrconfig.xml, the whole thing
>> breaks:
>>
>> <requestHandler name="/dataimport"
>>  class="org.apache.solr.handler.dataimport.DataImportHandler">
>> <lst name="defaults">
>>  <str name="config">rss-data-config.xml</str>
>> </lst>
>> </requestHandler>
>>
>> Obviously I replace with my own info...One thing I don't quite get is the
>> data-config.xml file. What exactly is it? I've seen examples of what it
>> contains but since I don't know enough, I couldn't really adjust it. In
>> any
>> case, this is the error I get, which may be because of a misconfigured
>> data-config.xml...
> the data-config.xml describes how to fetch data from various data
> sources and index them into Solr.
> 
> The stacktrace says that your xml is invalid.
> 
> The best bet is to take one of the sample dataconfig xml files and make
> changes.
> 
> http://svn.apache.org/viewvc/lucene/solr/trunk/example/example-DIH/solr/db/conf/db-data-config.xml?revision=691151&view=markup
> 
> http://svn.apache.org/viewvc/lucene/solr/trunk/example/example-DIH/solr/rss/conf/rss-data-config.xml?revision=691151&view=markup
> 
> 
>>
>> org.apache.solr.handler.dataimport.DataImportHandlerException: Exception
>> occurred while initializing context at
>> org.apache.solr.handler.dataimport.DataImporter.loadDataConfig(DataImporter.java:165)
>> at
>> org.apache.solr.handler.dataimport.DataImporter.<init>(DataImporter.java:99)
>> at
>> org.apache.solr.handler.dataimport.DataImportHandler.inform(DataImportHandler.java:96)
>> at
>> org.apache.solr.core.SolrResourceLoader.inform(SolrResourceLoader.java:388)
>> at org.apache.solr.core.SolrCore.<init>(SolrCore.java:571) at
>> org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:122)
>> at
>> org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:69)
>> at
>> org.apache.catalina.core.ApplicationFilterConfig.getFilter(ApplicationFilterConfig.java:221)
>> at
>> org.apache.catalina.core.ApplicationFilterConfig.setFilterDef(ApplicationFilterConfig.java:302)
>> at
>> org.apache.catalina.core.ApplicationFilterConfig.<init>(ApplicationFilterConfig.java:78)
>> at
>> org.apache.catalina.core.StandardContext.filterStart(StandardContext.java:3635)
>> at
>> org.apache.catalina.core.StandardContext.start(StandardContext.java:4222)
>> at
>> org.apache.catalina.core.ContainerBase.addChildInternal(ContainerBase.java:760)
>> at
>> org.apache.catalina.core.ContainerBase.addChild(ContainerBase.java:740)
>> at org.apache.catalina.core.StandardHost.addChild(StandardHost.java:544)
>> at
>> org.apache.catalina.startup.HostConfig.deployWAR(HostConfig.java:831) at
>> org.apache.catalina.startup.HostConfig.deployWARs(HostConfig.java:720) at
>> org.apache.catalina.startup.HostConfig.deployApps(HostConfig.java:490) at
>> org.apache.catalina.startup.HostConfig.start(HostConfig.java:1150) at
>> org.apache.catalina.startup.HostConfig.lifecycleEvent(HostConfig.java:311)
>> at
>> org.apache.catalina.util.LifecycleSupport.fireLifecycleEvent(LifecycleSupport.java:120)
>> at org.apache.catalina.core.ContainerBase.start(ContainerBase.java:1022)
>> at
>> org.apache.catalina.core.StandardHost.start(StandardHost.java:736) at
>> org.apache.catalina.core.ContainerBase.start(ContainerBase.java:1014) at
>> org.apache.catalina.core.StandardEngine.start(StandardEngine.java:443) at
>> org.apache.catalina.core.StandardService.start(StandardService.java:448)
>> at
>> org.apache.catalina.core.StandardServer.start(StandardServer.java:700) at
>> org.apache.catalina.startup.Catalina.start(Catalina.java:552) at
>> sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at
>> sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source) at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source) at
>> java.lang.reflect.Method.invoke(Unknown Source) at
>> org.apache.catalina.startup.Bootstrap.start(Bootstrap.java:295) at
>> org.apache.catalina.startup.Bootstrap.main(Bootstrap.java:433) Caused by:
>> org.xml.sax.SAXParseException: The element type "document" must be
>> terminated by the matching end-tag "</document>". at
>> com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(Unknown
>> Source)
>> at
>> com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(Unknown
>> Source) at
>> org.apache.solr.handler.dataimport.DataImporter.loadDataConfig(DataImporter.java:153)
>>
>>
>> It's unclear to me what I need to be using, as in what directories/files
>> I
>> need to implement this. Can someone please point me in the right
>> direction?
>>
>> BTW, I'm using Tomcat 5.5 because the prepackaged Jetty simply doesn't
>> work
>> for me. It shows that it "started" in the command line, but it hangs, and
>> doesn't actually work when I try to hit the Solr admin page (page not
>> found
>> type error). Jetty itself does start but the project doesn't seem to
>> deploy...
>>
>> I apologize for the long post and if I didn't provide as much information
>> as
>> I should. Let me know if you need clarification with anything I said.
>>
>> Thank you very much.
>> --
>> View this message in context:
>> http://www.nabble.com/Using-Solr-to-index-a-database-tp23136944p23136944.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
>>
>>
> 
> 
> 
> -- 
> --Noble Paul
> 
> 

-- 
View this message in context: http://www.nabble.com/Using-Solr-to-index-a-database-tp23136944p23156850.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Using Solr to index a database

Posted by Noble Paul നോബിള്‍ नोब्ळ् <no...@gmail.com>.
On Mon, Apr 20, 2009 at 7:15 PM, ahammad <ah...@gmail.com> wrote:
>
> Hello,
>
> I've never used Solr before, but I believe that it will suit my current
> needs with indexing information from a database.
>
> I downloaded and extracted Solr 1.3 to play around with it. I've been
> looking at the following tutorials:
> http://www.ibm.com/developerworks/java/library/j-solr-update/index.html
> http://www.ibm.com/developerworks/java/library/j-solr-update/index.html
> http://wiki.apache.org/solr/DataImportHandler
> http://wiki.apache.org/solr/DataImportHandler
>
> There are a few things I don't understand. For example, the IBM article
> sometimes refers to directories that aren't there, or a little different
> from what I have in my extracted copy of Solr (ie
> solr-dw/rss/conf/solrconfig.xml). I tried to follow the steps as best I can,
> but as soon as I put the following in solrconfig.xml, the whole thing
> breaks:
>
> <requestHandler name="/dataimport"
>  class="org.apache.solr.handler.dataimport.DataImportHandler">
> <lst name="defaults">
>  <str name="config">rss-data-config.xml</str>
> </lst>
> </requestHandler>
>
> Obviously I replace with my own info...One thing I don't quite get is the
> data-config.xml file. What exactly is it? I've seen examples of what it
> contains but since I don't know enough, I couldn't really adjust it. In any
> case, this is the error I get, which may be because of a misconfigured
> data-config.xml...
the data-config.xml describes how to fetch data from various data
sources and index them into Solr.

The stacktrace says that your xml is invalid.

The best bet is to take one of the sample dataconfig xml files and make changes.

http://svn.apache.org/viewvc/lucene/solr/trunk/example/example-DIH/solr/db/conf/db-data-config.xml?revision=691151&view=markup

http://svn.apache.org/viewvc/lucene/solr/trunk/example/example-DIH/solr/rss/conf/rss-data-config.xml?revision=691151&view=markup


>
> org.apache.solr.handler.dataimport.DataImportHandlerException: Exception
> occurred while initializing context at
> org.apache.solr.handler.dataimport.DataImporter.loadDataConfig(DataImporter.java:165)
> at
> org.apache.solr.handler.dataimport.DataImporter.<init>(DataImporter.java:99)
> at
> org.apache.solr.handler.dataimport.DataImportHandler.inform(DataImportHandler.java:96)
> at
> org.apache.solr.core.SolrResourceLoader.inform(SolrResourceLoader.java:388)
> at org.apache.solr.core.SolrCore.<init>(SolrCore.java:571) at
> org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:122)
> at
> org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:69)
> at
> org.apache.catalina.core.ApplicationFilterConfig.getFilter(ApplicationFilterConfig.java:221)
> at
> org.apache.catalina.core.ApplicationFilterConfig.setFilterDef(ApplicationFilterConfig.java:302)
> at
> org.apache.catalina.core.ApplicationFilterConfig.<init>(ApplicationFilterConfig.java:78)
> at
> org.apache.catalina.core.StandardContext.filterStart(StandardContext.java:3635)
> at org.apache.catalina.core.StandardContext.start(StandardContext.java:4222)
> at
> org.apache.catalina.core.ContainerBase.addChildInternal(ContainerBase.java:760)
> at org.apache.catalina.core.ContainerBase.addChild(ContainerBase.java:740)
> at org.apache.catalina.core.StandardHost.addChild(StandardHost.java:544) at
> org.apache.catalina.startup.HostConfig.deployWAR(HostConfig.java:831) at
> org.apache.catalina.startup.HostConfig.deployWARs(HostConfig.java:720) at
> org.apache.catalina.startup.HostConfig.deployApps(HostConfig.java:490) at
> org.apache.catalina.startup.HostConfig.start(HostConfig.java:1150) at
> org.apache.catalina.startup.HostConfig.lifecycleEvent(HostConfig.java:311)
> at
> org.apache.catalina.util.LifecycleSupport.fireLifecycleEvent(LifecycleSupport.java:120)
> at org.apache.catalina.core.ContainerBase.start(ContainerBase.java:1022) at
> org.apache.catalina.core.StandardHost.start(StandardHost.java:736) at
> org.apache.catalina.core.ContainerBase.start(ContainerBase.java:1014) at
> org.apache.catalina.core.StandardEngine.start(StandardEngine.java:443) at
> org.apache.catalina.core.StandardService.start(StandardService.java:448) at
> org.apache.catalina.core.StandardServer.start(StandardServer.java:700) at
> org.apache.catalina.startup.Catalina.start(Catalina.java:552) at
> sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at
> sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source) at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source) at
> java.lang.reflect.Method.invoke(Unknown Source) at
> org.apache.catalina.startup.Bootstrap.start(Bootstrap.java:295) at
> org.apache.catalina.startup.Bootstrap.main(Bootstrap.java:433) Caused by:
> org.xml.sax.SAXParseException: The element type "document" must be
> terminated by the matching end-tag "</document>". at
> com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(Unknown Source)
> at com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(Unknown
> Source) at
> org.apache.solr.handler.dataimport.DataImporter.loadDataConfig(DataImporter.java:153)
>
>
> It's unclear to me what I need to be using, as in what directories/files I
> need to implement this. Can someone please point me in the right direction?
>
> BTW, I'm using Tomcat 5.5 because the prepackaged Jetty simply doesn't work
> for me. It shows that it "started" in the command line, but it hangs, and
> doesn't actually work when I try to hit the Solr admin page (page not found
> type error). Jetty itself does start but the project doesn't seem to
> deploy...
>
> I apologize for the long post and if I didn't provide as much information as
> I should. Let me know if you need clarification with anything I said.
>
> Thank you very much.
> --
> View this message in context: http://www.nabble.com/Using-Solr-to-index-a-database-tp23136944p23136944.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
>



-- 
--Noble Paul