You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Kevin Penny <kp...@jobs2web.com> on 2008/11/11 05:15:15 UTC

Newbie Question - getting search results from dataimport request handler

My Question is: what is the format of a search that will return data?
i.e. /solr/select?q=developer&qt=dataimport (won’t work) nor will /solr/dataimport?q=developer (won’t work)
“HTTP ERROR: 404
NOT_FOUND
RequestURI=/solr/dataimport“

I have created a ‘dataimport’ set that contains data from a sql db.

I can view meta data from this url: /solr/dataimport
<response>
−
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">0</int>
</lst>
−
<lst name="initArgs">
−
<lst name="defaults">
<str name="config">data-config.xml</str>
</lst>
</lst>
<str name="status">idle</str>
<str name="importResponse"/>
−
<lst name="statusMessages">
<str name="Total Requests made to DataSource">1</str>
<str name="Total Rows Fetched">10</str>
<str name="Total Documents Skipped">0</str>
<str name="Full Dump Started">2008-11-10 21:51:40</str>
<str name="Time taken ">0:0:4.594</str>
</lst>
−
<str name="WARNING">
This response format is experimental.  It is likely to change in the future.
</str>
</response>

I can verify that the data is there by going through /solr/admin/dataimport.jsp and doing ‘verbose’ true and debug now.
It shows me the xml data set on the right as such:

<response>
−
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">4594</int>
</lst>
−
<lst name="initArgs">
−
<lst name="defaults">
<str name="config">data-config.xml</str>
</lst>
</lst>
<str name="command">full-import</str>
<str name="mode">debug</str>
−
<arr name="documents">
−
<arr>
−
<arr>
<int>87133</int>
</arr>
</arr>
−
<arr>
−
<arr>
<int>87134</int>
</arr>
</arr>
−
<arr>
−
<arr>
<int>87135</int>
</arr>
</arr>
−
<arr>
−
<arr>
<int>87136</int>
</arr>
</arr>
−
<arr>
−
<arr>
<int>87137</int>
</arr>
</arr>
−
<arr>
−
<arr>
<int>87138</int>
</arr>
</arr>
−
<arr>
−
<arr>
<int>87139</int>
</arr>
</arr>
−
<arr>
−
<arr>
<int>87140</int>
</arr>
</arr>
−
<arr>
−
<arr>
<int>87141</int>
</arr>
</arr>
−
<arr>
−
<arr>
<int>87142</int>
</arr>
</arr>
</arr>
−
<lst name="verbose-output">
−
<lst name="entity:item">
−
<lst name="document#1">
−
<str name="query">
SELECT  j.id      , j.title      ,  FROM      dbo.jobs j WITH (NOLOCK)      LEFT  WHERE j.siteid = 46 and j.active = 1
</str>
<str name="time-taken">0:0:4.578</str>
<str>----------- row #1-------------</str>
<str name="zip"/>
<str name="urltitle">Operations Software Developer Job</str>
<str name="altlocation">SAN ANTONIO, TX, 78229</str>
<str name="alttitle">Ope…


Here is my solconfig.xml
…
<requestHandler name="dataimport" class="org.apache.solr.handler.dataimport.DataImportHandler">
    <lst name="defaults">
      <str name="config">data-config.xml</str>
    </lst>
  </requestHandler>
…
Data-config.xml is in the same dir as solconfig.xml

My data-config.xml is like any other:
<dataConfig>
    <dataSource driver="com.microsoft.sqlserver.jdbc.SQLServerDriver"
url="jdbc:sqlserver://xxxxxxxx:1433;databaseName=xxxxx" user="xxxxx" password="xxxxx" />
    <document name="jobs">
            <entity name="item" pk="id" query="SELECT  j.id
                                                            , j.title
                                                            …
                                                FROM
                                                            dbo.jobs …
                                                WHERE j.siteid = 46 and j.active = 1"
                deltaQuery="select id from dbo.jobs where lastmodified > '${dataimporter.last_index_time}'">

        </entity>
    </document>
</dataConfig>

I’m using win xp with apache – and jetty + solr 1.3.0

Thanks



RE: Newbie Question - getting search results from dataimport request handler

Posted by Kevin Penny <kp...@jobs2web.com>.
Excellent!

Thanks a bunch - that did the trick - all defined and my terms are being returned nicely - schema.xml was the ticket - not sure how I missed that in the docs.

Kevin

-----Original Message-----
From: Shalin Shekhar Mangar [mailto:shalinmangar@gmail.com]
Sent: Monday, November 10, 2008 11:35 PM
To: solr-user@lucene.apache.org
Subject: Re: Newbie Question - getting search results from dataimport request handler

Hi Kevin,

You need to modify the schema which came with Solr to suit your data. There
should be a schema.xml inside example/solr/conf directory. Once you do that,
re-import your data.

Take a look at http://wiki.apache.org/solr/SchemaXml

On Tue, Nov 11, 2008 at 10:59 AM, Kevin Penny <kp...@jobs2web.com> wrote:

> I can execute: /solr/select?q=id:87133
>
> So there is data there, however I have not defined any 'Fields' in my
> data-config and am hoping my column names are the 'fields', yet I'm not
> seeing any of them being returned in the 'doc' node below :
>
>
> <response>
> -
> <lst name="responseHeader">
> <int name="status">0</int>
> <int name="QTime">0</int>
> -
> <lst name="params">
> <str name="q">id:87133</str>
> </lst>
> </lst>
> -
> <result name="response" numFound="1" start="0">
> -
> <doc>
> <str name="id">87133</str>
> <int name="popularity">0</int>
> <str name="sku">87133</str>
> <date name="timestamp">2008-11-11T05:25:29Z</date>
> </doc>
> </result>
> </response>
>
> Kevin
>
> -----Original Message-----
> From: Noble Paul നോബിള്‍ नोब्ळ् [mailto:noble.paul@gmail.com]
> Sent: Monday, November 10, 2008 11:23 PM
> To: solr-user@lucene.apache.org
> Subject: Re: Newbie Question - getting search results from dataimport
> request handler
>
> search for *:* and see if the index indeed has the documents .
> Once you ensure the docs are there go through the lucene query syntax
> and check your query
>
> On Tue, Nov 11, 2008 at 10:07 AM, Kevin Penny <kp...@jobs2web.com> wrote:
> > Ok so I executed a:
> > solr/dataimport?command=full-import
> > then I checked here:
> > solr/dataimport
> >
> > I get a good xml message (figure 1.1) showing me that 125 records have
> been indexed (good) and I know one of them contains the word 'job'.
> >
> > I sould get results from this query string then right (figure 1.0 is my
> result - 0 records found)?
> > solr/select?q=job
> >
> >
> > figure 1.0
> > <response>
> > −
> > <lst name="responseHeader">
> > <int name="status">0</int>
> > <int name="QTime">0</int>
> > −
> > <lst name="params">
> > <str name="q">job</str>
> > </lst>
> > </lst>
> > <result name="response" numFound="0" start="0"/>
> > </response>
> >
> > figure 1.1
> > <response>
> > −
> > <lst name="responseHeader">
> > <int name="status">0</int>
> > <int name="QTime">0</int>
> > </lst>
> > −
> > <lst name="initArgs">
> > −
> > <lst name="defaults">
> > <str name="config">data-config.xml</str>
> > </lst>
> > </lst>
> > <str name="status">idle</str>
> > <str name="importResponse"/>
> > −
> > <lst name="statusMessages">
> > <str name="Total Requests made to DataSource">1</str>
> > <str name="Total Rows Fetched">125</str>
> > <str name="Total Documents Skipped">0</str>
> > <str name="Full Dump Started">2008-11-10 22:33:55</str>
> > −
> > <str name="">
> > Indexing completed. Added/Updated: 125 documents. Deleted 0 documents.
> > </str>
> > <str name="Committed">2008-11-10 22:34:00</str>
> > <str name="Optimized">2008-11-10 22:34:00</str>
> > <str name="Time taken ">0:0:5.79</str>
> > </lst>
> > −
> > <str name="WARNING">
> > This response format is experimental.  It is likely to change in the
> future.
> > </str>
> > </response>
> >
> > Kevin
> >
> > -----Original Message-----
> > From: Noble Paul നോബിള്‍ नोब्ळ् [mailto:noble.paul@gmail.com]
> > Sent: Monday, November 10, 2008 10:30 PM
> > To: solr-user@lucene.apache.org
> > Subject: Re: Newbie Question - getting search results from dataimport
> request handler
> >
> > XML is just an intermediete data format Solr internally has no XML
> > data. When the data comes out XML is just another representation of
> > the same data.
> >
> > Whether you put in data using XML or DB (SQL) it all goes into the
> > same index . Query must be done on that index using the syntax
> > http://localhost:8983/solr/select/?q=<your-query-goes-here>
> >
> > On Tue, Nov 11, 2008 at 9:55 AM, Kevin Penny <kp...@jobs2web.com>
> wrote:
> >> Ok - and what would that be? (query interface)
> >>
> >> I need the URL format that would work in this situation to return data
> from my setup.
> >>
> >> I've gone through the tutorial and used execution strings like:
> >> http://localhost:8983/solr/select/?indent=on&q=video&sort=price+desc
> >> etc however I'm working with sql data and not xml data.
> >>
> >> Thanks
> >>
> >> -----Original Message-----
> >> From: Noble Paul നോബിള്‍ नोब्ळ् [mailto:noble.paul@gmail.com]
> >> Sent: Monday, November 10, 2008 10:18 PM
> >> To: solr-user@lucene.apache.org
> >> Subject: Re: Newbie Question - getting search results from dataimport
> request handler
> >>
> >> you cannot query the DIH. It can only do indexing
> >> after indexing you must do the indexing on the regular query interface
> >>
> >> On Tue, Nov 11, 2008 at 9:45 AM, Kevin Penny <kp...@jobs2web.com>
> wrote:
> >>> My Question is: what is the format of a search that will return data?
> >>> i.e. /solr/select?q=developer&qt=dataimport (won't work) nor will
> /solr/dataimport?q=developer (won't work)
> >>> "HTTP ERROR: 404
> >>> NOT_FOUND
> >>> RequestURI=/solr/dataimport"
> >>>
> >>> I have created a 'dataimport' set that contains data from a sql db.
> >>>
> >>> I can view meta data from this url: /solr/dataimport
> >>> <response>
> >>> −
> >>> <lst name="responseHeader">
> >>> <int name="status">0</int>
> >>> <int name="QTime">0</int>
> >>> </lst>
> >>> −
> >>> <lst name="initArgs">
> >>> −
> >>> <lst name="defaults">
> >>> <str name="config">data-config.xml</str>
> >>> </lst>
> >>> </lst>
> >>> <str name="status">idle</str>
> >>> <str name="importResponse"/>
> >>> −
> >>> <lst name="statusMessages">
> >>> <str name="Total Requests made to DataSource">1</str>
> >>> <str name="Total Rows Fetched">10</str>
> >>> <str name="Total Documents Skipped">0</str>
> >>> <str name="Full Dump Started">2008-11-10 21:51:40</str>
> >>> <str name="Time taken ">0:0:4.594</str>
> >>> </lst>
> >>> −
> >>> <str name="WARNING">
> >>> This response format is experimental.  It is likely to change in the
> future.
> >>> </str>
> >>> </response>
> >>>
> >>> I can verify that the data is there by going through
> /solr/admin/dataimport.jsp and doing 'verbose' true and debug now.
> >>> It shows me the xml data set on the right as such:
> >>>
> >>> <response>
> >>> −
> >>> <lst name="responseHeader">
> >>> <int name="status">0</int>
> >>> <int name="QTime">4594</int>
> >>> </lst>
> >>> −
> >>> <lst name="initArgs">
> >>> −
> >>> <lst name="defaults">
> >>> <str name="config">data-config.xml</str>
> >>> </lst>
> >>> </lst>
> >>> <str name="command">full-import</str>
> >>> <str name="mode">debug</str>
> >>> −
> >>> <arr name="documents">
> >>> −
> >>> <arr>
> >>> −
> >>> <arr>
> >>> <int>87133</int>
> >>> </arr>
> >>> </arr>
> >>> −
> >>> <arr>
> >>> −
> >>> <arr>
> >>> <int>87134</int>
> >>> </arr>
> >>> </arr>
> >>> −
> >>> <arr>
> >>> −
> >>> <arr>
> >>> <int>87135</int>
> >>> </arr>
> >>> </arr>
> >>> −
> >>> <arr>
> >>> −
> >>> <arr>
> >>> <int>87136</int>
> >>> </arr>
> >>> </arr>
> >>> −
> >>> <arr>
> >>> −
> >>> <arr>
> >>> <int>87137</int>
> >>> </arr>
> >>> </arr>
> >>> −
> >>> <arr>
> >>> −
> >>> <arr>
> >>> <int>87138</int>
> >>> </arr>
> >>> </arr>
> >>> −
> >>> <arr>
> >>> −
> >>> <arr>
> >>> <int>87139</int>
> >>> </arr>
> >>> </arr>
> >>> −
> >>> <arr>
> >>> −
> >>> <arr>
> >>> <int>87140</int>
> >>> </arr>
> >>> </arr>
> >>> −
> >>> <arr>
> >>> −
> >>> <arr>
> >>> <int>87141</int>
> >>> </arr>
> >>> </arr>
> >>> −
> >>> <arr>
> >>> −
> >>> <arr>
> >>> <int>87142</int>
> >>> </arr>
> >>> </arr>
> >>> </arr>
> >>> −
> >>> <lst name="verbose-output">
> >>> −
> >>> <lst name="entity:item">
> >>> −
> >>> <lst name="document#1">
> >>> −
> >>> <str name="query">
> >>> SELECT  j.id      , j.title      ,  FROM      dbo.jobs j WITH (NOLOCK)
>      LEFT  WHERE j.siteid = 46 and j.active = 1
> >>> </str>
> >>> <str name="time-taken">0:0:4.578</str>
> >>> <str>----------- row #1-------------</str>
> >>> <str name="zip"/>
> >>> <str name="urltitle">Operations Software Developer Job</str>
> >>> <str name="altlocation">SAN ANTONIO, TX, 78229</str>
> >>> <str name="alttitle">Ope…
> >>>
> >>>
> >>> Here is my solconfig.xml
> >>> …
> >>> <requestHandler name="dataimport"
> class="org.apache.solr.handler.dataimport.DataImportHandler">
> >>>    <lst name="defaults">
> >>>      <str name="config">data-config.xml</str>
> >>>    </lst>
> >>>  </requestHandler>
> >>> …
> >>> Data-config.xml is in the same dir as solconfig.xml
> >>>
> >>> My data-config.xml is like any other:
> >>> <dataConfig>
> >>>    <dataSource driver="com.microsoft.sqlserver.jdbc.SQLServerDriver"
> >>> url="jdbc:sqlserver://xxxxxxxx:1433;databaseName=xxxxx" user="xxxxx"
> password="xxxxx" />
> >>>    <document name="jobs">
> >>>            <entity name="item" pk="id" query="SELECT  j.id
> >>>                                                            , j.title
> >>>                                                            …
> >>>                                                FROM
> >>>                                                            dbo.jobs …
> >>>                                                WHERE j.siteid = 46 and
> j.active = 1"
> >>>                deltaQuery="select id from dbo.jobs where lastmodified >
> '${dataimporter.last_index_time}'">
> >>>
> >>>        </entity>
> >>>    </document>
> >>> </dataConfig>
> >>>
> >>> I'm using win xp with apache – and jetty + solr 1.3.0
> >>>
> >>> Thanks
> >>>
> >>>
> >>>
> >>
> >>
> >>
> >> --
> >> --Noble Paul
> >>
> >
> >
> >
> > --
> > --Noble Paul
> >
>
>
>
> --
> --Noble Paul
>



--
Regards,
Shalin Shekhar Mangar.

RE: Newbie Question - getting search results from dataimport request handler

Posted by Lance Norskog <go...@gmail.com>.
Comment added about SOLR-853, for those interested:
https://issues.apache.org/jira/browse/SOLR-853 

-----Original Message-----
From: Noble Paul നോബിള്‍ नोब्ळ् [mailto:noble.paul@gmail.com] 
Sent: Friday, November 21, 2008 8:12 PM
To: solr-user@lucene.apache.org
Subject: Re: Newbie Question - getting search results from dataimport request handler

On Sat, Nov 22, 2008 at 3:10 AM, Chris Hostetter <ho...@fucit.org> wrote:
>
> : > it might be worth considering a new @attribute for <fields> to 
> indicate
> : > that they are going to be used purely as "component" fields (ie: 
> your
> : > first-name/last-name example) and then have DIH pass all 
> non-component
> : > fields along and error if undefined in the schema just like other 
> updating
> : > RequestHandlers do.
> : >
> : > either that, or require that people declaure indexed="false"
> : > stored="false" fields in the schema for these intermediate 
> component
> : > fields so that we can properly warn then when DIH is getting data 
> it
> : > doesn't know what to do with -- protecting people from field name 
> typos
> : > and returning errors instead of silently ignoring unexpected input 
> is
> : > fairly important behavir -- especially for new users.
>
> : Actually it is done by DIH . When the dataconfig is loaded DIH 
> reports
> : these information on the console. though it is limited , it helps to 
> a
> : certain extent
>
> Hmmm.
>
> Logging an error and returning successfully (without adding any docs) 
> is still inconsistent with the way all other RequestHandlers work: 
> fail the request.
>
> I know DIH isn't a typical RequestHandler, but some things (like 
> failing on failure) seem like they should be a given.
SOLR-842 .
DIH is an ETL tool pretending to be a RequestHandler. Originally it was built to run outside of Solr using SolrJ. For better integration and ease of use we changed it later.

SOLR-853 aims to achieve the oroginal goal

The goal of DIH is to become a full featured ETL tool.



>
>
>
> -Hoss
>
>



--
--Noble Paul


Re: Newbie Question - getting search results from dataimport request handler

Posted by Noble Paul നോബിള്‍ नोब्ळ् <no...@gmail.com>.
On Mon, Nov 24, 2008 at 7:25 AM, Chris Hostetter
<ho...@fucit.org> wrote:
>
> : > Logging an error and returning successfully (without adding any docs) is
> : > still inconsistent with the way all other RequestHandlers work: fail the
> : > request.
> : >
> : > I know DIH isn't a typical RequestHandler, but some things (like failing
> : > on failure) seem like they should be a given.
> : SOLR-842 .
> : DIH is an ETL tool pretending to be a RequestHandler. Originally it
> : was built to run outside of Solr using SolrJ. For better integration
> : and ease of use we changed it later.
> :
> : SOLR-853 aims to achieve the oroginal goal
> :
> : The goal of DIH is to become a full featured ETL tool.
>
> Understood ... but shouldn't ETL Tools "fail on failure" ?
>
> I mean forget Solr for a minute:   If i've got a standalone ETL Tool that
> runs as a daemon, and on startup it logs some error messages because i've
> got bad configs (and it can tell the fields i've listed for my
> 'target' system don't exist there) should it report "success" everytime i
> push data to it?
>
> Based on this thread, that's what it sounds like DIH is doing right now in
> situations like this.
>
> If nothing else, we could give DIH a way to check the global
> <abortOnConfigurationError> value from solrconfig.xml and make it's
> decisison that way
We considered these. The severity of errors are very much specific to
the source of data. It is very unlikely that a DB source throws up
errors. In xml data sources say out of x urls 1 or two are wrong,
would the user wish to ignore or want to abort the entire import.


So we decided to give more options and the implementations are left to
the EntityProcessor. Moreover the default is set to onError=abort


>
>
>
> -Hoss
>
>



-- 
--Noble Paul

Re: Newbie Question - getting search results from dataimport request handler

Posted by Chris Hostetter <ho...@fucit.org>.
: > Logging an error and returning successfully (without adding any docs) is
: > still inconsistent with the way all other RequestHandlers work: fail the
: > request.
: >
: > I know DIH isn't a typical RequestHandler, but some things (like failing
: > on failure) seem like they should be a given.
: SOLR-842 .
: DIH is an ETL tool pretending to be a RequestHandler. Originally it
: was built to run outside of Solr using SolrJ. For better integration
: and ease of use we changed it later.
: 
: SOLR-853 aims to achieve the oroginal goal
: 
: The goal of DIH is to become a full featured ETL tool.

Understood ... but shouldn't ETL Tools "fail on failure" ?

I mean forget Solr for a minute:   If i've got a standalone ETL Tool that 
runs as a daemon, and on startup it logs some error messages because i've 
got bad configs (and it can tell the fields i've listed for my 
'target' system don't exist there) should it report "success" everytime i 
push data to it?

Based on this thread, that's what it sounds like DIH is doing right now in 
situations like this.

If nothing else, we could give DIH a way to check the global
<abortOnConfigurationError> value from solrconfig.xml and make it's 
decisison that way.



-Hoss


RE: Newbie Question - getting search results from dataimport request handler

Posted by "Norskog, Lance" <la...@divvio.com>.
As part of the ETL effort, please consider how to integrate with these two open-source ETL systems. I'm not asking for an implementation, just suggesting that having a concrete context will help you in the architecture phase.

http://kettle.pentaho.org/
http://www.talend.com/products-data-integration/talend-open-studio.php 

Thanks,

Lance

-----Original Message-----
From: Noble Paul നോബിള്‍ नोब्ळ् [mailto:noble.paul@gmail.com] 
Sent: Friday, November 21, 2008 8:12 PM
To: solr-user@lucene.apache.org
Subject: Re: Newbie Question - getting search results from dataimport request handler

On Sat, Nov 22, 2008 at 3:10 AM, Chris Hostetter <ho...@fucit.org> wrote:
>
> : > it might be worth considering a new @attribute for <fields> to 
> indicate
> : > that they are going to be used purely as "component" fields (ie: 
> your
> : > first-name/last-name example) and then have DIH pass all 
> non-component
> : > fields along and error if undefined in the schema just like other 
> updating
> : > RequestHandlers do.
> : >
> : > either that, or require that people declaure indexed="false"
> : > stored="false" fields in the schema for these intermediate 
> component
> : > fields so that we can properly warn then when DIH is getting data 
> it
> : > doesn't know what to do with -- protecting people from field name 
> typos
> : > and returning errors instead of silently ignoring unexpected input 
> is
> : > fairly important behavir -- especially for new users.
>
> : Actually it is done by DIH . When the dataconfig is loaded DIH 
> reports
> : these information on the console. though it is limited , it helps to 
> a
> : certain extent
>
> Hmmm.
>
> Logging an error and returning successfully (without adding any docs) 
> is still inconsistent with the way all other RequestHandlers work: 
> fail the request.
>
> I know DIH isn't a typical RequestHandler, but some things (like 
> failing on failure) seem like they should be a given.
SOLR-842 .
DIH is an ETL tool pretending to be a RequestHandler. Originally it was built to run outside of Solr using SolrJ. For better integration and ease of use we changed it later.

SOLR-853 aims to achieve the oroginal goal

The goal of DIH is to become a full featured ETL tool.



>
>
>
> -Hoss
>
>



--
--Noble Paul

Re: Newbie Question - getting search results from dataimport request handler

Posted by Noble Paul നോബിള്‍ नोब्ळ् <no...@gmail.com>.
On Sat, Nov 22, 2008 at 3:10 AM, Chris Hostetter
<ho...@fucit.org> wrote:
>
> : > it might be worth considering a new @attribute for <fields> to indicate
> : > that they are going to be used purely as "component" fields (ie: your
> : > first-name/last-name example) and then have DIH pass all non-component
> : > fields along and error if undefined in the schema just like other updating
> : > RequestHandlers do.
> : >
> : > either that, or require that people declaure indexed="false"
> : > stored="false" fields in the schema for these intermediate component
> : > fields so that we can properly warn then when DIH is getting data it
> : > doesn't know what to do with -- protecting people from field name typos
> : > and returning errors instead of silently ignoring unexpected input is
> : > fairly important behavir -- especially for new users.
>
> : Actually it is done by DIH . When the dataconfig is loaded DIH reports
> : these information on the console. though it is limited , it helps to a
> : certain extent
>
> Hmmm.
>
> Logging an error and returning successfully (without adding any docs) is
> still inconsistent with the way all other RequestHandlers work: fail the
> request.
>
> I know DIH isn't a typical RequestHandler, but some things (like failing
> on failure) seem like they should be a given.
SOLR-842 .
DIH is an ETL tool pretending to be a RequestHandler. Originally it
was built to run outside of Solr using SolrJ. For better integration
and ease of use we changed it later.

SOLR-853 aims to achieve the oroginal goal

The goal of DIH is to become a full featured ETL tool.



>
>
>
> -Hoss
>
>



-- 
--Noble Paul

Re: Newbie Question - getting search results from dataimport request handler

Posted by Chris Hostetter <ho...@fucit.org>.
: > it might be worth considering a new @attribute for <fields> to indicate
: > that they are going to be used purely as "component" fields (ie: your
: > first-name/last-name example) and then have DIH pass all non-component
: > fields along and error if undefined in the schema just like other updating
: > RequestHandlers do.
: >
: > either that, or require that people declaure indexed="false"
: > stored="false" fields in the schema for these intermediate component
: > fields so that we can properly warn then when DIH is getting data it
: > doesn't know what to do with -- protecting people from field name typos
: > and returning errors instead of silently ignoring unexpected input is
: > fairly important behavir -- especially for new users.

: Actually it is done by DIH . When the dataconfig is loaded DIH reports
: these information on the console. though it is limited , it helps to a
: certain extent

Hmmm. 

Logging an error and returning successfully (without adding any docs) is 
still inconsistent with the way all other RequestHandlers work: fail the 
request.

I know DIH isn't a typical RequestHandler, but some things (like failing 
on failure) seem like they should be a given.



-Hoss


Re: Newbie Question - getting search results from dataimport request handler

Posted by Noble Paul നോബിള്‍ नोब्ळ् <no...@gmail.com>.
On Sat, Nov 15, 2008 at 6:33 AM, Chris Hostetter
<ho...@fucit.org> wrote:
>
> : > Is here a bug in DIH that caused these unrecognized fields to be ignored,
> : > or is it possible the errors were logged (by DUH2 maybe? ... it's been a
> : > while since i looked at the update code) but DIH didn't notice them and
> : > reported success anyway?
> :
> :
> : If the data contains a field name which is not defined in the schema.xml,
> : then DIH ignores it. This is a very common use-case where you may want to
> : process intermediate data and add it to a completely new field. For example,
> : if you have first-name and last-name coming in from DB and you want to
> : combine them into a new field "name" with TemplateTransformer.
>
> Ahhhh.... i see.  so it's a feature that sometimes acts like a bug :)
>
> it might be worth considering a new @attribute for <fields> to indicate
> that they are going to be used purely as "component" fields (ie: your
> first-name/last-name example) and then have DIH pass all non-component
> fields along and error if undefined in the schema just like other updating
> RequestHandlers do.
>
> either that, or require that people declaure indexed="false"
> stored="false" fields in the schema for these intermediate component
> fields so that we can properly warn then when DIH is getting data it
> doesn't know what to do with -- protecting people from field name typos
> and returning errors instead of silently ignoring unexpected input is
> fairly important behavir -- especially for new users.
>
Actually it is done by DIH . When the dataconfig is loaded DIH reports
these information on the console. though it is limited , it helps to a
certain extent
> -Hoss
>
>



-- 
--Noble Paul

Re: Newbie Question - getting search results from dataimport request handler

Posted by Chris Hostetter <ho...@fucit.org>.
: > Is here a bug in DIH that caused these unrecognized fields to be ignored,
: > or is it possible the errors were logged (by DUH2 maybe? ... it's been a
: > while since i looked at the update code) but DIH didn't notice them and
: > reported success anyway?
: 
: 
: If the data contains a field name which is not defined in the schema.xml,
: then DIH ignores it. This is a very common use-case where you may want to
: process intermediate data and add it to a completely new field. For example,
: if you have first-name and last-name coming in from DB and you want to
: combine them into a new field "name" with TemplateTransformer.

Ahhhh.... i see.  so it's a feature that sometimes acts like a bug :)

it might be worth considering a new @attribute for <fields> to indicate 
that they are going to be used purely as "component" fields (ie: your 
first-name/last-name example) and then have DIH pass all non-component 
fields along and error if undefined in the schema just like other updating 
RequestHandlers do.

either that, or require that people declaure indexed="false" 
stored="false" fields in the schema for these intermediate component 
fields so that we can properly warn then when DIH is getting data it 
doesn't know what to do with -- protecting people from field name typos 
and returning errors instead of silently ignoring unexpected input is 
fairly important behavir -- especially for new users.

-Hoss


Re: Newbie Question - getting search results from dataimport request handler

Posted by Shalin Shekhar Mangar <sh...@gmail.com>.
On Thu, Nov 13, 2008 at 3:52 AM, Chris Hostetter
<ho...@fucit.org>wrote:

>
> : You need to modify the schema which came with Solr to suit your data.
> There
>
> If i'm understanding this thread correctly, DIH ran "successfully", docs
> were created, some fields were stored and indexed (because they did exist
> in the schema) but other fields the user was attempting to create didn't
> exist...
>
> : > So there is data there, however I have not defined any 'Fields' in my
> : > data-config and am hoping my column names are the 'fields', yet I'm not
> : > seeing any of them being returned in the 'doc' node below
>
> ...presumably the example schema.xml was being used, where there is no
> <dynamicField name="*" ... /> ... so shouldn't the unrecognized field
> names have generated an error during indexing?
>
> Is here a bug in DIH that caused these unrecognized fields to be ignored,
> or is it possible the errors were logged (by DUH2 maybe? ... it's been a
> while since i looked at the update code) but DIH didn't notice them and
> reported success anyway?


If the data contains a field name which is not defined in the schema.xml,
then DIH ignores it. This is a very common use-case where you may want to
process intermediate data and add it to a completely new field. For example,
if you have first-name and last-name coming in from DB and you want to
combine them into a new field "name" with TemplateTransformer.

-- 
Regards,
Shalin Shekhar Mangar.

Re: Newbie Question - getting search results from dataimport request handler

Posted by Noble Paul നോബിള്‍ नोब्ळ् <no...@gmail.com>.
On Thu, Nov 13, 2008 at 3:52 AM, Chris Hostetter
<ho...@fucit.org> wrote:
>
> : You need to modify the schema which came with Solr to suit your data. There
>
> If i'm understanding this thread correctly, DIH ran "successfully", docs
> were created, some fields were stored and indexed (because they did exist
> in the schema) but other fields the user was attempting to create didn't
> exist...
>
> : > So there is data there, however I have not defined any 'Fields' in my
> : > data-config and am hoping my column names are the 'fields', yet I'm not
> : > seeing any of them being returned in the 'doc' node below
>
> ...presumably the example schema.xml was being used, where there is no
> <dynamicField name="*" ... /> ... so shouldn't the unrecognized field
> names have generated an error during indexing?
>
> Is here a bug in DIH that caused these unrecognized fields to be ignored,
> or is it possible the errors were logged (by DUH2 maybe? ... it's been a
> while since i looked at the update code) but DIH didn't notice them and
> reported success anyway?

DIH in the 1.3 release had a bug with handling dynamic fields (SOLR-742)
>
> I smell a bug somewhere, i just don't know enough about the code to know
> where hte smell is coming from.
>
> -Hoss
>
>



-- 
--Noble Paul

Re: Newbie Question - getting search results from dataimport request handler

Posted by Chris Hostetter <ho...@fucit.org>.
: You need to modify the schema which came with Solr to suit your data. There

If i'm understanding this thread correctly, DIH ran "successfully", docs 
were created, some fields were stored and indexed (because they did exist 
in the schema) but other fields the user was attempting to create didn't 
exist...

: > So there is data there, however I have not defined any 'Fields' in my
: > data-config and am hoping my column names are the 'fields', yet I'm not
: > seeing any of them being returned in the 'doc' node below 

...presumably the example schema.xml was being used, where there is no 
<dynamicField name="*" ... /> ... so shouldn't the unrecognized field 
names have generated an error during indexing?

Is here a bug in DIH that caused these unrecognized fields to be ignored, 
or is it possible the errors were logged (by DUH2 maybe? ... it's been a 
while since i looked at the update code) but DIH didn't notice them and 
reported success anyway?

I smell a bug somewhere, i just don't know enough about the code to know 
where hte smell is coming from.

-Hoss


Re: Newbie Question - getting search results from dataimport request handler

Posted by Shalin Shekhar Mangar <sh...@gmail.com>.
Hi Kevin,

You need to modify the schema which came with Solr to suit your data. There
should be a schema.xml inside example/solr/conf directory. Once you do that,
re-import your data.

Take a look at http://wiki.apache.org/solr/SchemaXml

On Tue, Nov 11, 2008 at 10:59 AM, Kevin Penny <kp...@jobs2web.com> wrote:

> I can execute: /solr/select?q=id:87133
>
> So there is data there, however I have not defined any 'Fields' in my
> data-config and am hoping my column names are the 'fields', yet I'm not
> seeing any of them being returned in the 'doc' node below :
>
>
> <response>
> -
> <lst name="responseHeader">
> <int name="status">0</int>
> <int name="QTime">0</int>
> -
> <lst name="params">
> <str name="q">id:87133</str>
> </lst>
> </lst>
> -
> <result name="response" numFound="1" start="0">
> -
> <doc>
> <str name="id">87133</str>
> <int name="popularity">0</int>
> <str name="sku">87133</str>
> <date name="timestamp">2008-11-11T05:25:29Z</date>
> </doc>
> </result>
> </response>
>
> Kevin
>
> -----Original Message-----
> From: Noble Paul നോബിള്‍ नोब्ळ् [mailto:noble.paul@gmail.com]
> Sent: Monday, November 10, 2008 11:23 PM
> To: solr-user@lucene.apache.org
> Subject: Re: Newbie Question - getting search results from dataimport
> request handler
>
> search for *:* and see if the index indeed has the documents .
> Once you ensure the docs are there go through the lucene query syntax
> and check your query
>
> On Tue, Nov 11, 2008 at 10:07 AM, Kevin Penny <kp...@jobs2web.com> wrote:
> > Ok so I executed a:
> > solr/dataimport?command=full-import
> > then I checked here:
> > solr/dataimport
> >
> > I get a good xml message (figure 1.1) showing me that 125 records have
> been indexed (good) and I know one of them contains the word 'job'.
> >
> > I sould get results from this query string then right (figure 1.0 is my
> result - 0 records found)?
> > solr/select?q=job
> >
> >
> > figure 1.0
> > <response>
> > −
> > <lst name="responseHeader">
> > <int name="status">0</int>
> > <int name="QTime">0</int>
> > −
> > <lst name="params">
> > <str name="q">job</str>
> > </lst>
> > </lst>
> > <result name="response" numFound="0" start="0"/>
> > </response>
> >
> > figure 1.1
> > <response>
> > −
> > <lst name="responseHeader">
> > <int name="status">0</int>
> > <int name="QTime">0</int>
> > </lst>
> > −
> > <lst name="initArgs">
> > −
> > <lst name="defaults">
> > <str name="config">data-config.xml</str>
> > </lst>
> > </lst>
> > <str name="status">idle</str>
> > <str name="importResponse"/>
> > −
> > <lst name="statusMessages">
> > <str name="Total Requests made to DataSource">1</str>
> > <str name="Total Rows Fetched">125</str>
> > <str name="Total Documents Skipped">0</str>
> > <str name="Full Dump Started">2008-11-10 22:33:55</str>
> > −
> > <str name="">
> > Indexing completed. Added/Updated: 125 documents. Deleted 0 documents.
> > </str>
> > <str name="Committed">2008-11-10 22:34:00</str>
> > <str name="Optimized">2008-11-10 22:34:00</str>
> > <str name="Time taken ">0:0:5.79</str>
> > </lst>
> > −
> > <str name="WARNING">
> > This response format is experimental.  It is likely to change in the
> future.
> > </str>
> > </response>
> >
> > Kevin
> >
> > -----Original Message-----
> > From: Noble Paul നോബിള്‍ नोब्ळ् [mailto:noble.paul@gmail.com]
> > Sent: Monday, November 10, 2008 10:30 PM
> > To: solr-user@lucene.apache.org
> > Subject: Re: Newbie Question - getting search results from dataimport
> request handler
> >
> > XML is just an intermediete data format Solr internally has no XML
> > data. When the data comes out XML is just another representation of
> > the same data.
> >
> > Whether you put in data using XML or DB (SQL) it all goes into the
> > same index . Query must be done on that index using the syntax
> > http://localhost:8983/solr/select/?q=<your-query-goes-here>
> >
> > On Tue, Nov 11, 2008 at 9:55 AM, Kevin Penny <kp...@jobs2web.com>
> wrote:
> >> Ok - and what would that be? (query interface)
> >>
> >> I need the URL format that would work in this situation to return data
> from my setup.
> >>
> >> I've gone through the tutorial and used execution strings like:
> >> http://localhost:8983/solr/select/?indent=on&q=video&sort=price+desc
> >> etc however I'm working with sql data and not xml data.
> >>
> >> Thanks
> >>
> >> -----Original Message-----
> >> From: Noble Paul നോബിള്‍ नोब्ळ् [mailto:noble.paul@gmail.com]
> >> Sent: Monday, November 10, 2008 10:18 PM
> >> To: solr-user@lucene.apache.org
> >> Subject: Re: Newbie Question - getting search results from dataimport
> request handler
> >>
> >> you cannot query the DIH. It can only do indexing
> >> after indexing you must do the indexing on the regular query interface
> >>
> >> On Tue, Nov 11, 2008 at 9:45 AM, Kevin Penny <kp...@jobs2web.com>
> wrote:
> >>> My Question is: what is the format of a search that will return data?
> >>> i.e. /solr/select?q=developer&qt=dataimport (won't work) nor will
> /solr/dataimport?q=developer (won't work)
> >>> "HTTP ERROR: 404
> >>> NOT_FOUND
> >>> RequestURI=/solr/dataimport"
> >>>
> >>> I have created a 'dataimport' set that contains data from a sql db.
> >>>
> >>> I can view meta data from this url: /solr/dataimport
> >>> <response>
> >>> −
> >>> <lst name="responseHeader">
> >>> <int name="status">0</int>
> >>> <int name="QTime">0</int>
> >>> </lst>
> >>> −
> >>> <lst name="initArgs">
> >>> −
> >>> <lst name="defaults">
> >>> <str name="config">data-config.xml</str>
> >>> </lst>
> >>> </lst>
> >>> <str name="status">idle</str>
> >>> <str name="importResponse"/>
> >>> −
> >>> <lst name="statusMessages">
> >>> <str name="Total Requests made to DataSource">1</str>
> >>> <str name="Total Rows Fetched">10</str>
> >>> <str name="Total Documents Skipped">0</str>
> >>> <str name="Full Dump Started">2008-11-10 21:51:40</str>
> >>> <str name="Time taken ">0:0:4.594</str>
> >>> </lst>
> >>> −
> >>> <str name="WARNING">
> >>> This response format is experimental.  It is likely to change in the
> future.
> >>> </str>
> >>> </response>
> >>>
> >>> I can verify that the data is there by going through
> /solr/admin/dataimport.jsp and doing 'verbose' true and debug now.
> >>> It shows me the xml data set on the right as such:
> >>>
> >>> <response>
> >>> −
> >>> <lst name="responseHeader">
> >>> <int name="status">0</int>
> >>> <int name="QTime">4594</int>
> >>> </lst>
> >>> −
> >>> <lst name="initArgs">
> >>> −
> >>> <lst name="defaults">
> >>> <str name="config">data-config.xml</str>
> >>> </lst>
> >>> </lst>
> >>> <str name="command">full-import</str>
> >>> <str name="mode">debug</str>
> >>> −
> >>> <arr name="documents">
> >>> −
> >>> <arr>
> >>> −
> >>> <arr>
> >>> <int>87133</int>
> >>> </arr>
> >>> </arr>
> >>> −
> >>> <arr>
> >>> −
> >>> <arr>
> >>> <int>87134</int>
> >>> </arr>
> >>> </arr>
> >>> −
> >>> <arr>
> >>> −
> >>> <arr>
> >>> <int>87135</int>
> >>> </arr>
> >>> </arr>
> >>> −
> >>> <arr>
> >>> −
> >>> <arr>
> >>> <int>87136</int>
> >>> </arr>
> >>> </arr>
> >>> −
> >>> <arr>
> >>> −
> >>> <arr>
> >>> <int>87137</int>
> >>> </arr>
> >>> </arr>
> >>> −
> >>> <arr>
> >>> −
> >>> <arr>
> >>> <int>87138</int>
> >>> </arr>
> >>> </arr>
> >>> −
> >>> <arr>
> >>> −
> >>> <arr>
> >>> <int>87139</int>
> >>> </arr>
> >>> </arr>
> >>> −
> >>> <arr>
> >>> −
> >>> <arr>
> >>> <int>87140</int>
> >>> </arr>
> >>> </arr>
> >>> −
> >>> <arr>
> >>> −
> >>> <arr>
> >>> <int>87141</int>
> >>> </arr>
> >>> </arr>
> >>> −
> >>> <arr>
> >>> −
> >>> <arr>
> >>> <int>87142</int>
> >>> </arr>
> >>> </arr>
> >>> </arr>
> >>> −
> >>> <lst name="verbose-output">
> >>> −
> >>> <lst name="entity:item">
> >>> −
> >>> <lst name="document#1">
> >>> −
> >>> <str name="query">
> >>> SELECT  j.id      , j.title      ,  FROM      dbo.jobs j WITH (NOLOCK)
>      LEFT  WHERE j.siteid = 46 and j.active = 1
> >>> </str>
> >>> <str name="time-taken">0:0:4.578</str>
> >>> <str>----------- row #1-------------</str>
> >>> <str name="zip"/>
> >>> <str name="urltitle">Operations Software Developer Job</str>
> >>> <str name="altlocation">SAN ANTONIO, TX, 78229</str>
> >>> <str name="alttitle">Ope…
> >>>
> >>>
> >>> Here is my solconfig.xml
> >>> …
> >>> <requestHandler name="dataimport"
> class="org.apache.solr.handler.dataimport.DataImportHandler">
> >>>    <lst name="defaults">
> >>>      <str name="config">data-config.xml</str>
> >>>    </lst>
> >>>  </requestHandler>
> >>> …
> >>> Data-config.xml is in the same dir as solconfig.xml
> >>>
> >>> My data-config.xml is like any other:
> >>> <dataConfig>
> >>>    <dataSource driver="com.microsoft.sqlserver.jdbc.SQLServerDriver"
> >>> url="jdbc:sqlserver://xxxxxxxx:1433;databaseName=xxxxx" user="xxxxx"
> password="xxxxx" />
> >>>    <document name="jobs">
> >>>            <entity name="item" pk="id" query="SELECT  j.id
> >>>                                                            , j.title
> >>>                                                            …
> >>>                                                FROM
> >>>                                                            dbo.jobs …
> >>>                                                WHERE j.siteid = 46 and
> j.active = 1"
> >>>                deltaQuery="select id from dbo.jobs where lastmodified >
> '${dataimporter.last_index_time}'">
> >>>
> >>>        </entity>
> >>>    </document>
> >>> </dataConfig>
> >>>
> >>> I'm using win xp with apache – and jetty + solr 1.3.0
> >>>
> >>> Thanks
> >>>
> >>>
> >>>
> >>
> >>
> >>
> >> --
> >> --Noble Paul
> >>
> >
> >
> >
> > --
> > --Noble Paul
> >
>
>
>
> --
> --Noble Paul
>



-- 
Regards,
Shalin Shekhar Mangar.

RE: Newbie Question - getting search results from dataimport request handler

Posted by Kevin Penny <kp...@jobs2web.com>.
I can execute: /solr/select?q=id:87133

So there is data there, however I have not defined any 'Fields' in my data-config and am hoping my column names are the 'fields', yet I'm not seeing any of them being returned in the 'doc' node below :


<response>
-
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">0</int>
-
<lst name="params">
<str name="q">id:87133</str>
</lst>
</lst>
-
<result name="response" numFound="1" start="0">
-
<doc>
<str name="id">87133</str>
<int name="popularity">0</int>
<str name="sku">87133</str>
<date name="timestamp">2008-11-11T05:25:29Z</date>
</doc>
</result>
</response>

Kevin

-----Original Message-----
From: Noble Paul നോബിള്‍ नोब्ळ् [mailto:noble.paul@gmail.com]
Sent: Monday, November 10, 2008 11:23 PM
To: solr-user@lucene.apache.org
Subject: Re: Newbie Question - getting search results from dataimport request handler

search for *:* and see if the index indeed has the documents .
Once you ensure the docs are there go through the lucene query syntax
and check your query

On Tue, Nov 11, 2008 at 10:07 AM, Kevin Penny <kp...@jobs2web.com> wrote:
> Ok so I executed a:
> solr/dataimport?command=full-import
> then I checked here:
> solr/dataimport
>
> I get a good xml message (figure 1.1) showing me that 125 records have been indexed (good) and I know one of them contains the word 'job'.
>
> I sould get results from this query string then right (figure 1.0 is my result - 0 records found)?
> solr/select?q=job
>
>
> figure 1.0
> <response>
> −
> <lst name="responseHeader">
> <int name="status">0</int>
> <int name="QTime">0</int>
> −
> <lst name="params">
> <str name="q">job</str>
> </lst>
> </lst>
> <result name="response" numFound="0" start="0"/>
> </response>
>
> figure 1.1
> <response>
> −
> <lst name="responseHeader">
> <int name="status">0</int>
> <int name="QTime">0</int>
> </lst>
> −
> <lst name="initArgs">
> −
> <lst name="defaults">
> <str name="config">data-config.xml</str>
> </lst>
> </lst>
> <str name="status">idle</str>
> <str name="importResponse"/>
> −
> <lst name="statusMessages">
> <str name="Total Requests made to DataSource">1</str>
> <str name="Total Rows Fetched">125</str>
> <str name="Total Documents Skipped">0</str>
> <str name="Full Dump Started">2008-11-10 22:33:55</str>
> −
> <str name="">
> Indexing completed. Added/Updated: 125 documents. Deleted 0 documents.
> </str>
> <str name="Committed">2008-11-10 22:34:00</str>
> <str name="Optimized">2008-11-10 22:34:00</str>
> <str name="Time taken ">0:0:5.79</str>
> </lst>
> −
> <str name="WARNING">
> This response format is experimental.  It is likely to change in the future.
> </str>
> </response>
>
> Kevin
>
> -----Original Message-----
> From: Noble Paul നോബിള്‍ नोब्ळ् [mailto:noble.paul@gmail.com]
> Sent: Monday, November 10, 2008 10:30 PM
> To: solr-user@lucene.apache.org
> Subject: Re: Newbie Question - getting search results from dataimport request handler
>
> XML is just an intermediete data format Solr internally has no XML
> data. When the data comes out XML is just another representation of
> the same data.
>
> Whether you put in data using XML or DB (SQL) it all goes into the
> same index . Query must be done on that index using the syntax
> http://localhost:8983/solr/select/?q=<your-query-goes-here>
>
> On Tue, Nov 11, 2008 at 9:55 AM, Kevin Penny <kp...@jobs2web.com> wrote:
>> Ok - and what would that be? (query interface)
>>
>> I need the URL format that would work in this situation to return data from my setup.
>>
>> I've gone through the tutorial and used execution strings like:
>> http://localhost:8983/solr/select/?indent=on&q=video&sort=price+desc
>> etc however I'm working with sql data and not xml data.
>>
>> Thanks
>>
>> -----Original Message-----
>> From: Noble Paul നോബിള്‍ नोब्ळ् [mailto:noble.paul@gmail.com]
>> Sent: Monday, November 10, 2008 10:18 PM
>> To: solr-user@lucene.apache.org
>> Subject: Re: Newbie Question - getting search results from dataimport request handler
>>
>> you cannot query the DIH. It can only do indexing
>> after indexing you must do the indexing on the regular query interface
>>
>> On Tue, Nov 11, 2008 at 9:45 AM, Kevin Penny <kp...@jobs2web.com> wrote:
>>> My Question is: what is the format of a search that will return data?
>>> i.e. /solr/select?q=developer&qt=dataimport (won't work) nor will /solr/dataimport?q=developer (won't work)
>>> "HTTP ERROR: 404
>>> NOT_FOUND
>>> RequestURI=/solr/dataimport"
>>>
>>> I have created a 'dataimport' set that contains data from a sql db.
>>>
>>> I can view meta data from this url: /solr/dataimport
>>> <response>
>>> −
>>> <lst name="responseHeader">
>>> <int name="status">0</int>
>>> <int name="QTime">0</int>
>>> </lst>
>>> −
>>> <lst name="initArgs">
>>> −
>>> <lst name="defaults">
>>> <str name="config">data-config.xml</str>
>>> </lst>
>>> </lst>
>>> <str name="status">idle</str>
>>> <str name="importResponse"/>
>>> −
>>> <lst name="statusMessages">
>>> <str name="Total Requests made to DataSource">1</str>
>>> <str name="Total Rows Fetched">10</str>
>>> <str name="Total Documents Skipped">0</str>
>>> <str name="Full Dump Started">2008-11-10 21:51:40</str>
>>> <str name="Time taken ">0:0:4.594</str>
>>> </lst>
>>> −
>>> <str name="WARNING">
>>> This response format is experimental.  It is likely to change in the future.
>>> </str>
>>> </response>
>>>
>>> I can verify that the data is there by going through /solr/admin/dataimport.jsp and doing 'verbose' true and debug now.
>>> It shows me the xml data set on the right as such:
>>>
>>> <response>
>>> −
>>> <lst name="responseHeader">
>>> <int name="status">0</int>
>>> <int name="QTime">4594</int>
>>> </lst>
>>> −
>>> <lst name="initArgs">
>>> −
>>> <lst name="defaults">
>>> <str name="config">data-config.xml</str>
>>> </lst>
>>> </lst>
>>> <str name="command">full-import</str>
>>> <str name="mode">debug</str>
>>> −
>>> <arr name="documents">
>>> −
>>> <arr>
>>> −
>>> <arr>
>>> <int>87133</int>
>>> </arr>
>>> </arr>
>>> −
>>> <arr>
>>> −
>>> <arr>
>>> <int>87134</int>
>>> </arr>
>>> </arr>
>>> −
>>> <arr>
>>> −
>>> <arr>
>>> <int>87135</int>
>>> </arr>
>>> </arr>
>>> −
>>> <arr>
>>> −
>>> <arr>
>>> <int>87136</int>
>>> </arr>
>>> </arr>
>>> −
>>> <arr>
>>> −
>>> <arr>
>>> <int>87137</int>
>>> </arr>
>>> </arr>
>>> −
>>> <arr>
>>> −
>>> <arr>
>>> <int>87138</int>
>>> </arr>
>>> </arr>
>>> −
>>> <arr>
>>> −
>>> <arr>
>>> <int>87139</int>
>>> </arr>
>>> </arr>
>>> −
>>> <arr>
>>> −
>>> <arr>
>>> <int>87140</int>
>>> </arr>
>>> </arr>
>>> −
>>> <arr>
>>> −
>>> <arr>
>>> <int>87141</int>
>>> </arr>
>>> </arr>
>>> −
>>> <arr>
>>> −
>>> <arr>
>>> <int>87142</int>
>>> </arr>
>>> </arr>
>>> </arr>
>>> −
>>> <lst name="verbose-output">
>>> −
>>> <lst name="entity:item">
>>> −
>>> <lst name="document#1">
>>> −
>>> <str name="query">
>>> SELECT  j.id      , j.title      ,  FROM      dbo.jobs j WITH (NOLOCK)      LEFT  WHERE j.siteid = 46 and j.active = 1
>>> </str>
>>> <str name="time-taken">0:0:4.578</str>
>>> <str>----------- row #1-------------</str>
>>> <str name="zip"/>
>>> <str name="urltitle">Operations Software Developer Job</str>
>>> <str name="altlocation">SAN ANTONIO, TX, 78229</str>
>>> <str name="alttitle">Ope…
>>>
>>>
>>> Here is my solconfig.xml
>>> …
>>> <requestHandler name="dataimport" class="org.apache.solr.handler.dataimport.DataImportHandler">
>>>    <lst name="defaults">
>>>      <str name="config">data-config.xml</str>
>>>    </lst>
>>>  </requestHandler>
>>> …
>>> Data-config.xml is in the same dir as solconfig.xml
>>>
>>> My data-config.xml is like any other:
>>> <dataConfig>
>>>    <dataSource driver="com.microsoft.sqlserver.jdbc.SQLServerDriver"
>>> url="jdbc:sqlserver://xxxxxxxx:1433;databaseName=xxxxx" user="xxxxx" password="xxxxx" />
>>>    <document name="jobs">
>>>            <entity name="item" pk="id" query="SELECT  j.id
>>>                                                            , j.title
>>>                                                            …
>>>                                                FROM
>>>                                                            dbo.jobs …
>>>                                                WHERE j.siteid = 46 and j.active = 1"
>>>                deltaQuery="select id from dbo.jobs where lastmodified > '${dataimporter.last_index_time}'">
>>>
>>>        </entity>
>>>    </document>
>>> </dataConfig>
>>>
>>> I'm using win xp with apache – and jetty + solr 1.3.0
>>>
>>> Thanks
>>>
>>>
>>>
>>
>>
>>
>> --
>> --Noble Paul
>>
>
>
>
> --
> --Noble Paul
>



--
--Noble Paul

Re: Newbie Question - getting search results from dataimport request handler

Posted by Noble Paul നോബിള്‍ नोब्ळ् <no...@gmail.com>.
search for *:* and see if the index indeed has the documents .
Once you ensure the docs are there go through the lucene query syntax
and check your query

On Tue, Nov 11, 2008 at 10:07 AM, Kevin Penny <kp...@jobs2web.com> wrote:
> Ok so I executed a:
> solr/dataimport?command=full-import
> then I checked here:
> solr/dataimport
>
> I get a good xml message (figure 1.1) showing me that 125 records have been indexed (good) and I know one of them contains the word 'job'.
>
> I sould get results from this query string then right (figure 1.0 is my result - 0 records found)?
> solr/select?q=job
>
>
> figure 1.0
> <response>
> −
> <lst name="responseHeader">
> <int name="status">0</int>
> <int name="QTime">0</int>
> −
> <lst name="params">
> <str name="q">job</str>
> </lst>
> </lst>
> <result name="response" numFound="0" start="0"/>
> </response>
>
> figure 1.1
> <response>
> −
> <lst name="responseHeader">
> <int name="status">0</int>
> <int name="QTime">0</int>
> </lst>
> −
> <lst name="initArgs">
> −
> <lst name="defaults">
> <str name="config">data-config.xml</str>
> </lst>
> </lst>
> <str name="status">idle</str>
> <str name="importResponse"/>
> −
> <lst name="statusMessages">
> <str name="Total Requests made to DataSource">1</str>
> <str name="Total Rows Fetched">125</str>
> <str name="Total Documents Skipped">0</str>
> <str name="Full Dump Started">2008-11-10 22:33:55</str>
> −
> <str name="">
> Indexing completed. Added/Updated: 125 documents. Deleted 0 documents.
> </str>
> <str name="Committed">2008-11-10 22:34:00</str>
> <str name="Optimized">2008-11-10 22:34:00</str>
> <str name="Time taken ">0:0:5.79</str>
> </lst>
> −
> <str name="WARNING">
> This response format is experimental.  It is likely to change in the future.
> </str>
> </response>
>
> Kevin
>
> -----Original Message-----
> From: Noble Paul നോബിള്‍ नोब्ळ् [mailto:noble.paul@gmail.com]
> Sent: Monday, November 10, 2008 10:30 PM
> To: solr-user@lucene.apache.org
> Subject: Re: Newbie Question - getting search results from dataimport request handler
>
> XML is just an intermediete data format Solr internally has no XML
> data. When the data comes out XML is just another representation of
> the same data.
>
> Whether you put in data using XML or DB (SQL) it all goes into the
> same index . Query must be done on that index using the syntax
> http://localhost:8983/solr/select/?q=<your-query-goes-here>
>
> On Tue, Nov 11, 2008 at 9:55 AM, Kevin Penny <kp...@jobs2web.com> wrote:
>> Ok - and what would that be? (query interface)
>>
>> I need the URL format that would work in this situation to return data from my setup.
>>
>> I've gone through the tutorial and used execution strings like:
>> http://localhost:8983/solr/select/?indent=on&q=video&sort=price+desc
>> etc however I'm working with sql data and not xml data.
>>
>> Thanks
>>
>> -----Original Message-----
>> From: Noble Paul നോബിള്‍ नोब्ळ् [mailto:noble.paul@gmail.com]
>> Sent: Monday, November 10, 2008 10:18 PM
>> To: solr-user@lucene.apache.org
>> Subject: Re: Newbie Question - getting search results from dataimport request handler
>>
>> you cannot query the DIH. It can only do indexing
>> after indexing you must do the indexing on the regular query interface
>>
>> On Tue, Nov 11, 2008 at 9:45 AM, Kevin Penny <kp...@jobs2web.com> wrote:
>>> My Question is: what is the format of a search that will return data?
>>> i.e. /solr/select?q=developer&qt=dataimport (won't work) nor will /solr/dataimport?q=developer (won't work)
>>> "HTTP ERROR: 404
>>> NOT_FOUND
>>> RequestURI=/solr/dataimport"
>>>
>>> I have created a 'dataimport' set that contains data from a sql db.
>>>
>>> I can view meta data from this url: /solr/dataimport
>>> <response>
>>> −
>>> <lst name="responseHeader">
>>> <int name="status">0</int>
>>> <int name="QTime">0</int>
>>> </lst>
>>> −
>>> <lst name="initArgs">
>>> −
>>> <lst name="defaults">
>>> <str name="config">data-config.xml</str>
>>> </lst>
>>> </lst>
>>> <str name="status">idle</str>
>>> <str name="importResponse"/>
>>> −
>>> <lst name="statusMessages">
>>> <str name="Total Requests made to DataSource">1</str>
>>> <str name="Total Rows Fetched">10</str>
>>> <str name="Total Documents Skipped">0</str>
>>> <str name="Full Dump Started">2008-11-10 21:51:40</str>
>>> <str name="Time taken ">0:0:4.594</str>
>>> </lst>
>>> −
>>> <str name="WARNING">
>>> This response format is experimental.  It is likely to change in the future.
>>> </str>
>>> </response>
>>>
>>> I can verify that the data is there by going through /solr/admin/dataimport.jsp and doing 'verbose' true and debug now.
>>> It shows me the xml data set on the right as such:
>>>
>>> <response>
>>> −
>>> <lst name="responseHeader">
>>> <int name="status">0</int>
>>> <int name="QTime">4594</int>
>>> </lst>
>>> −
>>> <lst name="initArgs">
>>> −
>>> <lst name="defaults">
>>> <str name="config">data-config.xml</str>
>>> </lst>
>>> </lst>
>>> <str name="command">full-import</str>
>>> <str name="mode">debug</str>
>>> −
>>> <arr name="documents">
>>> −
>>> <arr>
>>> −
>>> <arr>
>>> <int>87133</int>
>>> </arr>
>>> </arr>
>>> −
>>> <arr>
>>> −
>>> <arr>
>>> <int>87134</int>
>>> </arr>
>>> </arr>
>>> −
>>> <arr>
>>> −
>>> <arr>
>>> <int>87135</int>
>>> </arr>
>>> </arr>
>>> −
>>> <arr>
>>> −
>>> <arr>
>>> <int>87136</int>
>>> </arr>
>>> </arr>
>>> −
>>> <arr>
>>> −
>>> <arr>
>>> <int>87137</int>
>>> </arr>
>>> </arr>
>>> −
>>> <arr>
>>> −
>>> <arr>
>>> <int>87138</int>
>>> </arr>
>>> </arr>
>>> −
>>> <arr>
>>> −
>>> <arr>
>>> <int>87139</int>
>>> </arr>
>>> </arr>
>>> −
>>> <arr>
>>> −
>>> <arr>
>>> <int>87140</int>
>>> </arr>
>>> </arr>
>>> −
>>> <arr>
>>> −
>>> <arr>
>>> <int>87141</int>
>>> </arr>
>>> </arr>
>>> −
>>> <arr>
>>> −
>>> <arr>
>>> <int>87142</int>
>>> </arr>
>>> </arr>
>>> </arr>
>>> −
>>> <lst name="verbose-output">
>>> −
>>> <lst name="entity:item">
>>> −
>>> <lst name="document#1">
>>> −
>>> <str name="query">
>>> SELECT  j.id      , j.title      ,  FROM      dbo.jobs j WITH (NOLOCK)      LEFT  WHERE j.siteid = 46 and j.active = 1
>>> </str>
>>> <str name="time-taken">0:0:4.578</str>
>>> <str>----------- row #1-------------</str>
>>> <str name="zip"/>
>>> <str name="urltitle">Operations Software Developer Job</str>
>>> <str name="altlocation">SAN ANTONIO, TX, 78229</str>
>>> <str name="alttitle">Ope…
>>>
>>>
>>> Here is my solconfig.xml
>>> …
>>> <requestHandler name="dataimport" class="org.apache.solr.handler.dataimport.DataImportHandler">
>>>    <lst name="defaults">
>>>      <str name="config">data-config.xml</str>
>>>    </lst>
>>>  </requestHandler>
>>> …
>>> Data-config.xml is in the same dir as solconfig.xml
>>>
>>> My data-config.xml is like any other:
>>> <dataConfig>
>>>    <dataSource driver="com.microsoft.sqlserver.jdbc.SQLServerDriver"
>>> url="jdbc:sqlserver://xxxxxxxx:1433;databaseName=xxxxx" user="xxxxx" password="xxxxx" />
>>>    <document name="jobs">
>>>            <entity name="item" pk="id" query="SELECT  j.id
>>>                                                            , j.title
>>>                                                            …
>>>                                                FROM
>>>                                                            dbo.jobs …
>>>                                                WHERE j.siteid = 46 and j.active = 1"
>>>                deltaQuery="select id from dbo.jobs where lastmodified > '${dataimporter.last_index_time}'">
>>>
>>>        </entity>
>>>    </document>
>>> </dataConfig>
>>>
>>> I'm using win xp with apache – and jetty + solr 1.3.0
>>>
>>> Thanks
>>>
>>>
>>>
>>
>>
>>
>> --
>> --Noble Paul
>>
>
>
>
> --
> --Noble Paul
>



-- 
--Noble Paul

RE: Newbie Question - getting search results from dataimport request handler

Posted by Kevin Penny <kp...@jobs2web.com>.
Ok so I executed a:
solr/dataimport?command=full-import
then I checked here:
solr/dataimport

I get a good xml message (figure 1.1) showing me that 125 records have been indexed (good) and I know one of them contains the word 'job'.

I sould get results from this query string then right (figure 1.0 is my result - 0 records found)?
solr/select?q=job


figure 1.0
<response>
−
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">0</int>
−
<lst name="params">
<str name="q">job</str>
</lst>
</lst>
<result name="response" numFound="0" start="0"/>
</response>

figure 1.1
<response>
−
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">0</int>
</lst>
−
<lst name="initArgs">
−
<lst name="defaults">
<str name="config">data-config.xml</str>
</lst>
</lst>
<str name="status">idle</str>
<str name="importResponse"/>
−
<lst name="statusMessages">
<str name="Total Requests made to DataSource">1</str>
<str name="Total Rows Fetched">125</str>
<str name="Total Documents Skipped">0</str>
<str name="Full Dump Started">2008-11-10 22:33:55</str>
−
<str name="">
Indexing completed. Added/Updated: 125 documents. Deleted 0 documents.
</str>
<str name="Committed">2008-11-10 22:34:00</str>
<str name="Optimized">2008-11-10 22:34:00</str>
<str name="Time taken ">0:0:5.79</str>
</lst>
−
<str name="WARNING">
This response format is experimental.  It is likely to change in the future.
</str>
</response>

Kevin

-----Original Message-----
From: Noble Paul നോബിള്‍ नोब्ळ् [mailto:noble.paul@gmail.com]
Sent: Monday, November 10, 2008 10:30 PM
To: solr-user@lucene.apache.org
Subject: Re: Newbie Question - getting search results from dataimport request handler

XML is just an intermediete data format Solr internally has no XML
data. When the data comes out XML is just another representation of
the same data.

Whether you put in data using XML or DB (SQL) it all goes into the
same index . Query must be done on that index using the syntax
http://localhost:8983/solr/select/?q=<your-query-goes-here>

On Tue, Nov 11, 2008 at 9:55 AM, Kevin Penny <kp...@jobs2web.com> wrote:
> Ok - and what would that be? (query interface)
>
> I need the URL format that would work in this situation to return data from my setup.
>
> I've gone through the tutorial and used execution strings like:
> http://localhost:8983/solr/select/?indent=on&q=video&sort=price+desc
> etc however I'm working with sql data and not xml data.
>
> Thanks
>
> -----Original Message-----
> From: Noble Paul നോബിള്‍ नोब्ळ् [mailto:noble.paul@gmail.com]
> Sent: Monday, November 10, 2008 10:18 PM
> To: solr-user@lucene.apache.org
> Subject: Re: Newbie Question - getting search results from dataimport request handler
>
> you cannot query the DIH. It can only do indexing
> after indexing you must do the indexing on the regular query interface
>
> On Tue, Nov 11, 2008 at 9:45 AM, Kevin Penny <kp...@jobs2web.com> wrote:
>> My Question is: what is the format of a search that will return data?
>> i.e. /solr/select?q=developer&qt=dataimport (won't work) nor will /solr/dataimport?q=developer (won't work)
>> "HTTP ERROR: 404
>> NOT_FOUND
>> RequestURI=/solr/dataimport"
>>
>> I have created a 'dataimport' set that contains data from a sql db.
>>
>> I can view meta data from this url: /solr/dataimport
>> <response>
>> −
>> <lst name="responseHeader">
>> <int name="status">0</int>
>> <int name="QTime">0</int>
>> </lst>
>> −
>> <lst name="initArgs">
>> −
>> <lst name="defaults">
>> <str name="config">data-config.xml</str>
>> </lst>
>> </lst>
>> <str name="status">idle</str>
>> <str name="importResponse"/>
>> −
>> <lst name="statusMessages">
>> <str name="Total Requests made to DataSource">1</str>
>> <str name="Total Rows Fetched">10</str>
>> <str name="Total Documents Skipped">0</str>
>> <str name="Full Dump Started">2008-11-10 21:51:40</str>
>> <str name="Time taken ">0:0:4.594</str>
>> </lst>
>> −
>> <str name="WARNING">
>> This response format is experimental.  It is likely to change in the future.
>> </str>
>> </response>
>>
>> I can verify that the data is there by going through /solr/admin/dataimport.jsp and doing 'verbose' true and debug now.
>> It shows me the xml data set on the right as such:
>>
>> <response>
>> −
>> <lst name="responseHeader">
>> <int name="status">0</int>
>> <int name="QTime">4594</int>
>> </lst>
>> −
>> <lst name="initArgs">
>> −
>> <lst name="defaults">
>> <str name="config">data-config.xml</str>
>> </lst>
>> </lst>
>> <str name="command">full-import</str>
>> <str name="mode">debug</str>
>> −
>> <arr name="documents">
>> −
>> <arr>
>> −
>> <arr>
>> <int>87133</int>
>> </arr>
>> </arr>
>> −
>> <arr>
>> −
>> <arr>
>> <int>87134</int>
>> </arr>
>> </arr>
>> −
>> <arr>
>> −
>> <arr>
>> <int>87135</int>
>> </arr>
>> </arr>
>> −
>> <arr>
>> −
>> <arr>
>> <int>87136</int>
>> </arr>
>> </arr>
>> −
>> <arr>
>> −
>> <arr>
>> <int>87137</int>
>> </arr>
>> </arr>
>> −
>> <arr>
>> −
>> <arr>
>> <int>87138</int>
>> </arr>
>> </arr>
>> −
>> <arr>
>> −
>> <arr>
>> <int>87139</int>
>> </arr>
>> </arr>
>> −
>> <arr>
>> −
>> <arr>
>> <int>87140</int>
>> </arr>
>> </arr>
>> −
>> <arr>
>> −
>> <arr>
>> <int>87141</int>
>> </arr>
>> </arr>
>> −
>> <arr>
>> −
>> <arr>
>> <int>87142</int>
>> </arr>
>> </arr>
>> </arr>
>> −
>> <lst name="verbose-output">
>> −
>> <lst name="entity:item">
>> −
>> <lst name="document#1">
>> −
>> <str name="query">
>> SELECT  j.id      , j.title      ,  FROM      dbo.jobs j WITH (NOLOCK)      LEFT  WHERE j.siteid = 46 and j.active = 1
>> </str>
>> <str name="time-taken">0:0:4.578</str>
>> <str>----------- row #1-------------</str>
>> <str name="zip"/>
>> <str name="urltitle">Operations Software Developer Job</str>
>> <str name="altlocation">SAN ANTONIO, TX, 78229</str>
>> <str name="alttitle">Ope…
>>
>>
>> Here is my solconfig.xml
>> …
>> <requestHandler name="dataimport" class="org.apache.solr.handler.dataimport.DataImportHandler">
>>    <lst name="defaults">
>>      <str name="config">data-config.xml</str>
>>    </lst>
>>  </requestHandler>
>> …
>> Data-config.xml is in the same dir as solconfig.xml
>>
>> My data-config.xml is like any other:
>> <dataConfig>
>>    <dataSource driver="com.microsoft.sqlserver.jdbc.SQLServerDriver"
>> url="jdbc:sqlserver://xxxxxxxx:1433;databaseName=xxxxx" user="xxxxx" password="xxxxx" />
>>    <document name="jobs">
>>            <entity name="item" pk="id" query="SELECT  j.id
>>                                                            , j.title
>>                                                            …
>>                                                FROM
>>                                                            dbo.jobs …
>>                                                WHERE j.siteid = 46 and j.active = 1"
>>                deltaQuery="select id from dbo.jobs where lastmodified > '${dataimporter.last_index_time}'">
>>
>>        </entity>
>>    </document>
>> </dataConfig>
>>
>> I'm using win xp with apache – and jetty + solr 1.3.0
>>
>> Thanks
>>
>>
>>
>
>
>
> --
> --Noble Paul
>



--
--Noble Paul

Re: Newbie Question - getting search results from dataimport request handler

Posted by Noble Paul നോബിള്‍ नोब्ळ् <no...@gmail.com>.
XML is just an intermediete data format Solr internally has no XML
data. When the data comes out XML is just another representation of
the same data.

Whether you put in data using XML or DB (SQL) it all goes into the
same index . Query must be done on that index using the syntax
http://localhost:8983/solr/select/?q=<your-query-goes-here>

On Tue, Nov 11, 2008 at 9:55 AM, Kevin Penny <kp...@jobs2web.com> wrote:
> Ok - and what would that be? (query interface)
>
> I need the URL format that would work in this situation to return data from my setup.
>
> I've gone through the tutorial and used execution strings like:
> http://localhost:8983/solr/select/?indent=on&q=video&sort=price+desc
> etc however I'm working with sql data and not xml data.
>
> Thanks
>
> -----Original Message-----
> From: Noble Paul നോബിള്‍ नोब्ळ् [mailto:noble.paul@gmail.com]
> Sent: Monday, November 10, 2008 10:18 PM
> To: solr-user@lucene.apache.org
> Subject: Re: Newbie Question - getting search results from dataimport request handler
>
> you cannot query the DIH. It can only do indexing
> after indexing you must do the indexing on the regular query interface
>
> On Tue, Nov 11, 2008 at 9:45 AM, Kevin Penny <kp...@jobs2web.com> wrote:
>> My Question is: what is the format of a search that will return data?
>> i.e. /solr/select?q=developer&qt=dataimport (won't work) nor will /solr/dataimport?q=developer (won't work)
>> "HTTP ERROR: 404
>> NOT_FOUND
>> RequestURI=/solr/dataimport"
>>
>> I have created a 'dataimport' set that contains data from a sql db.
>>
>> I can view meta data from this url: /solr/dataimport
>> <response>
>> −
>> <lst name="responseHeader">
>> <int name="status">0</int>
>> <int name="QTime">0</int>
>> </lst>
>> −
>> <lst name="initArgs">
>> −
>> <lst name="defaults">
>> <str name="config">data-config.xml</str>
>> </lst>
>> </lst>
>> <str name="status">idle</str>
>> <str name="importResponse"/>
>> −
>> <lst name="statusMessages">
>> <str name="Total Requests made to DataSource">1</str>
>> <str name="Total Rows Fetched">10</str>
>> <str name="Total Documents Skipped">0</str>
>> <str name="Full Dump Started">2008-11-10 21:51:40</str>
>> <str name="Time taken ">0:0:4.594</str>
>> </lst>
>> −
>> <str name="WARNING">
>> This response format is experimental.  It is likely to change in the future.
>> </str>
>> </response>
>>
>> I can verify that the data is there by going through /solr/admin/dataimport.jsp and doing 'verbose' true and debug now.
>> It shows me the xml data set on the right as such:
>>
>> <response>
>> −
>> <lst name="responseHeader">
>> <int name="status">0</int>
>> <int name="QTime">4594</int>
>> </lst>
>> −
>> <lst name="initArgs">
>> −
>> <lst name="defaults">
>> <str name="config">data-config.xml</str>
>> </lst>
>> </lst>
>> <str name="command">full-import</str>
>> <str name="mode">debug</str>
>> −
>> <arr name="documents">
>> −
>> <arr>
>> −
>> <arr>
>> <int>87133</int>
>> </arr>
>> </arr>
>> −
>> <arr>
>> −
>> <arr>
>> <int>87134</int>
>> </arr>
>> </arr>
>> −
>> <arr>
>> −
>> <arr>
>> <int>87135</int>
>> </arr>
>> </arr>
>> −
>> <arr>
>> −
>> <arr>
>> <int>87136</int>
>> </arr>
>> </arr>
>> −
>> <arr>
>> −
>> <arr>
>> <int>87137</int>
>> </arr>
>> </arr>
>> −
>> <arr>
>> −
>> <arr>
>> <int>87138</int>
>> </arr>
>> </arr>
>> −
>> <arr>
>> −
>> <arr>
>> <int>87139</int>
>> </arr>
>> </arr>
>> −
>> <arr>
>> −
>> <arr>
>> <int>87140</int>
>> </arr>
>> </arr>
>> −
>> <arr>
>> −
>> <arr>
>> <int>87141</int>
>> </arr>
>> </arr>
>> −
>> <arr>
>> −
>> <arr>
>> <int>87142</int>
>> </arr>
>> </arr>
>> </arr>
>> −
>> <lst name="verbose-output">
>> −
>> <lst name="entity:item">
>> −
>> <lst name="document#1">
>> −
>> <str name="query">
>> SELECT  j.id      , j.title      ,  FROM      dbo.jobs j WITH (NOLOCK)      LEFT  WHERE j.siteid = 46 and j.active = 1
>> </str>
>> <str name="time-taken">0:0:4.578</str>
>> <str>----------- row #1-------------</str>
>> <str name="zip"/>
>> <str name="urltitle">Operations Software Developer Job</str>
>> <str name="altlocation">SAN ANTONIO, TX, 78229</str>
>> <str name="alttitle">Ope…
>>
>>
>> Here is my solconfig.xml
>> …
>> <requestHandler name="dataimport" class="org.apache.solr.handler.dataimport.DataImportHandler">
>>    <lst name="defaults">
>>      <str name="config">data-config.xml</str>
>>    </lst>
>>  </requestHandler>
>> …
>> Data-config.xml is in the same dir as solconfig.xml
>>
>> My data-config.xml is like any other:
>> <dataConfig>
>>    <dataSource driver="com.microsoft.sqlserver.jdbc.SQLServerDriver"
>> url="jdbc:sqlserver://xxxxxxxx:1433;databaseName=xxxxx" user="xxxxx" password="xxxxx" />
>>    <document name="jobs">
>>            <entity name="item" pk="id" query="SELECT  j.id
>>                                                            , j.title
>>                                                            …
>>                                                FROM
>>                                                            dbo.jobs …
>>                                                WHERE j.siteid = 46 and j.active = 1"
>>                deltaQuery="select id from dbo.jobs where lastmodified > '${dataimporter.last_index_time}'">
>>
>>        </entity>
>>    </document>
>> </dataConfig>
>>
>> I'm using win xp with apache – and jetty + solr 1.3.0
>>
>> Thanks
>>
>>
>>
>
>
>
> --
> --Noble Paul
>



-- 
--Noble Paul

RE: Newbie Question - getting search results from dataimport request handler

Posted by Kevin Penny <kp...@jobs2web.com>.
Ok - and what would that be? (query interface)

I need the URL format that would work in this situation to return data from my setup.

I've gone through the tutorial and used execution strings like:
http://localhost:8983/solr/select/?indent=on&q=video&sort=price+desc
etc however I'm working with sql data and not xml data.

Thanks

-----Original Message-----
From: Noble Paul നോബിള്‍ नोब्ळ् [mailto:noble.paul@gmail.com]
Sent: Monday, November 10, 2008 10:18 PM
To: solr-user@lucene.apache.org
Subject: Re: Newbie Question - getting search results from dataimport request handler

you cannot query the DIH. It can only do indexing
after indexing you must do the indexing on the regular query interface

On Tue, Nov 11, 2008 at 9:45 AM, Kevin Penny <kp...@jobs2web.com> wrote:
> My Question is: what is the format of a search that will return data?
> i.e. /solr/select?q=developer&qt=dataimport (won't work) nor will /solr/dataimport?q=developer (won't work)
> "HTTP ERROR: 404
> NOT_FOUND
> RequestURI=/solr/dataimport"
>
> I have created a 'dataimport' set that contains data from a sql db.
>
> I can view meta data from this url: /solr/dataimport
> <response>
> −
> <lst name="responseHeader">
> <int name="status">0</int>
> <int name="QTime">0</int>
> </lst>
> −
> <lst name="initArgs">
> −
> <lst name="defaults">
> <str name="config">data-config.xml</str>
> </lst>
> </lst>
> <str name="status">idle</str>
> <str name="importResponse"/>
> −
> <lst name="statusMessages">
> <str name="Total Requests made to DataSource">1</str>
> <str name="Total Rows Fetched">10</str>
> <str name="Total Documents Skipped">0</str>
> <str name="Full Dump Started">2008-11-10 21:51:40</str>
> <str name="Time taken ">0:0:4.594</str>
> </lst>
> −
> <str name="WARNING">
> This response format is experimental.  It is likely to change in the future.
> </str>
> </response>
>
> I can verify that the data is there by going through /solr/admin/dataimport.jsp and doing 'verbose' true and debug now.
> It shows me the xml data set on the right as such:
>
> <response>
> −
> <lst name="responseHeader">
> <int name="status">0</int>
> <int name="QTime">4594</int>
> </lst>
> −
> <lst name="initArgs">
> −
> <lst name="defaults">
> <str name="config">data-config.xml</str>
> </lst>
> </lst>
> <str name="command">full-import</str>
> <str name="mode">debug</str>
> −
> <arr name="documents">
> −
> <arr>
> −
> <arr>
> <int>87133</int>
> </arr>
> </arr>
> −
> <arr>
> −
> <arr>
> <int>87134</int>
> </arr>
> </arr>
> −
> <arr>
> −
> <arr>
> <int>87135</int>
> </arr>
> </arr>
> −
> <arr>
> −
> <arr>
> <int>87136</int>
> </arr>
> </arr>
> −
> <arr>
> −
> <arr>
> <int>87137</int>
> </arr>
> </arr>
> −
> <arr>
> −
> <arr>
> <int>87138</int>
> </arr>
> </arr>
> −
> <arr>
> −
> <arr>
> <int>87139</int>
> </arr>
> </arr>
> −
> <arr>
> −
> <arr>
> <int>87140</int>
> </arr>
> </arr>
> −
> <arr>
> −
> <arr>
> <int>87141</int>
> </arr>
> </arr>
> −
> <arr>
> −
> <arr>
> <int>87142</int>
> </arr>
> </arr>
> </arr>
> −
> <lst name="verbose-output">
> −
> <lst name="entity:item">
> −
> <lst name="document#1">
> −
> <str name="query">
> SELECT  j.id      , j.title      ,  FROM      dbo.jobs j WITH (NOLOCK)      LEFT  WHERE j.siteid = 46 and j.active = 1
> </str>
> <str name="time-taken">0:0:4.578</str>
> <str>----------- row #1-------------</str>
> <str name="zip"/>
> <str name="urltitle">Operations Software Developer Job</str>
> <str name="altlocation">SAN ANTONIO, TX, 78229</str>
> <str name="alttitle">Ope…
>
>
> Here is my solconfig.xml
> …
> <requestHandler name="dataimport" class="org.apache.solr.handler.dataimport.DataImportHandler">
>    <lst name="defaults">
>      <str name="config">data-config.xml</str>
>    </lst>
>  </requestHandler>
> …
> Data-config.xml is in the same dir as solconfig.xml
>
> My data-config.xml is like any other:
> <dataConfig>
>    <dataSource driver="com.microsoft.sqlserver.jdbc.SQLServerDriver"
> url="jdbc:sqlserver://xxxxxxxx:1433;databaseName=xxxxx" user="xxxxx" password="xxxxx" />
>    <document name="jobs">
>            <entity name="item" pk="id" query="SELECT  j.id
>                                                            , j.title
>                                                            …
>                                                FROM
>                                                            dbo.jobs …
>                                                WHERE j.siteid = 46 and j.active = 1"
>                deltaQuery="select id from dbo.jobs where lastmodified > '${dataimporter.last_index_time}'">
>
>        </entity>
>    </document>
> </dataConfig>
>
> I'm using win xp with apache – and jetty + solr 1.3.0
>
> Thanks
>
>
>



--
--Noble Paul

Re: Newbie Question - getting search results from dataimport request handler

Posted by Noble Paul നോബിള്‍ नोब्ळ् <no...@gmail.com>.
you cannot query the DIH. It can only do indexing
after indexing you must do the indexing on the regular query interface

On Tue, Nov 11, 2008 at 9:45 AM, Kevin Penny <kp...@jobs2web.com> wrote:
> My Question is: what is the format of a search that will return data?
> i.e. /solr/select?q=developer&qt=dataimport (won't work) nor will /solr/dataimport?q=developer (won't work)
> "HTTP ERROR: 404
> NOT_FOUND
> RequestURI=/solr/dataimport"
>
> I have created a 'dataimport' set that contains data from a sql db.
>
> I can view meta data from this url: /solr/dataimport
> <response>
> −
> <lst name="responseHeader">
> <int name="status">0</int>
> <int name="QTime">0</int>
> </lst>
> −
> <lst name="initArgs">
> −
> <lst name="defaults">
> <str name="config">data-config.xml</str>
> </lst>
> </lst>
> <str name="status">idle</str>
> <str name="importResponse"/>
> −
> <lst name="statusMessages">
> <str name="Total Requests made to DataSource">1</str>
> <str name="Total Rows Fetched">10</str>
> <str name="Total Documents Skipped">0</str>
> <str name="Full Dump Started">2008-11-10 21:51:40</str>
> <str name="Time taken ">0:0:4.594</str>
> </lst>
> −
> <str name="WARNING">
> This response format is experimental.  It is likely to change in the future.
> </str>
> </response>
>
> I can verify that the data is there by going through /solr/admin/dataimport.jsp and doing 'verbose' true and debug now.
> It shows me the xml data set on the right as such:
>
> <response>
> −
> <lst name="responseHeader">
> <int name="status">0</int>
> <int name="QTime">4594</int>
> </lst>
> −
> <lst name="initArgs">
> −
> <lst name="defaults">
> <str name="config">data-config.xml</str>
> </lst>
> </lst>
> <str name="command">full-import</str>
> <str name="mode">debug</str>
> −
> <arr name="documents">
> −
> <arr>
> −
> <arr>
> <int>87133</int>
> </arr>
> </arr>
> −
> <arr>
> −
> <arr>
> <int>87134</int>
> </arr>
> </arr>
> −
> <arr>
> −
> <arr>
> <int>87135</int>
> </arr>
> </arr>
> −
> <arr>
> −
> <arr>
> <int>87136</int>
> </arr>
> </arr>
> −
> <arr>
> −
> <arr>
> <int>87137</int>
> </arr>
> </arr>
> −
> <arr>
> −
> <arr>
> <int>87138</int>
> </arr>
> </arr>
> −
> <arr>
> −
> <arr>
> <int>87139</int>
> </arr>
> </arr>
> −
> <arr>
> −
> <arr>
> <int>87140</int>
> </arr>
> </arr>
> −
> <arr>
> −
> <arr>
> <int>87141</int>
> </arr>
> </arr>
> −
> <arr>
> −
> <arr>
> <int>87142</int>
> </arr>
> </arr>
> </arr>
> −
> <lst name="verbose-output">
> −
> <lst name="entity:item">
> −
> <lst name="document#1">
> −
> <str name="query">
> SELECT  j.id      , j.title      ,  FROM      dbo.jobs j WITH (NOLOCK)      LEFT  WHERE j.siteid = 46 and j.active = 1
> </str>
> <str name="time-taken">0:0:4.578</str>
> <str>----------- row #1-------------</str>
> <str name="zip"/>
> <str name="urltitle">Operations Software Developer Job</str>
> <str name="altlocation">SAN ANTONIO, TX, 78229</str>
> <str name="alttitle">Ope…
>
>
> Here is my solconfig.xml
> …
> <requestHandler name="dataimport" class="org.apache.solr.handler.dataimport.DataImportHandler">
>    <lst name="defaults">
>      <str name="config">data-config.xml</str>
>    </lst>
>  </requestHandler>
> …
> Data-config.xml is in the same dir as solconfig.xml
>
> My data-config.xml is like any other:
> <dataConfig>
>    <dataSource driver="com.microsoft.sqlserver.jdbc.SQLServerDriver"
> url="jdbc:sqlserver://xxxxxxxx:1433;databaseName=xxxxx" user="xxxxx" password="xxxxx" />
>    <document name="jobs">
>            <entity name="item" pk="id" query="SELECT  j.id
>                                                            , j.title
>                                                            …
>                                                FROM
>                                                            dbo.jobs …
>                                                WHERE j.siteid = 46 and j.active = 1"
>                deltaQuery="select id from dbo.jobs where lastmodified > '${dataimporter.last_index_time}'">
>
>        </entity>
>    </document>
> </dataConfig>
>
> I'm using win xp with apache – and jetty + solr 1.3.0
>
> Thanks
>
>
>



-- 
--Noble Paul