You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Carrie Coy <cc...@ssww.com> on 2012/08/24 22:34:44 UTC

More debugging DIH - URLDataSource

I'm trying to write a DIH to incorporate page view metrics from an XML 
feed into our index.   The DIH makes a single request, and updates 0 
documents.  I set log level to "finest" for the entire dataimport 
section, but I still can't tell what's wrong.  I suspect the XPath.   
http://localhost:8080/solr/core1/admin/dataimport.jsp?handler=/dataimport returns 
404.  Any suggestions on how I can debug this?

    *

      solr-spec
          4.0.0.2012.08.06.22.50.47


The XML data:

<?xml version='1.0' encoding='UTF-8'?>
<ReportDataResponse>
<Data>
<Rows>
<Row rowKey="P#PRODUCT: BURLAP POTATO SACKS  (PACK OF 12) 
(W4537)#N/A#550000000016196614" rowActionAvailability="0 0 0">
<Value columnId="PAGE_NAME" comparisonSpecifier="A">PRODUCT: BURLAP 
POTATO SACKS  (PACK OF 12) (W4537)</Value>
<Value columnId="PAGE_VIEWS" comparisonSpecifier="A">2388</Value>
</Row>
<Row rowKey="P#PRODUCT: OPAQUE PONY BEADS 6X9MM  (BAG OF 850) 
(BE9000)#N/A#550000000021976460" rowActionAvailability="0 0 0">
<Value columnId="PAGE_NAME" comparisonSpecifier="A">PRODUCT: OPAQUE PONY 
BEADS 6X9MM  (BAG OF 850) (BE9000)</Value>
<Value columnId="PAGE_VIEWS" comparisonSpecifier="A">1313</Value>
</Row>
</Rows>
</Data>
</ReportDataResponse>

My DIH:

|<dataConfig>
  <dataSource name="coremetrics"
              type="URLDataSource"
              encoding="UTF-8"
              connectionTimeout="5000"
              readTimeout="10000"/>

  <document>
         <entity  name="coremetrics"
             dataSource="coremetrics"
             pk="id"
             url="https://welcome.coremetrics.com/analyticswebapp/api/1.0/report-data/contentcategory/bypage.ftl?clientId=******&amp;username=****&amp;format=XML&amp;userAuthKey=****&amp;language=en_US&mp;viewID=9475540&amp;period_a=M20110930"
             processor="XPathEntityProcessor"
             stream="true"
             forEach="/ReportDataResponse/Data/Rows/Row"
             logLevel="fine"
             transformer="RegexTransformer"  >

             <field  column="part_code"  name="id"    xpath="/ReportDataResponse/Data/Rows/Row/Value[@columnId='PAGE_NAME']"  regex="/^PRODUCT:.*\((.*?)\)$/"  replaceWith="$1"/>
             <field  column="page_views"             xpath="/ReportDataResponse/Data/Rows/Row/Value[@columnId='PAGE_VIEWS']"  />
        </entity>
  </document>
</dataConfig>
|

|||This little test perl script correctly extracts the data:|
||
|use XML::XPath;|
|use XML::XPath::XMLParser;|
||
|my $xp = XML::XPath->new(filename => 'cm.xml');|
|||my $nodeset = $xp->find('/ReportDataResponse/Data/Rows/Row');|
|||foreach my $node ($nodeset->get_nodelist) {|
|||my $page_name = $node->findvalue('Value[@columnId="PAGE_NAME"]');|
|    my $page_views = $node->findvalue('Value[@columnId="PAGE_VIEWS"]');|
|    $page_name =~ s/^PRODUCT:.*\((.*?)\)$/$1/;|
|}|

 From logs:

INFO: Loading DIH Configuration: data-config.xml
Aug 24, 2012 3:53:10 PM org.apache.solr.handler.dataimport.DataImporter 
loadDataConfig
INFO: Data Configuration loaded successfully
Aug 24, 2012 3:53:10 PM org.apache.solr.core.SolrCore execute
INFO: [ssww] webapp=/solr path=/dataimport params={command=full-import} 
status=0 QTime=2
Aug 24, 2012 3:53:10 PM org.apache.solr.handler.dataimport.DataImporter 
doFullImport
INFO: Starting Full Import
Aug 24, 2012 3:53:10 PM 
org.apache.solr.handler.dataimport.SimplePropertiesWriter 
readIndexerProperties
INFO: Read dataimport.properties
Aug 24, 2012 3:53:10 PM org.apache.solr.update.DirectUpdateHandler2 
deleteAll
INFO: [ssww] REMOVING ALL DOCUMENTS FROM INDEX
Aug 24, 2012 3:53:10 PM org.apache.solr.handler.dataimport.URLDataSource 
getData
FINE: Accessing URL: 
https://welcome.coremetrics.com/analyticswebapp/api/1.0/report-data/contentcategory/bypage.ftl?clientId=*****&username=***&format=XML&userAuthKey=******&language=en_US&viewID=9475540&period_a=M20110930
Aug 24, 2012 3:53:10 PM org.apache.solr.core.SolrCore execute
INFO: [ssww] webapp=/solr path=/dataimport params={command=status} 
status=0 QTime=0
Aug 24, 2012 3:53:12 PM org.apache.solr.core.SolrCore execute
INFO: [ssww] webapp=/solr path=/dataimport params={command=status} 
status=0 QTime=1
Aug 24, 2012 3:53:14 PM org.apache.solr.core.SolrCore execute
INFO: [ssww] webapp=/solr path=/dataimport params={command=status} 
status=0 QTime=1
Aug 24, 2012 3:53:16 PM org.apache.solr.core.SolrCore execute
INFO: [ssww] webapp=/solr path=/dataimport params={command=status} 
status=0 QTime=0
Aug 24, 2012 3:53:18 PM org.apache.solr.core.SolrCore execute
INFO: [ssww] webapp=/solr path=/dataimport params={command=status} 
status=0 QTime=0
Aug 24, 2012 3:53:20 PM org.apache.solr.core.SolrCore execute
INFO: [ssww] webapp=/solr path=/dataimport params={command=status} 
status=0 QTime=0
Aug 24, 2012 3:53:22 PM org.apache.solr.core.SolrCore execute
INFO: [ssww] webapp=/solr path=/dataimport params={command=status} 
status=0 QTime=0
Aug 24, 2012 3:53:24 PM org.apache.solr.core.SolrCore execute
INFO: [ssww] webapp=/solr path=/dataimport params={command=status} 
status=0 QTime=0
Aug 24, 2012 3:53:27 PM org.apache.solr.core.SolrCore execute
INFO: [ssww] webapp=/solr path=/dataimport params={command=status} 
status=0 QTime=0
Aug 24, 2012 3:53:28 PM org.apache.solr.handler.dataimport.DocBuilder finish
INFO: Import completed successfully
Aug 24, 2012 3:53:28 PM org.apache.solr.update.DirectUpdateHandler2 commit
INFO: start 
commit{flags=0,_version_=0,optimize=false,openSearcher=true,waitSearcher=true,expungeDeletes=false,softCommit=false}
Aug 24, 2012 3:53:28 PM org.apache.solr.core.SolrDeletionPolicy onCommit
INFO: SolrDeletionPolicy.onCommit: commits:num=2
         
commit{dir=/var/lib/tomcat6/solr/apache-solr-4.0.0-BETA/core1/data/index,segFN=segments_2b,generation=83,filenames=[segments_2b]
         
commit{dir=/var/lib/tomcat6/solr/apache-solr-4.0.0-BETA/core1/data/index,segFN=segments_2c,generation=84,filenames=[segments_2c]
Aug 24, 2012 3:53:28 PM org.apache.solr.core.SolrDeletionPolicy 
updateCommits
INFO: newest commit = 84
Aug 24, 2012 3:53:28 PM org.apache.solr.search.SolrIndexSearcher <init>
INFO: Opening Searcher@ff33d42 main
Aug 24, 2012 3:53:28 PM org.apache.solr.update.DirectUpdateHandler2 commit
INFO: end_commit_flush
Aug 24, 2012 3:53:28 PM org.apache.solr.core.QuerySenderListener newSearcher
INFO: QuerySenderListener sending requests to Searcher@ff33d42 
main{StandardDirectoryReader(segments_2c:323)}
Aug 24, 2012 3:53:28 PM org.apache.solr.core.QuerySenderListener newSearcher
INFO: QuerySenderListener done.
Aug 24, 2012 3:53:28 PM org.apache.solr.core.SolrCore registerSearcher
INFO: [ssww] Registered new searcher Searcher@ff33d42 
main{StandardDirectoryReader(segments_2c:323)}
Aug 24, 2012 3:53:28 PM 
org.apache.solr.handler.dataimport.SimplePropertiesWriter 
readIndexerProperties
INFO: Read dataimport.properties
Aug 24, 2012 3:53:28 PM 
org.apache.solr.handler.dataimport.SimplePropertiesWriter persist
INFO: Wrote last indexed time to dataimport.properties
Aug 24, 2012 3:53:28 PM org.apache.solr.handler.dataimport.DocBuilder 
execute
INFO: Time taken = 0:0:17.918
Aug 24, 2012 3:53:28 PM 
org.apache.solr.update.processor.LogUpdateProcessor finish
INFO: [ssww] webapp=/solr path=/dataimport params={command=full-import} 
status=0 QTime=2 {deleteByQuery=*:*,commit=} 0 2


Re: More debugging DIH - URLDataSource (solved)

Posted by Carrie Coy <cc...@ssww.com>.
Thank you for these suggestions.   The real problem was incorrect syntax 
for the primary key column in data-config.xml.   Once I corrected that, 
the data loaded fine.

wrong:

<field  column="part_code"  name="id"
xpath="/ReportDataResponse/Data/Rows/Row/Value[@columnId='PAGE_NAME']" regex="/^PRODUCT:.*\((.*?)\)$/"  replaceWith="$1"/>


Right:

<field  column="id"
xpath="/ReportDataResponse/Data/Rows/Row/Value[@columnId='PAGE_NAME']" regex="/^PRODUCT:.*\((.*?)\)$/"  replaceWith="$1"/>



On 08/25/2012 08:52 PM, Lance Norskog wrote:
> About XPaths: the XPath engine does a limited range of xpaths. The doc
> says that your paths are covered.
>
> About logs: You only have the RegexTransformer listed. You need to add
> LogTransformer to the transformer list:
> http://wiki.apache.org/solr/DataImportHandler#LogTransformer
>
> Having xml entity codes in the url string seems right. Can you verify
> the url that goes to the remote site? Can you read the logs at the
> remote site? Can you run this code through a proxy and watch the data?
>
> On Fri, Aug 24, 2012 at 1:34 PM, Carrie Coy<cc...@ssww.com>  wrote:
>> I'm trying to write a DIH to incorporate page view metrics from an XML feed
>> into our index.   The DIH makes a single request, and updates 0 documents.
>> I set log level to "finest" for the entire dataimport section, but I still
>> can't tell what's wrong.  I suspect the XPath.
>> http://localhost:8080/solr/core1/admin/dataimport.jsp?handler=/dataimport
>> returns 404.  Any suggestions on how I can debug this?
>>
>>     *
>>
>>       solr-spec
>>           4.0.0.2012.08.06.22.50.47
>>
>>
>> The XML data:
>>
>> <?xml version='1.0' encoding='UTF-8'?>
>> <ReportDataResponse>
>> <Data>
>> <Rows>
>> <Row rowKey="P#PRODUCT: BURLAP POTATO SACKS  (PACK OF 12)
>> (W4537)#N/A#550000000016196614" rowActionAvailability="0 0 0">
>> <Value columnId="PAGE_NAME" comparisonSpecifier="A">PRODUCT: BURLAP POTATO
>> SACKS  (PACK OF 12) (W4537)</Value>
>> <Value columnId="PAGE_VIEWS" comparisonSpecifier="A">2388</Value>
>> </Row>
>> <Row rowKey="P#PRODUCT: OPAQUE PONY BEADS 6X9MM  (BAG OF 850)
>> (BE9000)#N/A#550000000021976460" rowActionAvailability="0 0 0">
>> <Value columnId="PAGE_NAME" comparisonSpecifier="A">PRODUCT: OPAQUE PONY
>> BEADS 6X9MM  (BAG OF 850) (BE9000)</Value>
>> <Value columnId="PAGE_VIEWS" comparisonSpecifier="A">1313</Value>
>> </Row>
>> </Rows>
>> </Data>
>> </ReportDataResponse>
>>
>> My DIH:
>>
>> |<dataConfig>
>>   <dataSource name="coremetrics"
>>               type="URLDataSource"
>>               encoding="UTF-8"
>>               connectionTimeout="5000"
>>               readTimeout="10000"/>
>>
>>   <document>
>>          <entity  name="coremetrics"
>>              dataSource="coremetrics"
>>              pk="id"
>>
>> url="https://welcome.coremetrics.com/analyticswebapp/api/1.0/report-data/contentcategory/bypage.ftl?clientId=******&amp;username=****&amp;format=XML&amp;userAuthKey=****&amp;language=en_US&mp;viewID=9475540&amp;period_a=M20110930"
>>              processor="XPathEntityProcessor"
>>              stream="true"
>>              forEach="/ReportDataResponse/Data/Rows/Row"
>>              logLevel="fine"
>>              transformer="RegexTransformer">
>>
>>              <field  column="part_code"  name="id"
>> xpath="/ReportDataResponse/Data/Rows/Row/Value[@columnId='PAGE_NAME']"
>> regex="/^PRODUCT:.*\((.*?)\)$/"  replaceWith="$1"/>
>>              <field  column="page_views"
>> xpath="/ReportDataResponse/Data/Rows/Row/Value[@columnId='PAGE_VIEWS']"  />
>>         </entity>
>>   </document>
>> </dataConfig>
>> |
>>
>> |||This little test perl script correctly extracts the data:|
>> ||
>> |use XML::XPath;|
>> |use XML::XPath::XMLParser;|
>> ||
>> |my $xp = XML::XPath->new(filename =>  'cm.xml');|
>> |||my $nodeset = $xp->find('/ReportDataResponse/Data/Rows/Row');|
>> |||foreach my $node ($nodeset->get_nodelist) {|
>> |||my $page_name = $node->findvalue('Value[@columnId="PAGE_NAME"]');|
>> |    my $page_views = $node->findvalue('Value[@columnId="PAGE_VIEWS"]');|
>> |    $page_name =~ s/^PRODUCT:.*\((.*?)\)$/$1/;|
>> |}|
>>
>>  From logs:
>>
>> INFO: Loading DIH Configuration: data-config.xml
>> Aug 24, 2012 3:53:10 PM org.apache.solr.handler.dataimport.DataImporter
>> loadDataConfig
>> INFO: Data Configuration loaded successfully
>> Aug 24, 2012 3:53:10 PM org.apache.solr.core.SolrCore execute
>> INFO: [ssww] webapp=/solr path=/dataimport params={command=full-import}
>> status=0 QTime=2
>> Aug 24, 2012 3:53:10 PM org.apache.solr.handler.dataimport.DataImporter
>> doFullImport
>> INFO: Starting Full Import
>> Aug 24, 2012 3:53:10 PM
>> org.apache.solr.handler.dataimport.SimplePropertiesWriter
>> readIndexerProperties
>> INFO: Read dataimport.properties
>> Aug 24, 2012 3:53:10 PM org.apache.solr.update.DirectUpdateHandler2
>> deleteAll
>> INFO: [ssww] REMOVING ALL DOCUMENTS FROM INDEX
>> Aug 24, 2012 3:53:10 PM org.apache.solr.handler.dataimport.URLDataSource
>> getData
>> FINE: Accessing URL:
>> https://welcome.coremetrics.com/analyticswebapp/api/1.0/report-data/contentcategory/bypage.ftl?clientId=*****&username=***&format=XML&userAuthKey=******&language=en_US&viewID=9475540&period_a=M20110930
>> Aug 24, 2012 3:53:10 PM org.apache.solr.core.SolrCore execute
>> INFO: [ssww] webapp=/solr path=/dataimport params={command=status} status=0
>> QTime=0
>> Aug 24, 2012 3:53:12 PM org.apache.solr.core.SolrCore execute
>> INFO: [ssww] webapp=/solr path=/dataimport params={command=status} status=0
>> QTime=1
>> Aug 24, 2012 3:53:14 PM org.apache.solr.core.SolrCore execute
>> INFO: [ssww] webapp=/solr path=/dataimport params={command=status} status=0
>> QTime=1
>> Aug 24, 2012 3:53:16 PM org.apache.solr.core.SolrCore execute
>> INFO: [ssww] webapp=/solr path=/dataimport params={command=status} status=0
>> QTime=0
>> Aug 24, 2012 3:53:18 PM org.apache.solr.core.SolrCore execute
>> INFO: [ssww] webapp=/solr path=/dataimport params={command=status} status=0
>> QTime=0
>> Aug 24, 2012 3:53:20 PM org.apache.solr.core.SolrCore execute
>> INFO: [ssww] webapp=/solr path=/dataimport params={command=status} status=0
>> QTime=0
>> Aug 24, 2012 3:53:22 PM org.apache.solr.core.SolrCore execute
>> INFO: [ssww] webapp=/solr path=/dataimport params={command=status} status=0
>> QTime=0
>> Aug 24, 2012 3:53:24 PM org.apache.solr.core.SolrCore execute
>> INFO: [ssww] webapp=/solr path=/dataimport params={command=status} status=0
>> QTime=0
>> Aug 24, 2012 3:53:27 PM org.apache.solr.core.SolrCore execute
>> INFO: [ssww] webapp=/solr path=/dataimport params={command=status} status=0
>> QTime=0
>> Aug 24, 2012 3:53:28 PM org.apache.solr.handler.dataimport.DocBuilder finish
>> INFO: Import completed successfully
>> Aug 24, 2012 3:53:28 PM org.apache.solr.update.DirectUpdateHandler2 commit
>> INFO: start
>> commit{flags=0,_version_=0,optimize=false,openSearcher=true,waitSearcher=true,expungeDeletes=false,softCommit=false}
>> Aug 24, 2012 3:53:28 PM org.apache.solr.core.SolrDeletionPolicy onCommit
>> INFO: SolrDeletionPolicy.onCommit: commits:num=2
>>
>> commit{dir=/var/lib/tomcat6/solr/apache-solr-4.0.0-BETA/core1/data/index,segFN=segments_2b,generation=83,filenames=[segments_2b]
>>
>> commit{dir=/var/lib/tomcat6/solr/apache-solr-4.0.0-BETA/core1/data/index,segFN=segments_2c,generation=84,filenames=[segments_2c]
>> Aug 24, 2012 3:53:28 PM org.apache.solr.core.SolrDeletionPolicy
>> updateCommits
>> INFO: newest commit = 84
>> Aug 24, 2012 3:53:28 PM org.apache.solr.search.SolrIndexSearcher<init>
>> INFO: Opening Searcher@ff33d42 main
>> Aug 24, 2012 3:53:28 PM org.apache.solr.update.DirectUpdateHandler2 commit
>> INFO: end_commit_flush
>> Aug 24, 2012 3:53:28 PM org.apache.solr.core.QuerySenderListener newSearcher
>> INFO: QuerySenderListener sending requests to Searcher@ff33d42
>> main{StandardDirectoryReader(segments_2c:323)}
>> Aug 24, 2012 3:53:28 PM org.apache.solr.core.QuerySenderListener newSearcher
>> INFO: QuerySenderListener done.
>> Aug 24, 2012 3:53:28 PM org.apache.solr.core.SolrCore registerSearcher
>> INFO: [ssww] Registered new searcher Searcher@ff33d42
>> main{StandardDirectoryReader(segments_2c:323)}
>> Aug 24, 2012 3:53:28 PM
>> org.apache.solr.handler.dataimport.SimplePropertiesWriter
>> readIndexerProperties
>> INFO: Read dataimport.properties
>> Aug 24, 2012 3:53:28 PM
>> org.apache.solr.handler.dataimport.SimplePropertiesWriter persist
>> INFO: Wrote last indexed time to dataimport.properties
>> Aug 24, 2012 3:53:28 PM org.apache.solr.handler.dataimport.DocBuilder
>> execute
>> INFO: Time taken = 0:0:17.918
>> Aug 24, 2012 3:53:28 PM org.apache.solr.update.processor.LogUpdateProcessor
>> finish
>> INFO: [ssww] webapp=/solr path=/dataimport params={command=full-import}
>> status=0 QTime=2 {deleteByQuery=*:*,commit=} 0 2
>>
>
>

Re: More debugging DIH - URLDataSource

Posted by Lance Norskog <go...@gmail.com>.
About XPaths: the XPath engine does a limited range of xpaths. The doc
says that your paths are covered.

About logs: You only have the RegexTransformer listed. You need to add
LogTransformer to the transformer list:
http://wiki.apache.org/solr/DataImportHandler#LogTransformer

Having xml entity codes in the url string seems right. Can you verify
the url that goes to the remote site? Can you read the logs at the
remote site? Can you run this code through a proxy and watch the data?

On Fri, Aug 24, 2012 at 1:34 PM, Carrie Coy <cc...@ssww.com> wrote:
> I'm trying to write a DIH to incorporate page view metrics from an XML feed
> into our index.   The DIH makes a single request, and updates 0 documents.
> I set log level to "finest" for the entire dataimport section, but I still
> can't tell what's wrong.  I suspect the XPath.
> http://localhost:8080/solr/core1/admin/dataimport.jsp?handler=/dataimport
> returns 404.  Any suggestions on how I can debug this?
>
>    *
>
>      solr-spec
>          4.0.0.2012.08.06.22.50.47
>
>
> The XML data:
>
> <?xml version='1.0' encoding='UTF-8'?>
> <ReportDataResponse>
> <Data>
> <Rows>
> <Row rowKey="P#PRODUCT: BURLAP POTATO SACKS  (PACK OF 12)
> (W4537)#N/A#550000000016196614" rowActionAvailability="0 0 0">
> <Value columnId="PAGE_NAME" comparisonSpecifier="A">PRODUCT: BURLAP POTATO
> SACKS  (PACK OF 12) (W4537)</Value>
> <Value columnId="PAGE_VIEWS" comparisonSpecifier="A">2388</Value>
> </Row>
> <Row rowKey="P#PRODUCT: OPAQUE PONY BEADS 6X9MM  (BAG OF 850)
> (BE9000)#N/A#550000000021976460" rowActionAvailability="0 0 0">
> <Value columnId="PAGE_NAME" comparisonSpecifier="A">PRODUCT: OPAQUE PONY
> BEADS 6X9MM  (BAG OF 850) (BE9000)</Value>
> <Value columnId="PAGE_VIEWS" comparisonSpecifier="A">1313</Value>
> </Row>
> </Rows>
> </Data>
> </ReportDataResponse>
>
> My DIH:
>
> |<dataConfig>
>  <dataSource name="coremetrics"
>              type="URLDataSource"
>              encoding="UTF-8"
>              connectionTimeout="5000"
>              readTimeout="10000"/>
>
>  <document>
>         <entity  name="coremetrics"
>             dataSource="coremetrics"
>             pk="id"
>
> url="https://welcome.coremetrics.com/analyticswebapp/api/1.0/report-data/contentcategory/bypage.ftl?clientId=******&amp;username=****&amp;format=XML&amp;userAuthKey=****&amp;language=en_US&mp;viewID=9475540&amp;period_a=M20110930"
>             processor="XPathEntityProcessor"
>             stream="true"
>             forEach="/ReportDataResponse/Data/Rows/Row"
>             logLevel="fine"
>             transformer="RegexTransformer"  >
>
>             <field  column="part_code"  name="id"
> xpath="/ReportDataResponse/Data/Rows/Row/Value[@columnId='PAGE_NAME']"
> regex="/^PRODUCT:.*\((.*?)\)$/"  replaceWith="$1"/>
>             <field  column="page_views"
> xpath="/ReportDataResponse/Data/Rows/Row/Value[@columnId='PAGE_VIEWS']"  />
>        </entity>
>  </document>
> </dataConfig>
> |
>
> |||This little test perl script correctly extracts the data:|
> ||
> |use XML::XPath;|
> |use XML::XPath::XMLParser;|
> ||
> |my $xp = XML::XPath->new(filename => 'cm.xml');|
> |||my $nodeset = $xp->find('/ReportDataResponse/Data/Rows/Row');|
> |||foreach my $node ($nodeset->get_nodelist) {|
> |||my $page_name = $node->findvalue('Value[@columnId="PAGE_NAME"]');|
> |    my $page_views = $node->findvalue('Value[@columnId="PAGE_VIEWS"]');|
> |    $page_name =~ s/^PRODUCT:.*\((.*?)\)$/$1/;|
> |}|
>
> From logs:
>
> INFO: Loading DIH Configuration: data-config.xml
> Aug 24, 2012 3:53:10 PM org.apache.solr.handler.dataimport.DataImporter
> loadDataConfig
> INFO: Data Configuration loaded successfully
> Aug 24, 2012 3:53:10 PM org.apache.solr.core.SolrCore execute
> INFO: [ssww] webapp=/solr path=/dataimport params={command=full-import}
> status=0 QTime=2
> Aug 24, 2012 3:53:10 PM org.apache.solr.handler.dataimport.DataImporter
> doFullImport
> INFO: Starting Full Import
> Aug 24, 2012 3:53:10 PM
> org.apache.solr.handler.dataimport.SimplePropertiesWriter
> readIndexerProperties
> INFO: Read dataimport.properties
> Aug 24, 2012 3:53:10 PM org.apache.solr.update.DirectUpdateHandler2
> deleteAll
> INFO: [ssww] REMOVING ALL DOCUMENTS FROM INDEX
> Aug 24, 2012 3:53:10 PM org.apache.solr.handler.dataimport.URLDataSource
> getData
> FINE: Accessing URL:
> https://welcome.coremetrics.com/analyticswebapp/api/1.0/report-data/contentcategory/bypage.ftl?clientId=*****&username=***&format=XML&userAuthKey=******&language=en_US&viewID=9475540&period_a=M20110930
> Aug 24, 2012 3:53:10 PM org.apache.solr.core.SolrCore execute
> INFO: [ssww] webapp=/solr path=/dataimport params={command=status} status=0
> QTime=0
> Aug 24, 2012 3:53:12 PM org.apache.solr.core.SolrCore execute
> INFO: [ssww] webapp=/solr path=/dataimport params={command=status} status=0
> QTime=1
> Aug 24, 2012 3:53:14 PM org.apache.solr.core.SolrCore execute
> INFO: [ssww] webapp=/solr path=/dataimport params={command=status} status=0
> QTime=1
> Aug 24, 2012 3:53:16 PM org.apache.solr.core.SolrCore execute
> INFO: [ssww] webapp=/solr path=/dataimport params={command=status} status=0
> QTime=0
> Aug 24, 2012 3:53:18 PM org.apache.solr.core.SolrCore execute
> INFO: [ssww] webapp=/solr path=/dataimport params={command=status} status=0
> QTime=0
> Aug 24, 2012 3:53:20 PM org.apache.solr.core.SolrCore execute
> INFO: [ssww] webapp=/solr path=/dataimport params={command=status} status=0
> QTime=0
> Aug 24, 2012 3:53:22 PM org.apache.solr.core.SolrCore execute
> INFO: [ssww] webapp=/solr path=/dataimport params={command=status} status=0
> QTime=0
> Aug 24, 2012 3:53:24 PM org.apache.solr.core.SolrCore execute
> INFO: [ssww] webapp=/solr path=/dataimport params={command=status} status=0
> QTime=0
> Aug 24, 2012 3:53:27 PM org.apache.solr.core.SolrCore execute
> INFO: [ssww] webapp=/solr path=/dataimport params={command=status} status=0
> QTime=0
> Aug 24, 2012 3:53:28 PM org.apache.solr.handler.dataimport.DocBuilder finish
> INFO: Import completed successfully
> Aug 24, 2012 3:53:28 PM org.apache.solr.update.DirectUpdateHandler2 commit
> INFO: start
> commit{flags=0,_version_=0,optimize=false,openSearcher=true,waitSearcher=true,expungeDeletes=false,softCommit=false}
> Aug 24, 2012 3:53:28 PM org.apache.solr.core.SolrDeletionPolicy onCommit
> INFO: SolrDeletionPolicy.onCommit: commits:num=2
>
> commit{dir=/var/lib/tomcat6/solr/apache-solr-4.0.0-BETA/core1/data/index,segFN=segments_2b,generation=83,filenames=[segments_2b]
>
> commit{dir=/var/lib/tomcat6/solr/apache-solr-4.0.0-BETA/core1/data/index,segFN=segments_2c,generation=84,filenames=[segments_2c]
> Aug 24, 2012 3:53:28 PM org.apache.solr.core.SolrDeletionPolicy
> updateCommits
> INFO: newest commit = 84
> Aug 24, 2012 3:53:28 PM org.apache.solr.search.SolrIndexSearcher <init>
> INFO: Opening Searcher@ff33d42 main
> Aug 24, 2012 3:53:28 PM org.apache.solr.update.DirectUpdateHandler2 commit
> INFO: end_commit_flush
> Aug 24, 2012 3:53:28 PM org.apache.solr.core.QuerySenderListener newSearcher
> INFO: QuerySenderListener sending requests to Searcher@ff33d42
> main{StandardDirectoryReader(segments_2c:323)}
> Aug 24, 2012 3:53:28 PM org.apache.solr.core.QuerySenderListener newSearcher
> INFO: QuerySenderListener done.
> Aug 24, 2012 3:53:28 PM org.apache.solr.core.SolrCore registerSearcher
> INFO: [ssww] Registered new searcher Searcher@ff33d42
> main{StandardDirectoryReader(segments_2c:323)}
> Aug 24, 2012 3:53:28 PM
> org.apache.solr.handler.dataimport.SimplePropertiesWriter
> readIndexerProperties
> INFO: Read dataimport.properties
> Aug 24, 2012 3:53:28 PM
> org.apache.solr.handler.dataimport.SimplePropertiesWriter persist
> INFO: Wrote last indexed time to dataimport.properties
> Aug 24, 2012 3:53:28 PM org.apache.solr.handler.dataimport.DocBuilder
> execute
> INFO: Time taken = 0:0:17.918
> Aug 24, 2012 3:53:28 PM org.apache.solr.update.processor.LogUpdateProcessor
> finish
> INFO: [ssww] webapp=/solr path=/dataimport params={command=full-import}
> status=0 QTime=2 {deleteByQuery=*:*,commit=} 0 2
>



-- 
Lance Norskog
goksron@gmail.com