You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Carrie Coy <cc...@ssww.com> on 2012/08/24 22:34:44 UTC
More debugging DIH - URLDataSource
I'm trying to write a DIH to incorporate page view metrics from an XML
feed into our index. The DIH makes a single request, and updates 0
documents. I set log level to "finest" for the entire dataimport
section, but I still can't tell what's wrong. I suspect the XPath.
http://localhost:8080/solr/core1/admin/dataimport.jsp?handler=/dataimport returns
404. Any suggestions on how I can debug this?
*
solr-spec
4.0.0.2012.08.06.22.50.47
The XML data:
<?xml version='1.0' encoding='UTF-8'?>
<ReportDataResponse>
<Data>
<Rows>
<Row rowKey="P#PRODUCT: BURLAP POTATO SACKS (PACK OF 12)
(W4537)#N/A#550000000016196614" rowActionAvailability="0 0 0">
<Value columnId="PAGE_NAME" comparisonSpecifier="A">PRODUCT: BURLAP
POTATO SACKS (PACK OF 12) (W4537)</Value>
<Value columnId="PAGE_VIEWS" comparisonSpecifier="A">2388</Value>
</Row>
<Row rowKey="P#PRODUCT: OPAQUE PONY BEADS 6X9MM (BAG OF 850)
(BE9000)#N/A#550000000021976460" rowActionAvailability="0 0 0">
<Value columnId="PAGE_NAME" comparisonSpecifier="A">PRODUCT: OPAQUE PONY
BEADS 6X9MM (BAG OF 850) (BE9000)</Value>
<Value columnId="PAGE_VIEWS" comparisonSpecifier="A">1313</Value>
</Row>
</Rows>
</Data>
</ReportDataResponse>
My DIH:
|<dataConfig>
<dataSource name="coremetrics"
type="URLDataSource"
encoding="UTF-8"
connectionTimeout="5000"
readTimeout="10000"/>
<document>
<entity name="coremetrics"
dataSource="coremetrics"
pk="id"
url="https://welcome.coremetrics.com/analyticswebapp/api/1.0/report-data/contentcategory/bypage.ftl?clientId=******&username=****&format=XML&userAuthKey=****&language=en_US∓viewID=9475540&period_a=M20110930"
processor="XPathEntityProcessor"
stream="true"
forEach="/ReportDataResponse/Data/Rows/Row"
logLevel="fine"
transformer="RegexTransformer" >
<field column="part_code" name="id" xpath="/ReportDataResponse/Data/Rows/Row/Value[@columnId='PAGE_NAME']" regex="/^PRODUCT:.*\((.*?)\)$/" replaceWith="$1"/>
<field column="page_views" xpath="/ReportDataResponse/Data/Rows/Row/Value[@columnId='PAGE_VIEWS']" />
</entity>
</document>
</dataConfig>
|
|||This little test perl script correctly extracts the data:|
||
|use XML::XPath;|
|use XML::XPath::XMLParser;|
||
|my $xp = XML::XPath->new(filename => 'cm.xml');|
|||my $nodeset = $xp->find('/ReportDataResponse/Data/Rows/Row');|
|||foreach my $node ($nodeset->get_nodelist) {|
|||my $page_name = $node->findvalue('Value[@columnId="PAGE_NAME"]');|
| my $page_views = $node->findvalue('Value[@columnId="PAGE_VIEWS"]');|
| $page_name =~ s/^PRODUCT:.*\((.*?)\)$/$1/;|
|}|
From logs:
INFO: Loading DIH Configuration: data-config.xml
Aug 24, 2012 3:53:10 PM org.apache.solr.handler.dataimport.DataImporter
loadDataConfig
INFO: Data Configuration loaded successfully
Aug 24, 2012 3:53:10 PM org.apache.solr.core.SolrCore execute
INFO: [ssww] webapp=/solr path=/dataimport params={command=full-import}
status=0 QTime=2
Aug 24, 2012 3:53:10 PM org.apache.solr.handler.dataimport.DataImporter
doFullImport
INFO: Starting Full Import
Aug 24, 2012 3:53:10 PM
org.apache.solr.handler.dataimport.SimplePropertiesWriter
readIndexerProperties
INFO: Read dataimport.properties
Aug 24, 2012 3:53:10 PM org.apache.solr.update.DirectUpdateHandler2
deleteAll
INFO: [ssww] REMOVING ALL DOCUMENTS FROM INDEX
Aug 24, 2012 3:53:10 PM org.apache.solr.handler.dataimport.URLDataSource
getData
FINE: Accessing URL:
https://welcome.coremetrics.com/analyticswebapp/api/1.0/report-data/contentcategory/bypage.ftl?clientId=*****&username=***&format=XML&userAuthKey=******&language=en_US&viewID=9475540&period_a=M20110930
Aug 24, 2012 3:53:10 PM org.apache.solr.core.SolrCore execute
INFO: [ssww] webapp=/solr path=/dataimport params={command=status}
status=0 QTime=0
Aug 24, 2012 3:53:12 PM org.apache.solr.core.SolrCore execute
INFO: [ssww] webapp=/solr path=/dataimport params={command=status}
status=0 QTime=1
Aug 24, 2012 3:53:14 PM org.apache.solr.core.SolrCore execute
INFO: [ssww] webapp=/solr path=/dataimport params={command=status}
status=0 QTime=1
Aug 24, 2012 3:53:16 PM org.apache.solr.core.SolrCore execute
INFO: [ssww] webapp=/solr path=/dataimport params={command=status}
status=0 QTime=0
Aug 24, 2012 3:53:18 PM org.apache.solr.core.SolrCore execute
INFO: [ssww] webapp=/solr path=/dataimport params={command=status}
status=0 QTime=0
Aug 24, 2012 3:53:20 PM org.apache.solr.core.SolrCore execute
INFO: [ssww] webapp=/solr path=/dataimport params={command=status}
status=0 QTime=0
Aug 24, 2012 3:53:22 PM org.apache.solr.core.SolrCore execute
INFO: [ssww] webapp=/solr path=/dataimport params={command=status}
status=0 QTime=0
Aug 24, 2012 3:53:24 PM org.apache.solr.core.SolrCore execute
INFO: [ssww] webapp=/solr path=/dataimport params={command=status}
status=0 QTime=0
Aug 24, 2012 3:53:27 PM org.apache.solr.core.SolrCore execute
INFO: [ssww] webapp=/solr path=/dataimport params={command=status}
status=0 QTime=0
Aug 24, 2012 3:53:28 PM org.apache.solr.handler.dataimport.DocBuilder finish
INFO: Import completed successfully
Aug 24, 2012 3:53:28 PM org.apache.solr.update.DirectUpdateHandler2 commit
INFO: start
commit{flags=0,_version_=0,optimize=false,openSearcher=true,waitSearcher=true,expungeDeletes=false,softCommit=false}
Aug 24, 2012 3:53:28 PM org.apache.solr.core.SolrDeletionPolicy onCommit
INFO: SolrDeletionPolicy.onCommit: commits:num=2
commit{dir=/var/lib/tomcat6/solr/apache-solr-4.0.0-BETA/core1/data/index,segFN=segments_2b,generation=83,filenames=[segments_2b]
commit{dir=/var/lib/tomcat6/solr/apache-solr-4.0.0-BETA/core1/data/index,segFN=segments_2c,generation=84,filenames=[segments_2c]
Aug 24, 2012 3:53:28 PM org.apache.solr.core.SolrDeletionPolicy
updateCommits
INFO: newest commit = 84
Aug 24, 2012 3:53:28 PM org.apache.solr.search.SolrIndexSearcher <init>
INFO: Opening Searcher@ff33d42 main
Aug 24, 2012 3:53:28 PM org.apache.solr.update.DirectUpdateHandler2 commit
INFO: end_commit_flush
Aug 24, 2012 3:53:28 PM org.apache.solr.core.QuerySenderListener newSearcher
INFO: QuerySenderListener sending requests to Searcher@ff33d42
main{StandardDirectoryReader(segments_2c:323)}
Aug 24, 2012 3:53:28 PM org.apache.solr.core.QuerySenderListener newSearcher
INFO: QuerySenderListener done.
Aug 24, 2012 3:53:28 PM org.apache.solr.core.SolrCore registerSearcher
INFO: [ssww] Registered new searcher Searcher@ff33d42
main{StandardDirectoryReader(segments_2c:323)}
Aug 24, 2012 3:53:28 PM
org.apache.solr.handler.dataimport.SimplePropertiesWriter
readIndexerProperties
INFO: Read dataimport.properties
Aug 24, 2012 3:53:28 PM
org.apache.solr.handler.dataimport.SimplePropertiesWriter persist
INFO: Wrote last indexed time to dataimport.properties
Aug 24, 2012 3:53:28 PM org.apache.solr.handler.dataimport.DocBuilder
execute
INFO: Time taken = 0:0:17.918
Aug 24, 2012 3:53:28 PM
org.apache.solr.update.processor.LogUpdateProcessor finish
INFO: [ssww] webapp=/solr path=/dataimport params={command=full-import}
status=0 QTime=2 {deleteByQuery=*:*,commit=} 0 2
Re: More debugging DIH - URLDataSource (solved)
Posted by Carrie Coy <cc...@ssww.com>.
Thank you for these suggestions. The real problem was incorrect syntax
for the primary key column in data-config.xml. Once I corrected that,
the data loaded fine.
wrong:
<field column="part_code" name="id"
xpath="/ReportDataResponse/Data/Rows/Row/Value[@columnId='PAGE_NAME']" regex="/^PRODUCT:.*\((.*?)\)$/" replaceWith="$1"/>
Right:
<field column="id"
xpath="/ReportDataResponse/Data/Rows/Row/Value[@columnId='PAGE_NAME']" regex="/^PRODUCT:.*\((.*?)\)$/" replaceWith="$1"/>
On 08/25/2012 08:52 PM, Lance Norskog wrote:
> About XPaths: the XPath engine does a limited range of xpaths. The doc
> says that your paths are covered.
>
> About logs: You only have the RegexTransformer listed. You need to add
> LogTransformer to the transformer list:
> http://wiki.apache.org/solr/DataImportHandler#LogTransformer
>
> Having xml entity codes in the url string seems right. Can you verify
> the url that goes to the remote site? Can you read the logs at the
> remote site? Can you run this code through a proxy and watch the data?
>
> On Fri, Aug 24, 2012 at 1:34 PM, Carrie Coy<cc...@ssww.com> wrote:
>> I'm trying to write a DIH to incorporate page view metrics from an XML feed
>> into our index. The DIH makes a single request, and updates 0 documents.
>> I set log level to "finest" for the entire dataimport section, but I still
>> can't tell what's wrong. I suspect the XPath.
>> http://localhost:8080/solr/core1/admin/dataimport.jsp?handler=/dataimport
>> returns 404. Any suggestions on how I can debug this?
>>
>> *
>>
>> solr-spec
>> 4.0.0.2012.08.06.22.50.47
>>
>>
>> The XML data:
>>
>> <?xml version='1.0' encoding='UTF-8'?>
>> <ReportDataResponse>
>> <Data>
>> <Rows>
>> <Row rowKey="P#PRODUCT: BURLAP POTATO SACKS (PACK OF 12)
>> (W4537)#N/A#550000000016196614" rowActionAvailability="0 0 0">
>> <Value columnId="PAGE_NAME" comparisonSpecifier="A">PRODUCT: BURLAP POTATO
>> SACKS (PACK OF 12) (W4537)</Value>
>> <Value columnId="PAGE_VIEWS" comparisonSpecifier="A">2388</Value>
>> </Row>
>> <Row rowKey="P#PRODUCT: OPAQUE PONY BEADS 6X9MM (BAG OF 850)
>> (BE9000)#N/A#550000000021976460" rowActionAvailability="0 0 0">
>> <Value columnId="PAGE_NAME" comparisonSpecifier="A">PRODUCT: OPAQUE PONY
>> BEADS 6X9MM (BAG OF 850) (BE9000)</Value>
>> <Value columnId="PAGE_VIEWS" comparisonSpecifier="A">1313</Value>
>> </Row>
>> </Rows>
>> </Data>
>> </ReportDataResponse>
>>
>> My DIH:
>>
>> |<dataConfig>
>> <dataSource name="coremetrics"
>> type="URLDataSource"
>> encoding="UTF-8"
>> connectionTimeout="5000"
>> readTimeout="10000"/>
>>
>> <document>
>> <entity name="coremetrics"
>> dataSource="coremetrics"
>> pk="id"
>>
>> url="https://welcome.coremetrics.com/analyticswebapp/api/1.0/report-data/contentcategory/bypage.ftl?clientId=******&username=****&format=XML&userAuthKey=****&language=en_US∓viewID=9475540&period_a=M20110930"
>> processor="XPathEntityProcessor"
>> stream="true"
>> forEach="/ReportDataResponse/Data/Rows/Row"
>> logLevel="fine"
>> transformer="RegexTransformer">
>>
>> <field column="part_code" name="id"
>> xpath="/ReportDataResponse/Data/Rows/Row/Value[@columnId='PAGE_NAME']"
>> regex="/^PRODUCT:.*\((.*?)\)$/" replaceWith="$1"/>
>> <field column="page_views"
>> xpath="/ReportDataResponse/Data/Rows/Row/Value[@columnId='PAGE_VIEWS']" />
>> </entity>
>> </document>
>> </dataConfig>
>> |
>>
>> |||This little test perl script correctly extracts the data:|
>> ||
>> |use XML::XPath;|
>> |use XML::XPath::XMLParser;|
>> ||
>> |my $xp = XML::XPath->new(filename => 'cm.xml');|
>> |||my $nodeset = $xp->find('/ReportDataResponse/Data/Rows/Row');|
>> |||foreach my $node ($nodeset->get_nodelist) {|
>> |||my $page_name = $node->findvalue('Value[@columnId="PAGE_NAME"]');|
>> | my $page_views = $node->findvalue('Value[@columnId="PAGE_VIEWS"]');|
>> | $page_name =~ s/^PRODUCT:.*\((.*?)\)$/$1/;|
>> |}|
>>
>> From logs:
>>
>> INFO: Loading DIH Configuration: data-config.xml
>> Aug 24, 2012 3:53:10 PM org.apache.solr.handler.dataimport.DataImporter
>> loadDataConfig
>> INFO: Data Configuration loaded successfully
>> Aug 24, 2012 3:53:10 PM org.apache.solr.core.SolrCore execute
>> INFO: [ssww] webapp=/solr path=/dataimport params={command=full-import}
>> status=0 QTime=2
>> Aug 24, 2012 3:53:10 PM org.apache.solr.handler.dataimport.DataImporter
>> doFullImport
>> INFO: Starting Full Import
>> Aug 24, 2012 3:53:10 PM
>> org.apache.solr.handler.dataimport.SimplePropertiesWriter
>> readIndexerProperties
>> INFO: Read dataimport.properties
>> Aug 24, 2012 3:53:10 PM org.apache.solr.update.DirectUpdateHandler2
>> deleteAll
>> INFO: [ssww] REMOVING ALL DOCUMENTS FROM INDEX
>> Aug 24, 2012 3:53:10 PM org.apache.solr.handler.dataimport.URLDataSource
>> getData
>> FINE: Accessing URL:
>> https://welcome.coremetrics.com/analyticswebapp/api/1.0/report-data/contentcategory/bypage.ftl?clientId=*****&username=***&format=XML&userAuthKey=******&language=en_US&viewID=9475540&period_a=M20110930
>> Aug 24, 2012 3:53:10 PM org.apache.solr.core.SolrCore execute
>> INFO: [ssww] webapp=/solr path=/dataimport params={command=status} status=0
>> QTime=0
>> Aug 24, 2012 3:53:12 PM org.apache.solr.core.SolrCore execute
>> INFO: [ssww] webapp=/solr path=/dataimport params={command=status} status=0
>> QTime=1
>> Aug 24, 2012 3:53:14 PM org.apache.solr.core.SolrCore execute
>> INFO: [ssww] webapp=/solr path=/dataimport params={command=status} status=0
>> QTime=1
>> Aug 24, 2012 3:53:16 PM org.apache.solr.core.SolrCore execute
>> INFO: [ssww] webapp=/solr path=/dataimport params={command=status} status=0
>> QTime=0
>> Aug 24, 2012 3:53:18 PM org.apache.solr.core.SolrCore execute
>> INFO: [ssww] webapp=/solr path=/dataimport params={command=status} status=0
>> QTime=0
>> Aug 24, 2012 3:53:20 PM org.apache.solr.core.SolrCore execute
>> INFO: [ssww] webapp=/solr path=/dataimport params={command=status} status=0
>> QTime=0
>> Aug 24, 2012 3:53:22 PM org.apache.solr.core.SolrCore execute
>> INFO: [ssww] webapp=/solr path=/dataimport params={command=status} status=0
>> QTime=0
>> Aug 24, 2012 3:53:24 PM org.apache.solr.core.SolrCore execute
>> INFO: [ssww] webapp=/solr path=/dataimport params={command=status} status=0
>> QTime=0
>> Aug 24, 2012 3:53:27 PM org.apache.solr.core.SolrCore execute
>> INFO: [ssww] webapp=/solr path=/dataimport params={command=status} status=0
>> QTime=0
>> Aug 24, 2012 3:53:28 PM org.apache.solr.handler.dataimport.DocBuilder finish
>> INFO: Import completed successfully
>> Aug 24, 2012 3:53:28 PM org.apache.solr.update.DirectUpdateHandler2 commit
>> INFO: start
>> commit{flags=0,_version_=0,optimize=false,openSearcher=true,waitSearcher=true,expungeDeletes=false,softCommit=false}
>> Aug 24, 2012 3:53:28 PM org.apache.solr.core.SolrDeletionPolicy onCommit
>> INFO: SolrDeletionPolicy.onCommit: commits:num=2
>>
>> commit{dir=/var/lib/tomcat6/solr/apache-solr-4.0.0-BETA/core1/data/index,segFN=segments_2b,generation=83,filenames=[segments_2b]
>>
>> commit{dir=/var/lib/tomcat6/solr/apache-solr-4.0.0-BETA/core1/data/index,segFN=segments_2c,generation=84,filenames=[segments_2c]
>> Aug 24, 2012 3:53:28 PM org.apache.solr.core.SolrDeletionPolicy
>> updateCommits
>> INFO: newest commit = 84
>> Aug 24, 2012 3:53:28 PM org.apache.solr.search.SolrIndexSearcher<init>
>> INFO: Opening Searcher@ff33d42 main
>> Aug 24, 2012 3:53:28 PM org.apache.solr.update.DirectUpdateHandler2 commit
>> INFO: end_commit_flush
>> Aug 24, 2012 3:53:28 PM org.apache.solr.core.QuerySenderListener newSearcher
>> INFO: QuerySenderListener sending requests to Searcher@ff33d42
>> main{StandardDirectoryReader(segments_2c:323)}
>> Aug 24, 2012 3:53:28 PM org.apache.solr.core.QuerySenderListener newSearcher
>> INFO: QuerySenderListener done.
>> Aug 24, 2012 3:53:28 PM org.apache.solr.core.SolrCore registerSearcher
>> INFO: [ssww] Registered new searcher Searcher@ff33d42
>> main{StandardDirectoryReader(segments_2c:323)}
>> Aug 24, 2012 3:53:28 PM
>> org.apache.solr.handler.dataimport.SimplePropertiesWriter
>> readIndexerProperties
>> INFO: Read dataimport.properties
>> Aug 24, 2012 3:53:28 PM
>> org.apache.solr.handler.dataimport.SimplePropertiesWriter persist
>> INFO: Wrote last indexed time to dataimport.properties
>> Aug 24, 2012 3:53:28 PM org.apache.solr.handler.dataimport.DocBuilder
>> execute
>> INFO: Time taken = 0:0:17.918
>> Aug 24, 2012 3:53:28 PM org.apache.solr.update.processor.LogUpdateProcessor
>> finish
>> INFO: [ssww] webapp=/solr path=/dataimport params={command=full-import}
>> status=0 QTime=2 {deleteByQuery=*:*,commit=} 0 2
>>
>
>
Re: More debugging DIH - URLDataSource
Posted by Lance Norskog <go...@gmail.com>.
About XPaths: the XPath engine does a limited range of xpaths. The doc
says that your paths are covered.
About logs: You only have the RegexTransformer listed. You need to add
LogTransformer to the transformer list:
http://wiki.apache.org/solr/DataImportHandler#LogTransformer
Having xml entity codes in the url string seems right. Can you verify
the url that goes to the remote site? Can you read the logs at the
remote site? Can you run this code through a proxy and watch the data?
On Fri, Aug 24, 2012 at 1:34 PM, Carrie Coy <cc...@ssww.com> wrote:
> I'm trying to write a DIH to incorporate page view metrics from an XML feed
> into our index. The DIH makes a single request, and updates 0 documents.
> I set log level to "finest" for the entire dataimport section, but I still
> can't tell what's wrong. I suspect the XPath.
> http://localhost:8080/solr/core1/admin/dataimport.jsp?handler=/dataimport
> returns 404. Any suggestions on how I can debug this?
>
> *
>
> solr-spec
> 4.0.0.2012.08.06.22.50.47
>
>
> The XML data:
>
> <?xml version='1.0' encoding='UTF-8'?>
> <ReportDataResponse>
> <Data>
> <Rows>
> <Row rowKey="P#PRODUCT: BURLAP POTATO SACKS (PACK OF 12)
> (W4537)#N/A#550000000016196614" rowActionAvailability="0 0 0">
> <Value columnId="PAGE_NAME" comparisonSpecifier="A">PRODUCT: BURLAP POTATO
> SACKS (PACK OF 12) (W4537)</Value>
> <Value columnId="PAGE_VIEWS" comparisonSpecifier="A">2388</Value>
> </Row>
> <Row rowKey="P#PRODUCT: OPAQUE PONY BEADS 6X9MM (BAG OF 850)
> (BE9000)#N/A#550000000021976460" rowActionAvailability="0 0 0">
> <Value columnId="PAGE_NAME" comparisonSpecifier="A">PRODUCT: OPAQUE PONY
> BEADS 6X9MM (BAG OF 850) (BE9000)</Value>
> <Value columnId="PAGE_VIEWS" comparisonSpecifier="A">1313</Value>
> </Row>
> </Rows>
> </Data>
> </ReportDataResponse>
>
> My DIH:
>
> |<dataConfig>
> <dataSource name="coremetrics"
> type="URLDataSource"
> encoding="UTF-8"
> connectionTimeout="5000"
> readTimeout="10000"/>
>
> <document>
> <entity name="coremetrics"
> dataSource="coremetrics"
> pk="id"
>
> url="https://welcome.coremetrics.com/analyticswebapp/api/1.0/report-data/contentcategory/bypage.ftl?clientId=******&username=****&format=XML&userAuthKey=****&language=en_US∓viewID=9475540&period_a=M20110930"
> processor="XPathEntityProcessor"
> stream="true"
> forEach="/ReportDataResponse/Data/Rows/Row"
> logLevel="fine"
> transformer="RegexTransformer" >
>
> <field column="part_code" name="id"
> xpath="/ReportDataResponse/Data/Rows/Row/Value[@columnId='PAGE_NAME']"
> regex="/^PRODUCT:.*\((.*?)\)$/" replaceWith="$1"/>
> <field column="page_views"
> xpath="/ReportDataResponse/Data/Rows/Row/Value[@columnId='PAGE_VIEWS']" />
> </entity>
> </document>
> </dataConfig>
> |
>
> |||This little test perl script correctly extracts the data:|
> ||
> |use XML::XPath;|
> |use XML::XPath::XMLParser;|
> ||
> |my $xp = XML::XPath->new(filename => 'cm.xml');|
> |||my $nodeset = $xp->find('/ReportDataResponse/Data/Rows/Row');|
> |||foreach my $node ($nodeset->get_nodelist) {|
> |||my $page_name = $node->findvalue('Value[@columnId="PAGE_NAME"]');|
> | my $page_views = $node->findvalue('Value[@columnId="PAGE_VIEWS"]');|
> | $page_name =~ s/^PRODUCT:.*\((.*?)\)$/$1/;|
> |}|
>
> From logs:
>
> INFO: Loading DIH Configuration: data-config.xml
> Aug 24, 2012 3:53:10 PM org.apache.solr.handler.dataimport.DataImporter
> loadDataConfig
> INFO: Data Configuration loaded successfully
> Aug 24, 2012 3:53:10 PM org.apache.solr.core.SolrCore execute
> INFO: [ssww] webapp=/solr path=/dataimport params={command=full-import}
> status=0 QTime=2
> Aug 24, 2012 3:53:10 PM org.apache.solr.handler.dataimport.DataImporter
> doFullImport
> INFO: Starting Full Import
> Aug 24, 2012 3:53:10 PM
> org.apache.solr.handler.dataimport.SimplePropertiesWriter
> readIndexerProperties
> INFO: Read dataimport.properties
> Aug 24, 2012 3:53:10 PM org.apache.solr.update.DirectUpdateHandler2
> deleteAll
> INFO: [ssww] REMOVING ALL DOCUMENTS FROM INDEX
> Aug 24, 2012 3:53:10 PM org.apache.solr.handler.dataimport.URLDataSource
> getData
> FINE: Accessing URL:
> https://welcome.coremetrics.com/analyticswebapp/api/1.0/report-data/contentcategory/bypage.ftl?clientId=*****&username=***&format=XML&userAuthKey=******&language=en_US&viewID=9475540&period_a=M20110930
> Aug 24, 2012 3:53:10 PM org.apache.solr.core.SolrCore execute
> INFO: [ssww] webapp=/solr path=/dataimport params={command=status} status=0
> QTime=0
> Aug 24, 2012 3:53:12 PM org.apache.solr.core.SolrCore execute
> INFO: [ssww] webapp=/solr path=/dataimport params={command=status} status=0
> QTime=1
> Aug 24, 2012 3:53:14 PM org.apache.solr.core.SolrCore execute
> INFO: [ssww] webapp=/solr path=/dataimport params={command=status} status=0
> QTime=1
> Aug 24, 2012 3:53:16 PM org.apache.solr.core.SolrCore execute
> INFO: [ssww] webapp=/solr path=/dataimport params={command=status} status=0
> QTime=0
> Aug 24, 2012 3:53:18 PM org.apache.solr.core.SolrCore execute
> INFO: [ssww] webapp=/solr path=/dataimport params={command=status} status=0
> QTime=0
> Aug 24, 2012 3:53:20 PM org.apache.solr.core.SolrCore execute
> INFO: [ssww] webapp=/solr path=/dataimport params={command=status} status=0
> QTime=0
> Aug 24, 2012 3:53:22 PM org.apache.solr.core.SolrCore execute
> INFO: [ssww] webapp=/solr path=/dataimport params={command=status} status=0
> QTime=0
> Aug 24, 2012 3:53:24 PM org.apache.solr.core.SolrCore execute
> INFO: [ssww] webapp=/solr path=/dataimport params={command=status} status=0
> QTime=0
> Aug 24, 2012 3:53:27 PM org.apache.solr.core.SolrCore execute
> INFO: [ssww] webapp=/solr path=/dataimport params={command=status} status=0
> QTime=0
> Aug 24, 2012 3:53:28 PM org.apache.solr.handler.dataimport.DocBuilder finish
> INFO: Import completed successfully
> Aug 24, 2012 3:53:28 PM org.apache.solr.update.DirectUpdateHandler2 commit
> INFO: start
> commit{flags=0,_version_=0,optimize=false,openSearcher=true,waitSearcher=true,expungeDeletes=false,softCommit=false}
> Aug 24, 2012 3:53:28 PM org.apache.solr.core.SolrDeletionPolicy onCommit
> INFO: SolrDeletionPolicy.onCommit: commits:num=2
>
> commit{dir=/var/lib/tomcat6/solr/apache-solr-4.0.0-BETA/core1/data/index,segFN=segments_2b,generation=83,filenames=[segments_2b]
>
> commit{dir=/var/lib/tomcat6/solr/apache-solr-4.0.0-BETA/core1/data/index,segFN=segments_2c,generation=84,filenames=[segments_2c]
> Aug 24, 2012 3:53:28 PM org.apache.solr.core.SolrDeletionPolicy
> updateCommits
> INFO: newest commit = 84
> Aug 24, 2012 3:53:28 PM org.apache.solr.search.SolrIndexSearcher <init>
> INFO: Opening Searcher@ff33d42 main
> Aug 24, 2012 3:53:28 PM org.apache.solr.update.DirectUpdateHandler2 commit
> INFO: end_commit_flush
> Aug 24, 2012 3:53:28 PM org.apache.solr.core.QuerySenderListener newSearcher
> INFO: QuerySenderListener sending requests to Searcher@ff33d42
> main{StandardDirectoryReader(segments_2c:323)}
> Aug 24, 2012 3:53:28 PM org.apache.solr.core.QuerySenderListener newSearcher
> INFO: QuerySenderListener done.
> Aug 24, 2012 3:53:28 PM org.apache.solr.core.SolrCore registerSearcher
> INFO: [ssww] Registered new searcher Searcher@ff33d42
> main{StandardDirectoryReader(segments_2c:323)}
> Aug 24, 2012 3:53:28 PM
> org.apache.solr.handler.dataimport.SimplePropertiesWriter
> readIndexerProperties
> INFO: Read dataimport.properties
> Aug 24, 2012 3:53:28 PM
> org.apache.solr.handler.dataimport.SimplePropertiesWriter persist
> INFO: Wrote last indexed time to dataimport.properties
> Aug 24, 2012 3:53:28 PM org.apache.solr.handler.dataimport.DocBuilder
> execute
> INFO: Time taken = 0:0:17.918
> Aug 24, 2012 3:53:28 PM org.apache.solr.update.processor.LogUpdateProcessor
> finish
> INFO: [ssww] webapp=/solr path=/dataimport params={command=full-import}
> status=0 QTime=2 {deleteByQuery=*:*,commit=} 0 2
>
--
Lance Norskog
goksron@gmail.com