You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Theodor Tolstoy <Th...@sub.su.se> on 2010/11/04 13:13:08 UTC
ContentStreamDataSource
Hi!
I am trying to get the ContentStreamDataSource to work properly , but there are not many examples out there.
What I have done is that I have made a copy of my HttpDataSource config and replaced the <dataSource type="HttpDataSource" with <dataSource type=" ContentStreamDataSource "
If understand everything correctly I should be able to use the same URL syntax as with HttpDataSource and supply the XML file as post data.
I have tried to post data - both as binary, file and string to the URL, but nothing happens.
This is the log file:
2010-nov-04 12:32:17 org.apache.solr.handler.dataimport.DataImporter doFullImport
INFO: Starting Full Import
2010-nov-04 12:32:17 org.apache.solr.handler.dataimport.SolrWriter readIndexerProperties
VARNING: Unable to read: datapush.properties
2010-nov-04 12:32:17 org.apache.solr.handler.dataimport.DocBuilder execute
INFO: Time taken = 0:0:0.0
2010-nov-04 12:32:17 org.apache.solr.core.SolrCore execute
INFO: [] webapp=/solr path=/datapush params={clean=false&entity=suLIBRIS&command=full-import} status=0 QTime=0
What am I doing wrong?
Regards
Theodor Tolstoy
Developer Stockholm university library
Re: ContentStreamDataSource
Posted by Noble Paul നോബിള് नोब्ळ् <no...@gmail.com>.
for contentstreamdatasource to work you must post the stream in the request
On Thu, Nov 4, 2010 at 8:13 AM, Theodor Tolstoy
<Th...@sub.su.se>wrote:
> Hi!
> I am trying to get the ContentStreamDataSource to work properly , but there
> are not many examples out there.
>
> What I have done is that I have made a copy of my HttpDataSource config
> and replaced the <dataSource type="HttpDataSource" with <dataSource type="
> ContentStreamDataSource "
>
> If understand everything correctly I should be able to use the same URL
> syntax as with HttpDataSource and supply the XML file as post data.
>
> I have tried to post data - both as binary, file and string to the URL, but
> nothing happens.
>
>
> This is the log file:
> 2010-nov-04 12:32:17 org.apache.solr.handler.dataimport.DataImporter
> doFullImport
> INFO: Starting Full Import
> 2010-nov-04 12:32:17 org.apache.solr.handler.dataimport.SolrWriter
> readIndexerProperties
> VARNING: Unable to read: datapush.properties
> 2010-nov-04 12:32:17 org.apache.solr.handler.dataimport.DocBuilder execute
> INFO: Time taken = 0:0:0.0
> 2010-nov-04 12:32:17 org.apache.solr.core.SolrCore execute
> INFO: [] webapp=/solr path=/datapush
> params={clean=false&entity=suLIBRIS&command=full-import} status=0 QTime=0
>
>
> What am I doing wrong?
>
> Regards
> Theodor Tolstoy
> Developer Stockholm university library
>
>
--
-----------------------------------------------------
Noble Paul | Systems Architect| AOL | http://aol.com
SV: ContentStreamDataSource
Posted by Theodor Tolstoy <Th...@sub.su.se>.
I got it to work. There was an error in the requestHandler section in the solrconfig. Too bad I had to try almost every possible way to make http POST requests in .NET before realizing that...
For future reference, here is my solution:
//Example url: http://solrserver/solr/datapush?command=full-import&clean=false";
protected void PostToSolr(string url)
{
HttpWebRequest request = (HttpWebRequest)WebRequest.Create(url);
request.Accept = "text/xml";
request.Method = "POST";
using (FileStream fileStream = File.OpenRead(url))
using (Stream requestStream = request.GetRequestStream())
{
int bufferSize = 1024;
byte[] buffer = new byte[bufferSize];
int byteCount = 0;
while ((byteCount = fileStream.Read(buffer, 0, bufferSize)) > 0)
{
requestStream.Write(buffer, 0, byteCount);
}
}
string result;
using (WebResponse response = request.GetResponse())
using (StreamReader reader = new StreamReader(response.GetResponseStream()))
{
result = reader.ReadToEnd();
}
}
Here is my DIH config (dataconfigpush.xml) .
<dataConfig>
<dataSource type="ContentStreamDataSource" connectionTimeout="300000" readTimeout="400000" />
<document>
<entity name="suMARC"
processor="XPathEntityProcessor"
stream="false"
forEach="/collection/record"
onError ="continue"
transformer="DateFormatTransformer, TemplateTransformer">
<field column="id" xpath="/collection/record/field1" />
<field column="titlePrimary" xpath="/collection/record/field2" />
</entity>
</document>
</dataConfig>
And the relevant requesthandler part in solrconfig:
<requestHandler name="/datapush"
class="org.apache.solr.handler.dataimport.DataImportHandler">
<lst name="defaults">
<str name="config">dataconfigpush.xml</str>
</lst>
</requestHandler>
Thank you for your help. As you can see, you send GET parameters with the request, so it is not necessary to put the commands in the config.
-----Ursprungligt meddelande-----
Från: Lance Norskog [mailto:goksron@gmail.com]
Skickat: den 6 november 2010 05:09
Till: solr-user@lucene.apache.org
Ämne: Re: ContentStreamDataSource
What program do you use to POST?
How do you give parameters to Solr? Are you doing multipart upload?
You might have to add all of your parameters to a custom requestHandler, like the /dataimport requestHandler.
Post your DIH config file, if you can.
On Thu, Nov 4, 2010 at 5:13 AM, Theodor Tolstoy <Th...@sub.su.se> wrote:
> Hi!
> I am trying to get the ContentStreamDataSource to work properly , but there are not many examples out there.
>
> What I have done is that I have made a copy of my HttpDataSource config and replaced the <dataSource type="HttpDataSource" with <dataSource type=" ContentStreamDataSource "
>
> If understand everything correctly I should be able to use the same URL syntax as with HttpDataSource and supply the XML file as post data.
>
> I have tried to post data - both as binary, file and string to the URL, but nothing happens.
>
>
> This is the log file:
> 2010-nov-04 12:32:17 org.apache.solr.handler.dataimport.DataImporter
> doFullImport
> INFO: Starting Full Import
> 2010-nov-04 12:32:17 org.apache.solr.handler.dataimport.SolrWriter
> readIndexerProperties
> VARNING: Unable to read: datapush.properties
> 2010-nov-04 12:32:17 org.apache.solr.handler.dataimport.DocBuilder
> execute
> INFO: Time taken = 0:0:0.0
> 2010-nov-04 12:32:17 org.apache.solr.core.SolrCore execute
> INFO: [] webapp=/solr path=/datapush
> params={clean=false&entity=suLIBRIS&command=full-import} status=0
> QTime=0
>
>
> What am I doing wrong?
>
> Regards
> Theodor Tolstoy
> Developer Stockholm university library
>
>
--
Lance Norskog
goksron@gmail.com
Re: ContentStreamDataSource
Posted by Lance Norskog <go...@gmail.com>.
What program do you use to POST?
How do you give parameters to Solr? Are you doing multipart upload?
You might have to add all of your parameters to a custom
requestHandler, like the /dataimport requestHandler.
Post your DIH config file, if you can.
On Thu, Nov 4, 2010 at 5:13 AM, Theodor Tolstoy
<Th...@sub.su.se> wrote:
> Hi!
> I am trying to get the ContentStreamDataSource to work properly , but there are not many examples out there.
>
> What I have done is that I have made a copy of my HttpDataSource config and replaced the <dataSource type="HttpDataSource" with <dataSource type=" ContentStreamDataSource "
>
> If understand everything correctly I should be able to use the same URL syntax as with HttpDataSource and supply the XML file as post data.
>
> I have tried to post data - both as binary, file and string to the URL, but nothing happens.
>
>
> This is the log file:
> 2010-nov-04 12:32:17 org.apache.solr.handler.dataimport.DataImporter doFullImport
> INFO: Starting Full Import
> 2010-nov-04 12:32:17 org.apache.solr.handler.dataimport.SolrWriter readIndexerProperties
> VARNING: Unable to read: datapush.properties
> 2010-nov-04 12:32:17 org.apache.solr.handler.dataimport.DocBuilder execute
> INFO: Time taken = 0:0:0.0
> 2010-nov-04 12:32:17 org.apache.solr.core.SolrCore execute
> INFO: [] webapp=/solr path=/datapush params={clean=false&entity=suLIBRIS&command=full-import} status=0 QTime=0
>
>
> What am I doing wrong?
>
> Regards
> Theodor Tolstoy
> Developer Stockholm university library
>
>
--
Lance Norskog
goksron@gmail.com