You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Eric_Peng <sa...@gmail.com> on 2014/02/12 16:21:50 UTC

Question about how to upload XML by using SolrJ Client Java Code

 I was just trying to use SolrJ Client to import XML data to Solr server. And
I read SolrJ wiki that says "SolrJ lets you upload content in XML and Binary
format" 

I realized there is a XML parser in Solr (We can use a dataUpadateHandler in
Solr default UI Solr Core "Dataimport")

So I was wondering how to directly use solr xml parser to upload xml by
using SolrJ Java Code? I could use other open-source xml parser, But I
really want to know if there is a way to call Solr parser library.

Would you mind send me a simple code if possible, really appreciated.
Thanks in advance.

solr/4.6.1
     



--
View this message in context: http://lucene.472066.n3.nabble.com/Question-about-how-to-upload-XML-by-using-SolrJ-Client-Java-Code-tp4116901.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Question about how to upload XML by using SolrJ Client Java Code

Posted by Eric_Peng <sa...@gmail.com>.
Thanks a lot, learnt a lot from it



--
View this message in context: http://lucene.472066.n3.nabble.com/Question-about-how-to-upload-XML-by-using-SolrJ-Client-Java-Code-tp4116901p4116937.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Question about how to upload XML by using SolrJ Client Java Code

Posted by Shawn Heisey <so...@elyograg.org>.
On 2/12/2014 8:21 AM, Eric_Peng wrote:
>  I was just trying to use SolrJ Client to import XML data to Solr server. And
> I read SolrJ wiki that says "SolrJ lets you upload content in XML and Binary
> format" 
> 
> I realized there is a XML parser in Solr (We can use a dataUpadateHandler in
> Solr default UI Solr Core "Dataimport")
> 
> So I was wondering how to directly use solr xml parser to upload xml by
> using SolrJ Java Code? I could use other open-source xml parser, But I
> really want to know if there is a way to call Solr parser library.
> 
> Would you mind send me a simple code if possible, really appreciated.
> Thanks in advance.
> 
> solr/4.6.1

When the docs say that SolrJ lets you upload data in XML and binary
format, what they actually mean is that SolrJ will create an update
request that is formatted using XML, not that it will let you send
arbitrary XML data.  It is referring to the specific XML format shown here:

http://wiki.apache.org/solr/UpdateXmlMessages#add.2Freplace_documents

As for an XML parser ... SolrJ's "XMLResponseParser" is a class that
accepts XML *responses* from Solr and translates them into the Java
response object.  There is also BinaryResponseParser.

The only things that I am aware of in Solr that will deal with XML as
the data source are the XPathEntityProcessor in the dataimport handler
and the ExtractingRequestHandler which uses Apache Tika.  Both of these
are actually contrib modules -- jar files for these features are in the
download, but not built into Solr or SolrJ.

If you are using the extracting request handler, you could probably use
the DirectXmlRequest object, where 'xml' is a String with the xml in it:

  DirectXmlRequest req = new DirectXmlRequest( "/update/extract", xml );
  ModifiableSolrParams params = new ModifiableSolrParams();
  params.set("someParam", "someValue");
  req.setParams(params);
  NamedList<Object> response = solrServer.request(req);

I hope that you are right and there actually is an XML parser built into
SolrJ.  We would both learn something.

Thanks,
Shawn


Re: Question about how to upload XML by using SolrJ Client Java Code

Posted by Jack Krupansky <ja...@basetechnology.com>.
There is also an "XSLT" update handler option to transform raw XML to Solr 
XML on the fly. If anybody here has used it, feel free to chime in.

See:
http://wiki.apache.org/solr/XsltUpdateRequestHandler
and
https://cwiki.apache.org/confluence/display/solr/Uploading+Data+with+Index+Handlers#UploadingDatawithIndexHandlers-UsingXSLTtoTransformXMLIndexUpdates

-- Jack Krupansky

-----Original Message----- 
From: Eric_Peng
Sent: Wednesday, February 12, 2014 11:42 AM
To: solr-user@lucene.apache.org
Subject: Re: Question about how to upload XML by using SolrJ Client Java 
Code

Thanks you so much Erick, I will try to write my owe XML parser




--
View this message in context: 
http://lucene.472066.n3.nabble.com/Question-about-how-to-upload-XML-by-using-SolrJ-Client-Java-Code-tp4116901p4116936.html
Sent from the Solr - User mailing list archive at Nabble.com. 


Re: Question about how to upload XML by using SolrJ Client Java Code

Posted by Eric_Peng <sa...@gmail.com>.
Thanks you so much Erick, I will try to write my owe XML parser




--
View this message in context: http://lucene.472066.n3.nabble.com/Question-about-how-to-upload-XML-by-using-SolrJ-Client-Java-Code-tp4116901p4116936.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Question about how to upload XML by using SolrJ Client Java Code

Posted by Erick Erickson <er...@gmail.com>.
Hmmm, before going there let's be sure you're trying to do
what you think you are.

Solr does _not_ index arbitrary XML. There is a very
specific format of XML that describes solr documents
that _can_ be indexed. But random XML is not
supported. See the documents in example/exampledocs
for the XML form of Solr docs.

So if you have arbitrary XML, you need to parse it and then
construct Solr documents. One way would be to use
SolrJ, parse the docs using your favorite Java parser and
construct SolrInputDocuments which you then use one of
the SolrServer classes (e.g. CloudSolrServer) to add to the index.

There really is no "Solr MXL Parser" that I know of, Solr just
uses one of the standard XML parsers (e.g. sax)...

Best,
Erick


On Wed, Feb 12, 2014 at 7:21 AM, Eric_Peng <sa...@gmail.com>wrote:

>  I was just trying to use SolrJ Client to import XML data to Solr server.
> And
> I read SolrJ wiki that says "SolrJ lets you upload content in XML and
> Binary
> format"
>
> I realized there is a XML parser in Solr (We can use a dataUpadateHandler
> in
> Solr default UI Solr Core "Dataimport")
>
> So I was wondering how to directly use solr xml parser to upload xml by
> using SolrJ Java Code? I could use other open-source xml parser, But I
> really want to know if there is a way to call Solr parser library.
>
> Would you mind send me a simple code if possible, really appreciated.
> Thanks in advance.
>
> solr/4.6.1
>
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Question-about-how-to-upload-XML-by-using-SolrJ-Client-Java-Code-tp4116901.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>