You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by "VON RUEDEN, Jonathan" <jo...@sap.com> on 2016/07/15 12:44:17 UTC
XML
Hi everyone,
I want to read an XML file with multiple attributes per tag and would need some help. I am able to read and process the sample files but can't find a solution for my XML.
Here's the file structure:
<?xml version="1.0" encoding="UTF-8"?>
<report format="1.0">
<creationTime millis="1468158875331" readable="2016-07-10 13:54:35 +0000" />
<project artifactid="fin.ap.balances.display" gitUrl="ssh://git.wdf.sap.corp:2/path/path/path" groupid="com.sap.prod.prod" parentArtifactId="name.name" parentVersion="1.12.2" version="4.0.7-SNAPSHOT">
<check columnNumber="0" context="4.0.6" errorType="PREVIOUS_PROJECT_VERSION" filePath="/hompath/path/path" lineNumber="0" message="Reporting :: Previous version checked for compatibility
For details, see: https://githudoc.doc.doc.doc.docm.md" severity="Info" />
<check columnNumber="0" context="Directories in '/src/main/webapp': [WEB-INF, model, view, util, css, img, i18n]" errorType="PROJECT_OLD_STRUCTURE" filePath="/hpathpathpath/ath/webapp" lineNumber="0" message="Reporting :: Using old project structure
For details, see: https://github.wdf.sap.coath.oath/pathpath.nmd" severity="Info" />
</project>
</report>
--> Is there any way I can have com.databricks.spark.xml write all the attributes into one cell as a string and I come up with my own way of splitting and transforming this into a table? Do you guys know how I can read in such a file.
thanks much,
best,
jonathan
[SAP_grad_R_pref.png]
Jonathan von RĂ¼den
Enterprise Analytics
SAP France | Paris
Mobile: +33 68 221-2425
Email: Jonathan.von.rueden@sap.com<ma...@sap.com>