You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@nifi.apache.org by "ASF subversion and git services (Jira)" <ji...@apache.org> on 2020/07/17 09:36:00 UTC
[jira] [Commented] (NIFI-7493) XML Schema Inference can infer a
type of String when it should be Record
[ https://issues.apache.org/jira/browse/NIFI-7493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17159810#comment-17159810 ]
ASF subversion and git services commented on NIFI-7493:
-------------------------------------------------------
Commit 7e09e0db339a07670eba47c138959d79663400ee in nifi's branch refs/heads/main from Mark Payne
[ https://gitbox.apache.org/repos/asf?p=nifi.git;h=7e09e0d ]
NIFI-7493: When inferring schema for XML data, if we find a text element that also has attributes, infer it as a Record type, in order to match how the data will be read when using the XML Reader
Signed-off-by: Pierre Villard <pi...@gmail.com>
This closes #4375.
> XML Schema Inference can infer a type of String when it should be Record
> ------------------------------------------------------------------------
>
> Key: NIFI-7493
> URL: https://issues.apache.org/jira/browse/NIFI-7493
> Project: Apache NiFi
> Issue Type: Bug
> Components: Extensions
> Reporter: Mark Payne
> Assignee: Mark Payne
> Priority: Major
> Fix For: 1.12.0
>
> Time Spent: 10m
> Remaining Estimate: 0h
>
> From the mailing list:
> {quote}I have configured a XMLReader to use the Infer Schema. The other issue is that I have problems converting sub records. My records looks something like this:<RootLabel> <Part1> <name>John Doe</name> <adress>some there</adress> </Part1> <Part2> <Job>workingman</Job> </Part2> <Part3> <Details> <additionalInfo name="Location">New York</additionalInfo>
> <additionalInfo name="Company">A Company</additionalInfo>
> </Details>
> </Part3>
> </RootLabel>
>
> The issues are with the subrecords in part 3. I have configured the XMLReader property "Field Name for Content" = value
>
> When the data is being converted via a XMLWriter the output for the additionalInfo fields looks like this:
> <Part3> <Details> <additionalInfo>MapRecord[\{name=Location, value=New York}]</additionalInfo>
> <additionalInfo>MapRecord[\{name=Company, value=A Company}]</additionalInfo> </Details>
> </Part3>
>
>
> If I use a JSONWriter I gets this:
> "Part3": { "Details": {
> "additionalInfo": [ "MapRecord[\{name=Location, value=New York}]", "MapRecord[\{name=Company, value=A Company}]" ]
> }
> }{quote}
> The issue appears to be that "additionalInfo" is being inferred as a String, but the XML Reader is returning a Record.
>
> This is probably because the "additionalInfo" element contains String content and no child nodes. However, it does have attributes. As a result, the XML Reader will return a Record. I'm guessing that attributes are not taken into account in the schema inference, though, and since "additionalInfo" has no child nodes but has textual content, it must be a String.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)