You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@daffodil.apache.org by Deepak Shukla <de...@gmail.com> on 2023/03/14 02:04:26 UTC

Help Needed - daffodil

> 
> Hi Team,
> 
> I am trying to use daffodil to read text file which have multiple lines separated by new line delimiter (crlf). I have few records containing dot (.) at 5th position and other records doesn’t have dot. How can I select only record in my output which have dot(.) at 5th position .
> 
> Your help is appreciated. Let me know if you need more details.
> 
> Sent from my iPhone

Re: Help Needed - daffodil

Posted by Mike Beckerle <mb...@apache.org>.
Deepak,

I suggest something like this:

<group name="lookAheadForDot">
  <sequence>
       <annotation><appinfo source="http://www.ogf.org/dfdl/">
           <!-- Look ahead for 5th character to be a literal "." -->
           <dfdl:discriminator testKind='pattern' testPattern='....\.' />
       </appinfo></annotation>
   </sequence>
</group>

Then in your schema for handling the lines, assuming they're just strings
something like this:

<choice>
  <sequence>
       <group ref="p:lookAheadForDot"/>
       <element name="lineWithDot" type="xs:string" dfdl:terminator="%NL;"/>
   </sequence>
   <sequence dfdl:hiddenGroupRef="p:lineNoDot"/>
</choice>

<group name="lineNoDot">
  <sequence>
      <element name="lineNoDot" type="xs:string" dfdl:terminator="%NL;"/>
  </sequence>
</group>

This creates an infoset that parses lines with dot, parses lines without
dot, but doesn't include the lines without the dot
in the infoset at all.


On Wed, Mar 15, 2023 at 11:22 AM Deepak Shukla <de...@gmail.com>
wrote:

> Hi Mike,
>
> Hope you’re doing good, did you get chance to check.
>
> Sent from my iPhone
>
> On Mar 14, 2023, at 8:14 AM, Deepak Shukla <de...@gmail.com>
> wrote:
>
> I have not parsed data, during parsing I need output should have only
> line where dot(.) at 5th position .
>
> Sent from my iPhone
>
> On Mar 14, 2023, at 7:47 AM, Mike Beckerle <mb...@apache.org> wrote:
>
> 
> Deepak,
>
> I do need some more details before I can respond meaningfully.
>
> Are you parsing the data and you want to distinguish the records with the
> "." at parsing time?
>
> Or have you already parsed the data, and now want to extract only specific
> elements in some sort of post-processing of the infoset?
>
> -mikeb
>
> On Mon, Mar 13, 2023 at 10:04 PM Deepak Shukla <de...@gmail.com>
> wrote:
>
>>
>> >
>> > Hi Team,
>> >
>> > I am trying to use daffodil to read text file which have multiple lines
>> separated by new line delimiter (crlf). I have few records containing dot
>> (.) at 5th position and other records doesn’t have dot. How can I select
>> only record in my output which have dot(.) at 5th position .
>> >
>> > Your help is appreciated. Let me know if you need more details.
>> >
>> > Sent from my iPhone
>>
>

Re: Help Needed - daffodil

Posted by Deepak Shukla <de...@gmail.com>.
Hi Mike,

  

Hope you’re doing good, did you get chance to check.  
  

Sent from my iPhone

  

> On Mar 14, 2023, at 8:14 AM, Deepak Shukla <de...@gmail.com>
> wrote:  
>  
>

> I have not parsed data, during parsing I need output should have only line
> where dot(.) at 5th position .  
>  
>
>
> Sent from my iPhone
>
>  
>
>

>> On Mar 14, 2023, at 7:47 AM, Mike Beckerle <mb...@apache.org> wrote:  
>  
>
>

>> 

>>

>> Deepak,

>>

>>  
>
>>

>> I do need some more details before I can respond meaningfully.

>>

>>  
>
>>

>> Are you parsing the data and you want to distinguish the records with the
"." at parsing time?

>>

>>  
>
>>

>> Or have you already parsed the data, and now want to extract only specific
elements in some sort of post-processing of the infoset?

>>

>>  
>
>>

>> -mikeb

>>

>>  
>
>>

>> On Mon, Mar 13, 2023 at 10:04 PM Deepak Shukla
<[deepakshukla2020@gmail.com](mailto:deepakshukla2020@gmail.com)> wrote:  
>
>>

>>>  
>  >  
>  > Hi Team,  
>  >  
>  > I am trying to use daffodil to read text file which have multiple lines
> separated by new line delimiter (crlf). I have few records containing dot
> (.) at 5th position and other records doesn’t have dot. How can I select
> only record in my output which have dot(.) at 5th position .  
>  >  
>  > Your help is appreciated. Let me know if you need more details.  
>  >  
>  > Sent from my iPhone  
>


Re: Help Needed - daffodil

Posted by Deepak Shukla <de...@gmail.com>.
I have not parsed data, during parsing I need output should have only line
where dot(.) at 5th position .  
  

Sent from my iPhone

  

> On Mar 14, 2023, at 7:47 AM, Mike Beckerle <mb...@apache.org> wrote:  
>  
>

> 
>
> Deepak,
>
>  
>
>
> I do need some more details before I can respond meaningfully.
>
>  
>
>
> Are you parsing the data and you want to distinguish the records with the
> "." at parsing time?
>
>  
>
>
> Or have you already parsed the data, and now want to extract only specific
> elements in some sort of post-processing of the infoset?
>
>  
>
>
> -mikeb
>
>  
>
>
> On Mon, Mar 13, 2023 at 10:04 PM Deepak Shukla
> <[deepakshukla2020@gmail.com](mailto:deepakshukla2020@gmail.com)> wrote:  
>
>

>>  
>  >  
>  > Hi Team,  
>  >  
>  > I am trying to use daffodil to read text file which have multiple lines
> separated by new line delimiter (crlf). I have few records containing dot
> (.) at 5th position and other records doesn’t have dot. How can I select
> only record in my output which have dot(.) at 5th position .  
>  >  
>  > Your help is appreciated. Let me know if you need more details.  
>  >  
>  > Sent from my iPhone  
>


Re: Help Needed - daffodil

Posted by Mike Beckerle <mb...@apache.org>.
Deepak,

I do need some more details before I can respond meaningfully.

Are you parsing the data and you want to distinguish the records with the
"." at parsing time?

Or have you already parsed the data, and now want to extract only specific
elements in some sort of post-processing of the infoset?

-mikeb

On Mon, Mar 13, 2023 at 10:04 PM Deepak Shukla <de...@gmail.com>
wrote:

>
> >
> > Hi Team,
> >
> > I am trying to use daffodil to read text file which have multiple lines
> separated by new line delimiter (crlf). I have few records containing dot
> (.) at 5th position and other records doesn’t have dot. How can I select
> only record in my output which have dot(.) at 5th position .
> >
> > Your help is appreciated. Let me know if you need more details.
> >
> > Sent from my iPhone
>