You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@tika.apache.org by Peter Kronenberg <pe...@torch.ai> on 2021/04/12 02:27:31 UTC

Error building TIKA - complaining about CRLF

Just tried to do a full build and it fails with all sorts of error messages about CRLF not allowed.

Here's a representative sample.   I got a new computer a few weeks and I'm not sure if this is the first time I'm trying this on the new computer.  So it's certainly possible that's it's something on my end.

[cid:image002.png@01D72F21.DBFEE3E0]

[cid:image003.png@01D72F21.DBFEE3E0]

[cid:image004.png@01D72F21.DBFEE3E0]

Peter Kronenberg  |  Senior AI Analytic ENGINEER
C: 703.887.5623
[Torch AI]<http://www.torch.ai/>
4303 W. 119th St., Leawood, KS 66209
WWW.TORCH.AI<http://www.torch.ai/>



RE: Error building TIKA - complaining about CRLF

Posted by Peter Kronenberg <pe...@torch.ai>.
It looks like that is part of .gitattributes, which is a checked-in file.  IF you want to enforce this, Tika needs to change itd version of .gitattributes.
But is it really necessary for me to retain the line endings when checking out, as long as they are committed property?

Peter Kronenberg  |  Senior AI Analytic ENGINEER
C: 703.887.5623
[Torch AI]<http://www.torch.ai/>
4303 W. 119th St., Leawood, KS 66209
WWW.TORCH.AI<http://www.torch.ai/>


From: Tim Allison <ta...@apache.org>
Sent: Monday, April 12, 2021 5:42 AM
To: user@tika.apache.org
Subject: Re: Error building TIKA - complaining about CRLF

I took OpenNLP’s check style as the basis for ours, largely without much thought. By default, git converts new lines to \r\n on checkout when on a windows os.  Git undoes these when committing.&n
Warning! This message was sent from outside your organization and we are unable to verify the sender.
Allow sender<https://mail-cloudstation-us-east-2.prod.hydra.sophos.com/mail/api/xgemail/smart-banner/c65e19333d9087824f74a686310f3403> | Block sender<https://mail-cloudstation-us-east-2.prod.hydra.sophos.com/mail/api/xgemail/smart-banner/ebd4de4a5fede5ebb1974a555e03f99e>
sophospsmartbannerend
I took OpenNLP’s check style as the basis for ours, largely without much thought.

By default, git converts new lines to \r\n on checkout when on a windows os.  Git undoes these when committing.  You can get  git to keep Linux line endings with : text eol=lf



https://docs.github.com/ /github/getting-started-with-github/configuring-git-to-handle-line-endings<https://docs.github.com/%20/github/getting-started-with-github/configuring-git-to-handle-line-endings>


On Sun, Apr 11, 2021 at 10:48 PM Peter Kronenberg <pe...@torch.ai>> wrote:
Actually, I think all these lines need to be removed

<module name="NewlineAtEndOfFile">
  <property name="lineSeparator" value="lf"/>
</module>

<module name="RegexpMultiline">
  <property name="format" value="\r\n"/>
  <property name="message" value="CRLF line endings are prohibited"/>
</module>


Not sure why both sets are needed.  They seem redundant


Peter Kronenberg  |  Senior AI Analytic ENGINEER
C: 703.887.5623
[Torch AI]<https://us-east-2.protection.sophos.com/?d=torch.ai&u=aHR0cDovL3d3dy50b3JjaC5haS8=&i=NjAwMDY2MjNjNzQ1NDY0ODkyYTNmNTg3&t=dHRDUUJralFuRnRCU2tvcmRLNUUycFdBV2RmazdTZU0zZUZVM21GSXhobz0=&h=cf8273904d6b4c229dffa5c528171d4d>
4303 W. 119th St., Leawood, KS 66209<https://us-east-2.protection.sophos.com?d=google.com&u=aHR0cHM6Ly93d3cuZ29vZ2xlLmNvbS9tYXBzL3NlYXJjaC80MzAzK1cuKzExOXRoK1N0LiwrTGVhd29vZCwrS1MrNjYyMDk_ZW50cnk9Z21haWwmc291cmNlPWc=&i=NjAwMDY2MjNjNzQ1NDY0ODkyYTNmNTg3&t=N2FQazRValYxZ2cyRHZLcXZnb1AzcTVlQVc0SHJFYXdjMkFPemVSR1M1cz0=&h=cf8273904d6b4c229dffa5c528171d4d>
WWW.TORCH.AI<https://us-east-2.protection.sophos.com?d=torch.ai&u=aHR0cDovL3d3dy50b3JjaC5haS8=&i=NjAwMDY2MjNjNzQ1NDY0ODkyYTNmNTg3&t=dHRDUUJralFuRnRCU2tvcmRLNUUycFdBV2RmazdTZU0zZUZVM21GSXhobz0=&h=cf8273904d6b4c229dffa5c528171d4d>


From: Peter Kronenberg
Sent: Sunday, April 11, 2021 10:45 PM
To: user@tika.apache.org<ma...@tika.apache.org>
Subject: RE: Error building TIKA - complaining about CRLF

Appears to be caused by these lines in the checkstyle.xml config file


<module name="RegexpMultiline">
  <property name="format" value="\r\n"/>
  <property name="message" value="CRLF line endings are prohibited"/>
</module>

But this won’t work on Windows. . I suggest removing this and relying on Git to handle the translating between LF and CRLF

Peter Kronenberg  |  Senior AI Analytic ENGINEER
C: 703.887.5623
[Torch AI]<https://us-east-2.protection.sophos.com/?d=torch.ai&u=aHR0cDovL3d3dy50b3JjaC5haS8=&i=NjAwMDY2MjNjNzQ1NDY0ODkyYTNmNTg3&t=dHRDUUJralFuRnRCU2tvcmRLNUUycFdBV2RmazdTZU0zZUZVM21GSXhobz0=&h=cf8273904d6b4c229dffa5c528171d4d>
4303 W. 119th St., Leawood, KS 66209<https://us-east-2.protection.sophos.com?d=google.com&u=aHR0cHM6Ly93d3cuZ29vZ2xlLmNvbS9tYXBzL3NlYXJjaC80MzAzK1cuKzExOXRoK1N0LiwrTGVhd29vZCwrS1MrNjYyMDk_ZW50cnk9Z21haWwmc291cmNlPWc=&i=NjAwMDY2MjNjNzQ1NDY0ODkyYTNmNTg3&t=N2FQazRValYxZ2cyRHZLcXZnb1AzcTVlQVc0SHJFYXdjMkFPemVSR1M1cz0=&h=cf8273904d6b4c229dffa5c528171d4d>
WWW.TORCH.AI<https://us-east-2.protection.sophos.com?d=torch.ai&u=aHR0cDovL3d3dy50b3JjaC5haS8=&i=NjAwMDY2MjNjNzQ1NDY0ODkyYTNmNTg3&t=dHRDUUJralFuRnRCU2tvcmRLNUUycFdBV2RmazdTZU0zZUZVM21GSXhobz0=&h=cf8273904d6b4c229dffa5c528171d4d>


From: Peter Kronenberg
Sent: Sunday, April 11, 2021 10:28 PM
To: user@tika.apache.org<ma...@tika.apache.org>
Subject: Error building TIKA - complaining about CRLF

Just tried to do a full build and it fails with all sorts of error messages about CRLF not allowed.

Here’s a representative sample.   I got a new computer a few weeks and I’m not sure if this is the first time I’m trying this on the new computer.  So it’s certainly possible that’s it’s something on my end.

[cid:image002.png@01D72F79.86AB4AC0]

[cid:image003.png@01D72F79.86AB4AC0]

[cid:image004.png@01D72F79.86AB4AC0]

Peter Kronenberg  |  Senior AI Analytic ENGINEER
C: 703.887.5623
[Torch AI]<https://us-east-2.protection.sophos.com/?d=torch.ai&u=aHR0cDovL3d3dy50b3JjaC5haS8=&i=NjAwMDY2MjNjNzQ1NDY0ODkyYTNmNTg3&t=dHRDUUJralFuRnRCU2tvcmRLNUUycFdBV2RmazdTZU0zZUZVM21GSXhobz0=&h=cf8273904d6b4c229dffa5c528171d4d>
4303 W. 119th St., Leawood, KS 66209<https://us-east-2.protection.sophos.com?d=google.com&u=aHR0cHM6Ly93d3cuZ29vZ2xlLmNvbS9tYXBzL3NlYXJjaC80MzAzK1cuKzExOXRoK1N0LiwrTGVhd29vZCwrS1MrNjYyMDk_ZW50cnk9Z21haWwmc291cmNlPWc=&i=NjAwMDY2MjNjNzQ1NDY0ODkyYTNmNTg3&t=N2FQazRValYxZ2cyRHZLcXZnb1AzcTVlQVc0SHJFYXdjMkFPemVSR1M1cz0=&h=cf8273904d6b4c229dffa5c528171d4d>
WWW.TORCH.AI<https://us-east-2.protection.sophos.com?d=torch.ai&u=aHR0cDovL3d3dy50b3JjaC5haS8=&i=NjAwMDY2MjNjNzQ1NDY0ODkyYTNmNTg3&t=dHRDUUJralFuRnRCU2tvcmRLNUUycFdBV2RmazdTZU0zZUZVM21GSXhobz0=&h=cf8273904d6b4c229dffa5c528171d4d>



Re: Error building TIKA - complaining about CRLF

Posted by Tim Allison <ta...@apache.org>.
I took OpenNLP’s check style as the basis for ours, largely without much
thought.

By default, git converts new lines to \r\n on checkout when on a windows
os.  Git undoes these when committing.  You can get  git to keep Linux line
endings with : text eol=lf



https://docs.github.com/en/github/getting-started-with-github/configuring-git-to-handle-line-endings


On Sun, Apr 11, 2021 at 10:48 PM Peter Kronenberg <pe...@torch.ai>
wrote:

> Actually, I think all these lines need to be removed
>
>
>
> <*module **name**="NewlineAtEndOfFile"*>
>   <*property **name**="lineSeparator" **value**="lf"*/>
> </*module*>
>
> <*module **name**="RegexpMultiline"*>
>   <*property **name**="format" **value**="\r\n"*/>
>   <*property **name**="message" **value**="CRLF line endings are
> prohibited"*/>
> </*module*>
>
>
>
>
>
> Not sure why both sets are needed.  They seem redundant
>
>
>
>
>
> *Peter Kronenberg*  *| * *Senior AI Analytic ENGINEER *
>
> *C: 703.887.5623*
>
> [image: Torch AI] <http://www.torch.ai/>
>
> 4303 W. 119th St., Leawood, KS 66209
> <https://www.google.com/maps/search/4303+W.+119th+St.,+Leawood,+KS+66209?entry=gmail&source=g>
> WWW.TORCH.AI <http://www.torch.ai/>
>
>
>
>
>
> *From:* Peter Kronenberg
> *Sent:* Sunday, April 11, 2021 10:45 PM
> *To:* user@tika.apache.org
> *Subject:* RE: Error building TIKA - complaining about CRLF
>
>
>
> Appears to be caused by these lines in the checkstyle.xml config file
>
>
>
> <*module name="RegexpMultiline"*>
>   <*property name="format" value="\r\n"*/>
>   <*property name="message" value="CRLF line endings are prohibited"*/>
> </*module*>
>
>
>
> But this won’t work on Windows. . I suggest removing this and relying on
> Git to handle the translating between LF and CRLF
>
>
>
> *Peter Kronenberg*  *| * *Senior AI Analytic ENGINEER *
>
> *C: 703.887.5623*
>
> [image: Torch AI] <http://www.torch.ai/>
>
> 4303 W. 119th St., Leawood, KS 66209
> <https://www.google.com/maps/search/4303+W.+119th+St.,+Leawood,+KS+66209?entry=gmail&source=g>
> WWW.TORCH.AI <http://www.torch.ai/>
>
>
>
>
>
> *From:* Peter Kronenberg
> *Sent:* Sunday, April 11, 2021 10:28 PM
> *To:* user@tika.apache.org
> *Subject:* Error building TIKA - complaining about CRLF
>
>
>
> Just tried to do a full build and it fails with all sorts of error
> messages about CRLF not allowed.
>
>
>
> Here’s a representative sample.   I got a new computer a few weeks and I’m
> not sure if this is the first time I’m trying this on the new computer.  So
> it’s certainly possible that’s it’s something on my end.
>
>
>
>
>
>
>
>
>
> *Peter Kronenberg*  *| * *Senior AI Analytic ENGINEER *
>
> *C: 703.887.5623*
>
> [image: Torch AI] <http://www.torch.ai/>
>
> 4303 W. 119th St., Leawood, KS 66209
> <https://www.google.com/maps/search/4303+W.+119th+St.,+Leawood,+KS+66209?entry=gmail&source=g>
> WWW.TORCH.AI <http://www.torch.ai/>
>
>
>
>
>

RE: Error building TIKA - complaining about CRLF

Posted by Peter Kronenberg <pe...@torch.ai>.
Actually, I think all these lines need to be removed

<module name="NewlineAtEndOfFile">
  <property name="lineSeparator" value="lf"/>
</module>

<module name="RegexpMultiline">
  <property name="format" value="\r\n"/>
  <property name="message" value="CRLF line endings are prohibited"/>
</module>


Not sure why both sets are needed.  They seem redundant


Peter Kronenberg  |  Senior AI Analytic ENGINEER
C: 703.887.5623
[Torch AI]<http://www.torch.ai/>
4303 W. 119th St., Leawood, KS 66209
WWW.TORCH.AI<http://www.torch.ai/>


From: Peter Kronenberg
Sent: Sunday, April 11, 2021 10:45 PM
To: user@tika.apache.org
Subject: RE: Error building TIKA - complaining about CRLF

Appears to be caused by these lines in the checkstyle.xml config file


<module name="RegexpMultiline">
  <property name="format" value="\r\n"/>
  <property name="message" value="CRLF line endings are prohibited"/>
</module>

But this won't work on Windows. . I suggest removing this and relying on Git to handle the translating between LF and CRLF

Peter Kronenberg  |  Senior AI Analytic ENGINEER
C: 703.887.5623
[Torch AI]<http://www.torch.ai/>
4303 W. 119th St., Leawood, KS 66209
WWW.TORCH.AI<http://www.torch.ai/>


From: Peter Kronenberg
Sent: Sunday, April 11, 2021 10:28 PM
To: user@tika.apache.org<ma...@tika.apache.org>
Subject: Error building TIKA - complaining about CRLF

Just tried to do a full build and it fails with all sorts of error messages about CRLF not allowed.

Here's a representative sample.   I got a new computer a few weeks and I'm not sure if this is the first time I'm trying this on the new computer.  So it's certainly possible that's it's something on my end.

[cid:image002.png@01D72F24.C524CB50]

[cid:image003.png@01D72F24.C524CB50]

[cid:image004.png@01D72F24.C524CB50]

Peter Kronenberg  |  Senior AI Analytic ENGINEER
C: 703.887.5623
[Torch AI]<http://www.torch.ai/>
4303 W. 119th St., Leawood, KS 66209
WWW.TORCH.AI<http://www.torch.ai/>



RE: Error building TIKA - complaining about CRLF

Posted by Peter Kronenberg <pe...@torch.ai>.
Appears to be caused by these lines in the checkstyle.xml config file


<module name="RegexpMultiline">
  <property name="format" value="\r\n"/>
  <property name="message" value="CRLF line endings are prohibited"/>
</module>

But this won't work on Windows. . I suggest removing this and relying on Git to handle the translating between LF and CRLF

Peter Kronenberg  |  Senior AI Analytic ENGINEER
C: 703.887.5623
[Torch AI]<http://www.torch.ai/>
4303 W. 119th St., Leawood, KS 66209
WWW.TORCH.AI<http://www.torch.ai/>


From: Peter Kronenberg
Sent: Sunday, April 11, 2021 10:28 PM
To: user@tika.apache.org
Subject: Error building TIKA - complaining about CRLF

Just tried to do a full build and it fails with all sorts of error messages about CRLF not allowed.

Here's a representative sample.   I got a new computer a few weeks and I'm not sure if this is the first time I'm trying this on the new computer.  So it's certainly possible that's it's something on my end.

[cid:image002.png@01D72F24.1F9F4A20]

[cid:image003.png@01D72F24.1F9F4A20]

[cid:image004.png@01D72F24.1F9F4A20]

Peter Kronenberg  |  Senior AI Analytic ENGINEER
C: 703.887.5623
[Torch AI]<http://www.torch.ai/>
4303 W. 119th St., Leawood, KS 66209
WWW.TORCH.AI<http://www.torch.ai/>