You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Hoss Man (Jira)" <ji...@apache.org> on 2019/09/13 17:00:00 UTC

[jira] [Reopened] (SOLR-13622) Add FileStream Streaming Expression

     [ https://issues.apache.org/jira/browse/SOLR-13622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hoss Man reopened SOLR-13622:
-----------------------------

Uwe's jenkins servers weren't being included in my reports for over a month – that's why you didn't see any StreamExpressionTest failures in my reports when you looked last month.

in reality Uwe's windows builds started picking up a new type of failure: that file handles are being leaked (and thus the test framework can't close them)...

{noformat}
   [junit4]   2> NOTE: reproduce with: ant test  -Dtestcase=StreamExpressionTest -Dtests.seed=607225F2726A5625 -Dtests.slow=true -Dtests.locale=ar-PS -Dtests.timezone=Kwajalein -Dtests.asserts=true -Dtests.file.encoding=UTF-8
   [junit4] ERROR   0.00s J1 | StreamExpressionTest (suite) <<<
   [junit4]    > Throwable #1: java.io.IOException: Could not remove the following files (in the order of attempts):
   [junit4]    >    C:\Users\jenkins\workspace\Lucene-Solr-master-Windows\solr\build\solr-solrj\test\J1\temp\solr.client.solrj.io.stream.StreamExpressionTest_607225F2726A5625-001\tempDir-001\node2\userfiles\directory1\secondLevel2.txt: java.nio.file.FileSystemException: C:\Users\jenkins\workspace\Lucene-Solr-master-Windows\solr\build\solr-solrj\test\J1\temp\solr.client.solrj.io.stream.StreamExpressionTest_607225F2726A5625-001\tempDir-001\node2\userfiles\directory1\secondLevel2.txt: The process cannot access the file because it is being used by another process.
   [junit4]    >    C:\Users\jenkins\workspace\Lucene-Solr-master-Windows\solr\build\solr-solrj\test\J1\temp\solr.client.solrj.io.stream.StreamExpressionTest_607225F2726A5625-001\tempDir-001\node2\userfiles\directory1: java.nio.file.DirectoryNotEmptyException: C:\Users\jenkins\workspace\Lucene-Solr-master-Windows\solr\build\solr-solrj\test\J1\temp\solr.client.solrj.io.stream.StreamExpressionTest_607225F2726A5625-001\tempDir-001\node2\userfiles\directory1
   [junit4]    >    C:\Users\jenkins\workspace\Lucene-Solr-master-Windows\solr\build\solr-solrj\test\J1\temp\solr.client.solrj.io.stream.StreamExpressionTest_607225F2726A5625-001\tempDir-001\node2\userfiles: java.nio.file.DirectoryNotEmptyException: C:\Users\jenkins\workspace\Lucene-Solr-master-Windows\solr\build\solr-solrj\test\J1\temp\solr.client.solrj.io.stream.StreamExpressionTest_607225F2726A5625-001\tempDir-001\node2\userfiles
   [junit4]    >    C:\Users\jenkins\workspace\Lucene-Solr-master-Windows\solr\build\solr-solrj\test\J1\temp\solr.client.solrj.io.stream.StreamExpressionTest_607225F2726A5625-001\tempDir-001\node2: java.nio.file.DirectoryNotEmptyException: C:\Users\jenkins\workspace\Lucene-Solr-master-Windows\solr\build\solr-solrj\test\J1\temp\solr.client.solrj.io.stream.StreamExpressionTest_607225F2726A5625-001\tempDir-001\node2
   [junit4]    >    C:\Users\jenkins\workspace\Lucene-Solr-master-Windows\solr\build\solr-solrj\test\J1\temp\solr.client.solrj.io.stream.StreamExpressionTest_607225F2726A5625-001\tempDir-001: java.nio.file.DirectoryNotEmptyException: C:\Users\jenkins\workspace\Lucene-Solr-master-Windows\solr\build\solr-solrj\test\J1\temp\solr.client.solrj.io.stream.StreamExpressionTest_607225F2726A5625-001\tempDir-001
   [junit4]    >    C:\Users\jenkins\workspace\Lucene-Solr-master-Windows\solr\build\solr-solrj\test\J1\temp\solr.client.solrj.io.stream.StreamExpressionTest_607225F2726A5625-001: java.nio.file.DirectoryNotEmptyException: C:\Users\jenkins\workspace\Lucene-Solr-master-Windows\solr\build\solr-solrj\test\J1\temp\solr.client.solrj.io.stream.StreamExpressionTest_607225F2726A5625-001
   [junit4]    >        at __randomizedtesting.SeedInfo.seed([607225F2726A5625]:0)
   [junit4]    >        at org.apache.lucene.util.IOUtils.rm(IOUtils.java:319)
   [junit4]    >        at java.base/java.lang.Thread.run(Thread.java:835)
{noformat}

I've only ever seen {{secondLevel2.txt}} show up as being the problem -- based on how the test works, that suggests _either_ the multi file usage (ie {{cat("topLevel1.txt,directory1\secondLevel2.txt")}} usage causes it's _second_ arg to be leaked, _or_ the single arg directory usage (ie: {{cat("directory1")}} causes the last file in the directory to be leaked (*or both*)

skimming the code in CatStream this doesn't seem too suprising -- AFAICT the only time {{currentFileLines}} get's closed is when {{maxLines}} get's exceeded, or when {{allFilesToCrawl.hasNext()}} is true ... if there are no more files to crawl, or a file is 0 bytes (ie: {{currentFileLines.hasNext()}} never returns true) then the current / "last" file will never be closed.


> Add FileStream Streaming Expression
> -----------------------------------
>
>                 Key: SOLR-13622
>                 URL: https://issues.apache.org/jira/browse/SOLR-13622
>             Project: Solr
>          Issue Type: New Feature
>          Components: streaming expressions
>            Reporter: Joel Bernstein
>            Assignee: Jason Gerlowski
>            Priority: Major
>             Fix For: 8.3
>
>         Attachments: SOLR-13622.patch, SOLR-13622.patch
>
>
> The FileStream will read files from a local filesystem and Stream back each line of the file as a tuple.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org