You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Jason Gerlowski (JIRA)" <ji...@apache.org> on 2019/07/15 01:46:00 UTC

[jira] [Commented] (SOLR-13622) Add FileStream Streaming Expression

    [ https://issues.apache.org/jira/browse/SOLR-13622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16884802#comment-16884802 ] 

Jason Gerlowski commented on SOLR-13622:
----------------------------------------

Joel and I discussed this a bit offline and had some initial thoughts about what this should look like:

* file specification could take either files or directories (which would then be processed recursively).  Ideally the file parameter would allow a comma-delimited list of files/directories to process.
* received filepaths would have to be evaluated relative to a specified particular data directory (to avoid the security issue of allowing reading arbitrary files on the Solr box).  Also to this effect, we'd need to do some sanitizing of the file paths that users provide to ensure they're not escaping the sandbox we set up for them.
* each emitted tuple could contain the filename/path of the file that the emitted tuple came from, to allow differentiation of lines from multiple files.
* we could add a numeric parameter to cap the number of lines that get emitted if users just want to see the first N lines of a large file (or group of files)

> Add FileStream Streaming Expression
> -----------------------------------
>
>                 Key: SOLR-13622
>                 URL: https://issues.apache.org/jira/browse/SOLR-13622
>             Project: Solr
>          Issue Type: New Feature
>      Security Level: Public(Default Security Level. Issues are Public) 
>          Components: streaming expressions
>            Reporter: Joel Bernstein
>            Assignee: Jason Gerlowski
>            Priority: Major
>         Attachments: SOLR-13622.patch
>
>
> The FileStream will read files from a local filesystem and Stream back each line of the file as a tuple.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org