You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Steve Loughran (JIRA)" <ji...@apache.org> on 2017/02/03 10:17:52 UTC
[jira] [Comment Edited] (SPARK-19407) defaultFS is used
FileSystem.get instead of getting it from uri scheme
[ https://issues.apache.org/jira/browse/SPARK-19407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15851290#comment-15851290 ]
Steve Loughran edited comment on SPARK-19407 at 2/3/17 10:17 AM:
-----------------------------------------------------------------
Yes, looks like {{StreamMetadata.read()}} is getting it wrong. Which is funny, as {{StreamingQueryManager.createQuery()}} gets it right
{code}
/** Read the metadata from file if it exists */
def read(metadataFile: Path, hadoopConf: Configuration): Option[StreamMetadata] = {
val fs = FileSystem.get(hadoopConf) /* HERE */
if (fs.exists(metadataFile)) {
var input: FSDataInputStream = null
try {
{code}
when it should be
{code}
val fs = FileSystem.get(metadataFile, hadoopConf)
{code}
The hard part will be testing this
was (Author: stevel@apache.org):
Yes, looks like {{StreamMetadata.read()}} is getting it wrong. Which is funny, as {{StreamingQueryManager.createQuery()}} gets it right
{code}
/** Read the metadata from file if it exists */
def read(metadataFile: Path, hadoopConf: Configuration): Option[StreamMetadata] = {
val fs = FileSystem.get(hadoopConf) /* HERE */
if (fs.exists(metadataFile)) {
var input: FSDataInputStream = null
try {
{code}
when it should be
{code}
val fs = FileSystem.get(metadataFile, hadoopConf)
{code}
> defaultFS is used FileSystem.get instead of getting it from uri scheme
> ----------------------------------------------------------------------
>
> Key: SPARK-19407
> URL: https://issues.apache.org/jira/browse/SPARK-19407
> Project: Spark
> Issue Type: Bug
> Components: Structured Streaming
> Affects Versions: 2.1.0
> Reporter: Amit Assudani
> Labels: checkpoint, filesystem, starter, streaming
>
> Caused by: java.lang.IllegalArgumentException: Wrong FS: s3a://**************/checkpoint/7b2231a3-d845-4740-bfa3-681850e5987f/metadata, expected: file:///
> at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:649)
> at org.apache.hadoop.fs.RawLocalFileSystem.pathToFile(RawLocalFileSystem.java:82)
> at org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:606)
> at org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:824)
> at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:601)
> at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:421)
> at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1426)
> at org.apache.spark.sql.execution.streaming.StreamMetadata$.read(StreamMetadata.scala:51)
> at org.apache.spark.sql.execution.streaming.StreamExecution.<init>(StreamExecution.scala:100)
> at org.apache.spark.sql.streaming.StreamingQueryManager.createQuery(StreamingQueryManager.scala:232)
> at org.apache.spark.sql.streaming.StreamingQueryManager.startQuery(StreamingQueryManager.scala:269)
> at org.apache.spark.sql.streaming.DataStreamWriter.start(DataStreamWriter.scala:262)
> Can easily replicate on spark standalone cluster by providing checkpoint location uri scheme anything other than "file://" and not overriding in config.
> WorkAround --conf spark.hadoop.fs.defaultFS=s3a://somebucket or set it in sparkConf or spark-default.conf
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org