You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@beam.apache.org by "Matt Darwin (JIRA)" <ji...@apache.org> on 2018/01/08 16:00:00 UTC

[jira] [Created] (BEAM-3429) TextIO.Write not handling URI properly

Matt Darwin created BEAM-3429:
---------------------------------

             Summary: TextIO.Write not handling URI properly
                 Key: BEAM-3429
                 URL: https://issues.apache.org/jira/browse/BEAM-3429
             Project: Beam
          Issue Type: Bug
          Components: beam-model
    Affects Versions: 2.2.0
         Environment: Mac OS X
            Reporter: Matt Darwin
            Assignee: Kenneth Knowles


I think I have found a bug in TextIO, in the way it handles URIs.  I am told that the TextIO.Write.to(String) method should take a URI string, but this doesn't seem to work for me.

Test case inline:

{{ @Test
    public void testBeamTextIO() throws IOException {
        List<String> words = Arrays.asList("tom", "huck", "polly");
        Create.Values<String> source = Create.of(words);
        PCollection<String> coll = this.pipeline.apply(source);
        String fileName = "test" + System.currentTimeMillis() + ".txt";
        String fileURI = new File(fileName).toPath().toUri().toString();
        // ie file:///full/unix/path/to/test.file

        coll.apply(TextIO.write().to(fileURI));
        //test passes if we use the line below instead of the above
//        coll.apply(TextIO.write().to(fileName));
        pipeline.run();

        //read all sharded files:
        Path dir = Paths.get(URI.create(fileURI)).getParent();

        List<Path> files = new ArrayList<>();
        Files.newDirectoryStream(dir, new DirectoryStream.Filter<Path>() {
            @Override public boolean accept(Path entry) throws IOException {
                return entry.toString().contains(fileName);
            }
        }).forEach(path -> files.add(path));

        assertFalse("no files produced!", files.isEmpty());
        List<String> fileContents = new ArrayList<>();
        for (Path f : files) {
            fileContents.addAll(Files.readAllLines(f));
        }
        assertTrue(fileContents.contains("tom"));
        assertTrue(fileContents.contains("polly"));

    }}}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)