You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@beam.apache.org by "Anonymous (Jira)" <ji...@apache.org> on 2023/04/13 11:07:00 UTC

[jira] [Updated] (BEAM-12715) SnowflakeWrite fails in batch mode when the number of shards is > 1000

     [ https://issues.apache.org/jira/browse/BEAM-12715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Anonymous updated BEAM-12715:
-----------------------------
    Status: Triage Needed  (was: Resolved)

> SnowflakeWrite fails in batch mode when the number of shards is > 1000
> ----------------------------------------------------------------------
>
>                 Key: BEAM-12715
>                 URL: https://issues.apache.org/jira/browse/BEAM-12715
>             Project: Beam
>          Issue Type: Bug
>          Components: io-java-snowflake
>            Reporter: Daniel Mateus Pires
>            Priority: P2
>             Fix For: 2.33.0
>
>          Time Spent: 1h
>  Remaining Estimate: 0h
>
> When writing to Snowflake in batch mode, if the number of files to import is more than 1000, the load will fail
> From the Snowflake docs
> {quote}Of the three options for identifying/specifying data files to load from a stage, providing a discrete list of files is generally the fastest; however, the FILES parameter supports a maximum of 1,000 files, meaning a COPY command executed with the FILES parameter can only load up to 1,000 files.
> {quote}
> I noticed that the Snowflake Write in batch mode ignores the number of shards set by the user, and I think the first step should be to get the number of shards before writing.
> Longer term, should Beam issue multiple COPY statements with a distinct list of files when the number of files is more than 1000? Maybe inside the same transaction (BEGIN; END; block)
>  
> Also, I wanted to set the Jira issue component as io-java-snowflake but it does not exist



--
This message was sent by Atlassian Jira
(v8.20.10#820010)