You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by "Yao Zhang (Jira)" <ji...@apache.org> on 2022/07/29 02:54:00 UTC
[jira] [Comment Edited] (HUDI-4485) Hudi cli got empty result for command show fsview all

    [ https://issues.apache.org/jira/browse/HUDI-4485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17572719#comment-17572719 ] 

Yao Zhang edited comment on HUDI-4485 at 7/29/22 2:53 AM:
----------------------------------------------------------

Hi all,

After further investigation I found that Spring-shell 1.2.0 deals with block comment in command line. The relevant codes are:

org.springframework.shell.core.AbstractShell::executeCommand
{code:java}
// We support simple block comments; ie a single pair per line
if (!inBlockComment && line.contains("/*") && line.contains("*/")) {
    blockCommentBegin();
    String lhs = line.substring(0, line.lastIndexOf("/*"));
    if (line.contains("*/")) {
        line = lhs + line.substring(line.lastIndexOf("*/") + 2);
        blockCommentFinish();
    } else {
        line = lhs;
    }
}
if (inBlockComment) {
    if (!line.contains("*/")) {
        return new CommandResult(true);
    }
    blockCommentFinish();
    line = line.substring(line.lastIndexOf("*/") + 2);
}
// We also support inline comments (but only at start of line, otherwise valid
// command options like http://www.helloworld.com will fail as per ROO-517)
if (!inBlockComment && (line.trim().startsWith("//") || line.trim().startsWith("#"))) { // # support in ROO-1116
    line = "";
}
{code}

The codes above remove the last occurance of "/* xxx \*/" in side a command line string. That's why we pass '*/*/*' to pathRegex and finally we will get '\*/\*\*'. Moreover, the block comment removal logic above is buggy as in the case of '\*/\*/\*' , the begin comment block identifier is '\*/\*(/\*)' as quoted in the string, also the end comment block identifier is '\*/(\*/)\*'. characters before begin identifier and after end identifier will be kept. That's why we get '\*/\*\*'.

Finally, I suggest we should disable erasing block comment in hudi cli command line. Unfortunately, Spring shell 1.2.0 does not provide such as configuration that can disable block comment processing. Also I tried to use a converter that append '/*\*/' to every command string but it did not work, because spring shell deals with block comment before invoking converters.


was (Author: paul8263):
Hi all,

After further investigation I found that Spring-shell 1.2.0 deals with block comment in command line. The relevant codes are:

org.springframework.shell.core.AbstractShell::executeCommand
{code:java}
// We support simple block comments; ie a single pair per line
if (!inBlockComment && line.contains("/*") && line.contains("*/")) {
    blockCommentBegin();
    String lhs = line.substring(0, line.lastIndexOf("/*"));
    if (line.contains("*/")) {
        line = lhs + line.substring(line.lastIndexOf("*/") + 2);
        blockCommentFinish();
    } else {
        line = lhs;
    }
}
if (inBlockComment) {
    if (!line.contains("*/")) {
        return new CommandResult(true);
    }
    blockCommentFinish();
    line = line.substring(line.lastIndexOf("*/") + 2);
}
// We also support inline comments (but only at start of line, otherwise valid
// command options like http://www.helloworld.com will fail as per ROO-517)
if (!inBlockComment && (line.trim().startsWith("//") || line.trim().startsWith("#"))) { // # support in ROO-1116
    line = "";
}
{code}

The codes above remove the last occurance of "/* xxx */" in side a command line string. That's why we pass '*/*/*' to pathRegex and finally we will get '*/**'. Moreover, the block comment removal logic above is buggy as in the case of '*/*/*' , the begin comment block identifier is '*/*(/*)' as quoted in the string, also the end comment block identifier is '*/(*/)*'. characters before begin identifier and after end identifier will be kept. That's why we get '*/**'.

Finally, I suggest we should disable erasing block comment in hudi cli command line. Unfortunately, Spring shell 1.2.0 does not provide such as configuration that can disable block comment processing. Also I tried to use a converter that append '/**/' to every command string but it did not work, because spring shell deals with block comment before invoking converters.

> Hudi cli got empty result for command show fsview all
> -----------------------------------------------------
>
>                 Key: HUDI-4485
>                 URL: https://issues.apache.org/jira/browse/HUDI-4485
>             Project: Apache Hudi
>          Issue Type: Bug
>          Components: cli
>    Affects Versions: 0.11.1
>         Environment: Hudi version : 0.11.1
> Spark version : 3.1.1
> Hive version : 3.1.0
> Hadoop version : 3.1.1
>            Reporter: Yao Zhang
>            Priority: Minor
>             Fix For: 0.12.0
>
>
> This issue is from: [[SUPPORT] Hudi cli got empty result for command show fsview all · Issue #6177 · apache/hudi (github.com)|https://github.com/apache/hudi/issues/6177]
> **Describe the problem you faced**
> Hudi cli got empty result after running command show fsview all.
> ![image](https://user-images.githubusercontent.com/7007327/180346750-6a55f472-45ac-46cf-8185-3c4fc4c76434.png)
> The type of table t1 is  COW and I am sure that the parquet file is actually generated inside data folder. Also, the parquet files are not damaged as the data could be retrieved correctly by reading as Hudi table or directly reading each parquet file(using Spark).
> **To Reproduce**
> Steps to reproduce the behavior:
> 1. Enter Flink SQL client.
> 2. Execute the SQL and check the data was written successfully.
> ```sql
> CREATE TABLE t1(
>   uuid VARCHAR(20),
>   name VARCHAR(10),
>   age INT,
>   ts TIMESTAMP(3),
>   `partition` VARCHAR(20)
> )
> PARTITIONED BY (`partition`)
> WITH (
>   'connector' = 'hudi',
>   'path' = 'hdfs:///path/to/table/',
>   'table.type' = 'COPY_ON_WRITE'
> );
> -- insert data using values
> INSERT INTO t1 VALUES
>   ('id1','Danny',23,TIMESTAMP '1970-01-01 00:00:01','par1'),
>   ('id2','Stephen',33,TIMESTAMP '1970-01-01 00:00:02','par1'),
>   ('id3','Julian',53,TIMESTAMP '1970-01-01 00:00:03','par2'),
>   ('id4','Fabian',31,TIMESTAMP '1970-01-01 00:00:04','par2'),
>   ('id5','Sophia',18,TIMESTAMP '1970-01-01 00:00:05','par3'),
>   ('id6','Emma',20,TIMESTAMP '1970-01-01 00:00:06','par3'),
>   ('id7','Bob',44,TIMESTAMP '1970-01-01 00:00:07','par4'),
>   ('id8','Han',56,TIMESTAMP '1970-01-01 00:00:08','par4');
> ```
> 3. Enter Hudi cli and execute `show fsview all`
> **Expected behavior**
> `show fsview all` in Hudi cli should return all file slices.
> **Environment Description**
> * Hudi version : 0.11.1
> * Spark version : 3.1.1
> * Hive version : 3.1.0
> * Hadoop version : 3.1.1
> * Storage (HDFS/S3/GCS..) : HDFS
> * Running on Docker? (yes/no) : no
> **Additional context**
> No.
> **Stacktrace**
> N/A
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)