You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@dolphinscheduler.apache.org by GitBox <gi...@apache.org> on 2020/09/14 07:59:47 UTC

[GitHub] [incubator-dolphinscheduler] liuzx8888 opened a new issue #3739: [Bug][service] sql result set is converted to json, there is data loss

liuzx8888 opened a new issue #3739:
URL: https://github.com/apache/incubator-dolphinscheduler/issues/3739


   **Describe the bug**
   sql result set is converted to json, there is data loss
   
   **To Reproduce**
   
   1. sqlscript:
    `select
   DBS.`NAME`,
   PARTITIONS.PART_ID,
   TBLS.TBL_NAME,
   PARTITIONS.PART_NAME,
   SDS.LOCATION,
   PARTITION_FILENUM.PARAM_VALUE AS FILE_NUM,
   PARTITION_TOTAL_Size.PARAM_VALUE AS FILE_SIZE,
    CEILING(PARTITION_TOTAL_Size.PARAM_VALUE/(128*1024*1024)) AS CLAC_FILE_NUM
   from 
   DBS,SDS,TBLS,PARTITIONS,
   (select * from PARTITION_PARAMS where PARAM_KEY='numFiles') as PARTITION_FILENUM,
   (select * from PARTITION_PARAMS where PARAM_KEY='totalSize') as PARTITION_TOTAL_Size
   where DBS.DB_ID = TBLS.DB_ID
   and PARTITIONS.SD_ID = SDS.SD_ID
   and TBLS.TBL_ID=PARTITIONS.TBL_ID
   and PARTITIONS.PART_ID =PARTITION_FILENUM.PART_ID
   and PARTITIONS.PART_ID =PARTITION_TOTAL_Size.PART_ID
   AND PARTITION_FILENUM.PARAM_VALUE>CEILING(PARTITION_TOTAL_Size.PARAM_VALUE/(128*1024*1024)) 
   limit 1
   `
   result:
   
   ![image](https://user-images.githubusercontent.com/10862577/93058748-5e924300-f6a2-11ea-9eb5-9da43fdecf9b.png)
   
   
   2. Code location : class[SqlTask]-Method [resultProcess]
   ![image](https://user-images.githubusercontent.com/10862577/93058544-1f63f200-f6a2-11ea-9a3d-23230ffc02ec.png)
   
   Generated json result:
   [{
   	"NAME": "ods",
   	"PART_ID": 28980,
   	"TBL_NAME": "xxxxx",
   	"PART_NAME": "partition_col=201511",
   	"LOCATION": "hdfs://cdh1:8020/user/hive/warehouse/ods.db/xxxxx/partition_col=201511",
   	"PARAM_VALUE": "6320486",
   	"CLAC_FILE_NUM": 1.0
   }]
   
   **Expected behavior**
   The comparison result found:
   1. First of all, the data field alias fails to have any effect
   2. Take the original name of the database, there are missing fields in FILE_NUM and FILE_SIZE
   
   **Which version of Dolphin Scheduler:**
    -[1.3.1-preview]
   
   **Requirement or improvement**
   
   1.Modify the code to take the database field alias instead of the original name。
    I think it should be changed to this:
   ![image](https://user-images.githubusercontent.com/10862577/93058671-428ea180-f6a2-11ea-81cb-d36e33c332b9.png)
   
   2.Add new features to save the execution results of the previous task node, which can be passed as a parameter to the next task-level node. This scenario is very common
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-dolphinscheduler] xingchun-chen edited a comment on issue #3739: [Bug][service] sql result set is converted to json, there is data loss

Posted by GitBox <gi...@apache.org>.
xingchun-chen edited a comment on issue #3739:
URL: https://github.com/apache/incubator-dolphinscheduler/issues/3739#issuecomment-693138850


   > 1.Modify the code to take the database field alias instead of the original name。
   
   item 1 is the same as the issue #3549, which  is resolved in dev branch 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-dolphinscheduler] liuzx8888 closed issue #3739: [Bug][service] sql result set is converted to json, there is data loss

Posted by GitBox <gi...@apache.org>.
liuzx8888 closed issue #3739:
URL: https://github.com/apache/incubator-dolphinscheduler/issues/3739


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-dolphinscheduler] xingchun-chen commented on issue #3739: [Bug][service] sql result set is converted to json, there is data loss

Posted by GitBox <gi...@apache.org>.
xingchun-chen commented on issue #3739:
URL: https://github.com/apache/incubator-dolphinscheduler/issues/3739#issuecomment-693138850


   > 1.Modify the code to take the database field alias instead of the original name。
   item 1 is the same as the issue #3549, which  is resolved in dev branch 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-dolphinscheduler] felix-thinkingdata commented on issue #3739: [Bug][service] sql result set is converted to json, there is data loss

Posted by GitBox <gi...@apache.org>.
felix-thinkingdata commented on issue #3739:
URL: https://github.com/apache/incubator-dolphinscheduler/issues/3739#issuecomment-693370363


   > > 1.Modify the code to take the database field alias instead of the original name。
   > 
   > item 1 is the same as the issue #3549, which is resolved in dev branch
   
   This can be turned off


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-dolphinscheduler] felix-thinkingdata commented on issue #3739: [Bug][service] sql result set is converted to json, there is data loss

Posted by GitBox <gi...@apache.org>.
felix-thinkingdata commented on issue #3739:
URL: https://github.com/apache/incubator-dolphinscheduler/issues/3739#issuecomment-692098126


   You're right.  You can add a logical judgment。If getColumnLabel and getColumnName are not equal, use getColumnLabel;Can you fix this problem and submit a pull request;
   
   你是对的,可以加一个逻辑,如果别名和列名不同,优先使用别名。你可以提交pull request修复这个问题吗?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-dolphinscheduler] felix-thinkingdata commented on issue #3739: [Bug][service] sql result set is converted to json, there is data loss

Posted by GitBox <gi...@apache.org>.
felix-thinkingdata commented on issue #3739:
URL: https://github.com/apache/incubator-dolphinscheduler/issues/3739#issuecomment-692084262


   select DBS.NAME                                                        as 'DB_NAME',
          PARTITIONS.PART_ID,
          TBLS.TBL_NAME,
          PARTITIONS.PART_NAME,
          SDS.LOCATION,
          PARTITION_FILENUM.PARAM_VALUE                                   AS FILE_NUM,
          PARTITION_TOTAL_Size.PARAM_VALUE                                AS FILE_SIZE,
          CEILING(PARTITION_TOTAL_Size.PARAM_VALUE / (128 * 1024 * 1024)) AS CLAC_FILE_NUM
   from DBS,
        SDS,
        TBLS,
        PARTITIONS,
        (select * from PARTITION_PARAMS where PARAM_KEY = 'numFiles') as PARTITION_FILENUM,
        (select * from PARTITION_PARAMS where PARAM_KEY = 'totalSize') as PARTITION_TOTAL_Size
   where DBS.DB_ID = TBLS.DB_ID
     and PARTITIONS.SD_ID = SDS.SD_ID
     and TBLS.TBL_ID = PARTITIONS.TBL_ID
     and PARTITIONS.PART_ID = PARTITION_FILENUM.PART_ID
     and PARTITIONS.PART_ID = PARTITION_TOTAL_Size.PART_ID
     AND PARTITION_FILENUM.PARAM_VALUE > CEILING(PARTITION_TOTAL_Size.PARAM_VALUE / (128 * 1024 * 1024))
   limit 1
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org