You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@druid.apache.org by GitBox <gi...@apache.org> on 2020/04/13 19:15:49 UTC

[GitHub] [druid] vaibhav-imply opened a new issue #9694: Druid-0.17: Bad ingestion spec generation from Data-loader UI

vaibhav-imply opened a new issue #9694: Druid-0.17: Bad ingestion spec generation from Data-loader UI 
URL: https://github.com/apache/druid/issues/9694
 
 
   When ingesting raw data from the actual JSON data source, Dataloader UI gives an option under ConfigureSchema step to decide/add/delete: dimension and matrices.
   
   **Test Data:**
   
   ```
   {"timestamp":"2019-10-21T21:31:01.498Z","dim1":"dim1","dim2":"dim2,"dim3":"dim3","m1":10}
   {"timestamp":"2019-10-21T21:31:01.498Z","dim1":"dim1-2","dim2":"dim2-1","dim3":"dim3-1","m1":20}
   {"timestamp":"2019-10-21T21:31:01.498Z","dim1":"dim1-2","dim2":"dim2-2","dim3":"dim3-2","m1":5}
   {"timestamp":"2019-10-21T21:31:01.498Z","dim1":"dim1-3","dim2":"dim2-3","dim3":"dim3-3","m1":10}
   ```
   Let's consider two cases:
   
   **case-1)** If ROLLUP=FALSE: all the fields(dim1,dim2,dim3,m1) will be ingested as Dimensions.
   
   **case-2)** If ROLLUP=TRUE: DataLoader UI decides automatically, sum_m1 (long_Sum) on a field m1 as a metrics and generate the ingestion specs accordingly which is perfect,But
   
   While Re-indexing from already existing data source ( which has been ingested as per Case-1(Rollup=false),ConfigureSchema step reads all fields as Dimensions with RollUp-disabled,
   
   When Rollup is enabled here, DataLoader UI decides sum_m1 (long_Sum) on field m1 as a metric( same as Indexing),
   meaning the user is deciding to usem1as metrics, But this does not work post-Re-ingestion,Sum_m1` turns out to zero, however, data loader UI shows it perfectly:
   
   
   ![Uploading Re-indexing-RollUp_True.png…]()
   
   
   One need to add metrics in inputSource in the Re-indexing ingestionSpec to make it work: Eg
   
   ```
   "inputSource": {
       "type": "druid",
       "metrics": [
         "m1"        <--=============---
       ],
       "dataSource": "orig",
       "interval": "2000/3000"
     }
   ```
   **Is not DataLoader should automatically generate the JSON with the required metrics under InputSource ? This seems a UI DataLoader  .**
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org