You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@atlas.apache.org by Radhika Kundam via Review Board <no...@reviews.apache.org> on 2022/04/11 15:50:41 UTC

Re: Review Request 73926: ATLAS-4571 : Impala Hook : Indexed string field (solr.StrField) which is too large ERROR

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/73926/#review224294
-----------------------------------------------------------


Ship it!




Ship It!

- Radhika Kundam


On April 7, 2022, 9:57 p.m., Snehal Ambavkar wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/73926/
> -----------------------------------------------------------
> 
> (Updated April 7, 2022, 9:57 p.m.)
> 
> 
> Review request for atlas, Jayendra Parab, Mandar Ambawane, Pinal Shah, Radhika Kundam, and Sidharth Mishra.
> 
> 
> Bugs: ATLAS-4571
>     https://issues.apache.org/jira/browse/ATLAS-4571
> 
> 
> Repository: atlas
> 
> 
> Description
> -------
> 
> ERROR :
> 
> Exception writing document id test1 to the index; possible analysis error: Document contains at least one immense term in field="test_s" (whose UTF8 encoding is longer than the max length 32766), all of which were skipped.  Please correct the analyzer to not produce such terms.
> 
> RCA : 
> 
> Imapala process entities created by ImpalaHook saves query-string in name field.
> Since query-string can be large, we are getting the longer than the max error.
> 
> Fix :
> 
> To store qualifiedName in name field instead of query-string
> 
> 
> Diffs
> -----
> 
>   addons/impala-bridge/src/main/java/org/apache/atlas/impala/hook/events/BaseImpalaEvent.java 32efb8321 
>   addons/impala-bridge/src/test/java/org/apache/atlas/impala/ImpalaLineageToolIT.java 53e9b1224 
>   addons/impala-bridge/src/test/java/org/apache/atlas/impala/hook/ImpalaLineageHookIT.java 56d74fee3 
> 
> 
> Diff: https://reviews.apache.org/r/73926/diff/1/
> 
> 
> Testing
> -------
> 
> Created tables and inserted data as per jira scenario
> Created smaller hive tables through impala-shell and verified presence in Atlas through hook
> Applied and removed classifications on impala-generated entities and hive tables
> Assigned terms to impala-generated entities and hive tables
> Created and assigned business meta data to impala-generated entities and hive tables.
> Deleted and purged impala-generated entities
> (Checked the above in both old and new UI)
> 
> Eg : 
> Created two hive tables with 4000 columns.
> Performed query as follows 
> insert into table_2 select from <4000 column names> from table_1;
> 
> Also tested with other queries that would intiate creation of impala_process and impala_process_execution entities.
> 
> 
> Precommit : https://ci-builds.apache.org/job/Atlas/job/PreCommit-ATLAS-Build-Test/1068/
> 
> 
> Thanks,
> 
> Snehal Ambavkar
> 
>