You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@atlas.apache.org by Radhika Kundam via Review Board <no...@reviews.apache.org> on 2022/04/11 15:50:41 UTC
Re: Review Request 73926: ATLAS-4571 : Impala Hook : Indexed string field (solr.StrField) which is too large ERROR
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/73926/#review224294
-----------------------------------------------------------
Ship it!
Ship It!
- Radhika Kundam
On April 7, 2022, 9:57 p.m., Snehal Ambavkar wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/73926/
> -----------------------------------------------------------
>
> (Updated April 7, 2022, 9:57 p.m.)
>
>
> Review request for atlas, Jayendra Parab, Mandar Ambawane, Pinal Shah, Radhika Kundam, and Sidharth Mishra.
>
>
> Bugs: ATLAS-4571
> https://issues.apache.org/jira/browse/ATLAS-4571
>
>
> Repository: atlas
>
>
> Description
> -------
>
> ERROR :
>
> Exception writing document id test1 to the index; possible analysis error: Document contains at least one immense term in field="test_s" (whose UTF8 encoding is longer than the max length 32766), all of which were skipped. Please correct the analyzer to not produce such terms.
>
> RCA :
>
> Imapala process entities created by ImpalaHook saves query-string in name field.
> Since query-string can be large, we are getting the longer than the max error.
>
> Fix :
>
> To store qualifiedName in name field instead of query-string
>
>
> Diffs
> -----
>
> addons/impala-bridge/src/main/java/org/apache/atlas/impala/hook/events/BaseImpalaEvent.java 32efb8321
> addons/impala-bridge/src/test/java/org/apache/atlas/impala/ImpalaLineageToolIT.java 53e9b1224
> addons/impala-bridge/src/test/java/org/apache/atlas/impala/hook/ImpalaLineageHookIT.java 56d74fee3
>
>
> Diff: https://reviews.apache.org/r/73926/diff/1/
>
>
> Testing
> -------
>
> Created tables and inserted data as per jira scenario
> Created smaller hive tables through impala-shell and verified presence in Atlas through hook
> Applied and removed classifications on impala-generated entities and hive tables
> Assigned terms to impala-generated entities and hive tables
> Created and assigned business meta data to impala-generated entities and hive tables.
> Deleted and purged impala-generated entities
> (Checked the above in both old and new UI)
>
> Eg :
> Created two hive tables with 4000 columns.
> Performed query as follows
> insert into table_2 select from <4000 column names> from table_1;
>
> Also tested with other queries that would intiate creation of impala_process and impala_process_execution entities.
>
>
> Precommit : https://ci-builds.apache.org/job/Atlas/job/PreCommit-ATLAS-Build-Test/1068/
>
>
> Thanks,
>
> Snehal Ambavkar
>
>