You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@trafodion.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2017/10/19 03:46:01 UTC

[jira] [Commented] (TRAFODION-2714) odb does not load data correctly

    [ https://issues.apache.org/jira/browse/TRAFODION-2714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16210532#comment-16210532 ] 

ASF GitHub Bot commented on TRAFODION-2714:
-------------------------------------------

GitHub user SuJinpei opened a pull request:

    https://github.com/apache/incubator-trafodion/pull/1270

    [TRAFODION-2714] odb does not load data correctly

    1. The field enclosed in double quotes was not handled correctly before, fixed it.
    2. field index now start from 1.
    3. fix some other potential bug.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/SuJinpei/incubator-trafodion trafodion-2714

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/incubator-trafodion/pull/1270.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #1270
    
----
commit 04c0c46ee79a832c7e653bbd1b83fcc920dffb8f
Author: SuJinpei <87...@qq.com>
Date:   2017-10-19T03:25:47Z

    fix TRAFODION-2714

----


> odb does not load data correctly
> --------------------------------
>
>                 Key: TRAFODION-2714
>                 URL: https://issues.apache.org/jira/browse/TRAFODION-2714
>             Project: Apache Trafodion
>          Issue Type: Bug
>          Components: db-utility-odb
>    Affects Versions: 2.3-incubating
>         Environment: Internal Esgyn system nap043 and Windows laptop
>            Reporter: Rohit Jain
>
> I was trying to load a trafodion table via odb but I had rearranged the sequence of the columns to be loaded.  I used a map file to load the data.  I tried to load just 10 rows and verified that the source and target data were in fact the same, before loading the entire table.  To my surprise odb had loaded all columns correctly but two.  It had set the values of one of those two columns to zero and had set the value of the first column to what should have been the value in the other column.
> When I have the columns in the table arranged in the same sequence as those in the csv file, and don't have a mapping table, since it is not needed in that case, everything loads correctly.
> On nap043 in /disk1/rohit you will find movie_metadata.csv.
> The table I loaded the data into is:
> invoke movies_denorm;
> ..
> -- Definition of Trafodion table TRAFODION.MOVIES.MOVIES_DENORM
> -- Definition current  Tue Aug 15 13:15:13 2017
>   (
>     SYSKEY                           LARGEINT NO DEFAULT NOT NULL NOT DROPPABLE
>       NOT SERIALIZED
>   , MOVIE_TITLE                      CHAR(100) CHARACTER SET ISO88591 COLLATE
>       DEFAULT DEFAULT NULL NOT SERIALIZED
>   , TITLE_YEAR                       SMALLINT DEFAULT NULL NOT SERIALIZED
>   , CONTENT_RATING                   CHAR(10) CHARACTER SET ISO88591 COLLATE
>       DEFAULT DEFAULT NULL NOT SERIALIZED
>   , IMDB_SCORE                       NUMERIC(3, 1) DEFAULT NULL NOT SERIALIZED
>   , NUM_CRITIC_FOR_REVIEWS           SMALLINT DEFAULT NULL NOT SERIALIZED
>   , NUM_USER_FOR_REVIEWS             SMALLINT DEFAULT NULL NOT SERIALIZED
>   , NUM_VOTED_USERS                  INT DEFAULT NULL NOT SERIALIZED
>   , MOVIE_FACEBOOK_LIKES             INT DEFAULT NULL NOT SERIALIZED
>   , CAST_TOTAL_FACEBOOK_LIKES        INT DEFAULT NULL NOT SERIALIZED
>   , DURATION                         SMALLINT DEFAULT NULL NOT SERIALIZED
>   , BUDGET                           LARGEINT DEFAULT NULL NOT SERIALIZED
>   , GROSS                            INT DEFAULT NULL NOT SERIALIZED
>   , COLOR                            CHAR(16) CHARACTER SET ISO88591 COLLATE
>       DEFAULT DEFAULT NULL NOT SERIALIZED
>   , ASPECT_RATIO                     NUMERIC(4, 2) DEFAULT NULL NOT SERIALIZED
>   , COUNTRY                          CHAR(20) CHARACTER SET ISO88591 COLLATE
>       DEFAULT DEFAULT NULL NOT SERIALIZED
>   , LANGUAGE                         CHAR(10) CHARACTER SET ISO88591 COLLATE
>       DEFAULT DEFAULT NULL NOT SERIALIZED
>   , FACENUMBER_IN_POSTER             SMALLINT DEFAULT NULL NOT SERIALIZED
>   , GENRES                           CHAR(100) CHARACTER SET ISO88591 COLLATE
>       DEFAULT DEFAULT NULL NOT SERIALIZED
>   , PLOT_KEYWORDS                    CHAR(150) CHARACTER SET ISO88591 COLLATE
>       DEFAULT DEFAULT NULL NOT SERIALIZED
>   , MOVIE_IMDB_LINK                  CHAR(100) CHARACTER SET ISO88591 COLLATE
>       DEFAULT DEFAULT NULL NOT SERIALIZED
>   , ACTOR_1_NAME                     CHAR(35) CHARACTER SET ISO88591 COLLATE
>       DEFAULT DEFAULT NULL NOT SERIALIZED
>   , ACTOR_1_FACEBOOK_LIKES           INT DEFAULT NULL NOT SERIALIZED
>   , ACTOR_2_NAME                     CHAR(35) CHARACTER SET ISO88591 COLLATE
>       DEFAULT DEFAULT NULL NOT SERIALIZED
>   , ACTOR_2_FACEBOOK_LIKES           INT DEFAULT NULL NOT SERIALIZED
>   , ACTOR_3_NAME                     CHAR(35) CHARACTER SET ISO88591 COLLATE
>       DEFAULT DEFAULT NULL NOT SERIALIZED
>   , ACTOR_3_FACEBOOK_LIKES           INT DEFAULT NULL NOT SERIALIZED
>   , DIRECTOR_NAME                    CHAR(35) CHARACTER SET ISO88591 COLLATE
>       DEFAULT DEFAULT NULL NOT SERIALIZED
>   , DIRECTOR_FACEBOOK_LIKES          INT DEFAULT NULL NOT SERIALIZED
>   )
> I used the latest version of odb from my Windows laptop using the latest version of the JDBC driver.  Here is the load command:
> odb -u trafodion -p traf123 -d esgyndb -l src=movie_metadata.csv:tgt=TRAFODION.MOVIES.MOVIES_ORIG:skip=1:loadcmd=UL:truncate:map=movies_odb_column_map.txt
> mapping:
> COLOR:0
> DIRECTOR_NAME:1
> NUM_CRITIC_FOR_REVIEWS:2
> DURATION:3
> DIRECTOR_FACEBOOK_LIKES:4
> ACTOR_3_FACEBOOK_LIKES:5
> ACTOR_2_NAME:6
> ACTOR_1_FACEBOOK_LIKES:7
> GROSS:8
> GENRES:9
> ACTOR_1_NAME:10
> MOVIE_TITLE:11
> NUM_VOTED_USERS:12
> CAST_TOTAL_FACEBOOK_LIKES:13
> ACTOR_3_NAME:14
> FACENUMBER_IN_POSTER:15
> PLOT_KEYWORDS:16
> MOVIE_IMDB_LINK:17
> NUM_USER_FOR_REVIEWS:18
> LANGUAGE:19
> COUNTRY:20
> CONTENT_RATING:21
> BUDGET:22
> TITLE_YEAR:23
> ACTOR_2_FACEBOOK_LIKES:24
> IMDB_SCORE:25
> ASPECT_RATIO:26
> MOVIE_FACEBOOK_LIKES:27
> In the csv file you can see that the 3rd column in the first two rows is num_critic_for_reviews and has the values 723 and 302.  odb uses a column offset starting from zero, which for most folks like me is very confusing.  This column after the load has zeros in all the 10 sample rows I loaded.
> Also,  the last entry in the map list is movie_facebook_likes.  This column has the values 33,000 and 0 in the csv column.  In the table this column has the values 723 and 302.
> So, the values that should have been in the column num_critics_for_review are in movie_facebook_likes, and the num_critics_for_review has the value zero in all rows.  All other columns have the correct values in them.
> color,director_name,num_critic_for_reviews,duration,director_facebook_likes,actor_3_facebook_likes,actor_2_name,actor_1_facebook_likes,gross,genres,actor_1_name,movie_title,num_voted_users,cast_total_facebook_likes,actor_3_name,facenumber_in_poster,plot_keywords,movie_imdb_link,num_user_for_reviews,language,country,content_rating,budget,title_year,actor_2_facebook_likes,imdb_score,aspect_ratio,movie_facebook_likes
> Color,James Cameron,723,178,0,855,Joel David Moore,1000,760505847,Action|Adventure|Fantasy|Sci-Fi,CCH Pounder,AvatarĀ ,886204,4834,Wes Studi,0,avatar|future|marine|native|paraplegic,http://www.imdb.com/title/tt0499549/?ref_=fn_tt_tt_1,3054,English,USA,PG-13,237000000,2009,936,7.9,1.78,33000
> Color,Gore Verbinski,302,169,563,1000,Orlando Bloom,40000,309404152,Action|Adventure|Fantasy,Johnny Depp,Pirates of the Caribbean: At World's EndĀ ,471220,48350,Jack Davenport,0,goddess|marriage ceremony|marriage proposal|pirate|singapore,http://www.imdb.com/title/tt0449088/?ref_=fn_tt_tt_1,1238,English,USA,PG-13,300000000,2007,5000,7.1,2.35,0



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)