You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@trafodion.apache.org by "Rohit Jain (JIRA)" <ji...@apache.org> on 2017/08/15 21:59:01 UTC

[jira] [Created] (TRAFODION-2714) odb does not load data correctly

Rohit Jain created TRAFODION-2714:
-------------------------------------

             Summary: odb does not load data correctly
                 Key: TRAFODION-2714
                 URL: https://issues.apache.org/jira/browse/TRAFODION-2714
             Project: Apache Trafodion
          Issue Type: Bug
          Components: db-utility-odb
    Affects Versions: 2.3-incubating
         Environment: Internal Esgyn system nap043 and Windows laptop
            Reporter: Rohit Jain


I was trying to load a trafodion table via odb but I had rearranged the sequence of the columns to be loaded.  I used a map file to load the data.  I tried to load just 10 rows and verified that the source and target data were in fact the same, before loading the entire table.  To my surprise odb had loaded all columns correctly but two.  It had set the values of one of those two columns to zero and had set the value of the first column to what should have been the value in the other column.

When I have the columns in the table arranged in the same sequence as those in the csv file, and don't have a mapping table, since it is not needed in that case, everything loads correctly.

On nap043 in /disk1/rohit you will find movie_metadata.csv.

The table I loaded the data into is:
invoke movies_denorm;
..

-- Definition of Trafodion table TRAFODION.MOVIES.MOVIES_DENORM
-- Definition current  Tue Aug 15 13:15:13 2017

  (
    SYSKEY                           LARGEINT NO DEFAULT NOT NULL NOT DROPPABLE
      NOT SERIALIZED
  , MOVIE_TITLE                      CHAR(100) CHARACTER SET ISO88591 COLLATE
      DEFAULT DEFAULT NULL NOT SERIALIZED
  , TITLE_YEAR                       SMALLINT DEFAULT NULL NOT SERIALIZED
  , CONTENT_RATING                   CHAR(10) CHARACTER SET ISO88591 COLLATE
      DEFAULT DEFAULT NULL NOT SERIALIZED
  , IMDB_SCORE                       NUMERIC(3, 1) DEFAULT NULL NOT SERIALIZED
  , NUM_CRITIC_FOR_REVIEWS           SMALLINT DEFAULT NULL NOT SERIALIZED
  , NUM_USER_FOR_REVIEWS             SMALLINT DEFAULT NULL NOT SERIALIZED
  , NUM_VOTED_USERS                  INT DEFAULT NULL NOT SERIALIZED
  , MOVIE_FACEBOOK_LIKES             INT DEFAULT NULL NOT SERIALIZED
  , CAST_TOTAL_FACEBOOK_LIKES        INT DEFAULT NULL NOT SERIALIZED
  , DURATION                         SMALLINT DEFAULT NULL NOT SERIALIZED
  , BUDGET                           LARGEINT DEFAULT NULL NOT SERIALIZED
  , GROSS                            INT DEFAULT NULL NOT SERIALIZED
  , COLOR                            CHAR(16) CHARACTER SET ISO88591 COLLATE
      DEFAULT DEFAULT NULL NOT SERIALIZED
  , ASPECT_RATIO                     NUMERIC(4, 2) DEFAULT NULL NOT SERIALIZED
  , COUNTRY                          CHAR(20) CHARACTER SET ISO88591 COLLATE
      DEFAULT DEFAULT NULL NOT SERIALIZED
  , LANGUAGE                         CHAR(10) CHARACTER SET ISO88591 COLLATE
      DEFAULT DEFAULT NULL NOT SERIALIZED
  , FACENUMBER_IN_POSTER             SMALLINT DEFAULT NULL NOT SERIALIZED
  , GENRES                           CHAR(100) CHARACTER SET ISO88591 COLLATE
      DEFAULT DEFAULT NULL NOT SERIALIZED
  , PLOT_KEYWORDS                    CHAR(150) CHARACTER SET ISO88591 COLLATE
      DEFAULT DEFAULT NULL NOT SERIALIZED
  , MOVIE_IMDB_LINK                  CHAR(100) CHARACTER SET ISO88591 COLLATE
      DEFAULT DEFAULT NULL NOT SERIALIZED
  , ACTOR_1_NAME                     CHAR(35) CHARACTER SET ISO88591 COLLATE
      DEFAULT DEFAULT NULL NOT SERIALIZED
  , ACTOR_1_FACEBOOK_LIKES           INT DEFAULT NULL NOT SERIALIZED
  , ACTOR_2_NAME                     CHAR(35) CHARACTER SET ISO88591 COLLATE
      DEFAULT DEFAULT NULL NOT SERIALIZED
  , ACTOR_2_FACEBOOK_LIKES           INT DEFAULT NULL NOT SERIALIZED
  , ACTOR_3_NAME                     CHAR(35) CHARACTER SET ISO88591 COLLATE
      DEFAULT DEFAULT NULL NOT SERIALIZED
  , ACTOR_3_FACEBOOK_LIKES           INT DEFAULT NULL NOT SERIALIZED
  , DIRECTOR_NAME                    CHAR(35) CHARACTER SET ISO88591 COLLATE
      DEFAULT DEFAULT NULL NOT SERIALIZED
  , DIRECTOR_FACEBOOK_LIKES          INT DEFAULT NULL NOT SERIALIZED
  )

I used the latest version of odb from my Windows laptop using the latest version of the JDBC driver.  Here is the load command:

odb -u trafodion -p traf123 -d esgyndb -l src=movie_metadata.csv:tgt=TRAFODION.MOVIES.MOVIES_ORIG:skip=1:loadcmd=UL:truncate:map=movies_odb_column_map.txt

mapping:
COLOR:0
DIRECTOR_NAME:1
NUM_CRITIC_FOR_REVIEWS:2
DURATION:3
DIRECTOR_FACEBOOK_LIKES:4
ACTOR_3_FACEBOOK_LIKES:5
ACTOR_2_NAME:6
ACTOR_1_FACEBOOK_LIKES:7
GROSS:8
GENRES:9
ACTOR_1_NAME:10
MOVIE_TITLE:11
NUM_VOTED_USERS:12
CAST_TOTAL_FACEBOOK_LIKES:13
ACTOR_3_NAME:14
FACENUMBER_IN_POSTER:15
PLOT_KEYWORDS:16
MOVIE_IMDB_LINK:17
NUM_USER_FOR_REVIEWS:18
LANGUAGE:19
COUNTRY:20
CONTENT_RATING:21
BUDGET:22
TITLE_YEAR:23
ACTOR_2_FACEBOOK_LIKES:24
IMDB_SCORE:25
ASPECT_RATIO:26
MOVIE_FACEBOOK_LIKES:27

In the csv file you can see that the 3rd column in the first two rows is num_critic_for_reviews and has the values 723 and 302.  odb uses a column offset starting from zero, which for most folks like me is very confusing.  This column after the load has zeros in all the 10 sample rows I loaded.

Also,  the last entry in the map list is movie_facebook_likes.  This column has the values 33,000 and 0 in the csv column.  In the table this column has the values 723 and 302.

So, the values that should have been in the column num_critics_for_review are in movie_facebook_likes, and the num_critics_for_review has the value zero in all rows.  All other columns have the correct values in them.

color,director_name,num_critic_for_reviews,duration,director_facebook_likes,actor_3_facebook_likes,actor_2_name,actor_1_facebook_likes,gross,genres,actor_1_name,movie_title,num_voted_users,cast_total_facebook_likes,actor_3_name,facenumber_in_poster,plot_keywords,movie_imdb_link,num_user_for_reviews,language,country,content_rating,budget,title_year,actor_2_facebook_likes,imdb_score,aspect_ratio,movie_facebook_likes

Color,James Cameron,723,178,0,855,Joel David Moore,1000,760505847,Action|Adventure|Fantasy|Sci-Fi,CCH Pounder,AvatarĀ ,886204,4834,Wes Studi,0,avatar|future|marine|native|paraplegic,http://www.imdb.com/title/tt0499549/?ref_=fn_tt_tt_1,3054,English,USA,PG-13,237000000,2009,936,7.9,1.78,33000

Color,Gore Verbinski,302,169,563,1000,Orlando Bloom,40000,309404152,Action|Adventure|Fantasy,Johnny Depp,Pirates of the Caribbean: At World's EndĀ ,471220,48350,Jack Davenport,0,goddess|marriage ceremony|marriage proposal|pirate|singapore,http://www.imdb.com/title/tt0449088/?ref_=fn_tt_tt_1,1238,English,USA,PG-13,300000000,2007,5000,7.1,2.35,0




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)