You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Alex Liu (JIRA)" <ji...@apache.org> on 2013/09/30 23:36:26 UTC

[jira] [Commented] (CASSANDRA-6114) Pig with widerows=true and batch size = 1 works incorrectly

    [ https://issues.apache.org/jira/browse/CASSANDRA-6114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13782282#comment-13782282 ] 

Alex Liu commented on CASSANDRA-6114:
-------------------------------------

if widerows=true, then it uses get_paged_slice which set the page size(number of columns per page) to {code} cassandra.range.batch.size {code}, so it's not good to make it to one. For efficiency, it needs to be at least 100 which is the default number of columns per page.

We shouldn't use it combining with widerows=true. it should be at least 100.

The reason why it ends with one column is that at ColumnFamilyRecordReader
{code}
                if (wideColumns.hasNext() && wideColumns.peek().right.keySet().iterator().next().equals(lastColumn))
                    wideColumns.next();
                if (!wideColumns.hasNext())
                    rows = null;
{code}

if batch size set to 1, then wideColumns returns 1 column, which then end the iterator.

I will throw exception if the widerows=true and batch size set to 1

> Pig with widerows=true and batch size = 1 works incorrectly
> -----------------------------------------------------------
>
>                 Key: CASSANDRA-6114
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-6114
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Hadoop
>            Reporter: Alex Liu
>            Assignee: Alex Liu
>            Priority: Minor
>
> If I run the demo pig scripts, I end up with a column family with 6 fairly wide rows.  If I load and dump those rows with widerows=true or set the cassandra.range.batch.size=1, the dump returns the correct values.  However, if I set both of those, it does not.  So in the case of a batch size of 1, wide rows support is broken.
> So it's relatively simple to reproduce from the demo data:
> {code}
> grunt> SET cassandra.range.batch.size 1                                                
> grunt> rows = LOAD 'cassandra://PigDemo/Scores' using CassandraStorage();                
> grunt> dump rows;
> ...
> (sylvain,{(4,),(7,),(10,),(21,),(24,),(46,),(47,),(49,),(51,),(52,),(67,),(68,),(72,),(73,),(82,),(83,),(86,),(98,),(101,),(105,),(108,),(112,),(114,),(124,),(125,),(136,),(139,),(145,),(150,),(151,),(153,),(165,),(167,),(171,),(178,),(182,),(202,),(211,),(212,),(215,),(226,),(237,),(242,),(243,),(255,),(261,),(273,),(282,),(300,),(307,),(308,),(311,),(312,),(313,),(316,),(317,),(332,),(337,),(338,),(348,),(355,),(360,),(361,),(373,),(375,),(377,),(384,),(401,),(404,),(412,),(418,),(429,),(436,),(441,),(451,),(453,),(461,),(473,),(478,),(483,),(485,),(486,),(489,),(509,),(511,),(516,),(517,),(521,),(536,),(541,),(543,),(545,),(550,),(583,),(587,),(592,),(611,),(613,),(622,),(625,),(627,),(633,),(648,),(649,),(651,),(659,),(665,),(668,),(670,),(672,),(679,),(688,),(692,),(700,),(703,),(707,),(709,),(730,),(731,),(738,),(740,),(744,),(750,),(759,),(764,),(766,),(768,),(774,),(776,),(778,),(779,),(788,),(795,),(796,),(813,),(821,),(825,),(830,),(831,),(835,),(843,),(846,),(847,),(848,),(851,),(862,),(863,),(872,),(878,),(881,),(883,),(884,),(888,),(905,),(906,),(916,),(921,),(926,),(928,),(944,),(946,),(947,),(952,),(954,),(972,),(973,),(974,),(976,),(978,),(982,),(991,)})
> (brandon,{(6,),(7,),(14,),(15,),(25,),(36,),(37,),(38,),(46,),(53,),(57,),(65,),(74,),(75,),(84,),(91,),(104,),(120,),(128,),(137,),(148,),(159,),(171,),(174,),(176,),(179,),(183,),(192,),(195,),(201,),(205,),(210,),(216,),(222,),(223,),(243,),(255,),(264,),(271,),(287,),(290,),(308,),(309,),(326,),(343,),(347,),(356,),(359,),(360,),(363,),(367,),(368,),(378,),(398,),(400,),(402,),(410,),(412,),(419,),(427,),(429,),(447,),(449,),(462,),(464,),(468,),(470,),(472,),(480,),(482,),(506,),(511,),(520,),(521,),(522,),(524,),(535,),(548,),(553,),(565,),(569,),(571,),(573,),(575,),(583,),(584,),(595,),(597,),(606,),(608,),(634,),(646,),(650,),(654,),(667,),(673,),(677,),(686,),(690,),(692,),(713,),(715,),(721,),(723,),(736,),(737,),(752,),(753,),(758,),(759,),(764,),(766,),(767,),(776,),(778,),(786,),(812,),(816,),(818,),(823,),(826,),(832,),(838,),(842,),(860,),(873,),(879,),(918,),(919,),(935,),(941,),(942,),(948,),(956,),(961,),(966,),(973,),(974,),(977,),(979,),(983,),(984,),(986,),(995,),(997,)})
> (jake,{(1,),(7,),(10,),(14,),(29,),(52,),(54,),(65,),(67,),(78,),(82,),(83,),(89,),(97,),(100,),(115,),(126,),(140,),(141,),(145,),(214,),(221,),(230,),(231,),(232,),(241,),(245,),(247,),(265,),(266,),(269,),(271,),(282,),(286,),(288,),(299,),(316,),(323,),(331,),(332,),(335,),(338,),(348,),(353,),(355,),(364,),(367,),(371,),(379,),(398,),(409,),(420,),(428,),(429,),(439,),(443,),(450,),(454,),(467,),(477,),(482,),(488,),(490,),(502,),(503,),(512,),(520,),(521,),(535,),(536,),(541,),(548,),(552,),(557,),(560,),(596,),(600,),(604,),(606,),(611,),(613,),(621,),(624,),(630,),(635,),(641,),(647,),(655,),(660,),(665,),(674,),(676,),(690,),(693,),(694,),(704,),(719,),(720,),(724,),(731,),(749,),(751,),(763,),(765,),(767,),(771,),(779,),(782,),(784,),(789,),(793,),(797,),(798,),(801,),(802,),(806,),(820,),(825,),(839,),(845,),(848,),(856,),(865,),(866,),(867,),(870,),(876,),(887,),(891,),(901,),(905,),(908,),(922,),(929,),(944,),(960,),(964,),(980,),(988,),(996,)})
> (eric,{(14,),(17,),(23,),(25,),(26,),(34,),(42,),(43,),(57,),(64,),(68,),(80,),(88,),(93,),(100,),(114,),(131,),(132,),(134,),(143,),(146,),(147,),(156,),(157,),(170,),(171,),(172,),(177,),(186,),(197,),(198,),(206,),(209,),(223,),(224,),(233,),(236,),(241,),(251,),(252,),(255,),(263,),(266,),(267,),(268,),(272,),(277,),(280,),(289,),(293,),(294,),(297,),(301,),(306,),(310,),(312,),(321,),(326,),(333,),(334,),(335,),(345,),(357,),(362,),(363,),(370,),(380,),(389,),(392,),(393,),(401,),(420,),(431,),(462,),(464,),(465,),(471,),(484,),(486,),(490,),(493,),(504,),(505,),(509,),(515,),(521,),(534,),(538,),(547,),(554,),(557,),(561,),(564,),(572,),(573,),(578,),(582,),(584,),(590,),(598,),(599,),(603,),(605,),(609,),(618,),(634,),(636,),(639,),(648,),(656,),(661,),(667,),(671,),(674,),(675,),(687,),(713,),(721,),(733,),(736,),(763,),(767,),(776,),(785,),(787,),(809,),(813,),(826,),(829,),(830,),(832,),(840,),(841,),(844,),(846,),(854,),(855,),(876,),(890,),(892,),(902,),(910,),(930,),(934,),(938,),(940,),(943,),(955,),(959,),(965,),(966,),(968,),(972,),(980,),(985,),(989,)})
> (jonathan,{(17,),(18,),(31,),(34,),(37,),(40,),(67,),(69,),(75,),(93,),(111,),(124,),(127,),(128,),(137,),(142,),(168,),(178,),(190,),(193,),(194,),(207,),(211,),(216,),(221,),(229,),(237,),(242,),(252,),(253,),(264,),(265,),(267,),(270,),(272,),(274,),(276,),(278,),(280,),(283,),(297,),(299,),(300,),(302,),(303,),(309,),(311,),(318,),(323,),(329,),(330,),(332,),(344,),(346,),(351,),(354,),(358,),(361,),(363,),(366,),(367,),(374,),(378,),(379,),(386,),(389,),(392,),(395,),(398,),(404,),(424,),(426,),(429,),(434,),(439,),(443,),(445,),(448,),(472,),(477,),(494,),(500,),(504,),(522,),(525,),(538,),(539,),(541,),(548,),(553,),(557,),(560,),(563,),(566,),(567,),(578,),(591,),(593,),(595,),(599,),(605,),(610,),(626,),(635,),(636,),(640,),(642,),(644,),(649,),(660,),(662,),(663,),(667,),(674,),(690,),(706,),(708,),(712,),(716,),(723,),(733,),(741,),(747,),(758,),(765,),(797,),(798,),(801,),(822,),(827,),(828,),(837,),(850,),(863,),(867,),(894,),(895,),(896,),(904,),(911,),(917,),(932,),(949,),(951,),(952,),(958,),(969,),(974,),(983,),(985,),(988,),(989,),(996,),(1000,)})
> (gary,{(3,),(13,),(21,),(23,),(33,),(36,),(44,),(45,),(48,),(62,),(65,),(68,),(75,),(80,),(81,),(90,),(111,),(113,),(119,),(123,),(137,),(149,),(152,),(153,),(157,),(161,),(166,),(178,),(179,),(180,),(183,),(184,),(188,),(189,),(191,),(197,),(199,),(200,),(204,),(212,),(221,),(229,),(239,),(265,),(270,),(272,),(276,),(279,),(282,),(295,),(296,),(304,),(305,),(314,),(326,),(329,),(335,),(342,),(345,),(346,),(362,),(370,),(371,),(375,),(380,),(382,),(387,),(389,),(390,),(393,),(399,),(403,),(406,),(414,),(417,),(424,),(428,),(445,),(458,),(462,),(486,),(490,),(492,),(495,),(499,),(500,),(507,),(514,),(520,),(542,),(550,),(551,),(570,),(571,),(572,),(574,),(577,),(588,),(604,),(614,),(619,),(626,),(634,),(640,),(648,),(659,),(663,),(684,),(687,),(690,),(694,),(715,),(741,),(750,),(765,),(772,),(776,),(781,),(782,),(783,),(785,),(789,),(802,),(806,),(812,),(816,),(820,),(829,),(836,),(843,),(850,),(855,),(868,),(873,),(875,),(889,),(900,),(904,),(922,),(928,),(929,),(935,),(946,),(949,),(954,),(956,),(959,),(960,),(962,),(992,)})
> {code}
> {code}
> grunt> SET cassandra.range.batch.size 1                                                
> grunt> rows = LOAD 'cassandra://PigDemo/Scores?widerows=true' using CassandraStorage();
> grunt> dump rows;
> ...
> (jonathan,{(17,)})
> (sylvain,{(4,)})
> {code}
> When I try with set batch size to something higher than 1 and it works fine.



--
This message was sent by Atlassian JIRA
(v6.1#6144)