You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucy.apache.org by Peter Karman <pe...@peknet.com> on 2010/01/18 20:45:19 UTC

64-bit linux errors with t/core/032-string_helper.t

On 64-bit Centos 5 Linux with Perl 5.8.9 I get several failures for 
t/core/032-string_helper.t.

The issue seems to be related to the u8_t size of the i and max vars. It's as if 
they are being evaluated as signed rather than unsigned. However, even if I 
hardcode 'unsigned char' instead of the Charmonized 'u8_t' I still get the same 
error (below).

Changing to an int fixes it. But it doesn't make any sense to me why an unsigned 
char wouldn't work.

Patch after the errors.

$ perl -Mblib t/core/032-string_helper.t
<snipped passing tests>

not ok 255 - Expected '1', got '7'
     UTF8_SKIP bogus -128
not ok 256 - Expected '7', got '112'
     UTF8_TRAILING bogus -128
not ok 257 - Expected '1', got '7'
     UTF8_SKIP bogus -127
not ok 258 - Expected '7', got '117'
     UTF8_TRAILING bogus -127
not ok 259 - Expected '1', got '7'
     UTF8_SKIP bogus -126
not ok 260 - Expected '7', got '116'
     UTF8_TRAILING bogus -126
not ok 261 - Expected '1', got '7'
     UTF8_SKIP bogus -125
not ok 262 - Expected '7', got '0'
     UTF8_TRAILING bogus -125
not ok 263 - Expected '1', got '7'
     UTF8_SKIP bogus -124
not ok 264 - Expected '7', got '0'
     UTF8_TRAILING bogus -124
not ok 265 - Expected '1', got '7'
     UTF8_SKIP bogus -123
not ok 266 - Expected '7', got '0'
     UTF8_TRAILING bogus -123
not ok 267 - Expected '1', got '7'
     UTF8_SKIP bogus -122
not ok 268 - Expected '7', got '0'
     UTF8_TRAILING bogus -122
not ok 269 - Expected '1', got '7'
     UTF8_SKIP bogus -121
not ok 270 - Expected '7', got '0'
     UTF8_TRAILING bogus -121
not ok 271 - Expected '1', got '7'
     UTF8_SKIP bogus -120
not ok 272 - Expected '7', got '0'
     UTF8_TRAILING bogus -120
not ok 273 - Expected '1', got '7'
     UTF8_SKIP bogus -119
not ok 274 - Expected '7', got '0'
     UTF8_TRAILING bogus -119
not ok 275 - Expected '1', got '7'
     UTF8_SKIP bogus -118
not ok 276 - Expected '7', got '0'
     UTF8_TRAILING bogus -118
not ok 277 - Expected '1', got '7'
     UTF8_SKIP bogus -117
not ok 278 - Expected '7', got '0'
     UTF8_TRAILING bogus -117
not ok 279 - Expected '1', got '7'
     UTF8_SKIP bogus -116
not ok 280 - Expected '7', got '0'
     UTF8_TRAILING bogus -116
not ok 281 - Expected '1', got '7'
     UTF8_SKIP bogus -115
not ok 282 - Expected '7', got '0'
     UTF8_TRAILING bogus -115
not ok 283 - Expected '1', got '7'
     UTF8_SKIP bogus -114
not ok 284 - Expected '7', got '0'
     UTF8_TRAILING bogus -114
not ok 285 - Expected '1', got '7'
     UTF8_SKIP bogus -113
not ok 286 - Expected '7', got '0'
     UTF8_TRAILING bogus -113
not ok 287 - Expected '1', got '7'
     UTF8_SKIP bogus -112
not ok 288 - Expected '7', got '0'
     UTF8_TRAILING bogus -112
not ok 289 - Expected '1', got '7'
     UTF8_SKIP bogus -111
not ok 290 - Expected '7', got '0'
     UTF8_TRAILING bogus -111
not ok 291 - Expected '1', got '7'
     UTF8_SKIP bogus -110
not ok 292 - Expected '7', got '0'
     UTF8_TRAILING bogus -110
not ok 293 - Expected '1', got '7'
     UTF8_SKIP bogus -109
not ok 294 - Expected '7', got '0'
     UTF8_TRAILING bogus -109
not ok 295 - Expected '1', got '7'
     UTF8_SKIP bogus -108
not ok 296 - Expected '7', got '0'
     UTF8_TRAILING bogus -108
not ok 297 - Expected '1', got '7'
     UTF8_SKIP bogus -107
not ok 298 - Expected '7', got '0'
     UTF8_TRAILING bogus -107
not ok 299 - Expected '1', got '7'
     UTF8_SKIP bogus -106
not ok 300 - Expected '7', got '0'
     UTF8_TRAILING bogus -106
not ok 301 - Expected '1', got '7'
     UTF8_SKIP bogus -105
not ok 302 - Expected '7', got '0'
     UTF8_TRAILING bogus -105
not ok 303 - Expected '1', got '7'
     UTF8_SKIP bogus -104
not ok 304 - Expected '7', got '0'
     UTF8_TRAILING bogus -104
not ok 305 - Expected '1', got '7'
     UTF8_SKIP bogus -103
not ok 306 - Expected '7', got '0'
     UTF8_TRAILING bogus -103
not ok 307 - Expected '1', got '7'
     UTF8_SKIP bogus -102
not ok 308 - Expected '7', got '0'
     UTF8_TRAILING bogus -102
not ok 309 - Expected '1', got '7'
     UTF8_SKIP bogus -101
not ok 310 - Expected '7', got '0'
     UTF8_TRAILING bogus -101
not ok 311 - Expected '1', got '7'
     UTF8_SKIP bogus -100
not ok 312 - Expected '7', got '0'
     UTF8_TRAILING bogus -100
not ok 313 - Expected '1', got '7'
     UTF8_SKIP bogus -99
not ok 314 - Expected '7', got '0'
     UTF8_TRAILING bogus -99
not ok 315 - Expected '1', got '7'
     UTF8_SKIP bogus -98
not ok 316 - Expected '7', got '0'
     UTF8_TRAILING bogus -98
not ok 317 - Expected '1', got '7'
     UTF8_SKIP bogus -97
not ok 318 - Expected '7', got '0'
     UTF8_TRAILING bogus -97
not ok 319 - Expected '1', got '7'
     UTF8_SKIP bogus -96
not ok 320 - Expected '7', got '107'
     UTF8_TRAILING bogus -96
not ok 321 - Expected '1', got '7'
     UTF8_SKIP bogus -95
not ok 322 - Expected '7', got '105'
     UTF8_TRAILING bogus -95
not ok 323 - Expected '1', got '7'
     UTF8_SKIP bogus -94
not ok 324 - Expected '7', got '110'
     UTF8_TRAILING bogus -94
not ok 325 - Expected '1', got '7'
     UTF8_SKIP bogus -93
not ok 326 - Expected '7', got '111'
     UTF8_TRAILING bogus -93
not ok 327 - Expected '1', got '7'
     UTF8_SKIP bogus -92
not ok 328 - Expected '7', got '95'
     UTF8_TRAILING bogus -92
not ok 329 - Expected '1', got '7'
     UTF8_SKIP bogus -91
not ok 330 - Expected '7', got '73'
     UTF8_TRAILING bogus -91
not ok 331 - Expected '1', got '7'
     UTF8_SKIP bogus -90
not ok 332 - Expected '7', got '120'
     UTF8_TRAILING bogus -90
not ok 333 - Expected '1', got '7'
     UTF8_SKIP bogus -89
not ok 334 - Expected '7', got '70'
     UTF8_TRAILING bogus -89
not ok 335 - Expected '1', got '7'
     UTF8_SKIP bogus -88
not ok 336 - Expected '7', got '105'
     UTF8_TRAILING bogus -88
not ok 337 - Expected '1', got '7'
     UTF8_SKIP bogus -87
not ok 338 - Expected '7', got '108'
     UTF8_TRAILING bogus -87
not ok 339 - Expected '1', got '7'
     UTF8_SKIP bogus -86
not ok 340 - Expected '7', got '101'
     UTF8_TRAILING bogus -86
not ok 341 - Expected '1', got '7'
     UTF8_SKIP bogus -85
not ok 342 - Expected '7', got '78'
     UTF8_TRAILING bogus -85
not ok 343 - Expected '1', got '7'
     UTF8_SKIP bogus -84
not ok 344 - Expected '7', got '97'
     UTF8_TRAILING bogus -84
not ok 345 - Expected '1', got '7'
     UTF8_SKIP bogus -83
not ok 346 - Expected '7', got '109'
     UTF8_TRAILING bogus -83
not ok 347 - Expected '1', got '7'
     UTF8_SKIP bogus -82
not ok 348 - Expected '7', got '101'
     UTF8_TRAILING bogus -82
not ok 349 - Expected '1', got '7'
     UTF8_SKIP bogus -81
not ok 350 - Expected '7', got '115'
     UTF8_TRAILING bogus -81
not ok 351 - Expected '1', got '7'
     UTF8_SKIP bogus -80
not ok 352 - Expected '7', got '95'
     UTF8_TRAILING bogus -80
not ok 353 - Expected '1', got '7'
     UTF8_SKIP bogus -79
not ok 354 - Expected '7', got '108'
     UTF8_TRAILING bogus -79
not ok 355 - Expected '1', got '7'
     UTF8_SKIP bogus -78
not ok 356 - Expected '7', got '97'
     UTF8_TRAILING bogus -78
not ok 357 - Expected '1', got '7'
     UTF8_SKIP bogus -77
not ok 358 - Expected '7', got '116'
     UTF8_TRAILING bogus -77
not ok 359 - Expected '1', got '7'
     UTF8_SKIP bogus -76
not ok 360 - Expected '7', got '101'
     UTF8_TRAILING bogus -76
not ok 361 - Expected '1', got '7'
     UTF8_SKIP bogus -75
not ok 362 - Expected '7', got '115'
     UTF8_TRAILING bogus -75
not ok 363 - Expected '1', got '7'
     UTF8_SKIP bogus -74
not ok 364 - Expected '7', got '116'
     UTF8_TRAILING bogus -74
not ok 365 - Expected '1', got '7'
     UTF8_SKIP bogus -73
not ok 366 - Expected '7', got '95'
     UTF8_TRAILING bogus -73
not ok 367 - Expected '1', got '7'
     UTF8_SKIP bogus -72
not ok 368 - Expected '7', got '115'
     UTF8_TRAILING bogus -72
not ok 369 - Expected '1', got '7'
     UTF8_SKIP bogus -71
not ok 370 - Expected '7', got '110'
     UTF8_TRAILING bogus -71
not ok 371 - Expected '1', got '7'
     UTF8_SKIP bogus -70
not ok 372 - Expected '7', got '97'
     UTF8_TRAILING bogus -70
not ok 373 - Expected '1', got '7'
     UTF8_SKIP bogus -69
not ok 374 - Expected '7', got '112'
     UTF8_TRAILING bogus -69
not ok 375 - Expected '1', got '7'
     UTF8_SKIP bogus -68
not ok 376 - Expected '7', got '115'
     UTF8_TRAILING bogus -68
not ok 377 - Expected '1', got '7'
     UTF8_SKIP bogus -67
not ok 378 - Expected '7', got '104'
     UTF8_TRAILING bogus -67
not ok 379 - Expected '1', got '7'
     UTF8_SKIP bogus -66
not ok 380 - Expected '7', got '111'
     UTF8_TRAILING bogus -66
not ok 381 - Expected '1', got '7'
     UTF8_SKIP bogus -65
not ok 382 - Expected '7', got '116'
     UTF8_TRAILING bogus -65
not ok 383 - Expected '1', got '7'
     UTF8_SKIP bogus -64
not ok 384 - Expected '7', got '0'
     UTF8_TRAILING bogus -64
not ok 385 - Expected '2', got '1'
     UTF8_SKIP two-byte -62
not ok 386 - Expected '1', got '0'
     UTF8_TRAILING two-byte -62
not ok 387 - Expected '2', got '1'
     UTF8_SKIP two-byte -61
not ok 388 - Expected '1', got '0'
     UTF8_TRAILING two-byte -61
not ok 389 - Expected '2', got '1'
     UTF8_SKIP two-byte -60
not ok 390 - Expected '1', got '0'
     UTF8_TRAILING two-byte -60
not ok 391 - Expected '2', got '1'
     UTF8_SKIP two-byte -59
not ok 392 - Expected '1', got '0'
     UTF8_TRAILING two-byte -59
not ok 393 - Expected '2', got '1'
     UTF8_SKIP two-byte -58
not ok 394 - Expected '1', got '0'
     UTF8_TRAILING two-byte -58
not ok 395 - Expected '2', got '1'
     UTF8_SKIP two-byte -57
not ok 396 - Expected '1', got '0'
     UTF8_TRAILING two-byte -57
not ok 397 - Expected '2', got '1'
     UTF8_SKIP two-byte -56
not ok 398 - Expected '1', got '46'
     UTF8_TRAILING two-byte -56
not ok 399 - Expected '2', got '1'
     UTF8_SKIP two-byte -55
not ok 400 - Expected '1', got '46'
     UTF8_TRAILING two-byte -55
not ok 401 - Expected '2', got '1'
     UTF8_SKIP two-byte -54
not ok 402 - Expected '1', got '47'
     UTF8_TRAILING two-byte -54
not ok 403 - Expected '2', got '1'
     UTF8_SKIP two-byte -53
not ok 404 - Expected '1', got '99'
     UTF8_TRAILING two-byte -53
not ok 405 - Expected '2', got '1'
     UTF8_SKIP two-byte -52
not ok 406 - Expected '1', got '111'
     UTF8_TRAILING two-byte -52
not ok 407 - Expected '2', got '1'
     UTF8_SKIP two-byte -51
not ok 408 - Expected '1', got '114'
     UTF8_TRAILING two-byte -51
not ok 409 - Expected '2', got '1'
     UTF8_SKIP two-byte -50
not ok 410 - Expected '1', got '101'
     UTF8_TRAILING two-byte -50
not ok 411 - Expected '2', got '1'
     UTF8_SKIP two-byte -49
not ok 412 - Expected '1', got '47'
     UTF8_TRAILING two-byte -49
not ok 413 - Expected '2', got '1'
     UTF8_SKIP two-byte -48
not ok 414 - Expected '1', got '75'
     UTF8_TRAILING two-byte -48
not ok 415 - Expected '2', got '1'
     UTF8_SKIP two-byte -47
not ok 416 - Expected '1', got '105'
     UTF8_TRAILING two-byte -47
not ok 417 - Expected '2', got '1'
     UTF8_SKIP two-byte -46
not ok 418 - Expected '1', got '110'
     UTF8_TRAILING two-byte -46
not ok 419 - Expected '2', got '1'
     UTF8_SKIP two-byte -45
not ok 420 - Expected '1', got '111'
     UTF8_TRAILING two-byte -45
not ok 421 - Expected '2', got '1'
     UTF8_SKIP two-byte -44
not ok 422 - Expected '1', got '83'
     UTF8_TRAILING two-byte -44
not ok 423 - Expected '2', got '1'
     UTF8_SKIP two-byte -43
not ok 424 - Expected '1', got '101'
     UTF8_TRAILING two-byte -43
not ok 425 - Expected '2', got '1'
     UTF8_SKIP two-byte -42
not ok 426 - Expected '1', got '97'
     UTF8_TRAILING two-byte -42
not ok 427 - Expected '2', got '1'
     UTF8_SKIP two-byte -41
not ok 428 - Expected '1', got '114'
     UTF8_TRAILING two-byte -41
not ok 429 - Expected '2', got '1'
     UTF8_SKIP two-byte -40
not ok 430 - Expected '1', got '99'
     UTF8_TRAILING two-byte -40
not ok 431 - Expected '2', got '1'
     UTF8_SKIP two-byte -39
not ok 432 - Expected '1', got '104'
     UTF8_TRAILING two-byte -39
not ok 433 - Expected '2', got '1'
     UTF8_SKIP two-byte -38
not ok 434 - Expected '1', got '47'
     UTF8_TRAILING two-byte -38
not ok 435 - Expected '2', got '1'
     UTF8_SKIP two-byte -37
not ok 436 - Expected '1', got '85'
     UTF8_TRAILING two-byte -37
not ok 437 - Expected '2', got '1'
     UTF8_SKIP two-byte -36
not ok 438 - Expected '1', got '116'
     UTF8_TRAILING two-byte -36
not ok 439 - Expected '2', got '1'
     UTF8_SKIP two-byte -35
not ok 440 - Expected '1', got '105'
     UTF8_TRAILING two-byte -35
not ok 441 - Expected '2', got '1'
     UTF8_SKIP two-byte -34
not ok 442 - Expected '1', got '108'
     UTF8_TRAILING two-byte -34
not ok 443 - Expected '3', got '2'
     UTF8_SKIP three-byte -32
not ok 444 - Expected '2', got '73'
     UTF8_TRAILING three-byte -32
not ok 445 - Expected '3', got '2'
     UTF8_SKIP three-byte -31
not ok 446 - Expected '2', got '110'
     UTF8_TRAILING three-byte -31
not ok 447 - Expected '3', got '2'
     UTF8_SKIP three-byte -30
not ok 448 - Expected '2', got '100'
     UTF8_TRAILING three-byte -30
not ok 449 - Expected '3', got '2'
     UTF8_SKIP three-byte -29
not ok 450 - Expected '2', got '101'
     UTF8_TRAILING three-byte -29
not ok 451 - Expected '3', got '2'
     UTF8_SKIP three-byte -28
not ok 452 - Expected '2', got '120'
     UTF8_TRAILING three-byte -28
not ok 453 - Expected '3', got '2'
     UTF8_SKIP three-byte -27
not ok 454 - Expected '2', got '70'
     UTF8_TRAILING three-byte -27
not ok 455 - Expected '3', got '2'
     UTF8_SKIP three-byte -26
not ok 456 - Expected '2', got '105'
     UTF8_TRAILING three-byte -26
not ok 457 - Expected '3', got '2'
     UTF8_SKIP three-byte -25
not ok 458 - Expected '2', got '108'
     UTF8_TRAILING three-byte -25
not ok 459 - Expected '3', got '2'
     UTF8_SKIP three-byte -24
not ok 460 - Expected '2', got '101'
     UTF8_TRAILING three-byte -24
not ok 461 - Expected '3', got '2'
     UTF8_SKIP three-byte -23
not ok 462 - Expected '2', got '78'
     UTF8_TRAILING three-byte -23
not ok 463 - Expected '3', got '2'
     UTF8_SKIP three-byte -22
not ok 464 - Expected '2', got '97'
     UTF8_TRAILING three-byte -22
not ok 465 - Expected '3', got '2'
     UTF8_SKIP three-byte -21
not ok 466 - Expected '2', got '109'
     UTF8_TRAILING three-byte -21
not ok 467 - Expected '3', got '2'
     UTF8_SKIP three-byte -20
not ok 468 - Expected '2', got '101'
     UTF8_TRAILING three-byte -20
not ok 469 - Expected '3', got '2'
     UTF8_SKIP three-byte -19
not ok 470 - Expected '2', got '115'
     UTF8_TRAILING three-byte -19
not ok 471 - Expected '3', got '2'
     UTF8_SKIP three-byte -18
not ok 472 - Expected '2', got '46'
     UTF8_TRAILING three-byte -18
not ok 473 - Expected '4', got '3'
     UTF8_SKIP four-byte -16
not ok 474 - Expected '3', got '0'
     UTF8_TRAILING four-byte -16
not ok 475 - Expected '4', got '3'
     UTF8_SKIP four-byte -15
not ok 476 - Expected '3', got '0'
     UTF8_TRAILING four-byte -15
not ok 477 - Expected '4', got '3'
     UTF8_SKIP four-byte -14
not ok 478 - Expected '3', got '0'
     UTF8_TRAILING four-byte -14
not ok 479 - Expected '4', got '3'
     UTF8_SKIP four-byte -13
not ok 480 - Expected '3', got '0'
     UTF8_TRAILING four-byte -13
ok 481 - UTF8_SKIP bogus but no memory problems -11
not ok 482 - UTF8_TRAILING bogus but no memory problems -11
ok 483 - UTF8_SKIP bogus but no memory problems -10
not ok 484 - UTF8_TRAILING bogus but no memory problems -10
ok 485 - UTF8_SKIP bogus but no memory problems -9
not ok 486 - UTF8_TRAILING bogus but no memory problems -9
ok 487 - UTF8_SKIP bogus but no memory problems -8
not ok 488 - UTF8_TRAILING bogus but no memory problems -8
ok 489 - UTF8_SKIP bogus but no memory problems -7
not ok 490 - UTF8_TRAILING bogus but no memory problems -7
ok 491 - UTF8_SKIP bogus but no memory problems -6
not ok 492 - UTF8_TRAILING bogus but no memory problems -6
ok 493 - UTF8_SKIP bogus but no memory problems -5
not ok 494 - UTF8_TRAILING bogus but no memory problems -5
ok 495 - UTF8_SKIP bogus but no memory problems -4
not ok 496 - UTF8_TRAILING bogus but no memory problems -4
ok 497 - UTF8_SKIP bogus but no memory problems -3
not ok 498 - UTF8_TRAILING bogus but no memory problems -3
ok 499 - UTF8_SKIP bogus but no memory problems -2
not ok 500 - UTF8_TRAILING bogus but no memory problems -2


-------
patch.

Index: core/KinoSearch/Test/Util/TestStringHelper.c
===================================================================
--- core/KinoSearch/Test/Util/TestStringHelper.c	(revision 5705)
+++ core/KinoSearch/Test/Util/TestStringHelper.c	(working copy)
@@ -7,46 +7,46 @@
  static void
  test_SKIP_and_TRAILING(TestBatch *batch)
  {
-    u8_t i, max;
+    int i, max;

      /* Some of the upper max boundaries are skipped (e.g. 127)
       * because they may not appear as initial bytes in legal UTF-8.
       */
      for (i = 0, max = 127; i < max; i++) {
          ASSERT_INT_EQ(batch, StrHelp_UTF8_SKIP[i], 1,
-            "UTF8_SKIP ascii %d", (int)i);
+            "UTF8_SKIP ascii %d", i);
          ASSERT_INT_EQ(batch, StrHelp_UTF8_TRAILING[i], 0,
-            "UTF8_TRAILING ascii %d", (int)i);
+            "UTF8_TRAILING ascii %d", i);
      }
      for (i = 128, max = 193; i < max; i++) {
          ASSERT_INT_EQ(batch, StrHelp_UTF8_SKIP[i], 1,
-            "UTF8_SKIP bogus %d", (int)i);
+            "UTF8_SKIP bogus %d", i);
          ASSERT_INT_EQ(batch, StrHelp_UTF8_TRAILING[i], 7,
-            "UTF8_TRAILING bogus %d", (int)i);
+            "UTF8_TRAILING bogus %d", i);
      }
      for (i = 194, max = 223; i < max; i++) {
          ASSERT_INT_EQ(batch, StrHelp_UTF8_SKIP[i], 2,
-            "UTF8_SKIP two-byte %d", (int)i);
+            "UTF8_SKIP two-byte %d", i);
          ASSERT_INT_EQ(batch, StrHelp_UTF8_TRAILING[i], 1,
-            "UTF8_TRAILING two-byte %d", (int)i);
+            "UTF8_TRAILING two-byte %d", i);
      }
      for (i = 224, max = 239; i < max; i++) {
          ASSERT_INT_EQ(batch, StrHelp_UTF8_SKIP[i], 3,
-            "UTF8_SKIP three-byte %d", (int)i);
+            "UTF8_SKIP three-byte %d", i);
          ASSERT_INT_EQ(batch, StrHelp_UTF8_TRAILING[i], 2,
-            "UTF8_TRAILING three-byte %d", (int)i);
+            "UTF8_TRAILING three-byte %d", i);
      }
      for (i = 240, max = 244; i < max; i++) {
          ASSERT_INT_EQ(batch, StrHelp_UTF8_SKIP[i], 4,
-            "UTF8_SKIP four-byte %d", (int)i);
+            "UTF8_SKIP four-byte %d", i);
          ASSERT_INT_EQ(batch, StrHelp_UTF8_TRAILING[i], 3,
-            "UTF8_TRAILING four-byte %d", (int)i);
+            "UTF8_TRAILING four-byte %d", i);
      }
      for (i = 245, max = 255; i < max; i++) {
-        ASSERT_TRUE(batch, StrHelp_UTF8_SKIP[i] > 0,
-            "UTF8_SKIP bogus but no memory problems %d", (int)i);
-        ASSERT_TRUE(batch, StrHelp_UTF8_TRAILING[i] == 7,
-            "UTF8_TRAILING bogus but no memory problems %d", (int)i);
+        ASSERT_TRUE(batch, (StrHelp_UTF8_SKIP[i] > 0),
+            "UTF8_SKIP bogus but no memory problems %d", i);
+        ASSERT_INT_EQ(batch, StrHelp_UTF8_TRAILING[i], 7,
+            "UTF8_TRAILING bogus but no memory problems %d", i);
      }
  }

-- 
Peter Karman  .  http://peknet.com/  .  peter@peknet.com

Re: 64-bit linux errors with t/core/032-string_helper.t

Posted by Peter Karman <pe...@peknet.com>.
Marvin Humphrey wrote on 1/18/10 11:05 PM:
> On Mon, Jan 18, 2010 at 10:34:46PM -0600, Peter Karman wrote:
> 
>> alas, that didn't change the output.
> 
> Well, I don't really understand what's going on at this point.
> 

I take some small solace in our mutual confusion. :/

> Changing the test to use an int doesn't address the problem, because the whole
> point of UTF8_SKIP is to use it on header bytes in a UTF-8 sequence, which
> will be unsigned char.

agreed.

> 
> It seems messed up that an unsigned type gets promoted to a negative signed
> type when used as an array subscript.  You're not supposed to use "char" on
> its own as an array subscript, because whether "char" is signed or unsigned is
> implementation defined, but either "signed char" or "unsigned char" are
> allowed.

agreed.

> 
> So the next steps are investigatory.  What are the typedefs for u8_t in
> charmony.h?  

[pkarman@pijdev02:/tmp/ks-trunk/perl]$ grep u8_t charmony.h
typedef unsigned char chy_u8_t;
   #define u8_t chy_u8_t

 > What happens when we swap out u8_t for uint8_t when declaring
 > UTF8_SKIP in StringHelper.bp?  Does uint8_t work as a subscript?
 >

No. Same thing (negative subscripts when > 127).

Here's some more evidence. I added this to the test just for some sanity 
checking. Output below.

     for (i=0, max=255; i < max; i++) {
         printf("i == %d\n", i);
         printf("UTF8_SKIP[%d] == %d\n", i, StrHelp_UTF8_SKIP[i]);
     }
     for (i=0, max=255; i < max; i++) {
         printf("i == %d\n", i);
         printf("UTF8_TRAILING[%d] == %d\n", i, StrHelp_UTF8_TRAILING[i]);
     }
     for (i=0, max=255; i < max; i++) {
         printf("i == %d\n", i);
         printf("UTF8_SKIP[%d] == %d\n", i, StrHelp_UTF8_SKIP[i]);
         printf("UTF8_TRAILING[%d] == %d\n", i, StrHelp_UTF8_TRAILING[i]);
         printf("i == %d\n", i);
     }
     printf("finally: i == %d\n", i);


outputs:

i == 0
UTF8_SKIP[0] == 1
i == 1
UTF8_SKIP[1] == 1
i == 2
UTF8_SKIP[2] == 1
i == 3
UTF8_SKIP[3] == 1
i == 4
UTF8_SKIP[4] == 1
i == 5
UTF8_SKIP[5] == 1
i == 6
UTF8_SKIP[6] == 1
i == 7
UTF8_SKIP[7] == 1
i == 8
UTF8_SKIP[8] == 1
i == 9
UTF8_SKIP[9] == 1
i == 10
UTF8_SKIP[10] == 1
i == 11
UTF8_SKIP[11] == 1
i == 12
UTF8_SKIP[12] == 1
i == 13
UTF8_SKIP[13] == 1
i == 14
UTF8_SKIP[14] == 1
i == 15
UTF8_SKIP[15] == 1
i == 16
UTF8_SKIP[16] == 1
i == 17
UTF8_SKIP[17] == 1
i == 18
UTF8_SKIP[18] == 1
i == 19
UTF8_SKIP[19] == 1
i == 20
UTF8_SKIP[20] == 1
i == 21
UTF8_SKIP[21] == 1
i == 22
UTF8_SKIP[22] == 1
i == 23
UTF8_SKIP[23] == 1
i == 24
UTF8_SKIP[24] == 1
i == 25
UTF8_SKIP[25] == 1
i == 26
UTF8_SKIP[26] == 1
i == 27
UTF8_SKIP[27] == 1
i == 28
UTF8_SKIP[28] == 1
i == 29
UTF8_SKIP[29] == 1
i == 30
UTF8_SKIP[30] == 1
i == 31
UTF8_SKIP[31] == 1
i == 32
UTF8_SKIP[32] == 1
i == 33
UTF8_SKIP[33] == 1
i == 34
UTF8_SKIP[34] == 1
i == 35
UTF8_SKIP[35] == 1
i == 36
UTF8_SKIP[36] == 1
i == 37
UTF8_SKIP[37] == 1
i == 38
UTF8_SKIP[38] == 1
i == 39
UTF8_SKIP[39] == 1
i == 40
UTF8_SKIP[40] == 1
i == 41
UTF8_SKIP[41] == 1
i == 42
UTF8_SKIP[42] == 1
i == 43
UTF8_SKIP[43] == 1
i == 44
UTF8_SKIP[44] == 1
i == 45
UTF8_SKIP[45] == 1
i == 46
UTF8_SKIP[46] == 1
i == 47
UTF8_SKIP[47] == 1
i == 48
UTF8_SKIP[48] == 1
i == 49
UTF8_SKIP[49] == 1
i == 50
UTF8_SKIP[50] == 1
i == 51
UTF8_SKIP[51] == 1
i == 52
UTF8_SKIP[52] == 1
i == 53
UTF8_SKIP[53] == 1
i == 54
UTF8_SKIP[54] == 1
i == 55
UTF8_SKIP[55] == 1
i == 56
UTF8_SKIP[56] == 1
i == 57
UTF8_SKIP[57] == 1
i == 58
UTF8_SKIP[58] == 1
i == 59
UTF8_SKIP[59] == 1
i == 60
UTF8_SKIP[60] == 1
i == 61
UTF8_SKIP[61] == 1
i == 62
UTF8_SKIP[62] == 1
i == 63
UTF8_SKIP[63] == 1
i == 64
UTF8_SKIP[64] == 1
i == 65
UTF8_SKIP[65] == 1
i == 66
UTF8_SKIP[66] == 1
i == 67
UTF8_SKIP[67] == 1
i == 68
UTF8_SKIP[68] == 1
i == 69
UTF8_SKIP[69] == 1
i == 70
UTF8_SKIP[70] == 1
i == 71
UTF8_SKIP[71] == 1
i == 72
UTF8_SKIP[72] == 1
i == 73
UTF8_SKIP[73] == 1
i == 74
UTF8_SKIP[74] == 1
i == 75
UTF8_SKIP[75] == 1
i == 76
UTF8_SKIP[76] == 1
i == 77
UTF8_SKIP[77] == 1
i == 78
UTF8_SKIP[78] == 1
i == 79
UTF8_SKIP[79] == 1
i == 80
UTF8_SKIP[80] == 1
i == 81
UTF8_SKIP[81] == 1
i == 82
UTF8_SKIP[82] == 1
i == 83
UTF8_SKIP[83] == 1
i == 84
UTF8_SKIP[84] == 1
i == 85
UTF8_SKIP[85] == 1
i == 86
UTF8_SKIP[86] == 1
i == 87
UTF8_SKIP[87] == 1
i == 88
UTF8_SKIP[88] == 1
i == 89
UTF8_SKIP[89] == 1
i == 90
UTF8_SKIP[90] == 1
i == 91
UTF8_SKIP[91] == 1
i == 92
UTF8_SKIP[92] == 1
i == 93
UTF8_SKIP[93] == 1
i == 94
UTF8_SKIP[94] == 1
i == 95
UTF8_SKIP[95] == 1
i == 96
UTF8_SKIP[96] == 1
i == 97
UTF8_SKIP[97] == 1
i == 98
UTF8_SKIP[98] == 1
i == 99
UTF8_SKIP[99] == 1
i == 100
UTF8_SKIP[100] == 1
i == 101
UTF8_SKIP[101] == 1
i == 102
UTF8_SKIP[102] == 1
i == 103
UTF8_SKIP[103] == 1
i == 104
UTF8_SKIP[104] == 1
i == 105
UTF8_SKIP[105] == 1
i == 106
UTF8_SKIP[106] == 1
i == 107
UTF8_SKIP[107] == 1
i == 108
UTF8_SKIP[108] == 1
i == 109
UTF8_SKIP[109] == 1
i == 110
UTF8_SKIP[110] == 1
i == 111
UTF8_SKIP[111] == 1
i == 112
UTF8_SKIP[112] == 1
i == 113
UTF8_SKIP[113] == 1
i == 114
UTF8_SKIP[114] == 1
i == 115
UTF8_SKIP[115] == 1
i == 116
UTF8_SKIP[116] == 1
i == 117
UTF8_SKIP[117] == 1
i == 118
UTF8_SKIP[118] == 1
i == 119
UTF8_SKIP[119] == 1
i == 120
UTF8_SKIP[120] == 1
i == 121
UTF8_SKIP[121] == 1
i == 122
UTF8_SKIP[122] == 1
i == 123
UTF8_SKIP[123] == 1
i == 124
UTF8_SKIP[124] == 1
i == 125
UTF8_SKIP[125] == 1
i == 126
UTF8_SKIP[126] == 1
i == 127
UTF8_SKIP[127] == 1
i == 128
UTF8_SKIP[128] == 1
i == 129
UTF8_SKIP[129] == 1
i == 130
UTF8_SKIP[130] == 1
i == 131
UTF8_SKIP[131] == 1
i == 132
UTF8_SKIP[132] == 1
i == 133
UTF8_SKIP[133] == 1
i == 134
UTF8_SKIP[134] == 1
i == 135
UTF8_SKIP[135] == 1
i == 136
UTF8_SKIP[136] == 1
i == 137
UTF8_SKIP[137] == 1
i == 138
UTF8_SKIP[138] == 1
i == 139
UTF8_SKIP[139] == 1
i == 140
UTF8_SKIP[140] == 1
i == 141
UTF8_SKIP[141] == 1
i == 142
UTF8_SKIP[142] == 1
i == 143
UTF8_SKIP[143] == 1
i == 144
UTF8_SKIP[144] == 1
i == 145
UTF8_SKIP[145] == 1
i == 146
UTF8_SKIP[146] == 1
i == 147
UTF8_SKIP[147] == 1
i == 148
UTF8_SKIP[148] == 1
i == 149
UTF8_SKIP[149] == 1
i == 150
UTF8_SKIP[150] == 1
i == 151
UTF8_SKIP[151] == 1
i == 152
UTF8_SKIP[152] == 1
i == 153
UTF8_SKIP[153] == 1
i == 154
UTF8_SKIP[154] == 1
i == 155
UTF8_SKIP[155] == 1
i == 156
UTF8_SKIP[156] == 1
i == 157
UTF8_SKIP[157] == 1
i == 158
UTF8_SKIP[158] == 1
i == 159
UTF8_SKIP[159] == 1
i == 160
UTF8_SKIP[160] == 1
i == 161
UTF8_SKIP[161] == 1
i == 162
UTF8_SKIP[162] == 1
i == 163
UTF8_SKIP[163] == 1
i == 164
UTF8_SKIP[164] == 1
i == 165
UTF8_SKIP[165] == 1
i == 166
UTF8_SKIP[166] == 1
i == 167
UTF8_SKIP[167] == 1
i == 168
UTF8_SKIP[168] == 1
i == 169
UTF8_SKIP[169] == 1
i == 170
UTF8_SKIP[170] == 1
i == 171
UTF8_SKIP[171] == 1
i == 172
UTF8_SKIP[172] == 1
i == 173
UTF8_SKIP[173] == 1
i == 174
UTF8_SKIP[174] == 1
i == 175
UTF8_SKIP[175] == 1
i == 176
UTF8_SKIP[176] == 1
i == 177
UTF8_SKIP[177] == 1
i == 178
UTF8_SKIP[178] == 1
i == 179
UTF8_SKIP[179] == 1
i == 180
UTF8_SKIP[180] == 1
i == 181
UTF8_SKIP[181] == 1
i == 182
UTF8_SKIP[182] == 1
i == 183
UTF8_SKIP[183] == 1
i == 184
UTF8_SKIP[184] == 1
i == 185
UTF8_SKIP[185] == 1
i == 186
UTF8_SKIP[186] == 1
i == 187
UTF8_SKIP[187] == 1
i == 188
UTF8_SKIP[188] == 1
i == 189
UTF8_SKIP[189] == 1
i == 190
UTF8_SKIP[190] == 1
i == 191
UTF8_SKIP[191] == 1
i == 192
UTF8_SKIP[192] == 1
i == 193
UTF8_SKIP[193] == 2
i == 194
UTF8_SKIP[194] == 2
i == 195
UTF8_SKIP[195] == 2
i == 196
UTF8_SKIP[196] == 2
i == 197
UTF8_SKIP[197] == 2
i == 198
UTF8_SKIP[198] == 2
i == 199
UTF8_SKIP[199] == 2
i == 200
UTF8_SKIP[200] == 2
i == 201
UTF8_SKIP[201] == 2
i == 202
UTF8_SKIP[202] == 2
i == 203
UTF8_SKIP[203] == 2
i == 204
UTF8_SKIP[204] == 2
i == 205
UTF8_SKIP[205] == 2
i == 206
UTF8_SKIP[206] == 2
i == 207
UTF8_SKIP[207] == 2
i == 208
UTF8_SKIP[208] == 2
i == 209
UTF8_SKIP[209] == 2
i == 210
UTF8_SKIP[210] == 2
i == 211
UTF8_SKIP[211] == 2
i == 212
UTF8_SKIP[212] == 2
i == 213
UTF8_SKIP[213] == 2
i == 214
UTF8_SKIP[214] == 2
i == 215
UTF8_SKIP[215] == 2
i == 216
UTF8_SKIP[216] == 2
i == 217
UTF8_SKIP[217] == 2
i == 218
UTF8_SKIP[218] == 2
i == 219
UTF8_SKIP[219] == 2
i == 220
UTF8_SKIP[220] == 2
i == 221
UTF8_SKIP[221] == 2
i == 222
UTF8_SKIP[222] == 2
i == 223
UTF8_SKIP[223] == 2
i == 224
UTF8_SKIP[224] == 3
i == 225
UTF8_SKIP[225] == 3
i == 226
UTF8_SKIP[226] == 3
i == 227
UTF8_SKIP[227] == 3
i == 228
UTF8_SKIP[228] == 3
i == 229
UTF8_SKIP[229] == 3
i == 230
UTF8_SKIP[230] == 3
i == 231
UTF8_SKIP[231] == 3
i == 232
UTF8_SKIP[232] == 3
i == 233
UTF8_SKIP[233] == 3
i == 234
UTF8_SKIP[234] == 3
i == 235
UTF8_SKIP[235] == 3
i == 236
UTF8_SKIP[236] == 3
i == 237
UTF8_SKIP[237] == 3
i == 238
UTF8_SKIP[238] == 3
i == 239
UTF8_SKIP[239] == 3
i == 240
UTF8_SKIP[240] == 4
i == 241
UTF8_SKIP[241] == 4
i == 242
UTF8_SKIP[242] == 4
i == 243
UTF8_SKIP[243] == 4
i == 244
UTF8_SKIP[244] == 4
i == 245
UTF8_SKIP[245] == 4
i == 246
UTF8_SKIP[246] == 4
i == 247
UTF8_SKIP[247] == 4
i == 248
UTF8_SKIP[248] == 5
i == 249
UTF8_SKIP[249] == 5
i == 250
UTF8_SKIP[250] == 5
i == 251
UTF8_SKIP[251] == 5
i == 252
UTF8_SKIP[252] == 6
i == 253
UTF8_SKIP[253] == 6
i == 254
UTF8_SKIP[254] == 7
i == 0
UTF8_TRAILING[0] == 0
i == 1
UTF8_TRAILING[1] == 0
i == 2
UTF8_TRAILING[2] == 0
i == 3
UTF8_TRAILING[3] == 0
i == 4
UTF8_TRAILING[4] == 0
i == 5
UTF8_TRAILING[5] == 0
i == 6
UTF8_TRAILING[6] == 0
i == 7
UTF8_TRAILING[7] == 0
i == 8
UTF8_TRAILING[8] == 0
i == 9
UTF8_TRAILING[9] == 0
i == 10
UTF8_TRAILING[10] == 0
i == 11
UTF8_TRAILING[11] == 0
i == 12
UTF8_TRAILING[12] == 0
i == 13
UTF8_TRAILING[13] == 0
i == 14
UTF8_TRAILING[14] == 0
i == 15
UTF8_TRAILING[15] == 0
i == 16
UTF8_TRAILING[16] == 0
i == 17
UTF8_TRAILING[17] == 0
i == 18
UTF8_TRAILING[18] == 0
i == 19
UTF8_TRAILING[19] == 0
i == 20
UTF8_TRAILING[20] == 0
i == 21
UTF8_TRAILING[21] == 0
i == 22
UTF8_TRAILING[22] == 0
i == 23
UTF8_TRAILING[23] == 0
i == 24
UTF8_TRAILING[24] == 0
i == 25
UTF8_TRAILING[25] == 0
i == 26
UTF8_TRAILING[26] == 0
i == 27
UTF8_TRAILING[27] == 0
i == 28
UTF8_TRAILING[28] == 0
i == 29
UTF8_TRAILING[29] == 0
i == 30
UTF8_TRAILING[30] == 0
i == 31
UTF8_TRAILING[31] == 0
i == 32
UTF8_TRAILING[32] == 0
i == 33
UTF8_TRAILING[33] == 0
i == 34
UTF8_TRAILING[34] == 0
i == 35
UTF8_TRAILING[35] == 0
i == 36
UTF8_TRAILING[36] == 0
i == 37
UTF8_TRAILING[37] == 0
i == 38
UTF8_TRAILING[38] == 0
i == 39
UTF8_TRAILING[39] == 0
i == 40
UTF8_TRAILING[40] == 0
i == 41
UTF8_TRAILING[41] == 0
i == 42
UTF8_TRAILING[42] == 0
i == 43
UTF8_TRAILING[43] == 0
i == 44
UTF8_TRAILING[44] == 0
i == 45
UTF8_TRAILING[45] == 0
i == 46
UTF8_TRAILING[46] == 0
i == 47
UTF8_TRAILING[47] == 0
i == 48
UTF8_TRAILING[48] == 0
i == 49
UTF8_TRAILING[49] == 0
i == 50
UTF8_TRAILING[50] == 0
i == 51
UTF8_TRAILING[51] == 0
i == 52
UTF8_TRAILING[52] == 0
i == 53
UTF8_TRAILING[53] == 0
i == 54
UTF8_TRAILING[54] == 0
i == 55
UTF8_TRAILING[55] == 0
i == 56
UTF8_TRAILING[56] == 0
i == 57
UTF8_TRAILING[57] == 0
i == 58
UTF8_TRAILING[58] == 0
i == 59
UTF8_TRAILING[59] == 0
i == 60
UTF8_TRAILING[60] == 0
i == 61
UTF8_TRAILING[61] == 0
i == 62
UTF8_TRAILING[62] == 0
i == 63
UTF8_TRAILING[63] == 0
i == 64
UTF8_TRAILING[64] == 0
i == 65
UTF8_TRAILING[65] == 0
i == 66
UTF8_TRAILING[66] == 0
i == 67
UTF8_TRAILING[67] == 0
i == 68
UTF8_TRAILING[68] == 0
i == 69
UTF8_TRAILING[69] == 0
i == 70
UTF8_TRAILING[70] == 0
i == 71
UTF8_TRAILING[71] == 0
i == 72
UTF8_TRAILING[72] == 0
i == 73
UTF8_TRAILING[73] == 0
i == 74
UTF8_TRAILING[74] == 0
i == 75
UTF8_TRAILING[75] == 0
i == 76
UTF8_TRAILING[76] == 0
i == 77
UTF8_TRAILING[77] == 0
i == 78
UTF8_TRAILING[78] == 0
i == 79
UTF8_TRAILING[79] == 0
i == 80
UTF8_TRAILING[80] == 0
i == 81
UTF8_TRAILING[81] == 0
i == 82
UTF8_TRAILING[82] == 0
i == 83
UTF8_TRAILING[83] == 0
i == 84
UTF8_TRAILING[84] == 0
i == 85
UTF8_TRAILING[85] == 0
i == 86
UTF8_TRAILING[86] == 0
i == 87
UTF8_TRAILING[87] == 0
i == 88
UTF8_TRAILING[88] == 0
i == 89
UTF8_TRAILING[89] == 0
i == 90
UTF8_TRAILING[90] == 0
i == 91
UTF8_TRAILING[91] == 0
i == 92
UTF8_TRAILING[92] == 0
i == 93
UTF8_TRAILING[93] == 0
i == 94
UTF8_TRAILING[94] == 0
i == 95
UTF8_TRAILING[95] == 0
i == 96
UTF8_TRAILING[96] == 0
i == 97
UTF8_TRAILING[97] == 0
i == 98
UTF8_TRAILING[98] == 0
i == 99
UTF8_TRAILING[99] == 0
i == 100
UTF8_TRAILING[100] == 0
i == 101
UTF8_TRAILING[101] == 0
i == 102
UTF8_TRAILING[102] == 0
i == 103
UTF8_TRAILING[103] == 0
i == 104
UTF8_TRAILING[104] == 0
i == 105
UTF8_TRAILING[105] == 0
i == 106
UTF8_TRAILING[106] == 0
i == 107
UTF8_TRAILING[107] == 0
i == 108
UTF8_TRAILING[108] == 0
i == 109
UTF8_TRAILING[109] == 0
i == 110
UTF8_TRAILING[110] == 0
i == 111
UTF8_TRAILING[111] == 0
i == 112
UTF8_TRAILING[112] == 0
i == 113
UTF8_TRAILING[113] == 0
i == 114
UTF8_TRAILING[114] == 0
i == 115
UTF8_TRAILING[115] == 0
i == 116
UTF8_TRAILING[116] == 0
i == 117
UTF8_TRAILING[117] == 0
i == 118
UTF8_TRAILING[118] == 0
i == 119
UTF8_TRAILING[119] == 0
i == 120
UTF8_TRAILING[120] == 0
i == 121
UTF8_TRAILING[121] == 0
i == 122
UTF8_TRAILING[122] == 0
i == 123
UTF8_TRAILING[123] == 0
i == 124
UTF8_TRAILING[124] == 0
i == 125
UTF8_TRAILING[125] == 0
i == 126
UTF8_TRAILING[126] == 0
i == 127
UTF8_TRAILING[127] == 0
i == 128
UTF8_TRAILING[128] == 7
i == 129
UTF8_TRAILING[129] == 7
i == 130
UTF8_TRAILING[130] == 7
i == 131
UTF8_TRAILING[131] == 7
i == 132
UTF8_TRAILING[132] == 7
i == 133
UTF8_TRAILING[133] == 7
i == 134
UTF8_TRAILING[134] == 7
i == 135
UTF8_TRAILING[135] == 7
i == 136
UTF8_TRAILING[136] == 7
i == 137
UTF8_TRAILING[137] == 7
i == 138
UTF8_TRAILING[138] == 7
i == 139
UTF8_TRAILING[139] == 7
i == 140
UTF8_TRAILING[140] == 7
i == 141
UTF8_TRAILING[141] == 7
i == 142
UTF8_TRAILING[142] == 7
i == 143
UTF8_TRAILING[143] == 7
i == 144
UTF8_TRAILING[144] == 7
i == 145
UTF8_TRAILING[145] == 7
i == 146
UTF8_TRAILING[146] == 7
i == 147
UTF8_TRAILING[147] == 7
i == 148
UTF8_TRAILING[148] == 7
i == 149
UTF8_TRAILING[149] == 7
i == 150
UTF8_TRAILING[150] == 7
i == 151
UTF8_TRAILING[151] == 7
i == 152
UTF8_TRAILING[152] == 7
i == 153
UTF8_TRAILING[153] == 7
i == 154
UTF8_TRAILING[154] == 7
i == 155
UTF8_TRAILING[155] == 7
i == 156
UTF8_TRAILING[156] == 7
i == 157
UTF8_TRAILING[157] == 7
i == 158
UTF8_TRAILING[158] == 7
i == 159
UTF8_TRAILING[159] == 7
i == 160
UTF8_TRAILING[160] == 7
i == 161
UTF8_TRAILING[161] == 7
i == 162
UTF8_TRAILING[162] == 7
i == 163
UTF8_TRAILING[163] == 7
i == 164
UTF8_TRAILING[164] == 7
i == 165
UTF8_TRAILING[165] == 7
i == 166
UTF8_TRAILING[166] == 7
i == 167
UTF8_TRAILING[167] == 7
i == 168
UTF8_TRAILING[168] == 7
i == 169
UTF8_TRAILING[169] == 7
i == 170
UTF8_TRAILING[170] == 7
i == 171
UTF8_TRAILING[171] == 7
i == 172
UTF8_TRAILING[172] == 7
i == 173
UTF8_TRAILING[173] == 7
i == 174
UTF8_TRAILING[174] == 7
i == 175
UTF8_TRAILING[175] == 7
i == 176
UTF8_TRAILING[176] == 7
i == 177
UTF8_TRAILING[177] == 7
i == 178
UTF8_TRAILING[178] == 7
i == 179
UTF8_TRAILING[179] == 7
i == 180
UTF8_TRAILING[180] == 7
i == 181
UTF8_TRAILING[181] == 7
i == 182
UTF8_TRAILING[182] == 7
i == 183
UTF8_TRAILING[183] == 7
i == 184
UTF8_TRAILING[184] == 7
i == 185
UTF8_TRAILING[185] == 7
i == 186
UTF8_TRAILING[186] == 7
i == 187
UTF8_TRAILING[187] == 7
i == 188
UTF8_TRAILING[188] == 7
i == 189
UTF8_TRAILING[189] == 7
i == 190
UTF8_TRAILING[190] == 7
i == 191
UTF8_TRAILING[191] == 7
i == 192
UTF8_TRAILING[192] == 7
i == 193
UTF8_TRAILING[193] == 1
i == 194
UTF8_TRAILING[194] == 1
i == 195
UTF8_TRAILING[195] == 1
i == 196
UTF8_TRAILING[196] == 1
i == 197
UTF8_TRAILING[197] == 1
i == 198
UTF8_TRAILING[198] == 1
i == 199
UTF8_TRAILING[199] == 1
i == 200
UTF8_TRAILING[200] == 1
i == 201
UTF8_TRAILING[201] == 1
i == 202
UTF8_TRAILING[202] == 1
i == 203
UTF8_TRAILING[203] == 1
i == 204
UTF8_TRAILING[204] == 1
i == 205
UTF8_TRAILING[205] == 1
i == 206
UTF8_TRAILING[206] == 1
i == 207
UTF8_TRAILING[207] == 1
i == 208
UTF8_TRAILING[208] == 1
i == 209
UTF8_TRAILING[209] == 1
i == 210
UTF8_TRAILING[210] == 1
i == 211
UTF8_TRAILING[211] == 1
i == 212
UTF8_TRAILING[212] == 1
i == 213
UTF8_TRAILING[213] == 1
i == 214
UTF8_TRAILING[214] == 1
i == 215
UTF8_TRAILING[215] == 1
i == 216
UTF8_TRAILING[216] == 1
i == 217
UTF8_TRAILING[217] == 1
i == 218
UTF8_TRAILING[218] == 1
i == 219
UTF8_TRAILING[219] == 1
i == 220
UTF8_TRAILING[220] == 1
i == 221
UTF8_TRAILING[221] == 1
i == 222
UTF8_TRAILING[222] == 1
i == 223
UTF8_TRAILING[223] == 1
i == 224
UTF8_TRAILING[224] == 2
i == 225
UTF8_TRAILING[225] == 2
i == 226
UTF8_TRAILING[226] == 2
i == 227
UTF8_TRAILING[227] == 2
i == 228
UTF8_TRAILING[228] == 2
i == 229
UTF8_TRAILING[229] == 2
i == 230
UTF8_TRAILING[230] == 2
i == 231
UTF8_TRAILING[231] == 2
i == 232
UTF8_TRAILING[232] == 2
i == 233
UTF8_TRAILING[233] == 2
i == 234
UTF8_TRAILING[234] == 2
i == 235
UTF8_TRAILING[235] == 2
i == 236
UTF8_TRAILING[236] == 2
i == 237
UTF8_TRAILING[237] == 2
i == 238
UTF8_TRAILING[238] == 2
i == 239
UTF8_TRAILING[239] == 2
i == 240
UTF8_TRAILING[240] == 3
i == 241
UTF8_TRAILING[241] == 3
i == 242
UTF8_TRAILING[242] == 3
i == 243
UTF8_TRAILING[243] == 3
i == 244
UTF8_TRAILING[244] == 3
i == 245
UTF8_TRAILING[245] == 7
i == 246
UTF8_TRAILING[246] == 7
i == 247
UTF8_TRAILING[247] == 7
i == 248
UTF8_TRAILING[248] == 7
i == 249
UTF8_TRAILING[249] == 7
i == 250
UTF8_TRAILING[250] == 7
i == 251
UTF8_TRAILING[251] == 7
i == 252
UTF8_TRAILING[252] == 7
i == 253
UTF8_TRAILING[253] == 7
i == 254
UTF8_TRAILING[254] == 7
i == 0
UTF8_SKIP[0] == 1
UTF8_TRAILING[0] == 0
finally: i == 255

yes, when output both _SKIP and _TRAILING in the same loop, the 3rd loop just 
exits after one iteration.


-- 
Peter Karman  .  http://peknet.com/  .  peter@peknet.com

Re: 64-bit linux errors with t/core/032-string_helper.t

Posted by Peter Karman <pe...@peknet.com>.
Marvin Humphrey wrote on 01/19/2010 01:35 PM:

> I think the next step is to isolate the behavior of this platform in a minimal
> test app.  

Here's another one.

#include <stdio.h>


typedef unsigned char u8_t;



const u8_t StrHelp_UTF8_SKIP[] = {

    1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,
    1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,
    1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,
    1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,
    1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,
    1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,
    1,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,
    3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,4,4,4,4,4,4,4,4,5,5,5,5,6,6,7,7
};

const u8_t StrHelp_UTF8_TRAILING[] = {
    0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
    0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
    0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
    0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
    7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,
    7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,
    7,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,
    2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,3,3,3,3,3,7,7,7,7,7,7,7,7,7,7,7
};

int main() {

    u8_t i, max;

    for (i=127, max=255; i<max; i++) {
        printf("[%d] TRAILING[%d] SKIP[%d]\n",
               i, StrHelp_UTF8_TRAILING[i], StrHelp_UTF8_SKIP[i]);
    }

}

// eof

output:
[127] TRAILING[0] SKIP[1]

[128] TRAILING[7] SKIP[1]

[129] TRAILING[7] SKIP[1]

[130] TRAILING[7] SKIP[1]

[131] TRAILING[7] SKIP[1]

[132] TRAILING[7] SKIP[1]

[133] TRAILING[7] SKIP[1]

[134] TRAILING[7] SKIP[1]

[135] TRAILING[7] SKIP[1]

[136] TRAILING[7] SKIP[1]

[137] TRAILING[7] SKIP[1]

[138] TRAILING[7] SKIP[1]

[139] TRAILING[7] SKIP[1]

[140] TRAILING[7] SKIP[1]

[141] TRAILING[7] SKIP[1]

[142] TRAILING[7] SKIP[1]

[143] TRAILING[7] SKIP[1]

[144] TRAILING[7] SKIP[1]

[145] TRAILING[7] SKIP[1]

[146] TRAILING[7] SKIP[1]

[147] TRAILING[7] SKIP[1]

[148] TRAILING[7] SKIP[1]

[149] TRAILING[7] SKIP[1]

[150] TRAILING[7] SKIP[1]

[151] TRAILING[7] SKIP[1]

[152] TRAILING[7] SKIP[1]

[153] TRAILING[7] SKIP[1]

[154] TRAILING[7] SKIP[1]

[155] TRAILING[7] SKIP[1]

[156] TRAILING[7] SKIP[1]

[157] TRAILING[7] SKIP[1]

[158] TRAILING[7] SKIP[1]

[159] TRAILING[7] SKIP[1]

[160] TRAILING[7] SKIP[1]

[161] TRAILING[7] SKIP[1]

[162] TRAILING[7] SKIP[1]

[163] TRAILING[7] SKIP[1]

[164] TRAILING[7] SKIP[1]

[165] TRAILING[7] SKIP[1]

[166] TRAILING[7] SKIP[1]

[167] TRAILING[7] SKIP[1]

[168] TRAILING[7] SKIP[1]

[169] TRAILING[7] SKIP[1]

[170] TRAILING[7] SKIP[1]

[171] TRAILING[7] SKIP[1]

[172] TRAILING[7] SKIP[1]

[173] TRAILING[7] SKIP[1]

[174] TRAILING[7] SKIP[1]

[175] TRAILING[7] SKIP[1]

[176] TRAILING[7] SKIP[1]

[177] TRAILING[7] SKIP[1]

[178] TRAILING[7] SKIP[1]

[179] TRAILING[7] SKIP[1]

[180] TRAILING[7] SKIP[1]

[181] TRAILING[7] SKIP[1]

[182] TRAILING[7] SKIP[1]

[183] TRAILING[7] SKIP[1]

[184] TRAILING[7] SKIP[1]

[185] TRAILING[7] SKIP[1]

[186] TRAILING[7] SKIP[1]

[187] TRAILING[7] SKIP[1]

[188] TRAILING[7] SKIP[1]

[189] TRAILING[7] SKIP[1]

[190] TRAILING[7] SKIP[1]

[191] TRAILING[7] SKIP[1]

[192] TRAILING[7] SKIP[1]

[193] TRAILING[1] SKIP[2]

[194] TRAILING[1] SKIP[2]

[195] TRAILING[1] SKIP[2]

[196] TRAILING[1] SKIP[2]

[197] TRAILING[1] SKIP[2]

[198] TRAILING[1] SKIP[2]

[199] TRAILING[1] SKIP[2]

[200] TRAILING[1] SKIP[2]

[201] TRAILING[1] SKIP[2]

[202] TRAILING[1] SKIP[2]

[203] TRAILING[1] SKIP[2]

[204] TRAILING[1] SKIP[2]

[205] TRAILING[1] SKIP[2]

[206] TRAILING[1] SKIP[2]

[207] TRAILING[1] SKIP[2]

[208] TRAILING[1] SKIP[2]

[209] TRAILING[1] SKIP[2]

[210] TRAILING[1] SKIP[2]

[211] TRAILING[1] SKIP[2]

[212] TRAILING[1] SKIP[2]

[213] TRAILING[1] SKIP[2]

[214] TRAILING[1] SKIP[2]

[215] TRAILING[1] SKIP[2]

[216] TRAILING[1] SKIP[2]

[217] TRAILING[1] SKIP[2]

[218] TRAILING[1] SKIP[2]

[219] TRAILING[1] SKIP[2]

[220] TRAILING[1] SKIP[2]

[221] TRAILING[1] SKIP[2]

[222] TRAILING[1] SKIP[2]

[223] TRAILING[1] SKIP[2]

[224] TRAILING[2] SKIP[3]
[225] TRAILING[2] SKIP[3]
[226] TRAILING[2] SKIP[3]
[227] TRAILING[2] SKIP[3]
[228] TRAILING[2] SKIP[3]
[229] TRAILING[2] SKIP[3]
[230] TRAILING[2] SKIP[3]
[231] TRAILING[2] SKIP[3]
[232] TRAILING[2] SKIP[3]
[233] TRAILING[2] SKIP[3]
[234] TRAILING[2] SKIP[3]
[235] TRAILING[2] SKIP[3]
[236] TRAILING[2] SKIP[3]
[237] TRAILING[2] SKIP[3]
[238] TRAILING[2] SKIP[3]
[239] TRAILING[2] SKIP[3]
[240] TRAILING[3] SKIP[4]
[241] TRAILING[3] SKIP[4]
[242] TRAILING[3] SKIP[4]
[243] TRAILING[3] SKIP[4]
[244] TRAILING[3] SKIP[4]
[245] TRAILING[7] SKIP[4]
[246] TRAILING[7] SKIP[4]
[247] TRAILING[7] SKIP[4]
[248] TRAILING[7] SKIP[5]
[249] TRAILING[7] SKIP[5]
[250] TRAILING[7] SKIP[5]
[251] TRAILING[7] SKIP[5]
[252] TRAILING[7] SKIP[6]
[253] TRAILING[7] SKIP[6]
[254] TRAILING[7] SKIP[7]

-- 
Peter Karman  .  http://peknet.com/  .  peter@peknet.com

Re: 64-bit linux errors with t/core/032-string_helper.t

Posted by Marvin Humphrey <ma...@rectangular.com>.
On Wed, Jan 20, 2010 at 12:25:01PM -0600, Peter Karman wrote:
> Marvin Humphrey wrote on 01/20/2010 11:22 AM:
> 
> > Do you have valgrind on these boxen?  It would be interesting to see valgrind
> > analysis of the small program with and without -02.  However, I'm not sure
> > we'll see anything -- Valgrind's ability to detect stack corruption is
> > limited:
> 
> http://peknet.com/~karpet/char-with-O2.txt
> http://peknet.com/~karpet/char-without-O2.txt

For the email archive record... there's almost no difference between these two
files.  Valgrind wasn't able to help.

Marvin Humphrey


Re: 64-bit linux errors with t/core/032-string_helper.t

Posted by Peter Karman <pe...@peknet.com>.
Marvin Humphrey wrote on 01/20/2010 11:22 AM:

> Do you have valgrind on these boxen?  It would be interesting to see valgrind
> analysis of the small program with and without -02.  However, I'm not sure
> we'll see anything -- Valgrind's ability to detect stack corruption is
> limited:

http://peknet.com/~karpet/char-with-O2.txt
http://peknet.com/~karpet/char-without-O2.txt

-- 
Peter Karman  .  http://peknet.com/  .  peter@peknet.com

Re: [Lucy] Re: 64-bit linux errors with t/core/032-string_helper.t

Posted by Marvin Humphrey <ma...@rectangular.com>.
On Wed, Jan 20, 2010 at 08:56:35AM -0600, Peter Karman wrote:

> although changing the main() function to this, and compiling with -O2,
> still works ok, which makes me think Marvin's comment about the variadic
> function printf() might be on target.

Well, that seems to demonstrate that it's variadic argument passing
misbehaving under -O2, but it's still misbehaving.  I don't see anything in
the C code that's invalid.  Kudos to Nate if he spots something I've missed.

Do you have valgrind on these boxen?  It would be interesting to see valgrind
analysis of the small program with and without -02.  However, I'm not sure
we'll see anything -- Valgrind's ability to detect stack corruption is
limited:

    http://www.valgrind.org/docs/manual/faq.html#faq.overruns

Marvin Humphrey


Re: 64-bit linux errors with t/core/032-string_helper.t

Posted by Peter Karman <pe...@peknet.com>.
Peter Karman wrote on 01/20/2010 01:46 PM:
> Nathan Kurz wrote on 01/20/2010 01:35 PM:
> 
>> If we are seeing success with GCC 4.2.4 and failure with GCC 3.4.6,
>> and nothing else obviously wrong, this would seem to indicate a
>> compiler bug or library bug.   Are there other GCC versions that have
>> shown problems?
> 
> No other compilers I know of.
> 
> I'm currently downloading gcc 3.4.6 (the last of the 3.4.x releases
> fwiw, from 2006), and will try duplicating the problem on a different
> platform with the same compiler version.
> 
> 

gcc 3.2.3 on the same platform works ok. So I think we can safely call
it a compiler bug.

-- 
Peter Karman  .  http://peknet.com/  .  peter@peknet.com

Re: 64-bit linux errors with t/core/032-string_helper.t

Posted by Peter Karman <pe...@peknet.com>.
Nathan Kurz wrote on 01/20/2010 01:35 PM:

> If we are seeing success with GCC 4.2.4 and failure with GCC 3.4.6,
> and nothing else obviously wrong, this would seem to indicate a
> compiler bug or library bug.   Are there other GCC versions that have
> shown problems?

No other compilers I know of.

I'm currently downloading gcc 3.4.6 (the last of the 3.4.x releases
fwiw, from 2006), and will try duplicating the problem on a different
platform with the same compiler version.


-- 
Peter Karman  .  http://peknet.com/  .  peter@peknet.com

Re: [KinoSearch] 64-bit linux errors with t/core/032-string_helper.t

Posted by Nathan Kurz <na...@verse.com>.
On Wed, Jan 20, 2010 at 5:43 AM, Peter Karman <pe...@peknet.com> wrote:
> Nathan Kurz wrote on 1/20/10 1:35 AM:
>
>> Could you attach your failing standalone test case so I can take a
>> quick look at it?  I tried the inline one above, but saw nothing
>> strange with GCC 4.2.4 on Slamd64.
>>
>
> http://rectangular.com/pipermail/kinosearch/2010-January/007228.html

That's the one I tried, but I didn't see anything odd when running it
with or without -O2.   The symptom of failure would be the loop
skipping directly from i = 0 to i = 255?

If we are seeing success with GCC 4.2.4 and failure with GCC 3.4.6,
and nothing else obviously wrong, this would seem to indicate a
compiler bug or library bug.   Are there other GCC versions that have
shown problems?

Nathan Kurz
nate@verse.com

Re: [Lucy] Re: 64-bit linux errors with t/core/032-string_helper.t

Posted by Peter Karman <pe...@peknet.com>.
Peter Karman wrote on 01/20/2010 07:43 AM:
> Nathan Kurz wrote on 1/20/10 1:35 AM:
> 
>> Could you attach your failing standalone test case so I can take a
>> quick look at it?  I tried the inline one above, but saw nothing
>> strange with GCC 4.2.4 on Slamd64.
>>
> 
> http://rectangular.com/pipermail/kinosearch/2010-January/007228.html
> 

although changing the main() function to this, and compiling with -O2,
still works ok, which makes me think Marvin's comment about the variadic
function printf() might be on target.

int main() {
    u8_t i, max;
    int j;
    for (j=0, i=0, max=255; i<max; i++, j++) {
        printf("[%d] TRAILING[%d] SKIP[%d]\n",
               j,
               (int)StrHelp_UTF8_TRAILING[i],
               (int)StrHelp_UTF8_SKIP[i]
        );
    }
}


-- 
Peter Karman  .  http://peknet.com/  .  peter@peknet.com

Re: 64-bit linux errors with t/core/032-string_helper.t

Posted by Peter Karman <pe...@peknet.com>.
Nathan Kurz wrote on 1/20/10 1:35 AM:

> Could you attach your failing standalone test case so I can take a
> quick look at it?  I tried the inline one above, but saw nothing
> strange with GCC 4.2.4 on Slamd64.
> 

http://rectangular.com/pipermail/kinosearch/2010-January/007228.html

-- 
Peter Karman  .  http://peknet.com/  .  peter@peknet.com

Re: [KinoSearch] 64-bit linux errors with t/core/032-string_helper.t

Posted by Nathan Kurz <na...@verse.com>.
On Tue, Jan 19, 2010 at 10:39 PM, Marvin Humphrey
<ma...@rectangular.com> wrote:
> On Tue, Jan 19, 2010 at 11:33:00PM -0600, Peter Karman wrote:
>
>> >Was this standalone file compiled with the same set of flags?  Maybe
>> >there's some aggressive optimizer gone whack?
>>
>> bingo.
>>
>> When I compiled with -O2 I got the standalone to break.
>
> Rokk. :)
>
> I think we stop here.  It's not worth working around this.  The test will
> prevent silent failure, and if people care about compiling on systems that
> exhibit the bug, the workaround of disabling optimization is straightforward.

Marvin might be right, but inexplicable bugs have a nasty way of
biting back.  Yes, there are real GCC bugs, but more often than not
there's something else going on.

Could you attach your failing standalone test case so I can take a
quick look at it?  I tried the inline one above, but saw nothing
strange with GCC 4.2.4 on Slamd64.

Nathan Kurz
nate@verse.com

Re: 64-bit linux errors with t/core/032-string_helper.t

Posted by Marvin Humphrey <ma...@rectangular.com>.
On Tue, Jan 19, 2010 at 11:33:00PM -0600, Peter Karman wrote:

> >Was this standalone file compiled with the same set of flags?  Maybe
> >there's some aggressive optimizer gone whack?
> 
> bingo.
> 
> When I compiled with -O2 I got the standalone to break.

Rokk. :)

I think we stop here.  It's not worth working around this.  The test will
prevent silent failure, and if people care about compiling on systems that
exhibit the bug, the workaround of disabling optimization is straightforward.

Marvin Humphrey


Re: 64-bit linux errors with t/core/032-string_helper.t

Posted by Peter Karman <pe...@peknet.com>.
Marvin Humphrey wrote on 1/19/10 11:17 PM:

> Was this standalone file compiled with the same set of flags?  Maybe there's
> some aggressive optimizer gone whack?
> 

bingo.

When I compiled with -O2 I got the standalone to break.


>> I'm going to go drink a beer and try not to think about this madness for 
>> awhile and hope that the answer just comes to me in my sleep.
> 
> Can we tag out?  If you can get me an account on one of these boxen, or if we
> can duplicate this behavior on an Amazon EC2 instance, I'd like to take a
> crack at it.

I'll work on that and mail you offlist.


-- 
Peter Karman  .  http://peknet.com/  .  peter@peknet.com

Re: 64-bit linux errors with t/core/032-string_helper.t

Posted by Marvin Humphrey <ma...@rectangular.com>.
On Tue, Jan 19, 2010 at 09:13:35PM -0600, Peter Karman wrote:

> Notice that 'i' just skips straight from 0 to 255.

That's certainly puzzling and troublesome.  It sounds like a stack corruption
bug -- as in the memory where the stack variable "i" resides is being
overwritten somehow.  I don't suppose valgrind turns anything up...

> When I comment out either of the UTF8_*[..] calls, then it works fine. It's 
> the combination of the two that causes the problem.

Oi.  Shades of that damn NetBSD nightmare...

> So there are no macros to affect that _local() function, and no vararg 
> oddities.

Well, don't shoot me but StrHelp_UTF8_TRAILING is actually a macro alias for
kino_StrHelp_UTF8_TRAILING, and printf is a variadic function so char arguments
such as StrHelp_UTF8_TRAILING[i] automatically get promoted to int.  We should
try to remove all varidic functions and all macro expansions from the
equation.  

Clearly stuff like that *shouldn't* make a difference, but our local char
array works, and there's not much difference between what's happening there
and what's happening in these tests.

> However, when I put the same code into a standalone file and run it, it 
> works (see the test app I sent earlier in this thread with the UTF8 arrays 
> hardcoded).

Was this standalone file compiled with the same set of flags?  Maybe there's
some aggressive optimizer gone whack?

> I'm going to go drink a beer and try not to think about this madness for 
> awhile and hope that the answer just comes to me in my sleep.

Can we tag out?  If you can get me an account on one of these boxen, or if we
can duplicate this behavior on an Amazon EC2 instance, I'd like to take a
crack at it.

Marvin Humphrey


Re: 64-bit linux errors with t/core/032-string_helper.t

Posted by Peter Karman <pe...@peknet.com>.
Marvin Humphrey wrote on 1/19/10 2:45 PM:
> On Tue, Jan 19, 2010 at 02:14:17PM -0600, Peter Karman wrote:
> 
>> I do get the same result.
> 
> Good, 'cause working around that class of bug would have been a huge PITA -- I
> threw up my hands at that NetBSD bug.
> 
> Maybe it's macro expansion weirdness?  Try changing "ASSERT_INT_EQ" to
> "kino_TestBatch_int_equals" and "StrHelp_UTF8_SKIP" to
> "kino_StrHelp_UTF8_SKIP".
> 
> I'm also not totally satisfied that we've ruled out vararg argument passing.
> What if we do something like this?
> 
>     bool_t condition = StrHelp_UTF8_SKIP[i] == 1 ? true : false;
>     ASSERT_TRUE(batch, condition, "UTF8_SKIP ascii %d", (int)i);
> 

I added a new function to the core/KinoSearch/Test/Util/TestStringHelper.c file 
as below:

static void
test_SKIP_and_TRAILING_local()
{
     u8_t i, max;

     for (i=0, max=255; i < max; i++) {
         printf("i == %d\n", i);
         printf("UTF8_SKIP[%d] == %d\n", i, StrHelp_UTF8_SKIP[i]);
         printf("i == %d\n", i);
         printf("UTF8_TRAILING[%d] == %d\n", i, StrHelp_UTF8_TRAILING[i]);
         printf("i == %d size %d\n", i, sizeof(i));
     }
     printf("finally: i == %d\n", i);
}

output:

$ perl -Mblib t/core/032-string_helper.t
<...snip regular test output...>
i == 0
UTF8_SKIP[0] == 1
i == 0
UTF8_TRAILING[0] == 0
i == 0 size 1
finally: i == 255

Notice that 'i' just skips straight from 0 to 255.

When I comment out either of the UTF8_*[..] calls, then it works fine. It's the 
combination of the two that causes the problem.

So there are no macros to affect that _local() function, and no vararg oddities.

However, when I put the same code into a standalone file and run it, it works 
(see the test app I sent earlier in this thread with the UTF8 arrays hardcoded).

I have tried this on two different RHEL 4 boxes, both with gcc version 3.4.6 
20060404 (Red Hat 3.4.6-3).

I'm going to go drink a beer and try not to think about this madness for awhile 
and hope that the answer just comes to me in my sleep.
-- 
Peter Karman  .  http://peknet.com/  .  peter@peknet.com

Re: 64-bit linux errors with t/core/032-string_helper.t

Posted by Marvin Humphrey <ma...@rectangular.com>.
On Tue, Jan 19, 2010 at 02:14:17PM -0600, Peter Karman wrote:

> I do get the same result.

Good, 'cause working around that class of bug would have been a huge PITA -- I
threw up my hands at that NetBSD bug.

Maybe it's macro expansion weirdness?  Try changing "ASSERT_INT_EQ" to
"kino_TestBatch_int_equals" and "StrHelp_UTF8_SKIP" to
"kino_StrHelp_UTF8_SKIP".

I'm also not totally satisfied that we've ruled out vararg argument passing.
What if we do something like this?

    bool_t condition = StrHelp_UTF8_SKIP[i] == 1 ? true : false;
    ASSERT_TRUE(batch, condition, "UTF8_SKIP ascii %d", (int)i);

Marvin Humphrey


Re: 64-bit linux errors with t/core/032-string_helper.t

Posted by Peter Karman <pe...@peknet.com>.
Marvin Humphrey wrote on 01/19/2010 01:51 PM:
> On Tue, Jan 19, 2010 at 01:37:12PM -0600, Peter Karman wrote:
> 
>> ok - 0 == 0
>> ok - 1 == 1
>> ok - 100 == 100
>> ok - 126 == 126
>> ok - 127 == 127
>> ok - 128 == 128
>> ok - 129 == 129
>> ok - 250 == 250
>> ok - 254 == 254
>> ok - 255 == 255
> 
> Whew.  If that's the case, we ought to be able to make this work.
> 
> If you dump that same code into the TestStringHelper.c file, do you get the
> same result?  Checking for compiler flag weirdness or bizarre interactions
> from include libraries... (KinoSearch's tests once tickled a NetBSD bug that
> changed how doubles and long longs were converted to each other when a
> particular library was included...)
> 

I do get the same result.

Patch below (in case I messed it up):


Index: core/KinoSearch/Test/Util/TestStringHelper.c

===================================================================

--- core/KinoSearch/Test/Util/TestStringHelper.c        (revision 5710)

+++ core/KinoSearch/Test/Util/TestStringHelper.c        (working copy)

@@ -4,7 +4,20 @@

 #include "KinoSearch/Test/Util/TestStringHelper.h"

 #include "KinoSearch/Util/StringHelper.h"



+static unsigned char numbers[256];

+

 static void

+check_subscript(unsigned char subscript)

+{

+    if (subscript == numbers[subscript]) {

+        printf("ok - %d == %d\n", subscript, numbers[subscript]);

+    }

+    else {

+        printf("not ok - %d == %d\n", subscript, numbers[subscript]);

+    }

+}

+

+static void

 test_SKIP_and_TRAILING(TestBatch *batch)

 {

     u8_t i, max;

@@ -48,6 +61,37 @@

         ASSERT_TRUE(batch, StrHelp_UTF8_TRAILING[i] == 7,

             "UTF8_TRAILING bogus but no memory problems %d", (int)i);

     }

+    for (i=0, max=255; i < max; i++) {

+        printf("i == %d\n", i);

+        printf("UTF8_SKIP[%d] == %d\n", i, StrHelp_UTF8_SKIP[i]);

+    }

+    for (i=0, max=255; i < max; i++) {

+        printf("i == %d\n", i);

+        printf("UTF8_TRAILING[%d] == %d\n", i,
StrHelp_UTF8_TRAILING[i]);
+    }

+    for (i=0, max=255; i < max; i++) {

+        printf("i == %d\n", i);

+        printf("UTF8_SKIP[%d] == %d\n", i, StrHelp_UTF8_SKIP[i]);

+        printf("UTF8_TRAILING[%d] == %d\n", i,
StrHelp_UTF8_TRAILING[i]);
+        printf("i == %d\n", i);

+    }

+    printf("finally: i == %d\n", i);

+

+    int j;

+
+    for (j = 0; j < 256; j++) {
+        numbers[j] = j;
+    }
+    check_subscript(0);
+    check_subscript(1);
+    check_subscript(100);
+    check_subscript(126);
+    check_subscript(127);
+    check_subscript(128);
+    check_subscript(129);
+    check_subscript(250);
+    check_subscript(254);
+    check_subscript(255);
 }

-- 
Peter Karman  .  http://peknet.com/  .  peter@peknet.com

Re: [KinoSearch] 64-bit linux errors with t/core/032-string_helper.t

Posted by Marvin Humphrey <ma...@rectangular.com>.
On Tue, Jan 19, 2010 at 01:37:12PM -0600, Peter Karman wrote:

> ok - 0 == 0
> ok - 1 == 1
> ok - 100 == 100
> ok - 126 == 126
> ok - 127 == 127
> ok - 128 == 128
> ok - 129 == 129
> ok - 250 == 250
> ok - 254 == 254
> ok - 255 == 255

Whew.  If that's the case, we ought to be able to make this work.

If you dump that same code into the TestStringHelper.c file, do you get the
same result?  Checking for compiler flag weirdness or bizarre interactions
from include libraries... (KinoSearch's tests once tickled a NetBSD bug that
changed how doubles and long longs were converted to each other when a
particular library was included...)

Marvin Humphrey


Re: 64-bit linux errors with t/core/032-string_helper.t

Posted by Peter Karman <pe...@peknet.com>.
Marvin Humphrey wrote on 01/19/2010 01:35 PM:

> I think the next step is to isolate the behavior of this platform in a minimal
> test app.  Peter, what output do you see when you run the program below?
> 


ok - 0 == 0
ok - 1 == 1
ok - 100 == 100
ok - 126 == 126
ok - 127 == 127
ok - 128 == 128
ok - 129 == 129
ok - 250 == 250
ok - 254 == 254
ok - 255 == 255


-- 
Peter Karman  .  http://peknet.com/  .  peter@peknet.com

Re: [KinoSearch] 64-bit linux errors with t/core/032-string_helper.t

Posted by Marvin Humphrey <ma...@rectangular.com>.
On Mon, Jan 18, 2010 at 09:53:47PM -0800, Eric Howe wrote:
> On 2010-01-18, at 21:05 , Marvin Humphrey wrote:
> [...]
> > It seems messed up that an unsigned type gets promoted to a negative signed
> > type when used as an array subscript.  You're not supposed to use "char" on
> > its own as an array subscript, because whether "char" is signed or unsigned is
> > implementation defined, but either "signed char" or "unsigned char" are
> > allowed.
> 
> Arrays in C are just convenient notation for pointer arithmetic so array
> indices in C are not char, nor are they signed char nor unsigned char,
> they're probably int. Are you getting tripped up by an implicit "a[(int)i]"
> cast on the index?
> 
> Just a wild guess.

GCC will warn under -Wchar-subscripts (enabled by -Wall) if you use a naked
"char" variable as an array subscript.  It won't warn under either "signed
char" or "unsigned char", because those are both unambiguous.  That's what I
was referring to; it was imprecise of me to use "supposed to" and "allowed"
because it's the compiler rather than the C standard that imposes the
restriction, and it's only a warning not an error.

I'm still not sure exactly what's going on.  If it's just promotion, an int
should be wide enough to hold all the values of an unsigned char, so the
converted value should be the same.  So should values converted to whatever
64-bit type is used internally to perform pointer math on this system.

I think the next step is to isolate the behavior of this platform in a minimal
test app.  Peter, what output do you see when you run the program below?

Marvin Humphrey


#include <stdio.h>

unsigned char numbers[256];

static void
check_subscript(unsigned char subscript)
{
    if (subscript == numbers[subscript]) {
        printf("ok - %d == %d\n", subscript, numbers[subscript]);
    }
    else {
        printf("not ok - %d == %d\n", subscript, numbers[subscript]);
    }
}


int main ()
{
    int i;

    for (i = 0; i < 256; i++) {
        numbers[i] = i;
    }

    check_subscript(0);
    check_subscript(1);
    check_subscript(100);
    check_subscript(126);
    check_subscript(127);
    check_subscript(128);
    check_subscript(129);
    check_subscript(250);
    check_subscript(254);
    check_subscript(255);

    return 0;
}




Re: [KinoSearch] 64-bit linux errors with t/core/032-string_helper.t

Posted by Eric Howe <er...@pieinsky.ca>.
On 2010-01-18, at 21:05 , Marvin Humphrey wrote:
[...]
> It seems messed up that an unsigned type gets promoted to a negative signed
> type when used as an array subscript.  You're not supposed to use "char" on
> its own as an array subscript, because whether "char" is signed or unsigned is
> implementation defined, but either "signed char" or "unsigned char" are
> allowed.

Arrays in C are just convenient notation for pointer arithmetic so array indices in C are not char, nor are they signed char nor unsigned char, they're probably int. Are you getting tripped up by an implicit "a[(int)i]" cast on the index?

Just a wild guess.

Eric Howe
eric@pieinsky.ca


Re: 64-bit linux errors with t/core/032-string_helper.t

Posted by Marvin Humphrey <ma...@rectangular.com>.
On Mon, Jan 18, 2010 at 10:34:46PM -0600, Peter Karman wrote:

> alas, that didn't change the output.

Well, I don't really understand what's going on at this point.

Changing the test to use an int doesn't address the problem, because the whole
point of UTF8_SKIP is to use it on header bytes in a UTF-8 sequence, which
will be unsigned char.

It seems messed up that an unsigned type gets promoted to a negative signed
type when used as an array subscript.  You're not supposed to use "char" on
its own as an array subscript, because whether "char" is signed or unsigned is
implementation defined, but either "signed char" or "unsigned char" are
allowed.

So the next steps are investigatory.  What are the typedefs for u8_t in
charmony.h?  What happens when we swap out u8_t for uint8_t when declaring
UTF8_SKIP in StringHelper.bp?  Does uint8_t work as a subscript?

Marvin Humphrey


Re: 64-bit linux errors with t/core/032-string_helper.t

Posted by Peter Karman <pe...@peknet.com>.
Marvin Humphrey wrote on 1/18/10 5:42 PM:
> On Mon, Jan 18, 2010 at 01:45:19PM -0600, Peter Karman wrote:
>> On 64-bit Centos 5 Linux with Perl 5.8.9 I get several failures for 
>> t/core/032-string_helper.t.
>>
>> The issue seems to be related to the u8_t size of the i and max vars. It's 
>> as if they are being evaluated as signed rather than unsigned. However, 
>> even if I hardcode 'unsigned char' instead of the Charmonized 'u8_t' I 
>> still get the same error (below).
>>
>> Changing to an int fixes it. But it doesn't make any sense to me why an 
>> unsigned char wouldn't work.
> 
> I'll bet that what's actually going wrong is the promotion of the arguments
> from "unsigned char" to "long" as they are being passed to ASSERT_INT_EQ.
> 
> To verify, try changing one of the tests from...
> 
>     ASSERT_INT_EQ(batch, StrHelp_UTF8_SKIP[i], 1, 
>         "UTF8_SKIP ascii %d", (int)i);
> 
> ...to:
> 
>     ASSERT_TRUE(batch, StrHelp_UTF8_SKIP[i] == 1, 
>         "UTF8_SKIP ascii %d", (int)i);
> 

alas, that didn't change the output.


-- 
Peter Karman  .  http://peknet.com/  .  peter@peknet.com

Re: 64-bit linux errors with t/core/032-string_helper.t

Posted by Marvin Humphrey <ma...@rectangular.com>.
On Mon, Jan 18, 2010 at 01:45:19PM -0600, Peter Karman wrote:
> On 64-bit Centos 5 Linux with Perl 5.8.9 I get several failures for 
> t/core/032-string_helper.t.
> 
> The issue seems to be related to the u8_t size of the i and max vars. It's 
> as if they are being evaluated as signed rather than unsigned. However, 
> even if I hardcode 'unsigned char' instead of the Charmonized 'u8_t' I 
> still get the same error (below).
> 
> Changing to an int fixes it. But it doesn't make any sense to me why an 
> unsigned char wouldn't work.

I'll bet that what's actually going wrong is the promotion of the arguments
from "unsigned char" to "long" as they are being passed to ASSERT_INT_EQ.

To verify, try changing one of the tests from...

    ASSERT_INT_EQ(batch, StrHelp_UTF8_SKIP[i], 1, 
        "UTF8_SKIP ascii %d", (int)i);

...to:

    ASSERT_TRUE(batch, StrHelp_UTF8_SKIP[i] == 1, 
        "UTF8_SKIP ascii %d", (int)i);

If that's indeed the problem, I'd prefer not to fix it that way, as it will
cut down on the usefulness of verbose test output and make smoke test reports
harder to analyze.  Instead, we should add the following:

    TEST_I8_EQUALS
    TEST_U8_EQUALS
    TEST_I16_EQUALS
    TEST_U16_EQUALS
    TEST_I32_EQUALS
    TEST_U32_EQUALS
    TEST_I64_EQUALS
    TEST_U64_EQUALS

That's easier now that we're no longer using Charmonizer's test suite.

The choice of "TEST_*" rather than "ASSERT_*" is driven by the fact that
unlike typical xUnit tests, these return false on failure rather than throw
exceptions.

Marvin Humphrey