You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@lucy.apache.org by nw...@apache.org on 2011/12/05 21:55:04 UTC
[lucy-commits] svn commit: r1210619 - in /incubator/lucy/branches/LUCY-196-uax-tokenizer:
core/Lucy/Analysis/WordBreak.tab devel/bin/UnicodeTable.pm
devel/bin/gen_word_break_tables.pl devel/conf/rat-excludes
Author: nwellnhof
Date: Mon Dec 5 20:55:03 2011
New Revision: 1210619
URL: http://svn.apache.org/viewvc?rev=1210619&view=rev
Log:
Add a perl script to generate Unicode word break tables
Added:
incubator/lucy/branches/LUCY-196-uax-tokenizer/core/Lucy/Analysis/WordBreak.tab
incubator/lucy/branches/LUCY-196-uax-tokenizer/devel/bin/UnicodeTable.pm
incubator/lucy/branches/LUCY-196-uax-tokenizer/devel/bin/gen_word_break_tables.pl (with props)
Modified:
incubator/lucy/branches/LUCY-196-uax-tokenizer/devel/conf/rat-excludes
Added: incubator/lucy/branches/LUCY-196-uax-tokenizer/core/Lucy/Analysis/WordBreak.tab
URL: http://svn.apache.org/viewvc/incubator/lucy/branches/LUCY-196-uax-tokenizer/core/Lucy/Analysis/WordBreak.tab?rev=1210619&view=auto
==============================================================================
--- incubator/lucy/branches/LUCY-196-uax-tokenizer/core/Lucy/Analysis/WordBreak.tab (added)
+++ incubator/lucy/branches/LUCY-196-uax-tokenizer/core/Lucy/Analysis/WordBreak.tab Mon Dec 5 20:55:03 2011
@@ -0,0 +1,469 @@
+/*
+
+This file is generated with devel/bin/gen_word_break_tables.pl. DO NOT EDIT!
+The contents of this file are derived from the Unicode Character Database,
+version 6.0.0, available from http://www.unicode.org/Public/6.0.0/ucd/.
+The Unicode copyright and permission notice follows.
+
+Copyright (c) 1991-2011 Unicode, Inc. All rights reserved. Distributed under
+the Terms of Use in http://www.unicode.org/copyright.html.
+
+Permission is hereby granted, free of charge, to any person obtaining a copy of
+the Unicode data files and any associated documentation (the "Data Files") or
+Unicode software and any associated documentation (the "Software") to deal in
+the Data Files or Software without restriction, including without limitation
+the rights to use, copy, modify, merge, publish, distribute, and/or sell copies
+of the Data Files or Software, and to permit persons to whom the Data Files or
+Software are furnished to do so, provided that (a) the above copyright
+notice(s) and this permission notice appear with all copies of the Data Files
+or Software, (b) both the above copyright notice(s) and this permission notice
+appear in associated documentation, and (c) there is clear notice in each
+modified Data File or in the Software as well as in the documentation
+associated with the Data File(s) or Software that the data or software has been
+modified.
+
+THE DATA FILES AND SOFTWARE ARE PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT OF THIRD
+PARTY RIGHTS. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR HOLDERS INCLUDED IN
+THIS NOTICE BE LIABLE FOR ANY CLAIM, OR ANY SPECIAL INDIRECT OR CONSEQUENTIAL
+DAMAGES, OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS,
+WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING
+OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THE DATA FILES OR
+SOFTWARE.
+
+Except as contained in this notice, the name of a copyright holder shall not be
+used in advertising or otherwise to promote the sale, use or other dealings in
+these Data Files or Software without prior written authorization of the
+copyright holder.
+
+*/
+
+#define WB_TABLE1_SHIFT 6
+#define WB_TABLE1_MASK 63
+#define WB_TABLE1_SIZE 1793
+
+static const uint8_t wb_table1[1793] = {
+ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18,
+ 17, 17, 17, 19, 20, 21, 22, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23,
+ 24, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23,
+ 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23,
+ 23, 23, 23, 25, 26, 26, 27, 28, 29, 30, 26, 26, 26, 26, 26, 26, 26, 26, 26,
+ 26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 31, 17, 17, 17, 17, 17, 17,
+ 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 32, 33, 34, 35, 36, 37, 38, 17, 39,
+ 40, 41, 17, 42, 17, 17, 17, 17, 17, 17, 17, 26, 43, 44, 17, 17, 17, 17, 17,
+ 26, 26, 45, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17,
+ 17, 17, 17, 17, 17, 17, 17, 17, 17, 26, 46, 17, 17, 17, 17, 17, 17, 17, 17,
+ 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17,
+ 17, 17, 17, 17, 17, 17, 17, 47, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17,
+ 17, 17, 17, 17, 48, 49, 50, 51, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17,
+ 17, 17, 17, 17, 17, 17, 17, 17, 17, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23,
+ 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23,
+ 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23,
+ 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23,
+ 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 52, 23, 23,
+ 23, 23, 23, 23, 23, 53, 54, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17,
+ 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17,
+ 23, 54, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17,
+ 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17,
+ 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17,
+ 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17,
+ 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17,
+ 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17,
+ 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17,
+ 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17,
+ 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17,
+ 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17,
+ 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17,
+ 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17,
+ 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17,
+ 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17,
+ 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17,
+ 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17,
+ 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17,
+ 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17,
+ 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17,
+ 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17,
+ 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17,
+ 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17,
+ 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17,
+ 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17,
+ 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17,
+ 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17,
+ 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17,
+ 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17,
+ 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17,
+ 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17,
+ 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17,
+ 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17,
+ 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17,
+ 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17,
+ 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17,
+ 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17,
+ 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17,
+ 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17,
+ 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17,
+ 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17,
+ 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17,
+ 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17,
+ 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17,
+ 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17,
+ 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17,
+ 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17,
+ 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17,
+ 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17,
+ 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17,
+ 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17,
+ 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17,
+ 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17,
+ 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17,
+ 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17,
+ 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17,
+ 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17,
+ 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17,
+ 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17,
+ 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17,
+ 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17,
+ 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17,
+ 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17,
+ 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17,
+ 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17,
+ 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17,
+ 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17,
+ 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17,
+ 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17,
+ 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17,
+ 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17,
+ 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17,
+ 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17,
+ 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17,
+ 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17,
+ 17, 17, 17, 17, 17, 17, 55
+};
+
+#define WB_TABLE2_SHIFT 3
+#define WB_TABLE2_MASK 7
+#define WB_TABLE2_SIZE 3584
+
+static const uint8_t wb_table2[3584] = {
+ 0, 0, 0, 0, 1, 2, 3, 4, 5, 6, 6, 7, 5, 6, 6,
+ 8, 0, 0, 0, 0, 0, 9, 10, 11, 6, 6, 12, 6, 6, 6,
+ 12, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6,
+ 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6,
+ 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6,
+ 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 13, 6,
+ 14, 0, 15, 16, 0, 0, 17, 17, 17, 17, 17, 17, 17, 17, 17,
+ 17, 17, 17, 17, 17, 18, 19, 20, 21, 6, 6, 22, 6, 6, 6,
+ 6, 6, 6, 6, 6, 6, 23, 6, 6, 6, 6, 6, 6, 6, 6,
+ 6, 6, 6, 6, 6, 6, 6, 6, 6, 24, 25, 6, 6, 6, 6,
+ 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6,
+ 0, 5, 6, 6, 6, 12, 26, 5, 6, 6, 6, 6, 27, 28, 17,
+ 17, 17, 17, 29, 30, 0, 6, 6, 6, 8, 31, 0, 32, 33, 17,
+ 34, 6, 6, 6, 6, 6, 35, 17, 17, 3, 36, 37, 6, 6, 6,
+ 6, 6, 6, 6, 6, 6, 6, 6, 38, 29, 39, 40, 3, 41, 0,
+ 42, 43, 6, 6, 6, 17, 17, 17, 44, 6, 6, 6, 6, 6, 6,
+ 6, 6, 6, 6, 45, 17, 46, 0, 3, 47, 6, 6, 6, 35, 48,
+ 49, 6, 6, 45, 50, 51, 52, 0, 0, 6, 6, 6, 53, 0, 0,
+ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
+ 0, 0, 0, 54, 6, 6, 6, 6, 6, 6, 55, 17, 17, 56, 6,
+ 57, 3, 5, 5, 58, 59, 60, 6, 6, 61, 62, 63, 64, 65, 42,
+ 66, 57, 3, 14, 0, 58, 67, 60, 6, 6, 61, 68, 69, 70, 71,
+ 72, 73, 74, 3, 75, 0, 58, 23, 22, 6, 6, 61, 76, 63, 29,
+ 77, 78, 0, 57, 3, 0, 0, 58, 59, 60, 6, 6, 61, 76, 63,
+ 64, 71, 79, 66, 57, 3, 26, 0, 80, 81, 82, 83, 84, 81, 6,
+ 85, 86, 87, 88, 0, 74, 3, 0, 0, 58, 18, 61, 6, 6, 61,
+ 89, 90, 91, 87, 92, 14, 57, 3, 0, 0, 93, 18, 61, 6, 6,
+ 61, 89, 63, 91, 87, 92, 94, 57, 3, 95, 0, 93, 18, 61, 6,
+ 6, 6, 6, 96, 91, 97, 42, 0, 57, 3, 0, 98, 93, 6, 12,
+ 98, 6, 6, 22, 99, 12, 100, 101, 17, 0, 0, 102, 0, 103, 104,
+ 104, 104, 104, 104, 105, 34, 106, 107, 3, 108, 0, 0, 0, 0, 109,
+ 110, 111, 103, 112, 113, 105, 114, 115, 116, 3, 117, 0, 0, 0, 0,
+ 78, 0, 0, 118, 3, 108, 119, 120, 6, 5, 6, 6, 6, 15, 28,
+ 17, 91, 121, 17, 28, 17, 17, 17, 122, 123, 0, 0, 0, 0, 0,
+ 0, 0, 104, 104, 104, 104, 104, 124, 17, 125, 3, 108, 126, 127, 128,
+ 129, 130, 104, 131, 132, 3, 133, 6, 6, 6, 6, 134, 0, 6, 6,
+ 6, 6, 6, 135, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6,
+ 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6,
+ 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6,
+ 82, 12, 82, 6, 6, 6, 6, 6, 82, 6, 6, 6, 6, 82, 12,
+ 82, 6, 12, 6, 6, 6, 6, 6, 6, 6, 82, 6, 6, 6, 6,
+ 6, 6, 6, 6, 136, 0, 0, 0, 0, 6, 6, 0, 0, 6, 6,
+ 6, 6, 6, 6, 6, 6, 6, 6, 15, 0, 5, 6, 6, 6, 6,
+ 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6,
+ 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6,
+ 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6,
+ 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6,
+ 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 59, 6, 6,
+ 5, 6, 6, 8, 6, 6, 6, 6, 6, 6, 6, 6, 6, 81, 78,
+ 0, 6, 18, 137, 0, 6, 6, 137, 0, 6, 6, 138, 0, 6, 18,
+ 139, 0, 104, 104, 104, 104, 104, 104, 140, 17, 17, 17, 141, 142, 3,
+ 108, 0, 0, 0, 143, 3, 108, 6, 6, 6, 6, 6, 6, 6, 6,
+ 6, 6, 6, 0, 6, 6, 6, 6, 6, 144, 6, 6, 6, 6, 6,
+ 6, 6, 6, 134, 0, 6, 6, 6, 15, 17, 32, 17, 32, 74, 3,
+ 104, 104, 104, 145, 146, 0, 104, 104, 104, 104, 104, 147, 17, 17, 148,
+ 118, 3, 108, 0, 0, 0, 0, 6, 6, 149, 32, 104, 104, 104, 104,
+ 104, 104, 150, 107, 17, 17, 17, 64, 3, 108, 3, 108, 151, 0, 0,
+ 0, 0, 0, 0, 0, 0, 0, 0, 0, 152, 6, 6, 6, 6, 6,
+ 153, 17, 152, 154, 3, 108, 0, 155, 32, 0, 156, 6, 6, 6, 56,
+ 157, 3, 108, 6, 6, 6, 6, 45, 17, 32, 0, 6, 6, 6, 6,
+ 153, 17, 17, 0, 3, 158, 3, 47, 6, 6, 6, 134, 0, 0, 0,
+ 0, 0, 0, 0, 0, 0, 0, 159, 17, 17, 160, 161, 0, 6, 6,
+ 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6,
+ 6, 6, 6, 6, 6, 6, 6, 17, 17, 17, 17, 107, 0, 0, 162,
+ 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6,
+ 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6,
+ 6, 6, 6, 6, 134, 134, 6, 6, 6, 6, 134, 134, 6, 163, 6,
+ 6, 6, 134, 6, 6, 6, 6, 6, 6, 18, 164, 165, 15, 166, 154,
+ 6, 15, 165, 15, 0, 162, 0, 167, 168, 169, 0, 170, 171, 0, 172,
+ 0, 122, 173, 26, 174, 0, 0, 6, 15, 0, 0, 0, 0, 0, 0,
+ 17, 17, 17, 17, 175, 0, 176, 98, 99, 177, 16, 178, 6, 179, 180,
+ 181, 0, 0, 6, 6, 6, 6, 6, 78, 0, 0, 0, 0, 0, 0,
+ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
+ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
+ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
+ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
+ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
+ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
+ 0, 0, 0, 0, 182, 6, 6, 6, 6, 6, 6, 14, 0, 0, 0,
+ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
+ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
+ 0, 6, 6, 6, 6, 6, 12, 6, 6, 6, 6, 6, 12, 6, 6,
+ 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 15,
+ 183, 118, 0, 6, 6, 6, 6, 134, 0, 6, 6, 6, 6, 6, 6,
+ 134, 174, 0, 42, 6, 6, 12, 0, 12, 12, 12, 12, 12, 12, 12,
+ 12, 17, 17, 17, 17, 0, 0, 0, 0, 0, 174, 0, 0, 0, 0,
+ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
+ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
+ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
+ 0, 0, 0, 0, 0, 0, 0, 0, 0, 184, 0, 0, 0, 103, 131,
+ 185, 186, 103, 104, 104, 104, 104, 104, 104, 104, 104, 104, 187, 188, 189,
+ 189, 189, 189, 189, 189, 189, 189, 189, 189, 189, 190, 180, 6, 6, 6,
+ 6, 134, 5, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 12, 0,
+ 0, 6, 6, 6, 8, 0, 0, 0, 0, 0, 0, 189, 189, 0, 0,
+ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
+ 0, 0, 0, 0, 0, 0, 0, 0, 0, 189, 189, 189, 189, 189, 191,
+ 189, 189, 189, 189, 189, 189, 189, 189, 189, 189, 189, 0, 0, 0, 0,
+ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
+ 0, 0, 104, 104, 104, 104, 104, 104, 104, 104, 104, 104, 104, 104, 104,
+ 104, 104, 104, 104, 104, 104, 104, 104, 104, 104, 104, 104, 104, 104, 104,
+ 104, 104, 104, 104, 104, 104, 104, 104, 104, 104, 104, 104, 104, 104, 104,
+ 104, 104, 104, 104, 104, 104, 104, 104, 104, 104, 104, 104, 104, 104, 104,
+ 104, 104, 104, 104, 104, 104, 104, 104, 104, 104, 104, 104, 104, 104, 104,
+ 104, 104, 104, 104, 104, 104, 104, 104, 104, 104, 104, 104, 104, 104, 104,
+ 104, 104, 104, 104, 104, 104, 104, 104, 104, 104, 104, 104, 104, 104, 104,
+ 104, 104, 104, 104, 104, 104, 104, 104, 104, 104, 104, 104, 104, 104, 104,
+ 145, 0, 0, 0, 0, 0, 0, 0, 0, 0, 104, 104, 104, 104, 104,
+ 104, 104, 104, 104, 104, 104, 104, 104, 104, 104, 104, 104, 104, 104, 104,
+ 104, 104, 104, 104, 104, 104, 104, 104, 104, 104, 104, 104, 104, 104, 104,
+ 104, 104, 104, 104, 104, 104, 104, 104, 104, 104, 104, 104, 104, 104, 104,
+ 104, 104, 104, 104, 104, 104, 104, 147, 0, 0, 0, 0, 0, 0, 6,
+ 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6,
+ 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6,
+ 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6,
+ 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6,
+ 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6,
+ 6, 6, 6, 6, 6, 15, 0, 0, 0, 0, 0, 0, 0, 0, 6,
+ 6, 6, 6, 6, 134, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6,
+ 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6,
+ 6, 6, 6, 6, 6, 6, 6, 6, 15, 6, 6, 3, 192, 0, 0,
+ 6, 6, 6, 6, 6, 149, 34, 193, 6, 6, 6, 0, 6, 6, 6,
+ 6, 6, 6, 6, 6, 6, 6, 118, 0, 0, 0, 174, 6, 98, 6,
+ 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 194, 14, 0, 6,
+ 14, 0, 0, 0, 0, 0, 0, 0, 0, 0, 98, 195, 196, 6, 6,
+ 35, 0, 0, 0, 6, 6, 6, 6, 6, 6, 154, 0, 25, 6, 6,
+ 6, 6, 6, 153, 17, 122, 0, 3, 108, 17, 17, 25, 197, 3, 47,
+ 6, 6, 45, 116, 6, 6, 149, 17, 32, 0, 6, 6, 6, 15, 54,
+ 6, 6, 6, 6, 6, 35, 17, 175, 174, 3, 108, 0, 0, 0, 0,
+ 6, 6, 6, 6, 6, 56, 107, 0, 196, 198, 3, 108, 104, 104, 187,
+ 199, 104, 104, 104, 104, 104, 104, 128, 200, 201, 0, 0, 202, 0, 0,
+ 0, 0, 203, 203, 203, 0, 12, 12, 0, 0, 0, 0, 0, 0, 0,
+ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 6, 6, 6, 6,
+ 35, 204, 3, 108, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6,
+ 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6,
+ 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6,
+ 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 154, 0, 6, 6,
+ 12, 205, 6, 6, 6, 6, 6, 154, 0, 0, 0, 0, 0, 0, 0,
+ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
+ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 104, 104, 104, 104, 104,
+ 104, 104, 104, 104, 104, 104, 104, 104, 104, 104, 104, 104, 104, 104, 104,
+ 104, 104, 104, 104, 104, 104, 104, 104, 104, 104, 104, 104, 104, 104, 104,
+ 104, 104, 145, 104, 104, 104, 104, 104, 104, 104, 145, 104, 104, 104, 104,
+ 104, 104, 104, 104, 104, 104, 104, 104, 104, 206, 0, 0, 0, 0, 12,
+ 0, 205, 207, 6, 61, 12, 164, 208, 6, 6, 6, 6, 6, 6, 6,
+ 6, 6, 6, 6, 6, 6, 14, 0, 0, 0, 205, 6, 6, 6, 6,
+ 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6,
+ 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6,
+ 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 134, 0, 0, 6, 6,
+ 6, 6, 6, 6, 6, 6, 98, 6, 6, 6, 6, 6, 6, 0, 0,
+ 0, 0, 0, 6, 154, 17, 17, 209, 0, 107, 0, 210, 0, 0, 211,
+ 212, 0, 0, 0, 18, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6,
+ 6, 6, 6, 6, 6, 6, 213, 1, 2, 0, 214, 5, 6, 6, 7,
+ 5, 6, 6, 8, 215, 189, 189, 189, 189, 189, 189, 216, 6, 6, 6,
+ 12, 98, 98, 98, 217, 0, 0, 0, 218, 6, 89, 6, 6, 12, 6,
+ 6, 219, 6, 134, 6, 134, 0, 0, 0, 0, 6, 6, 6, 6, 6,
+ 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 8, 0, 0, 0, 0,
+ 0, 0, 0, 0, 6, 6, 6, 6, 6, 6, 15, 0, 0, 0, 0,
+ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 220, 0, 0,
+ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 6,
+ 6, 6, 15, 6, 6, 6, 6, 6, 6, 78, 0, 0, 0, 0, 0,
+ 6, 6, 6, 12, 0, 0, 6, 6, 6, 8, 0, 0, 0, 0, 0,
+ 0, 6, 6, 6, 134, 6, 6, 6, 6, 154, 6, 177, 0, 0, 0,
+ 0, 0, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6,
+ 6, 6, 6, 6, 6, 6, 134, 3, 108, 0, 0, 0, 0, 0, 0,
+ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
+ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
+ 0, 0, 0, 0, 0, 0, 134, 61, 6, 6, 6, 6, 23, 221, 6,
+ 6, 134, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
+ 0, 0, 0, 0, 0, 0, 0, 0, 6, 6, 134, 0, 6, 6, 6,
+ 14, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
+ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 222, 162, 89, 5, 6,
+ 6, 154, 70, 0, 0, 0, 0, 6, 6, 6, 15, 0, 0, 0, 0,
+ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 6, 6, 6,
+ 6, 6, 6, 134, 0, 6, 6, 134, 0, 6, 6, 8, 0, 0, 0,
+ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 6,
+ 6, 6, 6, 6, 6, 6, 6, 6, 78, 0, 0, 0, 0, 0, 0,
+ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
+ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
+ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
+ 0, 0, 0, 156, 6, 6, 6, 6, 6, 6, 17, 107, 0, 0, 0,
+ 74, 3, 0, 0, 156, 6, 6, 6, 6, 6, 17, 223, 0, 0, 0,
+ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
+ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
+ 0, 0, 0, 0, 0, 0, 0, 6, 6, 6, 6, 6, 6, 6, 6,
+ 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6,
+ 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6,
+ 6, 6, 6, 6, 6, 6, 6, 12, 0, 0, 0, 0, 0, 0, 0,
+ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 6, 6, 6, 6,
+ 6, 6, 6, 6, 6, 6, 6, 6, 8, 0, 0, 0, 0, 0, 0,
+ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
+ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
+ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
+ 6, 6, 6, 6, 6, 12, 0, 0, 0, 0, 0, 0, 0, 0, 0,
+ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
+ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
+ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
+ 0, 0, 0, 0, 6, 6, 6, 6, 6, 6, 6, 78, 0, 0, 0,
+ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
+ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
+ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
+ 0, 0, 0, 0, 0, 0, 0, 0, 224, 0, 0, 0, 0, 0, 0,
+ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
+ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
+ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
+ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
+ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
+ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
+ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 225, 226, 17, 17,
+ 227, 32, 0, 0, 0, 228, 0, 0, 0, 0, 0, 0, 0, 0, 0,
+ 0, 0, 0, 0, 0, 0, 0, 0, 0, 229, 0, 0, 0, 0, 0,
+ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
+ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
+ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
+ 0, 0, 0, 0, 0, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6,
+ 18, 6, 6, 6, 6, 6, 6, 6, 6, 18, 230, 231, 6, 232, 89,
+ 6, 6, 6, 6, 6, 6, 6, 23, 233, 18, 18, 6, 6, 6, 234,
+ 164, 98, 61, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6,
+ 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6,
+ 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 134,
+ 6, 6, 6, 61, 6, 6, 235, 6, 6, 6, 235, 6, 6, 18, 6,
+ 6, 6, 18, 6, 6, 12, 6, 6, 6, 12, 6, 6, 6, 61, 6,
+ 6, 6, 61, 6, 6, 235, 236, 3, 3, 3, 3, 3, 3, 104, 104,
+ 104, 104, 104, 104, 104, 104, 104, 104, 104, 104, 104, 104, 104, 104, 104,
+ 104, 104, 104, 104, 104, 104, 104, 104, 104, 187, 0, 0, 0, 0, 0,
+ 104, 104, 104, 104, 104, 104, 104, 104, 104, 104, 104, 104, 104, 104, 104,
+ 104, 104, 104, 104, 104, 104, 104, 104, 104, 104, 104, 104, 104, 104, 104,
+ 104, 104, 104, 104, 104, 104, 104, 104, 104, 104, 104, 104, 104, 104, 104,
+ 104, 104, 104, 104, 104, 104, 104, 104, 104, 104, 104, 104, 104, 104, 104,
+ 104, 104, 104, 104, 104, 104, 104, 104, 104, 104, 146, 0, 104, 104, 104,
+ 104, 104, 104, 104, 104, 104, 104, 104, 104, 104, 104, 104, 104, 104, 104,
+ 104, 104, 104, 104, 104, 104, 104, 104, 104, 145, 0, 0, 0, 0, 0,
+ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
+ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
+ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
+ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 72, 0, 0, 0, 17,
+ 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 0, 0, 0, 0,
+ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 17, 17, 17,
+ 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17,
+ 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 0, 0
+};
+
+#define WB_TABLE3_SIZE 1896
+
+static const uint8_t wb_table3[1896] = {
+ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 7, 0, 0, 0, 0, 9, 0, 7, 0, 3,
+ 3, 3, 3, 3, 3, 3, 3, 3, 3, 8, 9, 0, 0, 0, 0, 0, 2, 2, 2, 2, 2, 2, 2, 2, 2,
+ 2, 2, 2, 2, 2, 2, 2, 2, 2, 0, 0, 0, 0, 5, 2, 2, 2, 0, 0, 0, 0, 0, 0, 0, 2,
+ 0, 0, 6, 0, 0, 0, 0, 0, 0, 0, 2, 0, 8, 0, 0, 2, 0, 0, 0, 0, 0, 2, 2, 2, 2,
+ 2, 2, 2, 0, 2, 2, 0, 0, 0, 0, 2, 2, 2, 2, 0, 0, 0, 0, 0, 0, 2, 2, 2, 2, 2,
+ 0, 0, 0, 0, 0, 0, 0, 2, 0, 2, 0, 6, 6, 6, 6, 6, 6, 6, 6, 2, 2, 2, 2, 2, 0,
+ 2, 2, 0, 0, 2, 2, 2, 2, 9, 0, 0, 0, 0, 0, 0, 0, 2, 8, 2, 2, 2, 0, 2, 0, 2,
+ 2, 2, 2, 0, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 0, 2, 2, 2, 0, 6, 6, 6, 6, 6,
+ 6, 6, 2, 2, 2, 2, 2, 2, 0, 2, 0, 0, 0, 0, 0, 0, 0, 9, 0, 0, 0, 0, 0, 0, 0,
+ 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 0, 6, 0, 6, 6, 0, 6, 6, 0, 6, 2, 2,
+ 2, 2, 8, 0, 0, 0, 6, 6, 6, 6, 0, 0, 0, 0, 0, 0, 0, 0, 9, 9, 0, 0, 6, 6, 6,
+ 0, 0, 0, 0, 0, 2, 2, 2, 6, 6, 6, 6, 6, 3, 3, 0, 3, 9, 0, 2, 2, 6, 2, 2, 2,
+ 2, 2, 2, 2, 2, 2, 2, 2, 0, 2, 6, 6, 6, 6, 6, 6, 6, 2, 2, 6, 6, 0, 6, 6, 6,
+ 6, 2, 2, 3, 3, 2, 2, 2, 0, 0, 2, 0, 0, 0, 0, 0, 0, 0, 6, 2, 6, 2, 2, 2, 2,
+ 2, 2, 6, 6, 6, 0, 0, 2, 2, 2, 2, 2, 2, 2, 2, 2, 6, 6, 6, 2, 0, 0, 0, 0, 0,
+ 0, 3, 3, 2, 2, 2, 2, 2, 2, 6, 6, 6, 6, 2, 2, 0, 0, 9, 0, 2, 0, 0, 0, 0, 0,
+ 6, 6, 2, 6, 6, 6, 6, 6, 6, 6, 6, 6, 2, 6, 6, 6, 2, 6, 6, 6, 6, 6, 0, 0, 2,
+ 6, 6, 6, 0, 0, 0, 0, 6, 6, 6, 6, 2, 2, 2, 2, 2, 2, 6, 6, 6, 2, 6, 6, 2, 6,
+ 6, 6, 6, 6, 6, 6, 2, 2, 6, 6, 0, 0, 3, 3, 0, 6, 6, 6, 0, 2, 2, 2, 2, 2, 2,
+ 2, 2, 0, 0, 2, 2, 0, 0, 2, 2, 2, 2, 2, 2, 0, 2, 2, 2, 2, 2, 2, 2, 0, 2, 0,
+ 0, 0, 2, 2, 2, 2, 0, 0, 6, 2, 6, 6, 6, 6, 6, 6, 6, 0, 0, 6, 6, 0, 0, 6, 6,
+ 6, 2, 0, 0, 0, 0, 0, 2, 2, 0, 2, 2, 2, 2, 0, 0, 0, 0, 2, 2, 0, 2, 2, 0, 2,
+ 2, 0, 2, 2, 0, 0, 6, 0, 6, 6, 6, 6, 6, 0, 0, 0, 0, 6, 6, 0, 0, 6, 6, 6, 0,
+ 0, 0, 6, 0, 0, 0, 0, 0, 0, 0, 2, 2, 2, 2, 0, 2, 0, 0, 0, 0, 0, 0, 0, 3, 3,
+ 6, 6, 2, 2, 2, 6, 0, 0, 2, 0, 2, 2, 0, 2, 2, 2, 6, 6, 0, 6, 6, 6, 0, 0, 2,
+ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 6, 6, 0, 0, 6, 2, 0, 2, 2, 2, 2, 2,
+ 2, 0, 0, 0, 2, 2, 2, 0, 2, 2, 2, 2, 0, 0, 0, 2, 2, 0, 2, 0, 2, 2, 0, 0, 0,
+ 2, 2, 0, 0, 0, 2, 2, 0, 0, 0, 0, 6, 6, 6, 6, 6, 0, 0, 0, 6, 6, 6, 0, 6, 6,
+ 6, 6, 0, 0, 2, 0, 0, 0, 0, 0, 0, 6, 2, 2, 2, 2, 0, 2, 2, 2, 2, 2, 0, 0, 0,
+ 2, 6, 6, 6, 6, 6, 6, 6, 0, 6, 6, 0, 0, 0, 0, 0, 6, 6, 0, 0, 0, 6, 6, 0, 2,
+ 2, 2, 0, 0, 0, 0, 0, 0, 2, 0, 0, 2, 2, 0, 0, 0, 0, 0, 2, 2, 2, 0, 0, 2, 6,
+ 6, 6, 0, 6, 6, 6, 6, 2, 0, 0, 0, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 0, 2, 0, 0,
+ 0, 0, 6, 0, 0, 0, 0, 6, 6, 6, 6, 6, 6, 0, 6, 0, 0, 0, 6, 6, 0, 0, 0, 0, 0,
+ 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 6, 1, 1, 6, 6, 6, 6, 1, 1,
+ 1, 1, 1, 1, 1, 6, 6, 6, 6, 6, 6, 6, 6, 0, 3, 3, 0, 0, 0, 0, 0, 0, 0, 1, 1,
+ 0, 1, 0, 0, 1, 1, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 0, 1, 1, 1,
+ 0, 1, 0, 1, 0, 0, 1, 1, 0, 1, 1, 1, 6, 6, 0, 6, 6, 1, 0, 0, 1, 1, 1, 1, 1,
+ 0, 1, 0, 6, 6, 6, 6, 6, 6, 0, 0, 3, 3, 0, 0, 1, 1, 0, 0, 6, 6, 0, 0, 0, 0,
+ 0, 0, 0, 0, 0, 0, 0, 6, 0, 6, 0, 6, 0, 0, 0, 0, 6, 6, 2, 2, 2, 2, 2, 6, 6,
+ 6, 6, 6, 6, 6, 6, 0, 0, 0, 0, 0, 0, 0, 0, 0, 6, 0, 1, 1, 1, 6, 6, 6, 6, 6,
+ 6, 6, 6, 6, 6, 6, 6, 1, 1, 1, 1, 1, 1, 1, 6, 6, 6, 6, 1, 1, 1, 1, 6, 6, 6,
+ 1, 6, 6, 6, 1, 1, 6, 6, 6, 6, 6, 6, 6, 1, 1, 1, 6, 6, 6, 6, 1, 1, 1, 1, 1,
+ 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 1, 6, 3, 3, 6, 6, 6, 6, 0, 0, 2, 2, 2,
+ 2, 2, 2, 0, 0, 2, 2, 2, 0, 2, 0, 0, 0, 2, 2, 2, 0, 0, 6, 6, 6, 2, 2, 6, 6,
+ 6, 0, 0, 0, 2, 2, 6, 6, 0, 0, 0, 0, 2, 0, 6, 6, 0, 0, 0, 0, 1, 1, 1, 1, 6,
+ 6, 6, 6, 6, 6, 6, 6, 0, 0, 0, 1, 0, 0, 0, 0, 1, 6, 0, 0, 0, 0, 0, 6, 6, 6,
+ 0, 0, 2, 6, 2, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1, 0, 0,
+ 0, 1, 1, 1, 1, 0, 0, 0, 0, 6, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 6,
+ 1, 1, 1, 1, 1, 6, 6, 6, 0, 0, 0, 0, 0, 0, 0, 1, 6, 6, 6, 6, 6, 2, 2, 2, 2,
+ 2, 2, 2, 6, 6, 6, 6, 2, 2, 2, 2, 0, 0, 0, 0, 0, 0, 0, 6, 6, 6, 6, 6, 6, 6,
+ 6, 2, 2, 2, 2, 2, 6, 6, 6, 0, 0, 0, 2, 2, 3, 3, 0, 0, 0, 2, 2, 2, 6, 6, 6,
+ 0, 6, 6, 6, 6, 6, 2, 2, 2, 2, 6, 2, 2, 2, 2, 6, 0, 0, 0, 0, 0, 0, 0, 0, 0,
+ 6, 6, 6, 6, 0, 2, 0, 2, 0, 2, 0, 2, 2, 2, 2, 2, 2, 0, 2, 0, 0, 0, 2, 2, 2,
+ 0, 2, 2, 2, 2, 2, 2, 0, 0, 2, 2, 7, 7, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 7, 0,
+ 0, 8, 0, 0, 6, 6, 6, 6, 6, 0, 0, 0, 0, 0, 0, 0, 0, 5, 5, 0, 0, 0, 9, 0, 0,
+ 0, 0, 0, 0, 0, 5, 0, 0, 0, 0, 0, 6, 6, 6, 6, 6, 6, 0, 0, 0, 0, 0, 0, 0, 2,
+ 6, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 0, 0, 0, 0, 2, 0, 2, 2, 2, 2, 2, 0, 0, 2,
+ 0, 2, 2, 2, 2, 0, 2, 2, 2, 0, 0, 2, 2, 2, 2, 0, 0, 0, 0, 0, 2, 2, 2, 2, 2,
+ 0, 0, 0, 0, 2, 0, 0, 0, 0, 0, 0, 0, 2, 2, 0, 0, 0, 2, 2, 2, 2, 6, 0, 0, 0,
+ 0, 0, 2, 1, 1, 0, 4, 4, 4, 4, 4, 0, 0, 1, 1, 1, 2, 2, 0, 0, 0, 1, 1, 1, 1,
+ 1, 1, 1, 0, 0, 6, 6, 4, 4, 1, 1, 1, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 0, 4,
+ 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 0, 3, 3, 2, 2, 0, 0, 0, 0, 0, 0, 0, 0, 6, 6,
+ 0, 2, 2, 0, 0, 2, 2, 2, 2, 0, 2, 2, 6, 2, 2, 2, 6, 2, 2, 2, 2, 6, 2, 2, 2,
+ 2, 0, 0, 0, 2, 0, 0, 0, 0, 2, 2, 2, 2, 6, 6, 0, 0, 0, 0, 1, 6, 0, 0, 0, 0,
+ 6, 1, 1, 1, 1, 1, 6, 6, 1, 6, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 0, 0, 0,
+ 2, 2, 2, 2, 2, 2, 0, 6, 6, 6, 0, 6, 6, 0, 0, 0, 0, 0, 2, 2, 2, 2, 2, 1, 1,
+ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 6, 2, 2, 2, 0, 2, 2, 0, 2, 2, 9, 0, 0,
+ 8, 9, 0, 0, 0, 0, 0, 0, 5, 5, 0, 0, 0, 0, 0, 0, 0, 0, 5, 5, 5, 9, 0, 7, 0,
+ 9, 8, 0, 0, 2, 2, 2, 2, 2, 0, 0, 6, 0, 0, 8, 9, 0, 0, 0, 0, 0, 0, 0, 0, 0,
+ 0, 4, 4, 4, 4, 4, 4, 4, 4, 6, 6, 0, 0, 2, 2, 2, 0, 0, 0, 0, 6, 6, 6, 0, 0,
+ 0, 0, 2, 2, 2, 0, 2, 2, 0, 2, 0, 0, 0, 0, 0, 6, 0, 0, 2, 0, 0, 0, 2, 0, 0,
+ 2, 2, 6, 6, 6, 0, 6, 6, 0, 6, 6, 6, 0, 0, 6, 0, 0, 4, 1, 0, 0, 0, 0, 0, 0,
+ 0, 0, 0, 0, 0, 6, 6, 6, 6, 6, 0, 0, 0, 6, 6, 6, 6, 6, 6, 0, 0, 6, 6, 6, 0,
+ 0, 6, 6, 6, 6, 0, 0, 0, 0, 6, 6, 6, 0, 0, 0, 0, 0, 2, 0, 0, 2, 2, 0, 0, 2,
+ 2, 2, 2, 0, 2, 2, 2, 2, 0, 2, 0, 2, 2, 2, 2, 2, 2, 0, 0, 2, 2, 2, 2, 2, 0,
+ 2, 2, 2, 2, 0, 2, 2, 2, 0, 2, 2, 2, 2, 2, 2, 2, 2, 0, 0, 3, 3
+};
Added: incubator/lucy/branches/LUCY-196-uax-tokenizer/devel/bin/UnicodeTable.pm
URL: http://svn.apache.org/viewvc/incubator/lucy/branches/LUCY-196-uax-tokenizer/devel/bin/UnicodeTable.pm?rev=1210619&view=auto
==============================================================================
--- incubator/lucy/branches/LUCY-196-uax-tokenizer/devel/bin/UnicodeTable.pm (added)
+++ incubator/lucy/branches/LUCY-196-uax-tokenizer/devel/bin/UnicodeTable.pm Mon Dec 5 20:55:03 2011
@@ -0,0 +1,409 @@
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements. See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+package UnicodeTable;
+use strict;
+
+=head1 NAME
+
+UnicodeTable - Create compressed Unicode tables for C programs
+
+=head1 SYNOPSIS
+
+ my $table = UnicodeTable->read(
+ filename => $filename,
+ type => 'Enumerated',
+ map => \%map,
+ );
+
+ my $comp = $table->compress($shift);
+
+ $comp->dump;
+
+=head1 DESCRIPTION
+
+This module creates compressed tables used to lookup Unicode properties
+in C programs. To compress a table, it's split into blocks of a fixed
+size. Identical blocks are discovered and only unique blocks are written to
+the compressed table. An additional index table is created to map original
+block numbers to new ones.
+
+The index tables can then be compressed again using the same algorithm.
+
+Powers of two are used as block sizes, so the table indices to lookup values
+can be computed using bit operations.
+
+=head1 METHODS
+
+=head2 new
+
+ my $table = UnicodeTable->new(
+ table => \@table,
+ max => $max,
+ shift => $shift,
+ index => $index,
+ );
+
+\@table is an arrayref with the table values, $max is the maximum value.
+$shift and $index are used for compressed tables.
+
+=cut
+
+sub new {
+ my $class = shift;
+
+ my $opts = @_ == 1 ? $_[0] : {@_};
+ my $self = bless( {}, $class );
+
+ for my $name (qw(table max shift index)) {
+ $self->{$name} = $opts->{$name};
+ }
+
+ $self->{mask} = ( 1 << $self->{shift} ) - 1
+ if defined( $self->{shift} );
+
+ return $self;
+}
+
+=head2 read
+
+ my $table = UnicodeTable->table(
+ filename => $filename,
+ type => $type,
+ map => \%map,
+ );
+
+Reads a table from a Unicode data text file. $type is either 'Enumerated'
+or 'Boolean'. \%map is a hashref that maps property values to integers.
+For booleans, these integers are ORed.
+
+=cut
+
+sub read {
+ my $class = shift;
+
+ my $opts = @_ == 1 ? $_[0] : {@_};
+ my $max = 0;
+ my @table;
+
+ my $filename = $opts->{filename};
+ die('filename missing') if !defined($filename);
+ my $type = $opts->{type} or die('type missing');
+ my $map = $opts->{map} or die('map missing');
+ $type = lc($type);
+
+ open( my $file, '<', $filename )
+ or die("$filename: $!\n");
+
+ while ( my $line = $file->getline ) {
+ $line =~ s/\s*(#.*)?\z//s;
+ next if $line eq '';
+ my ( $chars, $prop ) = split( /\s*;\s*/, $line );
+ my $val = $map->{$prop};
+
+ if ( !defined($val) ) {
+ if ( $type eq 'boolean' ) {
+ next;
+ }
+ else {
+ die("unknown property '$prop'");
+ }
+ }
+
+ $max = $val if $val > $max;
+
+ if ( $chars =~ /^[0-9A-Fa-f]+\z/ ) {
+ my $i = hex($chars);
+ if ( $type eq 'boolean' ) {
+ $table[$i] |= $val;
+ }
+ else {
+ $table[$i] = $val;
+ }
+ }
+ elsif ( $chars =~ /^(\w+)\.\.(\w+)\z/ ) {
+ my ( $l, $r ) = ( hex($1), hex($2) );
+ die("invalid range '$chars'") if $l > $r;
+
+ for ( my $i = $l; $i <= $r; ++$i ) {
+ if ( $type eq 'boolean' ) {
+ $table[$i] |= $val;
+ }
+ else {
+ $table[$i] = $val;
+ }
+ }
+ }
+ else {
+ die("invalid range '$chars'");
+ }
+ }
+
+ close($file);
+
+ return bless(
+ { table => \@table,
+ max => $max,
+ },
+ $class
+ );
+}
+
+=head2 shift
+
+=head2 mask
+
+=head2 max
+
+=head2 index
+
+Accessors
+
+=cut
+
+sub shift {
+ return $_[0]->{shift};
+}
+
+sub mask {
+ return $_[0]->{mask};
+}
+
+sub max {
+ return $_[0]->{max};
+}
+
+sub index {
+ my $self = $_[0];
+ my $r = $self->{index};
+ $self->{index} = $_[1] if @_ > 1;
+ return $r;
+}
+
+=head2 set
+
+ $table->set($i, $value);
+
+Set entry at index $i to $value. Don't use with compressed tables.
+
+=cut
+
+sub set {
+ my ( $self, $i, $value ) = @_;
+ $self->{table}[$i] = $value;
+}
+
+=head2 size
+
+ my $size = $table->size;
+
+Storage size of the table in bytes.
+
+=cut
+
+sub size {
+ my $self = CORE::shift;
+
+ my $max = $self->{max};
+ my $bytes = $max < 0x100 ? 1 : $max < 0x10000 ? 2 : 4;
+
+ return @{ $self->{table} } * $bytes;
+}
+
+=head2 lookup
+
+ my $value = $table->lookup($i);
+
+Lookup value at index $i. Also works with compressed tables.
+
+=cut
+
+sub lookup {
+ my ( $self, $i ) = @_;
+
+ my $index = $self->{index};
+
+ if ($index) {
+ $i = $index->mangle_index($i);
+ return 0 if !defined($i);
+ return $self->{table}->[$i];
+ }
+ else {
+ return $self->{table}->[$i] || 0;
+ }
+}
+
+=head2 mangle_index
+
+ my $index = $index_table->mangle_index($i);
+
+Returns a mangled index to be used with a compressed table.
+
+=cut
+
+sub mangle_index {
+ my ( $self, $i ) = @_;
+
+ my $table = $self->{table};
+ my $shift = $self->{shift};
+ my $hi = $i >> $shift;
+ my $index = $self->{index};
+
+ if ($index) {
+ $hi = $index->mangle_index($hi);
+ return undef if !defined($hi);
+ }
+ else {
+ return undef if $hi >= @$table;
+ }
+
+ return ( $table->[$hi] << $shift ) | ( $i & $self->{mask} );
+}
+
+=head2 compress
+
+ my $compressed_table = $table->compress($shift);
+
+Returns a compressed version of this table which is linked to a second
+index table. Blocks of size (1 << $shift) are used.
+
+=cut
+
+sub compress {
+ my ( $self, $shift ) = @_;
+
+ my $table = $self->{table};
+ my $block_size = 1 << $shift;
+ my $block_count = 0;
+ my ( @compressed, @index, %blocks );
+
+ for ( my $start = 0; $start < @$table; $start += $block_size ) {
+ my @block;
+
+ for ( my $i = $start; $i < $start + $block_size; ++$i ) {
+ push( @block, $table->[$i] || 0 );
+ }
+
+ my $str = join( '|', @block );
+ my $block = $blocks{$str};
+
+ if ( !defined($block) ) {
+ $block = $block_count;
+ $blocks{$str} = $block;
+ ++$block_count;
+ push( @compressed, @block );
+ }
+
+ push( @index, $block );
+ }
+
+ my $index = UnicodeTable->new(
+ table => \@index,
+ max => $block_count - 1,
+ shift => $shift,
+ );
+
+ return UnicodeTable->new(
+ table => \@compressed,
+ max => $self->{max},
+ shift => $self->{shift},
+ index => $index,
+ );
+}
+
+=head2 dump
+
+ $table->dump($file, $name);
+
+Dump the table as C code to filehandle $file. The table name is $name.
+
+=cut
+
+sub dump {
+ my ( $self, $file, $name ) = @_;
+
+ my $table = $self->{table};
+ my $size = @$table;
+ my $uc_name = uc($name);
+
+ print $file (<<"EOF") if $self->{shift};
+#define ${uc_name}_SHIFT $self->{shift}
+#define ${uc_name}_MASK $self->{mask}
+EOF
+ print $file (<<"EOF");
+#define ${uc_name}_SIZE $size
+
+EOF
+
+ my $max = $self->{max};
+ my $bits = $max < 0x100 ? 8 : $max < 0x10000 ? 16 : 32;
+ my $pad = length($max);
+ my $vals_per_line = int( 76 / ( $pad + 2 ) );
+
+ print $file ("static const uint${bits}_t $name\[$size] = {\n");
+
+ my $i = 0;
+
+ while ( $i < $size ) {
+ printf $file ( " \%${pad}d", $table->[$i] );
+
+ my $max = $i + $vals_per_line;
+ $max = $size if $max > $size;
+
+ while ( ++$i < $max ) {
+ printf $file ( ", \%${pad}d", $table->[$i] );
+ }
+
+ print $file (',') if $i < $size;
+ print $file ("\n");
+ }
+
+ print $file ("};\n");
+}
+
+sub calc_sizes {
+ my ( $self, $range2, $range1 ) = @_;
+
+ for ( my $shift2 = $range2->[0]; $shift2 <= $range2->[1]; ++$shift2 ) {
+ my $comp = $self->compress($shift2);
+ my $index = $comp->index;
+ my $size3 = $comp->size;
+
+ for ( my $shift1 = $range1->[0]; $shift1 <= $range1->[1]; ++$shift1 )
+ {
+ my $comp_index = $index->compress($shift1);
+
+ my $size1 = $comp_index->index->size;
+ my $size2 = $comp_index->size;
+
+ printf(
+ "shift %2d %2d: %6d + %6d + %6d = %7d bytes, %4d %4d\n",
+ $shift1, $shift2, $size1, $size2, $size3,
+ $size1 + $size2 + $size3,
+ $comp_index->index->max, $comp_index->max,
+ );
+ }
+
+ print("\n");
+ }
+}
+
+=head1 AUTHOR
+
+Nick Wellnhofer <we...@aevum.de>
+
+=cut
+
+1;
Added: incubator/lucy/branches/LUCY-196-uax-tokenizer/devel/bin/gen_word_break_tables.pl
URL: http://svn.apache.org/viewvc/incubator/lucy/branches/LUCY-196-uax-tokenizer/devel/bin/gen_word_break_tables.pl?rev=1210619&view=auto
==============================================================================
--- incubator/lucy/branches/LUCY-196-uax-tokenizer/devel/bin/gen_word_break_tables.pl (added)
+++ incubator/lucy/branches/LUCY-196-uax-tokenizer/devel/bin/gen_word_break_tables.pl Mon Dec 5 20:55:03 2011
@@ -0,0 +1,150 @@
+#!/usr/bin/perl
+
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements. See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+use strict;
+
+use Getopt::Std;
+use UnicodeTable;
+
+my $out_filename = '../../core/Lucy/Analysis/WordBreak.tab';
+
+my %wb_map = (
+ CR => 0,
+ LF => 0,
+ Newline => 0,
+ ALetter => 2,
+ Numeric => 3,
+ Katakana => 4,
+ ExtendNumLet => 5,
+ Extend => 6,
+ Format => 6,
+ MidNumLet => 7,
+ MidLetter => 8,
+ MidNum => 9,
+);
+
+my %opts;
+if ( !getopts( 'c', \%opts ) || @ARGV != 1 ) {
+ print STDERR (<<'EOF');
+Usage: gen_word_break_tables.pl [-c] UNICODE_SRC_DIR
+
+UNICODE_SRC_DIR should point to a directory containing the files
+WordBreakProperty.txt and DerivedCoreProperties.txt from
+http://www.unicode.org/Public/6.0.0/ucd/
+
+Options:
+-c Show total table size for different shift values
+EOF
+ exit;
+}
+
+my $src_dir = $ARGV[0];
+
+my $wb = UnicodeTable->read(
+ filename => "$src_dir/WordBreakProperty.txt",
+ type => 'Enumerated',
+ map => \%wb_map,
+);
+my $alpha = UnicodeTable->read(
+ filename => "$src_dir/DerivedCoreProperties.txt",
+ type => 'Boolean',
+ map => { Alphabetic => 1 },
+);
+
+# Set characters in Alphabetic but not in Word_Break to WB_ASingle = 1
+for ( my $i = 0; $i < 0x30000; ++$i ) {
+ if ( !$wb->lookup($i) && $alpha->lookup($i) ) {
+ $wb->set( $i, 1 );
+ }
+}
+
+if ( $opts{c} ) {
+ $wb->calc_sizes( [ 2, 5 ], [ 3, 9 ] );
+}
+else {
+ # These give the smallest size
+ my $shift1 = 6;
+ my $shift2 = 3;
+
+ my $table3 = $wb->compress($shift2);
+ my $table2 = $table3->index->compress($shift1);
+ my $table1 = $table2->index;
+ $table3->index($table2);
+
+ for ( my $i = 0; $i < 0x110000; ++$i ) {
+ my $v1 = $wb->lookup($i);
+ my $v2 = $table3->lookup($i);
+ die("test for code point $i failed, want $v1, got $v2")
+ if $v1 != $v2;
+ }
+
+ open( my $out_file, '>', $out_filename )
+ or die("$out_filename: $!\n");
+
+ print $out_file (<DATA>);
+
+ $table1->dump( $out_file, 'wb_table1' );
+ print $out_file ("\n");
+ $table2->dump( $out_file, 'wb_table2' );
+ print $out_file ("\n");
+ $table3->dump( $out_file, 'wb_table3' );
+
+ close($out_file);
+}
+
+__DATA__
+/*
+
+This file is generated with devel/bin/gen_word_break_tables.pl. DO NOT EDIT!
+The contents of this file are derived from the Unicode Character Database,
+version 6.0.0, available from http://www.unicode.org/Public/6.0.0/ucd/.
+The Unicode copyright and permission notice follows.
+
+Copyright (c) 1991-2011 Unicode, Inc. All rights reserved. Distributed under
+the Terms of Use in http://www.unicode.org/copyright.html.
+
+Permission is hereby granted, free of charge, to any person obtaining a copy of
+the Unicode data files and any associated documentation (the "Data Files") or
+Unicode software and any associated documentation (the "Software") to deal in
+the Data Files or Software without restriction, including without limitation
+the rights to use, copy, modify, merge, publish, distribute, and/or sell copies
+of the Data Files or Software, and to permit persons to whom the Data Files or
+Software are furnished to do so, provided that (a) the above copyright
+notice(s) and this permission notice appear with all copies of the Data Files
+or Software, (b) both the above copyright notice(s) and this permission notice
+appear in associated documentation, and (c) there is clear notice in each
+modified Data File or in the Software as well as in the documentation
+associated with the Data File(s) or Software that the data or software has been
+modified.
+
+THE DATA FILES AND SOFTWARE ARE PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT OF THIRD
+PARTY RIGHTS. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR HOLDERS INCLUDED IN
+THIS NOTICE BE LIABLE FOR ANY CLAIM, OR ANY SPECIAL INDIRECT OR CONSEQUENTIAL
+DAMAGES, OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS,
+WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING
+OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THE DATA FILES OR
+SOFTWARE.
+
+Except as contained in this notice, the name of a copyright holder shall not be
+used in advertising or otherwise to promote the sale, use or other dealings in
+these Data Files or Software without prior written authorization of the
+copyright holder.
+
+*/
+
Propchange: incubator/lucy/branches/LUCY-196-uax-tokenizer/devel/bin/gen_word_break_tables.pl
------------------------------------------------------------------------------
svn:executable = *
Modified: incubator/lucy/branches/LUCY-196-uax-tokenizer/devel/conf/rat-excludes
URL: http://svn.apache.org/viewvc/incubator/lucy/branches/LUCY-196-uax-tokenizer/devel/conf/rat-excludes?rev=1210619&r1=1210618&r2=1210619&view=diff
==============================================================================
--- incubator/lucy/branches/LUCY-196-uax-tokenizer/devel/conf/rat-excludes (original)
+++ incubator/lucy/branches/LUCY-196-uax-tokenizer/devel/conf/rat-excludes Mon Dec 5 20:55:03 2011
@@ -49,7 +49,9 @@ modules/analysis/snowstem/source/test/te
modules/analysis/snowstop/source/snowball_stoplists.c
# The Unicode license as applied to utf8proc was dealt with in LEGAL-110.
+# The word break tables are also derived from data under the Unicode license.
modules/unicode/utf8proc/utf8proc_data.c
+core/Lucy/Analysis/WordBreak.tab
# For whatever reason, RAT does not recognize the MIT license of utf8proc.h
# and utf8proc.c.