You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@opennlp.apache.org by bg...@apache.org on 2016/11/16 09:11:21 UTC

[25/51] [partial] opennlp-sandbox git commit: merge from bgalitsky's own git repo

http://git-wip-us.apache.org/repos/asf/opennlp-sandbox/blob/1f97041b/opennlp-similarity/src/test/resources/style_recognizer/txt/News/80NewsGoalcom_MessiTop50_EN.txt.txt
----------------------------------------------------------------------
diff --git a/opennlp-similarity/src/test/resources/style_recognizer/txt/News/80NewsGoalcom_MessiTop50_EN.txt.txt b/opennlp-similarity/src/test/resources/style_recognizer/txt/News/80NewsGoalcom_MessiTop50_EN.txt.txt
new file mode 100644
index 0000000..579c641
--- /dev/null
+++ b/opennlp-similarity/src/test/resources/style_recognizer/txt/News/80NewsGoalcom_MessiTop50_EN.txt.txt
@@ -0,0 +1,2 @@
+
+Welcome to the Goal . com 50 ! In this special series , Goal . com editors worldwide vote for the top 50 players of 2010-11 . We count down to the announcement of the winner on August 22 with profiles of each and every player who made it into the top 50 ... Do n't Miss Goal . com 50 : Xavi ( 3 ) Goal . com 50 : Andres Iniesta ( 4 ) Goal . com 50 : Radamel Falcao ( 5 ) Goal . com 50 : Nemanja Vidic ( 6 ) On Wednesday night , Lionel Messi produced one of the finest individual performances seen on a football field in recent memory . Two fantastic goals and one quite wonderful assist swung a captivating Clasico against Real Madrid in Barcelona\u2019s favour . And yet , after the game , nobody was talking about the Argentine forward . There were other talking points , of course . There was Marcelo\u2019s foul on Cesc Fabregas , Jose Mourinho poking Barca assistant coach Tito Vilanova in the eye , David Villa\u2019s slap on Mesut Ozil and a whole host of other ill-tempered incidents , not to menti
 on an epic encounter between two superb sides . Nevertheless , Messi\u2019s masterclass had been the difference between a magnificent Madrid and a Barca team struggling to keep the pace with their biggest rivals . Having netted a vital goal in the first leg at the Santiago Bernabeu , Leo won the return almost single-handedly . It was a performance worthy of acclaim and accolades aplenty , of extreme eulogy . Quite simply , though , there is now little left to say about Messi\u2019s magic and marvel . As Pep Guardiola has oft opined : \u201c We are running out of words to describe Leo . \u201d Indeed , decisive displays have become the norm when it comes to this exceptional young man ; peerless performances are not only hoped for by Barcelona players and their fans , but expected . " There are no words to describe Messi . You have to see it - it is something you cannot describe because you have to see it to believe it . " - Barcelona boss Pep Guardiola The Camp Nou crowds have become accustomed 
 to special stars . Diego Maradona , Johan Cruyff , Ronaldo , Rivaldo and Romario have all graced the turf at Barca 's famous old stadium . But none have done so quite as brilliantly \u2013 or consistently \u2013 as Messi . The new season has barely begun and Leo already has three goals in two games , having only just returned from his holidays \u2013 and a disappointing Copa America campaign with Argentina \u2013 ahead of the Supercopa\u2019s first leg in Madrid . Last term , he hit 53 in just 55 games , racking up an incredible 24 assists along the way . MOMENT OF THE SEASON CHAMPIONS LEAGUE SEMI-FINAL L1 REAL MADRID 0-2 BARCELONA Just as he did in last week 's Spanish Supercopa , Messi proved the difference between Barca and Madrid as he decided this tie with a brilliant brace at the Bernabeu , including a stunning second which saw him beat no less than four Real players on a trademark slalom run towards goal and angled finish past Iker Casillas . Messi missed out as Barca began the 2010-11 camp
 aign with a 3-1 defeat in the first leg of the Supercopa in Seville , having been away on international duty earlier in the week . Back for the Camp Nou clash , however , the Argentine blew away the Andalusians with a brilliant hat-trick to ensure Barca started the season as they have become accostumed of late \u2013 by winning trophies . Leo then took just three minutes to make his mark in La Liga , netting his side\u2019s first of the campaign in a 3-0 win at Racing Santander . Be it Supercopa , Liga , Copa del Rey or Champions League , Messi made his mark . A brilliant brace against Panathinaikos was accompanied by two assists and the Argentine came within a whisker of his hat-trick , rattling the woodwork on two occasions . A treble did come in a 5-0 thrashing of Betis in the Copa del Rey , though , while another arrived in the historic 8-0 humiliation of Almeria . Those were part of the forward\u2019s most prolific scoring run as he netted in nine consecutive games . He was unable to ma
 ke it a perfect 10 in his side\u2019s next fixture , but will hardly have cared as the Catalans trounced Real Madrid 5-0 . In that match , Messi demonstrated to the watching world just how much of a complete player he has become . There were no goals , nor mazy runs , but two glorious assists for Villa and a breathtaking display of pressure and passing to inflict upon Mourinho his worst ever result as coach . The goals continued to fly in after that , with two more at Osasuna and another brace against Real Sociedad , capping a sensational 2010 as he beat off team-mates Andres Iniesta and Xavi to the Fifa Ballon d\u2019Or . But he would be even more decisive in 2011. When Barca were in need of inspiration , there was Messi to provide it . Two goals , including a sensational strike to open the scoring , saw the Catalan club overcome a 2-1 first-leg deficit in the last 16 of the Champions League against Arsenal , while a hat-trick against Atletico Madrid saw the Catalans achieve a 16th succe
 ssive victory , breaking a record held by Alfredo Di Stefano\u2019s brilliant Real side of the 1960s . He also bagged the only goal of the game in a vital league win at Valencia and surpassed his previous mark of 47 goals \u2013 which he had tied with Ronaldo the season before \u2013 with another strike in the Champions League quarter-final against Shakhtar Donetsk . The best , however , was yet to come . After the disappointment of losing the final of the Copa del Rey to Madrid , Messi soon erased the pain with both goals \u2013 including a stunning second \u2013 as Guardiola\u2019s side erased memories of that defeat with a 2-0 win at the Santiago Bernabeu , which sealed their passage - following a 1-1 draw at Camp Nou in the second leg - to the European showpiece against Manchester United at Wembley . " While Ronaldo remains selfish at times , Leo 's decision-making is impeccable ; he knows when to shoot , when to pass and even when to return , conserving his energies for quick , intuitive bursts 
 and sprints . The results are often devastasting . " For the second time in three years , Barca and United squared up in the final . And Messi , who had converted with a superb header in a 2-0 win in 2009 , beat Edwin Van der Sar this time with a thunderous left-footed drive which gave the Dutch keeper no time to react on its way in from outside the box . It turned out to be the game-winning goal and was a fitting end to a fairytale season for Barca 's talisman . His 53 strikes saw him tied with Cristiano Ronaldo at the end of the campaign , but while both players were lauded for their sensational scoring form , Messi\u2019s were more often decisive . The Argentine\u2019s assist total also shows he provided more than the Portuguese . While Ronaldo remains selfish at times , Leo\u2019s decision-making is impeccable ; he knows when to shoot , when to pass and even when to run , conserving his energies for quick , intuitive bursts and sprints . The results are often devastating : Lionel Andres 
 Messi is the best player in the world\u2019s best team , still the finest footballer on the planet and our overwhelming choice as Goal . com\u2019s prime performer from the last 12 months . With few adjectives left to describe his talents , perhaps the title of the Tina Turner track that blasts out of the tannoys before Barca\u2019s matches at Camp Nou is all that is required . Goal . com 50\u2019s worthy winner , Lionel Messi : Simply The Best . 
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/opennlp-sandbox/blob/1f97041b/opennlp-similarity/src/test/resources/style_recognizer/txt/News/81NewsGueyeA_BlackPete_EN.txt.txt
----------------------------------------------------------------------
diff --git a/opennlp-similarity/src/test/resources/style_recognizer/txt/News/81NewsGueyeA_BlackPete_EN.txt.txt b/opennlp-similarity/src/test/resources/style_recognizer/txt/News/81NewsGueyeA_BlackPete_EN.txt.txt
new file mode 100644
index 0000000..21740cb
--- /dev/null
+++ b/opennlp-similarity/src/test/resources/style_recognizer/txt/News/81NewsGueyeA_BlackPete_EN.txt.txt
@@ -0,0 +1,2 @@
+
+The Netherlands : A Holiday Season of Festivities , Costumes \u2026 and Racism ? In the winter season in The Netherlands a character named Zwarte Piet ( Black Pete ) accompanies Sinterklaas ( Saint Nicholas , the original inspiration for Santa Claus ) for a yearly feast that is celebrated on the evening of December 5 or morning of December 6 with sweets and presents for all good children . This traditional holiday rivals Christmas in importance . In recent years the role of Zwarte Piet has become part of a recurring debate in The Netherlands as some citizens take offense at holiday costumes with black painted faces . The story goes that the companions of Saint Nicholas are Moors who help carry the presents brought to children when he arrives by boat from Spain . The tradition continues to be popular , though some have been moved to protest against what they see as racist imagery . On November 12 , 2011 a protester wearing a t-shirt that said \u2018 Zwarte Piet is racisme \u2019 ( Black Pete 
 is racism ) was arrested in Dordrecht amidst accusations of police brutality . The t-shirt campaign has its own Tumblr blog with photos and a Facebook page with more than 800 followers . The blogger at Stuff Dutch People Like wrote in 2010 about the tradition of Zwarte Piet : You know it\u2019s that time of the year again in Holland , when you are greeted by some Dutch person on the street , whose face is painted completely black and is sporting an afro wig , bright red lips and a ridiculous clown-like costume . Sinterklaas and Zwarte Piet , The Hague , The Netherlands , November 2008 , by Zemistor ( CC-BY-ND ) Dutch graffiti artist and blogger BNE posts some photos of Zwarte Piet , and asks : Is The Dutch Holiday Of Sinterklaas\u2019s Tradition Of \u201c Black Pete \u201d Racist ? : This \u201c tradition \u201d has evolved throughout the years , partially due to increasing protests from groups that find these depictions offensive . Nowadays , it is claimed that the Black face is due to the fact that
  the helpers have gone through chimneys and as a result , their faces are covered in soot . What again , nobody can clearly explain , is what kind of soot leaves such a uniform and evenly spread residue . Or worse , why these \u201c chimney dwellers \u201d speak in a fake accent that parodies the Black population of the Dutch former colony of Suriname . Anthropologist and blogger Martijn de Koning of CLOSER explains in Jolly Black Servant \u2013 Tradition and Racism in the Netherlands : I dont expect a change in this tradition very soon . It should be clear however that Black Pete is a construction , and invention that has already changed in history . The current tradition has lost many of negative connotations which is partly positive but the negative side is that this makes the racism more hidden . Nevertheless , I think this Dutch tradition lends itself perfectly for teaching young children about racism , colonialism and religion throughout history . Maybe that would be a starting point f
 or some change in the future ? On travel website Off Track Planet , Anna Starostinetskaya gives this answer to the question , What the F*ck is Zwarte Piet ? : So is Pete a children\u2019s tale or a racist figure ? We promise no definitive answer exists . We\u2019re not saying this tradition is not objectifying black people in a racist way and it is understandable that Americans have the strongest feelings on the topic because Zwarte Piet is visually too close to what our racist roots look like . But Americans must also realize that our own history drives us to apply what we know about our own racist past on traditions that may not have anything in common but black face paint . Although it may be racist in some way , we cannot just superimpose our own racist history atop another country\u2019s tradition and say it\u2019s the same . Either way , we hope a happy medium exists that doesn\u2019t involve smurfs , midgets or complete Americanization of world traditions . Sinterklaas arrives by boat in Ar
 nhem , November 2011 , by Bas Boerman ( CC-BY-NC ) On the blog Tiger Beatdown , beneath Flavia\u2019s post \u201c If you protest racism during Black Face season in The Netherlands , you will be beaten up and arrested \u201d a comment by Elfe echoes the above : I read your post because I needed to understand why I do not find this tradition racist \u2026The \u201c slaves \u201d or \u201c helpers \u201d are you refer to them are not ridicule : these are pages not clowns and they are wearing nice clothes , they are not parading around half naked with a bone across their nostrils like some savages ( or like Josephine Baker and her banana skirt ) . \u2026 Like \u201c Tintin in the Congo \u201d the Zwarte Piets are a reminder of the past . \u2026 I know it is very insulting for Blacks in America to see White people with their face painted in black ( but it took me to live in the US to understand why : a period when black were not even allowed to play their own role in theater ) . \u2026Like the rappers who have decided to own 
 the N word we can just ignore this tradition if it annoys us , personally I could not care less . Being African I don\u2019t see the Zwarte Piets as Blacks ( they don\u2019t look like me or like any African I know ) \u2026 To feel insulted by them you really need to have a really poor self-esteem . Sorry for being politically incorrect \u2026 Written by Anna Gueye 
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/opennlp-sandbox/blob/1f97041b/opennlp-similarity/src/test/resources/style_recognizer/txt/News/82NewsLeM_OrbanGoldmanSachs_EN.txt.txt
----------------------------------------------------------------------
diff --git a/opennlp-similarity/src/test/resources/style_recognizer/txt/News/82NewsLeM_OrbanGoldmanSachs_EN.txt.txt b/opennlp-similarity/src/test/resources/style_recognizer/txt/News/82NewsLeM_OrbanGoldmanSachs_EN.txt.txt
new file mode 100644
index 0000000..b6f53e5
--- /dev/null
+++ b/opennlp-similarity/src/test/resources/style_recognizer/txt/News/82NewsLeM_OrbanGoldmanSachs_EN.txt.txt
@@ -0,0 +1,2 @@
+
+Hungary : \u201c We Voted For Orb�n , Not For Goldman Sachs \u201d Last Saturday , after several protests organized by citizen movements and opposition forces against the politics of the ruling Fidesz-KDNP government , Hungarians supporting those in power decided to express their opinion at a rally called Peace March . According to the Hungarian Interior Ministry 's report [ hu ] , some 400,000 people expressed their support for the government at the peaceful - and cheerful - event . \u2018 We are the Hungarian people and we stand for Orb�n 's government ! ' . Photo by Redjade , used with permission . Those who thought Fidesz-KDNP had lost the trust of the Hungarian citizens , drawing the consequence from the extensive foreign media coverage of the opposition protests in Budapest , were challenged this time by the fact that the government elected in 2010 with a two-thirds majority still enjoyed the support of many . Fidelitas , a youth group derived from Fidesz , shared some 360-degree pan
 oramic photos of the march . The protesters marched from Heroes ' Square to Kossuth Square by the Parliament , where brief speeches were delivered . The main organizers of the event were Zsolt Bayer , author of opinion pieces at the conservative daily Magyar H�rlap , G�bor Sz�les , a wealthy Hungarian entrepreneur and owner of Magyar H�rlap , and Andr�s Bencsik , editor-in-chief of Magyar Demokrata [ hu ] , also a Hungarian conservative daily . The right-leaning blog Mandiner has been very critical of the government recently , and , at first , their blogger , Dobray , who visited the Peace March , also had some doubts regarding the event [ hu ] : [ \u2026 ] Compared to what I had anticipated , the march came off even better : the mass of 400,000 ( probably fewer than that , the protest maths [ competition of whose protest had more attendees ] was started by Bencsik at Kossuth Square when he said , referring to a television report , that they were 1 million , which was evidently an
  unreal figure ) walked the distance , and , as no other options were listed on the program , no lame events happened . The puritan minimalism goes hand in hand with a portion of boredom well known from the first , eventless left-wing rallies . But it 's hard to pick at that . And that there were some groups with Arpad 's striped flags [ a symbol of the far right ] was not a big deal , we are used to that , they do n't do any trouble . We will worry about some Arpad stripes protesters in a mass of a couple of hundred thousands if the left wing expels from their community the comrades parading in the USSR and Che t-shirts . [ \u2026 ] The fact that describes the complex situation in Hungary best is that the government 's supporters oppose the talks and future agreements on the bailout from the EU and the IMF , while the opposition is in favor of reaching the agreements as soon as possible , in order to strengthen Hungary 's volatile economy . Pro-government protesters criticized EU/ECB/
 IMF for pressure on the government to take more bailout loans . Photo by Redjade , used with permission . Many protesters arrived from outside the capital . The blog of the city of �csa wrote [ hu ] about why they considered it important to participate in the march : People set off from almost every settlement of the country to express their solidarity with the government elected with the two-thirds majority , with its leader Viktor Orb�n and with everyone who has been attacked in the past days. The marchers stand up for the sovereignity of Hungary and stick to the achievements of democracy , they ca n't stand that foreign politicians , businessmen , banks are willing to administer their lives . [ \u2026 ] V�lem�nyvez�r pointed out [ hu ] that most of the protesters were elderly : [ \u2026 ] it was very striking that most of the marchers were aged 50 or older . They are the ones whose private pension savings were not taken away , almost none of them has a foreign currency loan , and 
 the government specifically tried to support them , through measures like the one-time 8-percent pension makeup or by implementing the institution of securing employment for older persons . [ \u2026 ] 'We voted for Orb�n and not for Goldman Sachs ' . Photo by Redjade , used with permission . Dobray hints at the rumours about paid protesters and organized travel to the rally location , the accusations raised by opposition members : [ \u2026 ] So now we are even , now really each and everyone has brought politics to the street . And it 's funny that at any sort of protest the actual side opposing the protesters tries every method to discredit the other 's event ; and tries to find those whose travel has been paid for , who were paid to come and who were cheated , etc. Everyone is generous when it 's about their protest , but if it 's about the other 's , they turn petty and suspicious . The neighbour 's lawn is always wilted . I also would be happy if the Peace March did n't get listed amo
 ng the ultimate arguments of Fidesz government allowing them to knock down all the opposing opinions . [ \u2026 ] Zolt�n Ruzsbaczky of Mos Maiorum blog published a guest post [ hu ] on Konzervat�rium blog , noting that the huge number of the pro-government supporters may signify the arrival of a new stage of democracy in Hungary , with a lot of people daring to stand up for their opinion : [ \u2026 ] Of course , this needs a government that applies this trust and successfully navigates the tempestuous sea of international politics and with its economic policy it sets Hungary on the track of growth . Besides this , one ca n't get by those masses who still oppose the politics of the government . We will learn only later what the long-term effects [ of this march ] will be , [ and whether there will be any ] . 
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/opennlp-sandbox/blob/1f97041b/opennlp-similarity/src/test/resources/style_recognizer/txt/News/83NewsMendesFrancoJ_HaitiBeyondCapital_EN.txt.txt
----------------------------------------------------------------------
diff --git a/opennlp-similarity/src/test/resources/style_recognizer/txt/News/83NewsMendesFrancoJ_HaitiBeyondCapital_EN.txt.txt b/opennlp-similarity/src/test/resources/style_recognizer/txt/News/83NewsMendesFrancoJ_HaitiBeyondCapital_EN.txt.txt
new file mode 100644
index 0000000..4f0580c
--- /dev/null
+++ b/opennlp-similarity/src/test/resources/style_recognizer/txt/News/83NewsMendesFrancoJ_HaitiBeyondCapital_EN.txt.txt
@@ -0,0 +1,2 @@
+
+Haiti : Beyond the Capital Posted 17 January 2010 22:42 GMT Much of the coverage of the destruction in the earthquake 's aftermath has been focused in and around Haiti 's capital city , Port-au-Prince . But many other areas close to the \u2018 quake 's epicentre have also been affected , as bloggers are quick to point out \u2026 Jacmel , located about twenty-five miles south of Port-au-Prince , is \u201c stranded and increasingly desperate \u201d , according to Repeating Islands ' republication of an excerpt from \u201c the award-winning team of reporters from the Miami Herald : Residents of Jacmel , a quaint , historic Caribbean port city that suffered widespread damage and has been cut off from Port-au-Prince to the north , complain they have been forgotten . Four days after the quake struck Jacmel with equal force , they say they are still awaiting food , water , medical supplies and relief workers . Despite the blog 's discontent \u201c about the nature of the coverage of the earthquake in Haiti 
 on American television and other media \u201d , in another post it follows the Herald team as it reports on another area that is receiving little media attention , Carrefour : This town , which on Tuesday was the epicenter of the earthquake , is living in the epicenter of oblivion . Pwoje Espwa - Hope in Haiti , meanwhile , reports on the relief efforts taking place in Les Cayes : In contrast to the situation in PAP , the UN is guiding the relief efforts in les Cayes , and will be coordinating and providing a platform for the efforts of all the NGOs working in the area . There is no fuel left for purchase in Cayes and the UN has very little left . The UN folks are not sure when food and fuel will be delivered . All of us are nervous about this . There was a commercial flight on Tortugair this afternoon from Cap Haitian to Cayes , and they delivered a group of 8 orthopedic surgeons to work at the hospital . As the people arrive here from the destroyed capitol we will assist them in any 
 way we can . Some need money to go on to family on the coast or inland ; some require medical attention ; all are hungry and thirsty ; almost all need clothing and shoes along with personal hygiene items . A simple thing like letting this young woman use my cell phone to call her mother and tell her she was all right and in Cayes was momentous for her and her mother . Konbit Pou Ayiti says that \u201c Haiti KONPAY has been playing a critical role coordinating a rapid response to the crisis in both Jacmel and Port-au-Prince\u2026pursuing two major strategies \u201d : 1. Delivering immediate support to people on the ground in Jacmel and Port-au-Prince by coordinating the transport of supplies and volunteers . Carefully design volunteer interventions to avoid exacerbating the developing food and water shortages . 2. Encouraging the evacuation of Port-au-Prince and establishing the resources necessary to assist victims when they arrive in the countryside by assessing existing resources in outlyi
 ng areas and sending teams and equipment to clinics . The post goes on to quote a report \u201c from Amber Munger on the ground in Port-au-Prince \u201d : These are some details of the damage in Jacmel , which is a city of 34,000 : � 1,785 homes completely destroyed � 4410 homes partially destroyed � 87 commercial businesses destroyed � 54 schools destroyed � 24 hotels destroyed � 26 churches destroyed � 5730 families displaced � Death count approaching 3,000 , nearly 10 % of the population ( Reported by Gwenn Mangine , www . mangine . org ) Mangine also posts images from the ( severely damaged ) general hospital , with a further update on Sunday 16 : \u2026we noticed that the main pharmacy in town was open . And so we went in and bought them out of everything they had from the list \u2014 alcohol , hand sanitizer , peroxide , wound care items , meds \u2026 ( another truckload . ) Yesterday we were expecting a big shipment of supplies , but we got one box . Still \u2013 we rushed over to the 
 hospital with it . Mostly antibiotics and trauma care supplies \u2013 both were desperately needed . The doctors were thrilled . Pye 's in Haiti discusses the \u201c crazy busy \u201d situation at the local airstrip : We had a plane full of supplies ready to come , however the San Juan airport would not let the plane leave with the medical supplies \u2026 . We are hoping that flights start today of supplies and medicine . And Darren Tyler of Conduit Mission , who has been trying to send emergency supplies to Jacmel by boat , shares an update from a member of his organisation on the ground : The port can be used , cruise ships can come in there . We need help bad here is the city . What kind of supplies are on the boat ? How fast can they get here ? We are starting to feel people get frustrated and scared \u2026 . Updates are being posted regularly on Twitter . @melindayiti noted ( 15 January ) that \u201c Jacmel is a mess - we have planes and boats but US coordinators wo n't give us clearance to get 
 in ! \u201d And a few hours later added : \u201c 2 boats on the way , still no clearance to land plane w/critical medical teams \u201d . Meanwhile , @RescueJacmel , a new Twitter account , is attempting to ensure that international rescue efforts do not overlook the small city . Video bloggers are also chronicling their experiences , with clips from Les Cayes and Jacmel getting lots of attention on YouTube and other video sharing websites . The Cine Institute in Jacmel also posted eyewitness accounts of the earthquake . Lougou Corner is one of the blogs eager to supply information from their community : # We last communicated with Ginette last Thursday evening and she said that an exodus of people left Port-au-Prince already and came back to the provinces and rural areas . # We communicated via email with some ministries in Cayes and they reported that hospitals in Cayes are flooded with patients returning from Port and other areas affected in the provinces . We have seen firsthand in Lougou
  how an entire community has changed when residents have a say on the issues that affect their own lives . They have the best knowledge on what can and should be done to meet their pressing needs and bring lasting change in their community . And finally , from a U. S. -based Caribbean diaspora blogger , comes a stirring account of her friend 's quest to find his mother , most likely in a neighbourhood of Port-au-Prince : \u2018 As we came up on the block , I first walked right by the house because a good portion of it was totally demolished . The fitness center across the street was also completely demolished with a really strong smell coming from in between the bricks . When I asked people if they knew my mom , they shook their heads until I mentioned her by her nickname , Tita . And they were like , \u201c Oh , yeah , \u201d with joy in their eyes . \u201c She 's right there in the house next door . \u201d \u2018 I opened the door . Her back was to me . I tapped her on the shoulder . The surprise ,
  the tears , the hug so hard to explain . It was an unbelievable moment . She squeezed me so hard , crazy with joy . She paraded me down the street . \u201c Meet my fourth son . He came for me , \u201d she said . \u201c He came for me . \u201d \u2018 We can only hope for similar stories coming out of other affected areas . For more on the earthquake in Haiti , visit our Special Coverage page . 
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/opennlp-sandbox/blob/1f97041b/opennlp-similarity/src/test/resources/style_recognizer/txt/News/84NewsMillerH_FrankensteinTradition_EN.txt.txt
----------------------------------------------------------------------
diff --git a/opennlp-similarity/src/test/resources/style_recognizer/txt/News/84NewsMillerH_FrankensteinTradition_EN.txt.txt b/opennlp-similarity/src/test/resources/style_recognizer/txt/News/84NewsMillerH_FrankensteinTradition_EN.txt.txt
new file mode 100644
index 0000000..4dfdba2
--- /dev/null
+++ b/opennlp-similarity/src/test/resources/style_recognizer/txt/News/84NewsMillerH_FrankensteinTradition_EN.txt.txt
@@ -0,0 +1,2 @@
+
+Understanding the Frankenstein Tradition Henry I. Miller 2010-11-03 PALO ALTO \u2013 \u201c It\u2019s alive , it\u2019s moving , it\u2019s alive ... IT\u2019S ALIVE ! \u201d So said Dr. Victor Frankenstein when his \u201c creation \u201d was complete . Researchers have long been fascinated with trying to create life , but mainly they have had to settle for crafting variations of living organisms via mutation or other techniques of genetic engineering . In May , researchers at the J. Craig Venter Institute , led by Venter himself , synthesized the genome of a bacterium from scratch using chemical building blocks , and inserted it into the cell of a different variety of bacteria . The new genetic information \u201c rebooted \u201d its host cell and got it to function , replicate , and take on the characteristics of the \u201c donor . \u201d In other words , a sort of synthetic organism had been created . Reactions in the scientific community ranged from \u201c slight novelty \u201d to \u201c looming apocalypse . \u201d The former is m
 ore apt : Venter\u2019s creation is evolutionary , not revolutionary . The goal of \u201c synthetic biology , \u201d as the field is known , is to move microbiology and cell biology closer to the approach of engineering , so that standardized parts can be mixed , matched , and assembled \u2013 just as off-the-shelf chassis , engines , transmissions , and so on can be combined to build a hot-rod . Achieving this goal could offer scientists unprecedented opportunities for innovation , and better enable them to craft bespoke microorganisms and plants that produce pharmaceuticals , clean up toxic wastes , and obtain ( or \u201c fix \u201d ) nitrogen from the air ( obviating the need for chemical fertilizers ) . During the past half-century , genetic engineers , using increasingly powerful and precise tools and resources , have achieved breakthroughs that are opening up new opportunities in a broad array of fields . The Venter lab\u2019s achievement builds on similar work that began decades ago . In 1967 , a
  research group from Stanford Medical School and Caltech demonstrated the infectiousness of the genome of a bacterial virus called \u03a6\u03a7174 , whose DNA had been synthesized with an enzyme using the intact viral DNA as a template , or blueprint . That feat was hailed as \u201c life in a test tube . \u201d In 2002 , a research group at the State University of New York , Stony Brook , created a functional , infectious poliovirus solely from basic , off-the-shelf chemical building blocks . Their only blueprint for engineering the genome was the known sequence of RNA ( which comprises the viral genome and is chemically very similar to DNA ) . Similar to the 1967 experiments , the infectious RNA was synthesized enzymatically . It was able to direct the synthesis of viral proteins in the absence of a natural template . Once again , scientists had , in effect , created life in a test tube . Venter\u2019s group did much the same thing in the recently reported research , except that they used chemical 
 synthesis instead of enzymes to make the DNA . But some of the hype that surrounded the publication of the ensuing article in the journal Nature was disproportionate . Along with the Venter paper , Nature published eight commentaries on the significance of the work . The \u201c real \u201d scientists were aware of the incremental nature of the work , and questioned whether the Venter group had created a genuine \u201c synthetic cell , \u201d while the social scientists tended to exaggerate the implications of the work . Mark Bedau , a professor of philosophy at Reed College , wrote that the technology\u2019s \u201c new powers create new responsibilities . Nobody can be sure about the consequences of making new forms of life , and we must expect the unexpected and the unintended . This calls for fundamental innovations in precautionary thinking and risk analysis . \u201d But , with increasing sophistication , genetic engineers using old and new techniques have been creating organisms with novel or enhanc
 ed properties for decades . Regulations and standards of good practice already effectively address organisms that may be pathogenic or that threaten the natural environment . ( If anything , these standards are excessively burdensome . ) On the other hand , Swiss bioengineer Martin Fussenegger correctly observed that the Venter achievement \u201c is a technical advance , not a conceptual one . \u201d Other scientists noted that the organism is really only \u201c semi-synthetic , \u201d because the synthetic DNA ( which comprises only about 1 % of the dry weight of the cell ) was introduced into a normal , or non-synthetic , bacterium . Understanding the history of synthetic biology is important , because recognizing the correct paradigm has critical implications for how governments regulate it , which in turn affects the potential application and diffusion of the technology . Thirty-five years ago , the US National Institutes of Health adopted overly risk-averse guidelines for research using re
 combinant DNA , or \u201c genetic engineering , \u201d techniques . Those guidelines , based on what has proved to be an idiosyncratic and largely invalid set of assumptions , sent a powerful message that scientists and the federal government were taking seriously speculative , exaggerated risk scenarios \u2013 a message that has afflicted the technology\u2019s development worldwide ever since . Synthetic biology offers the prospect of powerful new tools for research and development in innumerable fields . But its potential can be fulfilled only if regulatory oversight is based on science , sound risk analysis , and an appreciation of the mistakes of history . 
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/opennlp-sandbox/blob/1f97041b/opennlp-similarity/src/test/resources/style_recognizer/txt/News/85NewsRabinovichI_IranNuclear_EN.txt.txt
----------------------------------------------------------------------
diff --git a/opennlp-similarity/src/test/resources/style_recognizer/txt/News/85NewsRabinovichI_IranNuclear_EN.txt.txt b/opennlp-similarity/src/test/resources/style_recognizer/txt/News/85NewsRabinovichI_IranNuclear_EN.txt.txt
new file mode 100644
index 0000000..b2e6a74
--- /dev/null
+++ b/opennlp-similarity/src/test/resources/style_recognizer/txt/News/85NewsRabinovichI_IranNuclear_EN.txt.txt
@@ -0,0 +1,2 @@
+
+The Iranian Nuclear Threat Goes Global Itamar Rabinovich 2012-02-16 TEL AVIV \u2013 The current drive to prevent Iran from developing a nuclear arsenal reflects two important , and interrelated , changes . From Israel\u2019s perspective , these changes are to be welcomed , though its government must remain cautious about the country\u2019s own role . The first change is the escalation of efforts by the United States and its Western allies to abort the Iranian regime\u2019s nuclear quest . This was instigated in part by the International Atomic Energy Agency\u2019s finding in November 2011 that Iran is indeed developing a nuclear weapon , and that it is getting perilously close to crossing the \u201c red line \u201d \u2013 the point beyond which its progress could no longer be stopped . Moreover , the US and its allies understand that failure to take serious action might prompt Israel to launch its own unilateral military offensive . The second change is the perception that Iran\u2019s nuclear capacity would t
 hreaten not only Israel . In a speech to the Union for Reform Judaism in December , US President Barack Obama stated that \u201c another threat to the security of Israel , the US , and the world is Iran\u2019s nuclear program . \u201d But , by this February , Obama was saying of Iran that \u201c my number-one priority continues to be the security of the US , but also the security of Israel , and we continue to work in lockstep as we proceed to try to solve this \u2026 \u201d That choice of words was no accident ; rather , it was a sign that the US is changing tack when it comes to Iran . For more than a decade , the question \u201c Whose issue is it ? \u201d has been part of the policy debate about Iran\u2019s nuclear ambitions . Israel\u2019s former prime minister , Ariel Sharon , used to caution his colleagues against \u201c rushing to the head of the line \u201d on Iran . He argued that if Israel were to take the lead in sounding the alarm on Iran\u2019s nuclear ambitions , the issue would be perceived as yet another 
 \u201c Israeli problem . \u201d Indeed , Israel\u2019s critics were already arguing that this was another case of the tail wagging the dog \u2013 that Israel and its American lobby were trying to push the US into serving Israel\u2019s interests rather than its own . The most egregious examples of this view were statements made by the political scientists John Mearsheimer and Stephen Walt . In a paper published prior to the release of their much-debated book The Israel Lobby , they argued : \u201c \u2026 Iran\u2019s nuclear ambitions do not pose an existential threat to the US . If Washington could live with a nuclear Soviet Union , a nuclear China , and even a nuclear North Korea , then it can live with a nuclear Iran . And that is why the [ Israel ] Lobby must keep constant pressure on US politicians to confront Tehran . \u201d Israel\u2019s current prime minister , Benjamin Netanyahu , has been less worried than Sharon was about Israel\u2019s perceived role . He is too busy being directly engaged in the attempt t
 o eliminate the deadly threat that a nuclear-armed Iran would pose to the Jewish state . Prior to the 2009 election that brought him to power , Netanyahu campaigned on the Iranian danger , and his government made the issue its cardinal concern . Together with his defense minister , Ehud Barak , Netanyahu succeeded in persuading Obama and the rest of the world that Israel was preparing a military attack as a last resort , should the US and its allies fail to stop the Iranian program in time . That policy has been effective , but it has also drawn attention to Israel\u2019s influence on the Iran question . Curiously , this has not been held against Israel , at least not so far , partly because Obama and other leaders now regard Iran as a more serious threat , and therefore feel the need to take appropriate action . The international community must underscore that its members are acting in the service of their national interests , and not simply for Israel\u2019s sake . But their willingness
  to engage could wane , particularly if sanctions exact a high financial price or military action causes a large number of casualties . Israel would therefore be wise to remember Sharon\u2019s cautionary words , and reinforce its pressure on the US administration with a broader diplomatic campaign . Like it or not , Israel must urge the world to remember that Iran is everyone\u2019s problem . 
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/opennlp-sandbox/blob/1f97041b/opennlp-similarity/src/test/resources/style_recognizer/txt/News/86NewsRian_IranCutsOil_EN.txt.txt
----------------------------------------------------------------------
diff --git a/opennlp-similarity/src/test/resources/style_recognizer/txt/News/86NewsRian_IranCutsOil_EN.txt.txt b/opennlp-similarity/src/test/resources/style_recognizer/txt/News/86NewsRian_IranCutsOil_EN.txt.txt
new file mode 100644
index 0000000..e728122
--- /dev/null
+++ b/opennlp-similarity/src/test/resources/style_recognizer/txt/News/86NewsRian_IranCutsOil_EN.txt.txt
@@ -0,0 +1,2 @@
+
+Iran Cuts Oil Exports to UK , France Iran has stopped oil exports to British and French companies , the Islamic republic 's oil ministry said on Sunday . A statement the ministry 's website that Iran , OPEC 's second biggest oil producer after Saudi Arabia , would sell oil to " new customers . " Iran 's English-language television station Press TV said the move was " in line with the decision to end crude exports to six European states . " The European Union said last month it would stop importing Iranian crude from July 1 in a bid to force Iran to agree to halt its nuclear program . Western powers suspect Iran of seeking to create a nuclear bomb but Tehran insists its program is peaceful . Iranian media reported on Wednesday that Iran had cut oil exports to the Netherlands , Greece , France , Portugal , Spain and Italy in response to the EU oil embargo , but the country 's oil ministry later denied this . The 27-nation bloc currently buys about 20 percent of Iran 's oil exports . 
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/opennlp-sandbox/blob/1f97041b/opennlp-similarity/src/test/resources/style_recognizer/txt/News/87NewsRian_MedvedevDismisses_EN.txt.txt
----------------------------------------------------------------------
diff --git a/opennlp-similarity/src/test/resources/style_recognizer/txt/News/87NewsRian_MedvedevDismisses_EN.txt.txt b/opennlp-similarity/src/test/resources/style_recognizer/txt/News/87NewsRian_MedvedevDismisses_EN.txt.txt
new file mode 100644
index 0000000..f5e88a7
--- /dev/null
+++ b/opennlp-similarity/src/test/resources/style_recognizer/txt/News/87NewsRian_MedvedevDismisses_EN.txt.txt
@@ -0,0 +1,2 @@
+
+Medvedev dismisses EU criticism of Duma elections Russian President Dmitry Medvedev on Thursday said a European Parliament resolution calling for new State Duma elections \u201c means nothing . \u201d \u201c I have nothing to say on this resolution , because these are our elections . The European Parliament has no relation to them . They can comment on anything they want . I will not comment on their decisions as they mean nothing to me , \u201d the Russian president said at a joint press conference with European Commission President Jose Manuel Barroso and European Council President Herman Van Rompuy . Russia\u2019s parliamentary parties have severely criticized the European Parliament\u2019s resolution , labeling it interference in the country\u2019s domestic affairs , Medvedev added . They have also demanded that it stop such \u201c escapades . \u201d 
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/opennlp-sandbox/blob/1f97041b/opennlp-similarity/src/test/resources/style_recognizer/txt/News/88NewsWiki_KaradzicArrest_EN.txt.txt
----------------------------------------------------------------------
diff --git a/opennlp-similarity/src/test/resources/style_recognizer/txt/News/88NewsWiki_KaradzicArrest_EN.txt.txt b/opennlp-similarity/src/test/resources/style_recognizer/txt/News/88NewsWiki_KaradzicArrest_EN.txt.txt
new file mode 100644
index 0000000..c1c6136
--- /dev/null
+++ b/opennlp-similarity/src/test/resources/style_recognizer/txt/News/88NewsWiki_KaradzicArrest_EN.txt.txt
@@ -0,0 +1,2 @@
+
+Alleged war criminal Radovan Karad\u017ei\u0107 caught in Serbia Tuesday , July 22 , 2008 Radovan Karad\u017ei\u0107 Alleged war criminal Radovan Karad\u017ei\u0107 was caught yesterday in Serbia by Serbian security forces after almost 13 years on the run from authorities . Last night he was questioned by an inquisitor of the War Crimes Court in Belgrade . Karad\u017ei\u0107 has been accused by the International Criminal Tribunal for the former Yugoslavia of genocide , war crimes and crimes against humanity during the Bosnian War from 1992 to 1995. The Srebrenica massacre , in which about 8,000 Muslims were killed , is among the most serious of his alleged crimes . The massacre was categorised as genocide by both the International Court of Justice and the International Criminal Tribunal for the former Yugoslavia . Internationally , the arrest was unanimously welcomed . Along with former military chief Ratko Mladi\u0107 , Karad\u017ei\u0107 has been one of the most sought-after war criminals of the Balkan conflict . Described 
 by the BBC 's Kate Adie as a " smart , rather vain man " , his capture found him with a long white beard working in a clinic and practicing alternative medicine under the name Dragan Dabic . 
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/opennlp-sandbox/blob/1f97041b/opennlp-similarity/src/test/resources/style_recognizer/txt/Opac/89OpacStallman_FreeSoft_EN.txt.txt
----------------------------------------------------------------------
diff --git a/opennlp-similarity/src/test/resources/style_recognizer/txt/Opac/89OpacStallman_FreeSoft_EN.txt.txt b/opennlp-similarity/src/test/resources/style_recognizer/txt/Opac/89OpacStallman_FreeSoft_EN.txt.txt
new file mode 100644
index 0000000..cde10d6
--- /dev/null
+++ b/opennlp-similarity/src/test/resources/style_recognizer/txt/Opac/89OpacStallman_FreeSoft_EN.txt.txt
@@ -0,0 +1,2 @@
+
+\ufeff The Free Software Definition We maintain this free software definition to show clearly what must be true about a particular software program for it to be considered free software . From time to time we revise this definition to clarify it . If you would like to review the changes we 've made , please see the History section below for more information . \u201c Free software \u201d is a matter of liberty , not price . To understand the concept , you should think of \u201c free \u201d as in \u201c free speech , \u201d not as in \u201c free beer . \u201d Free software is a matter of the users ' freedom to run , copy , distribute , study , change and improve the software . More precisely , it means that the program 's users have the four essential freedoms : \u2022 The freedom to run the program , for any purpose ( freedom 0 ) . \u2022 The freedom to study how the program works , and change it so it does your computing as you wish ( freedom 1 ) . Access to the source code is a precondition for this . \u2022 The free
 dom to redistribute copies so you can help your neighbor ( freedom 2 ) . \u2022 The freedom to distribute copies of your modified versions to others ( freedom 3 ) . By doing this you can give the whole community a chance to benefit from your changes . Access to the source code is a precondition for this . A program is free software if users have all of these freedoms . Thus , you should be free to redistribute copies , either with or without modifications , either gratis or charging a fee for distribution , to anyone anywhere . Being free to do these things means ( among other things ) that you do not have to ask or pay for permission to do so . You should also have the freedom to make modifications and use them privately in your own work or play , without even mentioning that they exist . If you do publish your changes , you should not be required to notify anyone in particular , or in any particular way . The freedom to run the program means the freedom for any kind of person or orga
 nization to use it on any kind of computer system , for any kind of overall job and purpose , without being required to communicate about it with the developer or any other specific entity . In this freedom , it is the user 's purpose that matters , not the developer 's purpose ; you as a user are free to run the program for your purposes , and if you distribute it to someone else , she is then free to run it for her purposes , but you are not entitled to impose your purposes on her . The freedom to redistribute copies must include binary or executable forms of the program , as well as source code , for both modified and unmodified versions . ( Distributing programs in runnable form is necessary for conveniently installable free operating systems . ) It is OK if there is no way to produce a binary or executable form for a certain program ( since some languages do n't support that feature ) , but you must have the freedom to redistribute such forms should you find or develop a way to
  make them . In order for freedoms 1 and 3 ( the freedom to make changes and the freedom to publish improved versions ) to be meaningful , you must have access to the source code of the program . Therefore , accessibility of source code is a necessary condition for free software . Obfuscated \u201c source code \u201d is not real source code and does not count as source code . Freedom 1 includes the freedom to use your changed version in place of the original . If the program is delivered in a product designed to run someone else 's modified versions but refuse to run yours \u2014 a practice known as \u201c tivoization \u201d or \u201c lockdown \u201d , or ( in its practitioners ' perverse terminology ) as \u201c secure boot \u201d \u2014 freedom 1 becomes a theoretical fiction rather than a practical freedom . This is not sufficient . In other words , these binaries are not free software even if the source code they are compiled from is free . One important way to modify a program is by merging in available free
  subroutines and modules . If the program 's license says that you cannot merge in a suitably licensed existing module \u2014 for instance , if it requires you to be the copyright holder of any code you add \u2014 then the license is too restrictive to qualify as free . Freedom 3 includes the freedom to release your modified versions as free software . A free license may also permit other ways of releasing them ; in other words , it does not have to be a copyleft license . However , a license that requires modified versions to be nonfree does not qualify as a free license . In order for these freedoms to be real , they must be permanent and irrevocable as long as you do nothing wrong ; if the developer of the software has the power to revoke the license , or retroactively change its terms , without your doing anything wrong to give cause , the software is not free . However , certain kinds of rules about the manner of distributing free software are acceptable , when they do n't conflict w
 ith the central freedoms . For example , copyleft ( very simply stated ) is the rule that when redistributing the program , you cannot add restrictions to deny other people the central freedoms . This rule does not conflict with the central freedoms ; rather it protects them . \u201c Free software \u201d does not mean \u201c noncommercial . \u201d A free program must be available for commercial use , commercial development , and commercial distribution . Commercial development of free software is no longer unusual ; such free commercial software is very important . You may have paid money to get copies of free software , or you may have obtained copies at no charge . But regardless of how you got your copies , you always have the freedom to copy and change the software , even to sell copies . Whether a change constitutes an improvement is a subjective matter . If your modifications are limited , in substance , to changes that someone else considers an improvement , that is not freedom . However
  , rules about how to package a modified version are acceptable , if they do n't substantively limit your freedom to release modified versions , or your freedom to make and use modified versions privately . Thus , it is acceptable for the license to require that you change the name of the modified version , remove a logo , or identify your modifications as yours . As long as these requirements are not so burdensome that they effectively hamper you from releasing your changes , they are acceptable ; you 're already making other changes to the program , so you wo n't have trouble making a few more . Rules that \u201c if you make your version available in this way , you must make it available in that way also \u201d can be acceptable too , on the same condition . An example of such an acceptable rule is one saying that if you have distributed a modified version and a previous developer asks for a copy of it , you must send one . ( Note that such a rule still leaves you the choice of whether 
 to distribute your version at all . ) Rules that require release of source code to the users for versions that you put into public use are also acceptable . In the GNU project , we use copyleft to protect these freedoms legally for everyone . But noncopylefted free software also exists . We believe there are important reasons why it is better to use copyleft , but if your program is noncopylefted free software , it is still basically ethical . ( See Categories of Free Software for a description of how \u201c free software , \u201d \u201c copylefted software \u201d and other categories of software relate to each other . ) Sometimes government export control regulations and trade sanctions can constrain your freedom to distribute copies of programs internationally . Software developers do not have the power to eliminate or override these restrictions , but what they can and must do is refuse to impose them as conditions of use of the program . In this way , the restrictions will not affect activi
 ties and people outside the jurisdictions of these governments . Thus , free software licenses must not require obedience to any export regulations as a condition of any of the essential freedoms . Most free software licenses are based on copyright , and there are limits on what kinds of requirements can be imposed through copyright . If a copyright-based license respects freedom in the ways described above , it is unlikely to have some other sort of problem that we never anticipated ( though this does happen occasionally ) . However , some free software licenses are based on contracts , and contracts can impose a much larger range of possible restrictions . That means there are many possible ways such a license could be unacceptably restrictive and nonfree . We ca n't possibly list all the ways that might happen . If a contract-based license restricts the user in an unusual way that copyright-based licenses cannot , and which is n't mentioned here as legitimate , we will have to th
 ink about it , and we will probably conclude it is nonfree . When talking about free software , it is best to avoid using terms like \u201c give away \u201d or \u201c for free , \u201d because those terms imply that the issue is about price , not freedom . Some common terms such as \u201c piracy \u201d embody opinions we hope you wo n't endorse . See Confusing Words and Phrases that are Worth Avoiding for a discussion of these terms . We also have a list of proper translations of \u201c free software \u201d into various languages . Finally , note that criteria such as those stated in this free software definition require careful thought for their interpretation . To decide whether a specific software license qualifies as a free software license , we judge it based on these criteria to determine whether it fits their spirit as well as the precise words . If a license includes unconscionable restrictions , we reject it , even if we did not anticipate the issue in these criteria . Sometimes a license requirem
 ent raises an issue that calls for extensive thought , including discussions with a lawyer , before we can decide if the requirement is acceptable . When we reach a conclusion about a new issue , we often update these criteria to make it easier to see why certain licenses do or do n't qualify . If you are interested in whether a specific license qualifies as a free software license , see our list of licenses . If the license you are concerned with is not listed there , you can ask us about it by sending us email at 
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/opennlp-sandbox/blob/1f97041b/opennlp-similarity/src/test/resources/style_recognizer/txt/Opac/90OpacTeam_Berlin_EN.txt.txt
----------------------------------------------------------------------
diff --git a/opennlp-similarity/src/test/resources/style_recognizer/txt/Opac/90OpacTeam_Berlin_EN.txt.txt b/opennlp-similarity/src/test/resources/style_recognizer/txt/Opac/90OpacTeam_Berlin_EN.txt.txt
new file mode 100644
index 0000000..5b77713
--- /dev/null
+++ b/opennlp-similarity/src/test/resources/style_recognizer/txt/Opac/90OpacTeam_Berlin_EN.txt.txt
@@ -0,0 +1,2 @@
+
+\ufeff Berlin Declaration on Open Access to Knowledge in the Sciences and Humanities The Internet has fundamentally changed the practical and economic realities of distributing scientific knowledge and cultural heritage . For the first time ever , the Internet now offers the chance to constitute a global and interactive representation of human knowledge , including cultural heritage and the guarantee of worldwide access . We , the undersigned , feel obliged to address the challenges of the Internet as an emerging functional medium for distributing knowledge . Obviously , these developments will be able to significantly modify the nature of scientific publishing as well as the existing system of quality assurance . In accordance with the spirit of the Declaration of the Budapest Open Access Initiative , the ECHO Charter and the Bethesda Statement on Open Access Publishing , we have drafted the Berlin Declaration to promote the Internet as a functional instrument for a global scientific 
 knowledge base and human reflection and to specify measures which research policy makers , research institutions , funding agencies , libraries , archives and museums need to consider . Goals Our mission of disseminating knowledge is only half complete if the information is not made widely and readily available to society . New possibilities of knowledge dissemination not only through the classical form but also and increasingly through the open access paradigm via the Internet have to be supported . We define open access as a comprehensive source of human knowledge and cultural heritage that has been approved by the scientific community . In order to realize the vision of a global and accessible representation of knowledge , the future Web has to be sustainable , interactive , and transparent . Content and software tools must be openly accessible and compatible . Definition of an Open Access Contribution Establishing open access as a worthwhile procedure ideally requires the active
  commitment of each and every individual producer of scientific knowledge and holder of cultural heritage . Open access contributions include original scientific research results , raw data and metadata , source materials , digital representations of pictorial and graphical materials and scholarly multimedia material . Preface Open access contributions must satisfy two conditions : 1. The author(s ) and right holder(s ) of such contributions grant(s ) to all users a free , irrevocable , worldwide , right of access to , and a license to copy , use , distribute , transmit and display the work publicly and to make and distribute derivative works , in any digital medium for any responsible purpose , subject to proper attribution of authorship ( community standards , will continue to provide the mechanism for enforcement of proper attribution and responsible use of the published work , as they do now ) , as well as the right to make small numbers of printed copies for their personal use 
 . 2. A complete version of the work and all supplemental materials , including a copy of the permission as stated above , in an appropriate standard electronic format is deposited ( and thus published ) in at least one online repository using suitable technical standards ( such as the Open Archive definitions ) that is supported and maintained by an academic institution , scholarly society , government agency , or other well established organization that seeks to enable open access , unrestricted distribution , inter operability , and long-term archiving . Supporting the Transition to the Electronic Open Access Paradigm Our organizations are interested in the further promotion of the new open access paradigm to gain the most benefit for science and society . Therefore , we intend to make progress by � encouraging our researchers/grant recipients to publish their work according to the principles of the open access paradigm . � encouraging the holders of cultural heritage to support
  open access by providing their resources on the Internet . � developing means and ways to evaluate open access contributions and online journals in order to maintain the standards of quality assurance and good scientific practice . � advocating that open access publication be recognized in promotion and tenure evaluation . � advocating the intrinsic merit of contributions to an open access infrastructure by software tool development , content provision , metadata creation , or the publication of individual articles . We realize that the process of moving to open access changes the dissemination of knowledge with respect to legal and financial aspects . Our organizations aim to find solutions that support further development of the existing legal and financial frameworks in order to facilitate optimal use and access . 
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/opennlp-sandbox/blob/1f97041b/opennlp-similarity/src/test/resources/style_recognizer/txt/Opac/91OpacTeam_Budapest_EN.txt.txt
----------------------------------------------------------------------
diff --git a/opennlp-similarity/src/test/resources/style_recognizer/txt/Opac/91OpacTeam_Budapest_EN.txt.txt b/opennlp-similarity/src/test/resources/style_recognizer/txt/Opac/91OpacTeam_Budapest_EN.txt.txt
new file mode 100644
index 0000000..d91932b
--- /dev/null
+++ b/opennlp-similarity/src/test/resources/style_recognizer/txt/Opac/91OpacTeam_Budapest_EN.txt.txt
@@ -0,0 +1,2 @@
+
+\ufeff Budapest Open Access Initiative An old tradition and a new technology have converged to make possible an unprecedented public good . The old tradition is the willingness of scientists and scholars to publish the fruits of their research in scholarly journals without payment , for the sake of inquiry and knowledge . The new technology is the internet . The public good they make possible is the world-wide electronic distribution of the peer-reviewed journal literature and completely free and unrestricted access to it by all scientists , scholars , teachers , students , and other curious minds . Removing access barriers to this literature will accelerate research , enrich education , share the learning of the rich with the poor and the poor with the rich , make this literature as useful as it can be , and lay the foundation for uniting humanity in a common intellectual conversation and quest for knowledge . For various reasons , this kind of free and unrestricted online availabilit
 y , which we will call open access , has so far been limited to small portions of the journal literature . But even in these limited collections , many different initiatives have shown that open access is economically feasible , that it gives readers extraordinary power to find and make use of relevant literature , and that it gives authors and their works vast and measurable new visibility , readership , and impact . To secure these benefits for all , we call on all interested institutions and individuals to help open up access to the rest of this literature and remove the barriers , especially the price barriers , that stand in the way . The more who join the effort to advance this cause , the sooner we will all enjoy the benefits of open access . The literature that should be freely accessible online is that which scholars give to the world without expectation of payment . Primarily , this category encompasses their peer-reviewed journal articles , but it also includes any unrevi
 ewed preprints that they might wish to put online for comment or to alert colleagues to important research findings . There are many degrees and kinds of wider and easier access to this literature . By " open access " to this literature , we mean its free availability on the public internet , permitting any users to read , download , copy , distribute , print , search , or link to the full texts of these articles , crawl them for indexing , pass them as data to software , or use them for any other lawful purpose , without financial , legal , or technical barriers other than those inseparable from gaining access to the internet itself . The only constraint on reproduction and distribution , and the only role for copyright in this domain , should be to give authors control over the integrity of their work and the right to be properly acknowledged and cited . While the peer-reviewed journal literature should be accessible online without cost to readers , it is not costless to produce .
  However , experiments show that the overall costs of providing open access to this literature are far lower than the costs of traditional forms of dissemination . With such an opportunity to save money and expand the scope of dissemination at the same time , there is today a strong incentive for professional associations , universities , libraries , foundations , and others to embrace open access as a means of advancing their missions . Achieving open access will require new cost recovery models and financing mechanisms , but the significantly lower overall cost of dissemination is a reason to be confident that the goal is attainable and not merely preferable or utopian . To achieve open access to scholarly journal literature , we recommend two complementary strategies . I. Self-Archiving : First , scholars need the tools and assistance to deposit their refereed journal articles in open electronic archives , a practice commonly called , self-archiving . When these archives conform 
 to standards created by the Open Archives Initiative , then search engines and other tools can treat the separate archives as one . Users then need not know which archives exist or where they are located in order to find and make use of their contents . II . Open-access Journals : Second , scholars need the means to launch a new generation of journals committed to open access , and to help existing journals that elect to make the transition to open access . Because journal articles should be disseminated as widely as possible , these new journals will no longer invoke copyright to restrict access to and use of the material they publish . Instead they will use copyright and other tools to ensure permanent open access to all the articles they publish . Because price is a barrier to access , these new journals will not charge subscription or access fees , and will turn to other methods for covering their expenses . There are many alternative sources of funds for this purpose , includin
 g the foundations and governments that fund research , the universities and laboratories that employ researchers , endowments set up by discipline or institution , friends of the cause of open access , profits from the sale of add-ons to the basic texts , funds freed up by the demise or cancellation of journals charging traditional subscription or access fees , or even contributions from the researchers themselves . There is no need to favor one of these solutions over the others for all disciplines or nations , and no need to stop looking for other , creative alternatives . Open access to peer-reviewed journal literature is the goal . Self-archiving ( I. ) and a new generation of open-access journals ( II . ) are the ways to attain this goal . They are not only direct and effective means to this end , they are within the reach of scholars themselves , immediately , and need not wait on changes brought about by markets or legislation . While we endorse the two strategies just outlin
 ed , we also encourage experimentation with further ways to make the transition from the present methods of dissemination to open access . Flexibility , experimentation , and adaptation to local circumstances are the best ways to assure that progress in diverse settings will be rapid , secure , and long-lived . The Open Society Institute , the foundation network founded by philanthropist George Soros , is committed to providing initial help and funding to realize this goal . It will use its resources and influence to extend and promote institutional self-archiving , to launch new open-access journals , and to help an open-access journal system become economically self-sustaining . While the Open Society Institute 's commitment and resources are substantial , this initiative is very much in need of other organizations to lend their effort and resources . We invite governments , universities , libraries , journal editors , publishers , foundations , learned societies , professional as
 sociations , and individual scholars who share our vision to join us in the task of removing the barriers to open access and building a future in which research and education in every part of the world are that much more free to flourish . February 14 , 2002 Budapest , Hungary 
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/opennlp-sandbox/blob/1f97041b/opennlp-similarity/src/test/resources/style_recognizer/txt/Tedi/100TediOConnellA_Quantum_EN.txt.txt
----------------------------------------------------------------------
diff --git a/opennlp-similarity/src/test/resources/style_recognizer/txt/Tedi/100TediOConnellA_Quantum_EN.txt.txt b/opennlp-similarity/src/test/resources/style_recognizer/txt/Tedi/100TediOConnellA_Quantum_EN.txt.txt
new file mode 100644
index 0000000..90ff9b3
--- /dev/null
+++ b/opennlp-similarity/src/test/resources/style_recognizer/txt/Tedi/100TediOConnellA_Quantum_EN.txt.txt
@@ -0,0 +1,2 @@
+
+\ufeff This is a representation of your brain . And your brain can be broken into two parts . There 's the left half , which is the logical side , and then the right half , which is the intuitive . And so if we had a scale to measure the aptitude of each hemisphere , then we can plot our brain . And for example , this would be somebody who 's completely logical . This would be someone who 's entirely intuitive . So where would you put your brain on this scale ? Some of us may have opted for one of these extremes , but I think for most people in the audience , your brain is something like this -- with a high aptitude in both hemispheres at the same time . It 's not like they 're mutually exclusive or anything . You can be logical and intuitive . And so I consider myself one of these people , along with most of the other experimental quantum physicists , who need a good deal of logic to string together these complex ideas . But at the same time , we need a good deal of intuition to actua
 lly make the experiments work . How do we develop this intuition ? Well we like to play with stuff . So we go out and play with it , and then we see how it acts . And then we develop our intuition from there . And really you do the same thing . So some intuition that you may have developed over the years is that one thing is only in one place at a time . I mean , it can sound weird to think about one thing being in two different places at the same time , but you were n't born with this notion , you developed it . And I remember watching a kid playing on a car stop . He was just a toddler and he was n't very good at it , and he kept falling over . But I bet playing with this car stop taught him a really valuable lesson , and that 's that large things do n't let you get right past them , and that they stay in one place . And so this is a great conceptual model to have of the world , unless you 're a particle physicist . It 'd be a terrible model for a particle physicist , because they
  do n't play with car stops , they play with these little weird particles . And when they play with their particles , they find they do all sorts of really weird things -- like they can fly right through walls , or they can be in two different places at the same time . And so they wrote down all these observations , and they called it the theory of quantum mechanics . And so that 's where physics was at a few years ago ; you needed quantum mechanics to describe little , tiny particles . But you did n't need it to describe the large , everyday objects around us . This did n't really sit well with my intuition , and maybe it 's just because I do n't play with particles very often . Well , I play with them sometimes , but not very often . And I 've never seen them . I mean , nobody 's ever seen a particle . But it did n't sit well with my logical side either . Because if everything is made up of little particles and all the little particles follow quantum mechanics , then should n't ev
 erything just follow quantum mechanics ? I do n't see any reason why it should n't . And so I 'd feel a lot better about the whole thing if we could somehow show that an everyday object also follows quantum mechanics . So a few years ago , I set off to do just that . So I made one . This is the first object that you can see that has been in a mechanical quantum superposition . So what we 're looking at here is a tiny computer chip . And you can sort of see this green dot right in the middle . And that 's this piece of metal I 'm going to be talking about in a minute . This is a photograph of the object . And here I 'll zoom-in a little bit . We 're looking right there in the center . And then here 's a really , really big close-up of the little piece of metal . So what we 're looking at is a little chunk of metal , and it 's shaped like a diving board , and it 's sticking out over a ledge . And so I made this thing in nearly the same way as you make a computer chip . I went into a c
 lean room with a fresh silicon wafer , and then I just cranked away at all the big machines for about 100 hours . For the last stuff , I had to build my own machine -- to make this swimming pool-shaped hole underneath the device . This device has the ability to be in a quantum superposition , but it needs a little help to do it . Here , let me give you an analogy . You know how uncomfortable it is to be in a crowded elevator ? I mean , when I 'm in an elevator all alone , I do all sorts of weird things , but then other people get on board and I stop doing those things , because I do n't want to bother them , or , frankly , scare them . So quantum mechanics says that inanimate objects feel the same way . The fellow passengers for inanimate objects are not just people , but it 's also the light shining on it and the wind blowing past it and the heat of the room . And so we knew , if we wanted to see this piece of metal behave quantum mechanically , we 're going to have to kick out all
  the other passengers . And so that 's what we did . We turned off the lights , and then we put it in a vacuum and sucked out all the air , and then we cooled it down to just a fraction of a degree above absolute zero . Now , all alone in the elevator , the little chunk of metal is free to act however it wanted . And so we measured its motion . We found it was moving in really weird ways . Instead of just sitting perfectly still , it was vibrating . And the way it was vibrating was breathing something like this -- like expanding and contracting bellows . And by giving it a gentle nudge , we were able to make it both vibrate and not vibrate at the same time -- something that 's only allowed with quantum mechanics . So what I 'm telling you here is something truly fantastic . What does it mean for one thing to be both vibrating and not vibrating at the same time ? So let 's think about the atoms . So one case : all the trillions of atoms that make up that chunk of metal are sitting st
 ill and at the same time those same atoms are moving up and down . Now it 's only at precise times when they align . The rest of the time they 're delocalized . That means that every atom is in two different places at the same time , which in turn means the entire chunk of metal is in two different places . I think this is really cool . ( Laughter ) Really . ( Applause ) It was worth locking myself in a clean room to do this for all those years . Because , check this out , the difference in scale between a single atom and that chunk of metal is about the same as the difference between that chunk of metal and you . So if a single atom can be in two different places at the same time , that chunk of metal can be in two different places , then why not you ? I mean , this is just my logical side talking . So imagine if you 're in multiple places at the same time , what would that be like ? How would your consciousness handle your body being delocalized in space ? There 's one more part t
 o the story . It 's when we warmed it up , and we turned on the lights and looked inside the box , we saw that the piece metal was still there in one piece . And so I had to develop this new intuition , that it seems like all the objects in the elevator are really just quantum objects just crammed into a tiny space . You hear a lot of talk about how quantum mechanics says that everything is all interconnected . Well , that 's not quite right ; it 's more than that , it 's deeper . It 's that those connections , your connections to all the things around you , literally define who you are . And that 's the profound weirdness of quantum mechanics . Thank you . ( Applause ) 
\ No newline at end of file