You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@opennlp.apache.org by bg...@apache.org on 2016/11/16 09:11:04 UTC

[08/51] [partial] opennlp-sandbox git commit: merge from bgalitsky's own git repo

http://git-wip-us.apache.org/repos/asf/opennlp-sandbox/blob/1f97041b/opennlp-similarity/src/test/resources/style_recognizer/txt/ted/170ted_robert_thurman_on_compassion.txt
----------------------------------------------------------------------
diff --git a/opennlp-similarity/src/test/resources/style_recognizer/txt/ted/170ted_robert_thurman_on_compassion.txt b/opennlp-similarity/src/test/resources/style_recognizer/txt/ted/170ted_robert_thurman_on_compassion.txt
new file mode 100644
index 0000000..def743c
--- /dev/null
+++ b/opennlp-similarity/src/test/resources/style_recognizer/txt/ted/170ted_robert_thurman_on_compassion.txt
@@ -0,0 +1,2 @@
+
+I want to open by quoting Einstein 's wonderful statement , just so people will feel at ease that the great scientist of the 20th century also agrees with us , and also calls us to this action . He said , " A human being is a part of the whole , called by us , universe , a part limited in time and space . He experiences himself , his thoughts and feelings , as something separated from the rest , a kind of optical delusion of his consciousness , that separation . This delusion is a kind of prison for us , restricting us to our personal desires and to affection for a few persons nearest to us . Our task must be to free ourselves from this prison by widening our circle of compassion , to embrace all living creatures and the whole of nature in its beauty . " This insight of Einstein 's is uncannily close to that of Buddhist psychology , wherein compassion , karuna , it is called , is defined as , " The sensitivity to another 's suffering and the corresponding will to free the other from
  that suffering . " It pairs closely with love . Which is the will for the other to be happy . Which requires , of course , that one feels some happiness oneself and wishes to share it . This is perfect in that it clearly opposes self-centeredness and selfishness to compassion , the concern for others , and , further , it indicates that those caught in the cycle of self-concern , suffer helplessly , while the compassionate are more free and implicitly more happy . The Dalai Lama often states that compassion is his best friend . It helps him when he is overwhelmed with grief and despair . Compassion helps him turn away from the feeling of his suffering as the most absolute , most terrible suffering anyone has ever had and broadens his awareness of the sufferings of others , even of the perpetrators of his misery and the whole mass of beings . In fact , suffering is so huge and enormous , his own becomes less and less monumental . And he begins to move beyond his self-concern into the
  broader concern for others . And this immediately cheers him up , as his courage is stimulated to rise to the occasion . Thus , he uses his own suffering as a doorway to widening his circle of compassion . He is a very good colleague of Einstein 's , we must say . Now , I want to tell a story , which is a very famous story in the Indian and Buddhist tradition , of the great Saint Asanga who lived -- contemporary of Augustine in the West and was sort of like the Buddhist Augustine . And Asanga lived 800 years after the Buddha 's time . And he was discontented with the state of people 's practice of the Buddhist religion in India at that time . And so he said , " I 'm sick of all this . Nobody 's really living the doctrine . They 're talking about love and compassion and wisdom and enlightenment , but they are acting selfish and pathetic . So Buddha 's teaching has lost its momentum . I know next Buddha will come a few thousand years from now , but exists currently in a certain heave
 n , that 's Maitreya . So , I 'm going to go on a retreat , and I 'm going to meditate and pray until the Buddha Maitreya reveals himself to me , and gives me a teaching or something to revive the practice of compassion in the world today . " So he went on this retreat . And he meditated for three years and he did not see the future Buddha Maitreya . And he left in disgust . And as he was leaving , he saw a man -- a funny little man -- sitting sort of part way down the mountain . And he had a lump of iron . And he was rubbing it with a cloth . And he became became interested in that . He said , " Well what are you doing ? " And the man said , " I 'm making a needle . " And he said , " That 's ridiculous . you ca n't make a needle by rubbing a lump of iron with a cloth . " And the man said , " Really ? " And he showed him a dish full of needles . So he said , " Okay , I get the point . " He went back to his cave . He meditated again . Another three years , no vision . He leaves again
  . This time , he comes down . And as he 's leaving , he sees a bird making a nest on a cliff ledge . And where it 's landing to bring the twigs to the cliff , its feathers brushes the rock , and it had cut the rock in , inches , six to eight inches in , there was a cleft in the rock by the brushing of the feathers of generations of the birds . So he said , " All right . I get the point . " He went back . Another three years . Again , no vision of Maitreya after nine years . And , he again leaves , and this time water dripping , making a giant bowl in the rock where it drips in a stream . And so , again , he goes back . And after 12 there is still no vision . And he 's freaked out . And he wo n't even look left or right to see any encouraging vision . And he comes to the town . He 's a broken person . And there , in the town , he 's approached by a dog who comes like this -- one of these terrible dogs you can see in some poor countries , even in America , I think , in some areas -- 
 and he 's looking just terrible . And he becomes interested in this dog because it 's so pathetic , and it 's trying to attract his attention . And he sits down looking at the dog . And the dog 's whole hindquarters are a complete open sore . And some of it is like gangrenous . And there 's like maggots in the flesh . And it 's terrible . He thinks , " What can I do to fix up this dog ? Well , at least I can clean this wound and wash it . " So he takes it to some water , he 's about to clean , then his awareness focuses on the maggots . And he sees the maggots , and the maggots are kind of looking a little cute . And they 're maggoting happily in the dog 's hindquarters there . " Well , if I clean the dog , I 'll kill the maggots . So how can that be ? That 's it . I 'm a useless person and there 's no Buddha , no Maitreya , and everything is all hopeless . And now I 'm going to kill the maggots ? " So , he had a brilliant idea . And he took a shard of something , and cut a piece of
  flesh from his thigh , and he placed it on ground . He was not really thinking too carefully about the ASPCA . He was just immediately caught with the situation . So he thought , " I will take the maggots and put them on this piece of flesh , then clean the dog 's wounds , and then , you know , I 'll figure out what to do with the maggots . " So he starts to do that . He ca n't grab the maggots . Apparently they wriggle around . They 're kind of hard to grab , these maggots . So he says , " Well , I 'll put my tongue on the dog 's flesh . And then the maggots will jump on my warmer tongue . The dog is kind of used up . And then I 'll spit them one by one down on the thing . " So he goes down , and he 's sticking his tongue out like this . And he had to close his eyes , it 's so disgusting , and the smell and everything . And then , suddenly , there 's a pfft , a noise like that . He jumps back and there , of course , is the future Buddha Maitreya . In a beautiful vision like rainbo
 w lights , golden , jeweled , plasma body , like exquisite mystic vision , he sees . And he says , " Oh . " He bows . But , being human , he 's immediately thinking of his next complaint . So as he comes up from his first bow he says , " My Lord , I 'm so happy to see you , but where have you been for 12 years ? What is this ? " And Maitreya says , " I was with you . Who do you think was making needles and making nests and dripping on rocks for you , mister dense ? " ( Laughter ) " Looking for the Buddha in person . " he said . And he said , " You did n't have , until this moment , real compassion . And , until you have real compassion , you cannot recognize love . " Maitreya means love , the loving one , you know , in Sanskrit . And so he looked very dubious , Asanga did . And he said , " If you do n't believe me , just take me with you . " And so he took the Maitreya -- it shrunk into a globe , a ball -- took him on his shoulder . And he ran into town in the marketplace , and he s
 aid , " Rejoice . Rejoice . The future Buddha has come ahead of all predictions . Here he is . " And then pretty soon they started throwing rocks and stones at him -- It was n't Chautauqua . It was some other town -- because they saw a demented looking , scrawny looking yogi man , like some kind of hippie , with a bleeding leg and a rotten dog on his shoulder , shouting that the future Buddha had come . So , naturally , they chased him out of town . But on the edge of town , one elderly lady , a char woman in the charnel ground , saw a jeweled foot on a jeweled lotus on his shoulder and then the dog , but she saw the jewel foot of the Maitreya , and she offered a flower . So that encouraged him , and he went with Maitreya . With Maitreya then took him to a certain heaven , the way the Buddhist myth unfolds in a typical way . And Maitreya then kept him in heaven for five years , dictating to him five complicated tomes of the methodology of how you cultivate compassion . And then I th
 ought I would share with you what that method is , or one of them . Famous one , it 's called the " Sevenfold Causal Method of Developing Compassion . " And it begins first by one meditating and visualizing that all beings are with one , and all -- even animals too -- but everyone is in human form . The animals are in one of their human lives . The humans are human . And then , among them , you think of your friends and loved ones , the circle at the table . And you think of your enemies , and you think of the neutral ones . And then you try to say , " Well , the loved ones I love . But , you know , after all , they 're nice to me . I had fights with them . Sometimes they were unfriendly . I got mad . Brothers can fight . Parents and children can fight . So , in a way , I like them so much because they 're nice to me . While the neutral ones I do n't know . They could all be just fine . And then the enemies I do n't like because they 're mean to me . But they are nice to somebody . 
 I could be them . " And then the Buddhists , of course , think , because we 've all had infinite previous lives , the Buddhists think that we 've all been each other 's relatives , actually , and everyone , therefore all of you , in the Buddhist view in some previous life , although you do n't remember it and neither do I , have been my mother , for which I do apologize for the trouble I caused you . And also , actually , I 've been your mother . I 've been female , and I 've been every single one of you , your mother in a previous life , the way the Buddhists reflect . So , my mother is this life is really great . But all of you in a way are part of the eternal mother . You gave me that expression , the eternal mama , you said . That 's wonderful . So , that 's the way the Buddhists do it . A theist , Christian , can think that all beings , even my enemies , are God 's children . So , in that sense , we 're related . So , they first create this foundation of equality . So , we sort
  of reduce a little of the clinging to the ones we love -- just in the meditation -- and we open our mind to those we do n't know . And we definitely reduce the hostility and " I do n't want to be compassionate to them " to the ones we think of as the bad guys , the ones we hate and we do n't like . And we do n't hate anyone therefore . So we equalize . That 's very important . And then the next thing we do is what is called mother recognition . And that is , we think of every being as familiar , as family . We expand . We take the feeling about remembering a mama , and we defuse that to all beings in this meditation . And we see the mother in every being . We see that look that the mother has on her face , this looking at this child that is a miracle that she has produced from her own body , being a mammal , where she has true compassion , truly is the other , and identifies completely . Often the life of that other will be more important to her than her own life . And that 's why 
 it 's the most powerful form of -- altruism . The mother is what is the model of all altruism for human beings , in spiritual traditions . And so , we reflect until we can sort of see that motherly expression in all beings . People laugh at me because , you know , I used to say that I used to meditate on mama Cheney as my mom , when , of course , I was annoyed with him about all of his evil doings in Iraq . I used to meditate on George Bush . He 's quite a cute mom in a female form . Has his little ears and he smiles and he rocks you in his arms . And you think of him as nursing you . And then Saddam Hussein 's serious mustache is a problem . But you think of him as a mom . And this is the way you do it . You take any being who looks weird to you , and you see how they could be familiar to you . And you do that for awhile until you really feel that . You can feel the familiarity of all beings . Nobody seems alien . They 're not " other . " You reduce the feeling of otherness about b
 eings . Then you move from there to remembering the kindness of mothers in general , if you can remember the kindness of your own mother , if you can remember the kindness of your spouse , or , if you are a mother yourself , how you were with your children . And you begin to get very sentimental , you cultivate sentimentality intensely . You will even weep , perhaps , with gratitude and kindness . And then you connect that with your feeling that everyone has that motherly possibility . Every being , even the most mean looking ones , can be motherly . And then , third , you step from there to what is called a feeling of gratitude . You want to repay that kindness that all beings have shown to you . And then the fourth step , you go to what is called lovely love . In each one of these you can take some weeks , or months , or days depending on how you do it , or you can do them in a run , this meditation . And then you think of how lovely beings are when they are happy , when they are 
 satisfied . And every being looks beautiful when they are internally feeling a happiness . Their face does n't look like this . When they 're angry , they look ugly , every being , but when they 're happy they look beautiful . And so you see beings in their potential happiness . And you feel a love toward them that you want them to be happy , even the enemy . And , actually , it 's very logical to want to -- we think Jesus is being unrealistic when he says love thine enemy . He does say that , and we think he 's being unrealistic and sort of spiritual and highfalutin and , " Nice for him to say it , but I ca n't do that . " But , actually , that 's practical . If you love your enemy that means you want your enemy to be happy . If your enemy was really happy , why would they bother to be your enemy ? How boring to run around chasing you . They would be relaxing somewhere having a good time . So it makes sense to want your enemy to be happy because they 'll stop being your enemy becau
 se that 's too much trouble . But anyway , that 's the lovely love . And then finally , the fifth step is compassion , universal compassion . And that is where you then look at the reality of all the beings you can think of . And you look at them , and you see how they are . And you realize how unhappy they are actually , mostly , most of the time . You see that furrowed brow in people . And then you realize they do n't even have compassion on themselves . They 're driven by this duty and this obligation . " I have to get that . I need more . I 'm not worthy . And I should do something . " And they 're rushing around all stressed out . And they think of it as somehow macho , hard discipline on themselves . But actually they are cruel to themselves . And , of course , they are cruel and ruthless toward others . And they , then , never get any positive feedback . And the more they succeed , and the more power they have , the more unhappy they are . And this is where you feel real comp
 assion for them . And you then feel you must act . And it 's the motivation -- And the choice of action , of course , hopefully will be more practical than poor Asanga who was fixing the maggots on the dog , because he had that motivation , and whoever was in front of him , he wanted to help . But , of course , that is impractical . He should have founded the ASPCA in the town and gotten some scientific help for dogs and maggots . And I 'm sure he did that later . But that just indicates the state of mind , you know . And so the next step -- the sixth step beyond universal compassion -- which then is this thing where you 're linked with the needs of others in a true way , and you have compassion for yourself also , and you do n't -- it is n't sentimental only . You might be in fear of something . Some bad guy is making himself more and more unhappy being more and more mean to other people and getting punished in the future for it in various ways . And in Buddhism , they catch it in 
 the future life . Of course in theistic religion they 're punished by God or whatever . And materialism , they think they get out of it just by not existing , by dying , but they do n't . And so they get reborn as whatever , you know . Never mind . I wo n't get into that . But the next step is called universal responsibility . And that is very important -- the Charter of Compassion must lead us to develop through true compassion , what is called universal responsibility . And that means that the great teaching of his holiness , the Dalai Lama , that he always teaches everywhere , and he says that is the common religion of humanity , kindness , But kindness means universal responsibility . And that means whatever happens to other beings is happening to us , that we are responsible for that , and we should take it and do whatever we can at whatever little level and small level that we can do it . We absolutely must do that . There is no way not to do it . And then , finally , that lea
 ds to a new orientation in life where we live equally for ourselves and others , and we realize that happiness for ourselves -- and we are joyful and happy . One thing we must n't think is compassion makes you miserable . Compassion makes you happy . The first person who is happy , when you get great compassion , is yourself , even if you have n't done anything yet for anybody else . Although , the change in your mind already does something for other beings . They can sense this new quality in yourself , and it helps them already , and gives them an example . And that uncompassionate clock has just showed me that it 's all over . So , practice compassion , read the charter , disseminate it and develop it within yourself . Do n't just think , oh well , I 'm compassionate , or I 'm not compassionate , and sort of think you 're stuck there . You can develop this . You can diminish the non-compassion , the cruelty , the callousness , the neglect of others . Take universal responsibility
  for them , and then , not only will God smile and the eternal mama will smile , but Karen Armstrong will smile . Thank you very much . 
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/opennlp-sandbox/blob/1f97041b/opennlp-similarity/src/test/resources/style_recognizer/txt/ted/171ted_rory_sutherland_life_lessons_from_an_ad_man.txt
----------------------------------------------------------------------
diff --git a/opennlp-similarity/src/test/resources/style_recognizer/txt/ted/171ted_rory_sutherland_life_lessons_from_an_ad_man.txt b/opennlp-similarity/src/test/resources/style_recognizer/txt/ted/171ted_rory_sutherland_life_lessons_from_an_ad_man.txt
new file mode 100644
index 0000000..6d0bcf8
--- /dev/null
+++ b/opennlp-similarity/src/test/resources/style_recognizer/txt/ted/171ted_rory_sutherland_life_lessons_from_an_ad_man.txt
@@ -0,0 +1,2 @@
+
+This is my first time at TED . Normally , as an advertising man , I actually speak at TED Evil , which is TED 's secret sister organization -- the one that pays all the bills . It 's held every two years in Burma . And I particularly remember a really good speech by Kim Jong Il on how to get teens smoking again . ( Laughter ) But , actually , it 's suddenly come to me after years working in the business , that what we create in advertising , which is intangible value -- you might call it perceived value , you might call it badge value , subjective value , intangible value of some kind -- gets rather a bad rap . If you think about it , if you want to live in a world in the future where there are fewer material goods , you basically have two choices . You can either live in a world which is poorer , which people in general do n't like . Or you can live in a world where actually intangible value constitutes a greater part of overall value , that actually intangible value , in many ways
  is a very , very fine substitute for using up labor or limited resources in the creation of things . Here is one example . This is a train which goes from London to Paris . The question was given to a bunch of engineers , about 15 years ago , " How do we make the journey to Paris better ? " And they came up with a very good engineering solution , which was to spend six billion pounds building completely new tracks from London to the coast , and knocking about 40 minutes off a three-and-half-hour journey time . Now , call me Mister Picky . I 'm just an ad man ... ... but it strikes me as a slightly unimaginative way of improving a train journey merely to make it shorter . Now what is the hedonic opportunity cost on spending six billion pounds on those railway tracks ? Here is my naive advertising man 's suggestion . What you should in fact do is employ all of the world 's top male and female supermodels , pay them to walk the length of the train , handing out free Chateau Petrus for
  the entire duration of the journey . ( Laughter ) ( Applause ) Now , you 'll still have about three billion pounds left in change , and people will ask for the trains to be slowed down . ( Laughter ) Now , here is another naive advertising man 's question again . And this shows that engineers , medical people , scientific people , have an obsession with solving the problems of reality , when actually most problems , once you reach a basic level of wealth in society , most problems are actually problems of perception . So I 'll ask you another question . What on earth is wrong with placebos ? The seem fantastic to me . They cost very little to develop . They work extraordinarily well . They have no side effects , or if they do , they 're imaginary , so you can safely ignore them . ( Laughter ) So I was discussing this . And I actually went to the Marginal Revolution blog by Tyler Cowen . I do n't know if anybody knows it . Someone was actually suggesting that you can take this conce
 pt further , and actually produce placebo education . The point is that education does n't actually work by teaching you things . It actually works by giving you the impression that you 've had a very good education , which gives you an insane sense of unwarranted self confidence , which then makes you very , very successful in later life . So , welcome to Oxford , ladies and gentlemen . ( Laughter ) ( Applause ) But , actually , the point of placebo education is interesting . How many problems of life can be solved actually by tinkering with perception , rather than that tedious , hardworking and messy business of actually trying to change reality ? Here 's a great example from history . I 've heard this attributed to several other kings , but doing a bit of historical research it seems to be Fredrick the Great . Fredrick the Great of Prussia was very very keen for the Germans to adopt the potato , and to eat it . Because he realized that if you had two sources of carbohydrate , wh
 eat and potatoes , you get less price volatility in bread . And you get a far lower risk of famine , because you actually had two crops to fall back on , not one . The only problem is : potatoes , if you think about it , look pretty disgusting . And also , 18th century Prussians ate very , very few vegetables -- rather like contemporary Scottish people . ( Laughter ) So , actually , he tried making it compulsory . The Prussian peasantry said , " We ca n't even get the dogs to eat these damn things . They are absolutely disgusting and they 're good for nothing . " There are even records of people being executed for refusing to grow potatoes . So he tried plan B. He tried the marketing solution , which is he declared the potato as a royal vegetable . And none but the royal family could consume it . And he planted it in a royal potato patch , with guards who had instructions to guard over it , night and day , but with secret instructions not to guard it very well . ( Laughter ) Now 18t
 h century peasants know that there is one pretty safe rule in life , which is if something is worth guarding , it 's worth stealing . Before long , there was a massive underground potato-growing operation in Germany . What he 'd effectively done is he 'd re-branded the potato . It was an absolute masterpiece . I told this story and a gentleman from Turkey came up to me and said , " Very , very good marketer , Fredrick the Great . But not a patch on Ataturk . " Ataturk , rather like Nicolas Sarkozy , was very keen to discourage the wearing of a veil , in Turkey , to modernize it . Now , boring people would have just simply banned the veil . But that would have ended up with a lot of awful kickback and a hell of a lot of resistance . Ataturk was a lateral thinker . He made it compulsory for prostitutes to wear the veil . ( Laughter ) ( Applause ) I ca n't verify that fully . But it does not matter . There is your environmental problem solved , by the way , guys : All convicted child m
 olesters have to drive a Porsche Cayenne . ( Laughter ) What Ataturk realized actually is two very fundamental things . Which is that , actually , first one , all value is actually relative . All value is perceived value . For those of you who do n't speak Spanish , jugo de naranja -- it 's actually the Spanish for " orange juice . " Because actually it 's not the dollar . It 's actually the peso in Buenos Aires . Very clever Buenos Aires street vendors decided to practice price discrimination to the detriment to any passing gringo tourists . As an advertising man , I have to admire that . But the first thing this all shows is that all value is subjective . Second point is that persuasion is often better than compulsion . These funny signs that flash your speed at you , some of the new ones , on the bottom right , now actually show a smiley face or a frowny face , to act as an emotional trigger . What 's fascinating about these signs is they cost about 10 percent of the running cost
  of a conventional speed camera . But they prevent twice as many accidents . So , the bizarre thing which is baffling to conventional , classically trained economists , is that a weird little smiley face has a better effect on changing your behavior than the threat of a �60 fine and three penalty points . Tiny little behavioral economics detail : in Italy , penalty points go backwards . You start with 12 and they take them away . Because the found that loss aversion is a more powerful influence on people 's behavior . In Britain we tend to feel , " Whoa ! Got another three ! " Not so in Italy . Another fantastic case of creating intangible value to replace actual or material value , which remember , is what , after all , the environmental movement needs to be about : This , again , is from Prussia , from , I think , about 1812 , 1813. The wealthy Prussians , to help in war against the French , were encouraged to give in all their jewelry . And it was replaced with replica jewelry m
 ade of cast iron . Here 's one : " Gold gab ich f�r Eisen , 1813. " The interesting thing is that for 50 years hence , the highest status jewelry you could wear in Prussia was n't made of gold or diamonds . It was made of cast iron . Because actually , never mind the actual intrinsic value of having gold jewelry . This actually had symbolic value , badge value . It said that your family had made a great sacrifice in the past . So , the modern equivalent would of course be this . ( Laughter ) But , actually , there is a thing , just as there are Veblen goods , where the value of the good depends on it being expensive and rare -- there are opposite kind of things where actually the value in them depends on them being ubiquitous , classless and minimalistic . If you think about it , Shakerism was a proto-environmental movement . Adam Smith talks about 18th century America where the prohibition against visible displays of wealth was so great , it was almost a block in the economy in Ne
 w England , because even wealthy farmers could find nothing to spend their money on , without incurring the displeasure of their neighbors . It 's perfectly possible to create these social pressures which lead to more egalitarian societies . What 's also interesting , if you look at products that have a high component of what you might call messaging value , a high component of intangible value , versus their intrinsic value : They are often quite egalitarian . In terms of dress , denim is perhaps the perfect example of something which replaces material value with symbolic value . Coca-Cola . A bunch of you may be a load of pinkos , and you may not like the Coca-Cola company . But it 's worth remembering Andy Warhol 's point about Coke . What Warhol said about Coke is , he said , " What I really like about Coca-Cola is the president of the United States ca n't get a better Coke than the bum on the corner of the street . " Now , that is , actually , when you think about it , we take 
 it for granted -- it 's actually a remarkable achievement , to produce something that 's so democratic . Now , we basically have to change our views slightly . There is a basic view that real value involves making things , involves labor . It involves engineering . It involves limited raw materials . And that what we add on top is kind of false . It 's a fake version . And there is a reason for some suspicion and uncertainly about it . It patently veers toward propaganda . However , what we do have now is a much more variegated media ecosystem in which to kind of create this kind of value . And it 's much fairer . When I grew up , this was basically the media environment of my childhood as translated into food . You had a monopoly supplier . On the left , you have Rupert Murdoch , or the BBC . ( Laughter ) And on your right you have a dependent public which is pathetically grateful for anything you give it . ( Laughter ) Nowadays , the user is actually involved . This is actually wh
 at 's called , in the digital world , " user-generated content . " Although it 's called agriculture , in the world of food . ( Laughter ) This is actually called a mash-up , where you take content that someone else has produced and you do something new with it . In the world of food we call it cooking . This is food 2.0 , which is food you produce for the purpose of sharing it with other people . This is mobile food . British are very good at that . Fish and chips in newspaper , the Cornish Pastie , the pie , the sandwich . We invented the whole lot of them . We 're not very good at food in general . Italians do great food , but it 's not very portable , generally . ( Laughter ) I only learned this the other day . The Earl of Sandwich did n't invent the sandwich . He actually invented the toasty . But then , the Earl of Toasty would be a ridiculous name . ( Laughter ) Finally , we have contextual communication . Now , the reason I show you Pernod -- it 's only one example . Every c
 ountry has a contextual alcoholic drink . In France it 's Pernod . It tastes great within the borders of that country . But absolute shite if you take it anywhere else . ( Laughter ) Unicum in Hungary , for example . The Greeks have actually managed to produce something called Retsina , which even tastes shite when you 're in Greece . ( Laughter ) But so much communication now is contextual that the capacity for actually nudging people , for giving them better information -- B. J. Fogg , at the University of Stanford , makes the point that actually the mobile phone is -- He 's invented the phrase , " persuasive technologies . " He believes the mobile phone , by being location-specific , contextual , timely and immediate , is simply the greatest persuasive technology device ever invented . Now , if we have all these tools at our disposal , we simply have to ask the question , and Thaler and Sunstein have , of how we can use these more intelligently . I 'll give you one example . If y
 ou had a large red button of this kind , on the wall of your home , and every time you pressed it it saved 50 dollars for you , put 50 dollars into your pension , you would save a lot more . The reason is that the interface fundamentally determines the behavior . Okay ? Now , marketing has done a very very good job of creating opportunities for impulse buying . Yet we 've never created the opportunity for impulse saving . If you did this , more people would save more . It 's simply a question of changing the interface by which people make decisions . And the very nature of the decisions changes . Obviously , I do n't want people to do this , because as an advertising man I tend to regard saving as just consumerism needlessly postponed . ( Laughter ) But if anybody did want to do that , that 's the kind of thing we need to be thinking about , actually : fundamental opportunities to change human behavior . Now , I 've got an example here from Canada . There was a young intern at Ogilv
 y Canada called Hunter Somerville , who was working in improv in Toronto , and got a part-time job in advertising , and was given the job of advertising Shreddies . Now this is the most perfect case of creating intangible added value , without changing the product in the slightest . Shreddies is a strange , square , whole-grain cereal , only available in New Zealand , Canada and Britain . It 's Kraft 's peculiar way of rewarding loyalty to the crown . ( Laughter ) In working out how you could relaunch Shreddies , he came up with this . Video : ( Buzzer ) Man : Shreddies is supposed to be square . ( Laughter ) Woman : Have any of these diamond shapes gone out ? ( Laughter ) Voiceover : New Diamond Shreddies cereal . Same 100 percent whole-grain wheat in a delicious diamond shape . ( Applause ) Rory Sutherland : I 'm not sure this is n't the most perfect example of intangible value creation . All it requires is photons , neurons , and a great idea to create this thing . I would say it
  's a work of genius . But , naturally , you ca n't do this kind of thing without a little bit of market research . Man : So , Shreddies is actually producing a new product , which is something very exciting for them . So they are introducing new Diamond Shreddies . ( Laughter ) So I just want to get your first impressions when you see that , when you see the Diamond Shreddies box there . ( Laughter ) Woman : Were n't they square ? Woman #2 : I 'm a little bit confused . Woman #3 : They look like the squares to me . Man : They -- Yeah , it 's all in the appearance . But it 's kind of like flipping a six or a nine like a six . If you flip it over it looks like a nine . But a six is very different from a nine . Woman # 3 : Or an " M " and a " W " . Man : An " M " and a " W " , exactly . Man #2 : [ unclear ] You just looked like you turned it on its end . But when you see it like that it 's more interesting looking . Man : Just try both of them . Take a square one there , first . ( Lau
 ghter ) Man : Which one did you prefer ? Man #2 : The first one . Man : The first one ? ( Laughter ) Rory Sutherland : Now , naturally , a debate raged . There were conservative elements in Canada , unsurprisingly , who actually resented this intrusion . So , eventually , the manufacturers actually arrived at a compromise , which was the combo pack . ( Laughter ) ( Applause ) ( Laughter ) If you think it 's funny , bear in mind there is an organization called the American Institute of Wine Economics , which actually does extensive research into perception of things , and discovers that except for among perhaps five or ten percent of the most knowledgeable people , there is no correlation between quality and enjoyment in wine , except when you tell the people how expensive it is , in which case they tend to enjoy the more expensive stuff more . So drink your wine blind in the future . But this is both hysterically funny -- but I think an important philosophical point , which is , goi
 ng forward , we need more of this kind of value . We need to spend more time appreciating what already exists , and less time agonizing over what else we can do . Two quotations to more or less end with . One of them is , " Poetry is when you make new things familiar and familiar things new . " Which is n't a bad definition of what our job is , to help people appreciate what is unfamiliar , but also to gain a greater appreciation , and place a far higher value on those things which are already existing . There is some evidence , by the way , that things like social networking help do that . Because they help people share news . They give badge value to everyday little trivial activities . So they actually reduce the need for actually spending great money on display , and increase the kind of third-party enjoyment you can get from the smallest , simplest things in life . Which is magic . The second one is the second G. K. Chesterton quote of this session , which is , " We are perishi
 ng for want of wonder , not for want of wonders , " which I think for anybody involved in technology , is perfectly true . And a final thing : When you place a value on things like health , love , sex and other things , and learn to place a material value on what you 've previously discounted for being merely intangible , a thing not seen , you realize you 're much much wealthier than you ever imagined . Thank you very much indeed . ( Applause ) 
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/opennlp-sandbox/blob/1f97041b/opennlp-similarity/src/test/resources/style_recognizer/txt/ted/172ted_sean_gourley_on_the_mathematics_of_war.txt
----------------------------------------------------------------------
diff --git a/opennlp-similarity/src/test/resources/style_recognizer/txt/ted/172ted_sean_gourley_on_the_mathematics_of_war.txt b/opennlp-similarity/src/test/resources/style_recognizer/txt/ted/172ted_sean_gourley_on_the_mathematics_of_war.txt
new file mode 100644
index 0000000..7d554ee
--- /dev/null
+++ b/opennlp-similarity/src/test/resources/style_recognizer/txt/ted/172ted_sean_gourley_on_the_mathematics_of_war.txt
@@ -0,0 +1,2 @@
+
+We look around the media , as we see on the news from Iraq , Afghanistan , Sierra Leone , and the conflict seems incomprehensible to us . And that 's certainly how it seemed to me when I started this project . But as a physicist , I thought , well if you give me some data , I could maybe understand this . You know , give us a go . So as a naive New Zealander I thought , well I 'll go to the Pentagon . Can you get me some information ? ( Laughter ) No. So I had to think a little harder . And I was watching the news one night in Oxford . And I looked down at the chattering heads on my channel of choice . And I saw that there was information there . There was data within the streams of news that we consume . All this noise around us actually has information . So what I started thinking was , perhaps there is something like open source intelligence here . If we can get enough of these streams of information together we can perhaps start to understand the war . So this is exactly what I 
 did . We started bringing a team together , an interdisciplinary team of scientists , of economists , mathematicians . We brought these guys together and we started to try and solve this . We did it in three steps . The first step we did was to collect . We did 130 different sources of information -- from NGO reports to newspapers and cable news . We brought this raw data in and we filtered it . We extracted the key bits on information to build the database . That database contained the timing of attacks , the location , the size and the weapons used . It 's all in the streams of information we consume daily , we just have to know how to pull it out . And once we had this we could start doing some cool stuff . What if we were to look at the distribution of the sizes of attacks ? What would that tell us ? So we started doing this . And you can see here on the horizontal axis you 've got the number of people killed in an attack or the size of the attack . And on the vertical axis you 
 've got the number of attacks . So we plot data for sample on this . You see some sort of random distribution -- perhaps 67 attacks , one person was killed , or 47 attacks where seven people were killed . We did this exact same thing for Iraq . And we did n't know , for Iraq what we were going to find . It turns out what we found was pretty surprising . You take all of the conflict , all of the chaos , all of the noise , and out of that comes this precise mathematical distribution of the way attacks are ordered in this conflict . This blew our mind . Why should a conflict like Iraq have this as its fundamental signature ? Why should there be order in war ? We did n't really understand that . We thought maybe there is something special about Iraq . So we looked at a few more conflicts . We looked at Colombia , we looked at Afghanistan , and we looked at Senegal . And the same pattern emerged in each conflict . This was n't supposed to happen . These are different wars , with differen
 t religious factions , different political factions , and different socioeconomic problems . And yet the fundamental patterns underlying them are the same . So we went a little wider . We looked around the world at all the data we could get our hands on . From Peru to Indonesia , we studied this same pattern again . And we found that not only were the distributions these straight lines , but the slope of these lines , they clustered around this value of Alpha equals 2.5 . And we could generate an equation that could predict the likelihood of an attack . What we 're saying here is the probability of an attack killing X number of people in a country like Iraq , is equal to a constant , times the size of that attack , raised to the power of negative Alpha . And negative Alpha is the slope of that line I showed you before . So what ? This is data , statistics . What does it tell us about these conflicts ? That was a challenge we had to face as physicists . How do we explain this ? And w
 hat we really found was that Alpha if we really think about it , is the organizational structure of the insurgency . Alpha is the distribution of the sizes of attacks , which is really the distribution of the group strength carrying out the attacks . So we look at a process of group dynamics -- coalescence and fragmentation . Groups coming together . Groups breaking apart . And we start running the numbers on this . Can we simulate it ? Can we create the kind of patterns that we 're seeing in places like Iraq ? Turns out we kind of do a reasonable job . We can run these simulations . We can recreate this using a process of group dynamics to explain the patterns that we see all around the conflicts around the world . So what 's going on ? Why should these different -- seemingly different conflicts have the same patterns ? Now what I believe is going on is that the insurgent forces , they evolve over time . They adapt . And it turns out there is only one solution to fight a much stron
 ger enemy . And if you do n't find that solution as an insurgent force , you do n't exist . So every insurgent force that is ongoing , every conflict that is ongoing , it 's going to look something like this . And that is what we think is happening . Taking it forward , how do we change it ? How do we end a war like Iraq ? What does it look like ? Alpha is the structure . It 's got a stable state at 2.5 . This is what wars look like when they continue . We 've got to change that . We can push it up . The forces become more fragmented . There is more of them , but they are weaker . Or we push it down . They 're more robust . There is less groups . But perhaps you can sit and talk to them . So this graph here , I 'm going to show you now . No one has seen this before . This is literally stuff that we 've come through last week . And we see the evolution of Alpha through time . We see it start . And we see it grow up to the stable state the wars around the world look like . And it stay
 s there through the invasion of Falusia until the Samarra bombings in the Iraqi elections of '06 . And the system gets perturbed . It moves upwards to a fragmented state . This is when the surge happens . And depending on who you ask , the surge was supposed to push it up even further . The opposite happened . The groups became stronger . They became more robust . And so I 'm thinking , right , great , it 's going to keep going down . We can talk to them . We can get a solution . The opposite happened . It 's moved up again . The groups are more fragmented . And this tells me one of two things . Either we 're back where we started , and the surge has had no effect . Or finally the groups have been fragmented to the extent that we can start to think about maybe moving out . I do n't know what the answer is to that . But I know that we should be looking at the structure of the insurgency to answer that question . Thank you . ( Applause ) 
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/opennlp-sandbox/blob/1f97041b/opennlp-similarity/src/test/resources/style_recognizer/txt/ted/173ted_siegfried_woldhek_shows_how_he_found_the_true_face_of_leonardo.txt
----------------------------------------------------------------------
diff --git a/opennlp-similarity/src/test/resources/style_recognizer/txt/ted/173ted_siegfried_woldhek_shows_how_he_found_the_true_face_of_leonardo.txt b/opennlp-similarity/src/test/resources/style_recognizer/txt/ted/173ted_siegfried_woldhek_shows_how_he_found_the_true_face_of_leonardo.txt
new file mode 100644
index 0000000..5871249
--- /dev/null
+++ b/opennlp-similarity/src/test/resources/style_recognizer/txt/ted/173ted_siegfried_woldhek_shows_how_he_found_the_true_face_of_leonardo.txt
@@ -0,0 +1,2 @@
+
+Good morning . Let 's look for a minute at the greatest icon of all , Leonardo da Vinci . We 're all familiar with his fantastic work -- his drawings , his paintings , his inventions , his writings . But we do not know his face . Thousands of books have been written about him , but there 's controversy , and it remains , about his looks . Even this well-known portrait is not accepted by many art historians . So what do you think ? Is this the face of Leonardo da Vinci or is n't it ? Let 's find out . Leonardo was a man that drew everything around him . He drew people , anatomy , plants , animals , landscapes , buildings , water , everything . But no faces ? I find that hard to believe . His contemporaries made faces , like the ones you see here . En face or three quarters . So surely a passionate drawer like Leonardo must have made self-portraits from time to time . So let 's try to find them . I think that if we were to scan all of his work and look for self-portraits , we would fi
 nd his face looking at us . So I looked at all of his drawings , more than 700 , and looked for male portraits . There are about 120 , you see them here . Which ones of these could be self-portraits ? Well , for that they have to be done as we just saw , en face or three-quarters . So we can eliminate all the profiles . It also has to be sufficiently detailed . So we can also eliminate the ones that are very vague or very stylized . And we know from his contemporaries that Leonardo was a very handsome , even beautiful man . So we can also eliminate the ugly ones or the caricatures . ( Laughter ) And look what happens -- only three candidates remain that fit the bill . And here they are . Yes indeed , the old man is there , as is this famous pen drawing of the Homo Vitruvianos . And lastly , the only portrait of a male that Leonardo painted , " The Musician . " Before we go into these faces , I should explain why I have some right to talk about them . I 've made more than 1,100 portr
 aits myself for newspapers , over the course of 300 -- 30 years , sorry , 30 years only . ( Laughter ) But there are 1,100 , and very few artists have drawn so many faces . So I know a little about drawing and analyzing faces . OK , now let 's look at these three portraits . And hold onto your seats , because if we zoom in on those faces , remark how they have the same broad forehead , the horizontal eyebrows , the long nose , the curved lips and the small , well-developed chin . I could n't believe my eyes when I first saw that . There is no reason why these portraits should look alike . All we did was look for portraits that had the characteristics of a self-portrait , and look , they are very similar . Now , are they made in the right order ? The young man should be made first . And as you see here from the years that they were created , it is indeed the case . They are made in the right order . What was the age of Leonardo at the time ? Does that fit ? Yes it does . He was 33 , 
 38 and 63 when these were made . So we have three pictures , potentially of the same person of the same age as Leonardo at the time . But how do we know it 's him , and not someone else ? Well , we need a reference . And here 's the only picture of Leonardo that 's widely accepted . It 's a statue made by Verrocchio , of David , for which Leonardo posed as a boy of 15. And if we now compare the face of the statue , with the face of the musician , you see the very same features again . The statue is the reference , and it connects the identity of Leonardo to those three faces . Ladies and gentlemen , this story has not yet been published . It 's only proper that you here at TED hear and see it first . The icon of icons finally has a face . Here he is -- Leonardo da Vinci . ( Applause ) 
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/opennlp-sandbox/blob/1f97041b/opennlp-similarity/src/test/resources/style_recognizer/txt/ted/174ted_stephen_wolfram_computing_a_theory_of_everything.txt
----------------------------------------------------------------------
diff --git a/opennlp-similarity/src/test/resources/style_recognizer/txt/ted/174ted_stephen_wolfram_computing_a_theory_of_everything.txt b/opennlp-similarity/src/test/resources/style_recognizer/txt/ted/174ted_stephen_wolfram_computing_a_theory_of_everything.txt
new file mode 100644
index 0000000..97cf3ef
--- /dev/null
+++ b/opennlp-similarity/src/test/resources/style_recognizer/txt/ted/174ted_stephen_wolfram_computing_a_theory_of_everything.txt
@@ -0,0 +1,2 @@
+
+So I want to talk today about an idea . It 's a big idea . Actually , I think it 'll eventually be seen as probably the single biggest idea that 's emerged in the past century . It 's the idea of computation . Now , of course , that idea has brought us all of the computer technology we have today and so on . But there 's actually a lot more to computation than that . It 's really a very deep , very powerful , very fundamental idea , whose effects we 've only just begun to see . Well , I myself have spent the past 30 years of my life working on three large projects that really try to take the idea of computation seriously . So I started off at a young age as a physicist using computers as tools . Then , I started sort of drilling down , thinking about the computations I might want to do , trying to figure out what primitives they could be built up from and how they could be automated as much as possible . Eventually , I created a whole structure based on symbolic programming and so o
 n that let me build Mathematica . And for the past 23 years , at an increasing rate , we 've been pouring more and more ideas and capabilities and so on into Mathematica , and I 'm happy to say that 's led to many good things in R and D and education , lots of other areas . Well , I have to admit , actually , that I also had a very selfish reason for building Mathematica . I wanted to use it myself , a bit like Galileo got to use his telescope 400 years ago . But I wanted to look , not at the astronomical universe , but at the computational universe . So we normally think of programs as being complicated things that we build for very specific purposes . But what about the space of all possible programs ? Here 's a representation of a really simple program . So , if we run this program , this is what we get . Very simple . So let 's try changing the rule for this program a little bit . Now we get another result , still very simple . Try changing it again . You get something a little 
 bit more complicated , but if we keep running this for awhile , we find out that , although the pattern we get is very intricate , it has a very regular structure . So the question is : Can anything else happen ? Well , we can do a little experiment . Let 's just do a little mathematical experiment , try and find out . Let 's just run all possible programs of the particular type that we 're looking at . They 're called cellular automata . You can see a lot of diversity in the behavior here . Most of them do very simple things . But if you look along all these different pictures , at rule number 30 , you start to see something interesting going on . So let 's take a closer look at rule number 30 here . So here it is . We 're just following this very simple rule at the bottom here , but we 're getting all this amazing stuff . It 's not at all what we 're used to , and I must say that , when I first saw this , it came as a huge shock to my intuition , and , in fact , to understand it ,
  I eventually had to create a whole new kind of science . ( Laughter ) This science is different , more general , than the mathematics-based science that we 've had for the past 300 or so years . You know , it 's always seemed like a big mystery how nature , seemingly so effortlessly manages to produce so much that seems to us so complex . Well , I think we 've found its secret . It 's just sampling what 's out there in the computational universe and quite often getting things like Rule 30 or like this . And knowing that , starts to explain a lot of long-standing mysteries in science . It also brings up new issues though , like computational irreducibility . I mean , we 're used to having science let us predict things , but something like this is fundamentally irreducible . The only way to find its outcome is , effectively , just to watch it evolve . It 's connected to , what I call , the principle of computational equivalence , which tells us that even incredibly simple systems can
  do computations as sophisticated as anything . It does n't take lots of technology or biological evolution to be able to do arbitrary computation , just something that happens , naturally , all over the place . Things with rules as simple as these can do it . Well , this has deep implications about the limits of science , about predictability and controllability of things like biological processes or economies , about intelligence in the universe , about questions like free will and about creating technology . You know , working on this science for many years , I kept wondering , " What will be its first killer app ? " Well , ever since I was a kid , I 'd been thinking about systematizing knowledge and somehow making it computable . People like Leibniz had wondered about that too 300 years earlier . But I 'd always assumed that to make progress , I 'd essentially have to replicate a whole brain . Well , now I got to thinking : This scientific paradigm of mine suggests something dif
 ferent . And , by the way , I 've now got huge computation capabilities in Mathematica , and I 'm a CEO with some worldly resources to do large , seemingly crazy , projects . So I decided to just try to see how much of the systematic knowledge that 's out there in the world we can make computable . So , it 's been a big , very complex project , which I was not sure was going to work at all . But I 'm happy to say that it 's actually going really well . And last year we were able to release the first website version of Wolfram Alpha . It 's purpose is to be a serious knowledge engine that computes answers to questions . So let 's give it a try . Let 's start off with something really easy . Hope for the best . Very good . Okay . So far so good . ( Laughter ) Let 's try something a little bit harder . Let 's say ... Let 's do some mathy thing and with luck it 'll work out the answer and try and tell us some interesting things things about related math . We could ask it something about
  the real world . Let 's say -- I do n't know -- What 's the GDP of Spain ? And it should be able to tell us that . Now we could compute something related to this , let 's say the GDP of Spain divided by , I do n't know , the -- hmmm ... let 's say the revenue of Microsoft . ( Laughter ) The idea is that we can sort of just type this in , this kind of question in however we think of it . So let 's try asking a question , like a health related question . So let 's say we have a lab finding that -- you know , we have an LDL level of 140 for a male aged 50. So let 's type that in , and now Wolfram Alpha will go and use available public health data and try to figure out what part of the population that corresponds to and so on . Or let 's try asking about , I do n't know , the international space station . And what 's happening here is that Wolfram Alpha is not just looking up something ; it 's computing , in real time , where the international space station is right now , at this momen
 t , how fast it 's going and so on . So Wolfram Alpha knows about lots and lots of kinds of things . It 's got by now , pretty good coverage of everything you might find in a standard reference library and so on . But the goal is to go much further and , very broadly , to democratize all of this kind of knowledge , and to try and be an authoritative source in all areas , to be able to compute answers to specific questions that people have , not by searching what other people may have written down before , but by using built in knowledge to compute fresh new answers to specific question . Now , of course , Wolfram Alpha is a monumentally huge , long term project with lots and lots of challenges . For a start , one has to curate a zillion different sources of facts and data , and we built quite a pipeline of Mathematica automation and human domain experts for doing this . But that 's just the beginning . Given raw facts or data to actually answer questions , one has to compute , one h
 as to implement all those methods and models and algorithms and so on that science and other areas have built up over the centuries . Well , even starting from Mathematica , this is still a huge amount of work . So far , there are about 8 million lines of Mathematica code in Wolfram Alpha built by experts from many , many different fields . Well , a crucial idea of Wolfram Alpha is that you can just ask it questions using ordinary human language , which means that we 've got to be able to take all those strange utterances that people type into the input field and understand them . And I must say that I thought that step might just be plain impossible . Two big things happened . First , a bunch of new ideas about linguistics that came from studying the computational universe . And second , the realization that having actual computable knowledge completely changes how one can set about understanding language . And , of course , now with Wolfram Alpha actually out in the wild , we can 
 learn from its actual usage . And , in fact , there 's been an interesting coevolution that 's been going on between Wolfram Alpha and its human users . And it 's really encouraging . Right now , if we look at web queries , more than 80 percent of them get handled successfully the first time . And if you look at things like the iPhone app , the fraction is considerably larger . So , I 'm pretty pleased with it all . But , in many ways , we 're still at the very beginning with Wolfram Alpha . I mean , everything is scaling up very nicely . We 're getting more confident . You can expect to see Wolfram Alpha technology showing up in more and more places , working both with this kind of public data , like on the website , and with private knowledge for people and companies and so on . You know , I 've realized that Wolfram Alpha actually gives one a sort of whole new kind of computing that one can call knowledge-based computing , in which one 's starting , not just from raw computation 
 , but from a vast amount of built-in knowledge . And when one does that , one really changes the economics of delivering computational things , whether it 's on the web or elsewhere . You know , we have a fairly interesting situation right now . On the one hand , we have Mathematica , with its sort of precise , formal language and a huge network of carefully designed capabilities able to get a lot done in just a few lines . Let me show you a couple of examples here . So here 's a trivial piece of Mathematica programming . Here 's something where we 're sort of integrating a bunch of different capabilities here . Here we 'll just create in this line a little user interface that allows us to do something fun there . If you go on , that 's a slightly more complicated program that 's now doing all sorts of algorithmic things and creating user interface and so on . But it 's something that 's very precise stuff . It 's a precise specification with a precise formal language that causes Ma
 thematica to know what to do here . Well , then on the other hand , we have Wolfram Alpha , with all the sort of messiness of the world and human language and so on built into it . So what happens when you put these things together ? I think it 's actually rather wonderful . With Wolfram Alpha inside Mathematica , you can , for example , make precise programs that call on real-world data . Here 's a really simple example . You can also just sort of give vague input and then try and have Wolfram Alpha figure out what you 're talking about . Let 's try this here . But actually I think sort of the most exciting thing about this is that it really gives one the chance to democratize programming . I mean , anyone will be able to just sort of say what they want in plain language , then , the idea is , that Wolfram Alpha will be able to figure out what precise pieces of code can do what they 're asking for and then show them examples that will let them pick what they need to build up bigger
  and bigger , precise programs . So , sometimes , Wolfram Alpha will be able to do the whole thing immediately and just give back a whole big program that you can then compute with . So here 's a big website where we 've been collecting lots of educational and other demonstrations about lots of kinds of things . So , I do n't know , I 'll show you one example , maybe here . This is just an example of one of these computable documents . This is probably a fairly small piece of Mathematica code that 's able to be run here . Okay . Let 's zoom out again . So , given our new kind of science , is there a general way to use it to make technology ? So , with physical materials , we 're used to kind of going around the world and discovering that particular materials are useful for particular technological purposes and so on . Well , it turns out , we can do very much the same kind of thing in the computational universe . There 's an inexhaustible supply of programs out there . The challenge
  is to see how to harness them for human purposes . Something like Rule 30 , for example , turns out to be a really good randomness generator . Other simple programs are good models for processes in the natural or social world . And , for example , Wolfram Alpha and Mathematica are actually now full of algorithms that we discovered by searching the computational universe . And , for example , this -- we go back here -- This has become surprisingly popular among composers finding musical forms by searching the computational universe . In a sense , we can use the computational universe to get mass customized creativity . I 'm hoping we can , for example , use that even to get Wolfram Alpha to routinely sort of do invention and discovery on the fly and to find all sorts of wonderful stuff that no engineer and no process of incremental evolution would ever come up with . Well , so , that leads to sort of an ultimate question . Could it be that someplace out there in the computational un
 iverse we might find our physical universe ? Perhaps there 's even some quite simple rule , some simple program for our universe . Well , the history of physics would have us believe that the rule for the universe must be pretty complicated . But in the computational universe we 've now seen how rules that are incredibly simple can produce incredibly rich and complex behavior . So could that be what 's going on with our whole universe ? If the rules for the universe are simple , it 's kind of inevitable that they have to be very abstract and very low level , operating , for example , far below the level of space or time , which makes it hard to represent things . But in at least a large class of cases , one can think of the universe as being like some kind of network , which , when it gets big enough , behaves like continuous space in much the same way as having lots of molecules can behave like a continuous fluid . Well , then the universe has to evolve by applying little rules tha
 t progressively update this network . And each possible rule , in a sense , corresponds to a candidate universe . Actually , I have n't shown these before , but here are a few of the candidate universes that I 've looked at . Some of these are hopeless universes , completely sterile , with other kinds of pathologies like no notion of space , no notion of time , no matter , other problems like that . But the exciting thing that I 've found in the last few years is that you actually do n't have to go very far in the computational universe before you start finding candidate universes that are n't obviously not our universe . Here 's the problem : Any serious candidate for our universe , is inevitably full of computational irreducibility , which means that it is irreducibly difficult to find out how it will really behave , and whether it matches our physical universe . A few years ago , I was pretty excited to discover that there are candidate universes with incredibly simple rules that
  successfully reproduce special relativity and even general relativity and gravitation and at least give hints of quantum mechanics . So , will we find the whole of physics ? I do n't know for sure . But I think at this point it 's sort of almost embarrassing not to at least try . Not an easy project . One has got to build a lot of technology . One 's got to build a structure that 's probably at least as deep as existing physics . And I 'm not sure what the best way to organize the whole thing is . Build a team , open it up , offer prizes and so on . But I 'll tell you here today that I 'm committed to seeing this project done , to see if , within this decade , we can finally hold in our hands the rule for our universe and know where our universe lies in the space of all possible universes -- and be able to type into Wolfram Alpha " the theory of the universe , " and have it tell us . ( Laughter ) So I 've been working on the idea of computation now for more than 30 years , building
  tools and methods and turning sort of intellectual ideas into millions of lines of code and grist for server farms and so on . With every passing year , I realize how much more powerful the idea of computation really is . It 's taken us a long way already , but there 's so much more to come . From the foundations of science to the limits of technology to the very definition of the human condition , I think computation is destined to be the defining idea of our future . Thank you . ( Applause ) Chris Anderson : That was astonishing . Stay here . I 've got a question . ( Applause ) So , that was , fair to say , an astonishing talk . Are you able to say in a sentence or two how this type of thinking could integrate at some point to things like string theory or the kind of things that people think of as the fundamental explanations of the universe ? Stephen Wolfram : Well , the parts of physics that we kind of know to be true , things like the standard model of physics . What I 'm tryi
 ng to do better reproduce the standard model of physics or it 's simply wrong . The things that people have tried to do in the last 25 years or so with string theory and so on have been an interesting exploration that has tried to get back to the standard model , but has n't quite gotten there . My guess is that some great simplifications of what I 'm doing may actually have considerable resonance with what 's been done in string theory , but that 's a complicated math thing that I do n't yet know how it 's going to work out . CA : Benoit Mandlebrot is in the audience . He has also shown how complexity can arise from a simple start . Does your work relate to his ? SW : I think so . I view Benoit Mandlebrot 's work as kind of one of the founding contributions to this kind of area . Benoit has been particularly interested in nested patterns , in fractals and so on , where the structure is something that 's kind of tree-like , and where there 's sort of a big branch that makes little b
 ranches , and even smaller branches and so on . That 's kind of one of the ways that you get towards true complexity . I think things like the Rule 30 cellular automaton get us to a different level . In fact , in a very precise way they get us to a different level because they seem to be things that are capable of complexity that 's sort of as great as complexity can ever get ... I could go on about this at great length , but I wo n't . CA : Stephen Wolfram , thank you . ( Applause ) 
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/opennlp-sandbox/blob/1f97041b/opennlp-similarity/src/test/resources/style_recognizer/txt/ted/175ted_tom_wujec_build_a_tower.txt
----------------------------------------------------------------------
diff --git a/opennlp-similarity/src/test/resources/style_recognizer/txt/ted/175ted_tom_wujec_build_a_tower.txt b/opennlp-similarity/src/test/resources/style_recognizer/txt/ted/175ted_tom_wujec_build_a_tower.txt
new file mode 100644
index 0000000..ad09734
--- /dev/null
+++ b/opennlp-similarity/src/test/resources/style_recognizer/txt/ted/175ted_tom_wujec_build_a_tower.txt
@@ -0,0 +1,2 @@
+
+Several years ago , here at TED , Peter Skillman introduced a design challenge called the marshmallow challenge . And the idea 's pretty simple . Teams of four have to build the tallest free-standing structure out of 20 sticks of spaghetti , one yard of tape , one yard of string and a marshmallow . The marshmallow has to be on top . And , though it seems really simple , it 's actually pretty hard , because it forces people to collaborate very quickly . And so I thought that this was an interesting idea , and I incorporated it into a design workshop . And it was a huge success . And since then , I 've conducted about 70 design workshops across the world with students and designers and architects , even the CTOs of the Fortune 50 , and there 's something about this exercise that reveals very deep lessons about the nature of collaboration , and I 'd like to share some of them with you . So , normally , most people begin by orienting themselves to the task . They talk about it , they fi
 gure out what it 's going to look like , they jockey for power , then they spend some time planning , organizing . They sketch and they lay out spaghetti They spend the majority of their time assembling the sticks into ever-growing structures and then , finally , just as they 're running out of time , someone takes out the marshmallow , and then they gingerly put it on top , and then they stand back , and Ta-da ! they admire their work . But what really happens , most of the time , is that the " ta-da " turns into an " uh-oh , " because the weight of the marshmallow causes the entire structure to buckle and to collapse . So there are a number of people who have a lot more " uh-oh " moments than others , and among the worst are recent graduates of business school . ( Laughter ) They lie , they cheat , they get distracted , and they produce really lame structures . And of course there are teams that have a lot more " ta-da " structures , and , among the best , are recent graduates of 
 kindergarten . ( Laughter ) And it 's pretty amazing . As Peter tells us , not only do they produce the tallest structures , but they 're the most interesting structures of them all . So the question you want to ask is : How come ? Why ? What is it about them ? And Peter likes to say that , " None of the kids spend any time trying to be CEO of Spaghetti Inc. " Right . They do n't spend time jockeying for power . But there 's another reason as well . And the reason is that business students are trained to find the single right plan , right . And then they execute on it . And then what happens is , when they put the marshmallow on the top , they run out of time , and what happens ? It 's a crisis . Sound familiar ? Right . What kindergarteners do differently , is that they start with the marshmallow , and they build prototypes , successive prototypes , always keeping the marshmallow on top , so they have multiple times to fix ill built prototypes along the way . So designers recognize
  this type of collaboration as the essence of the iterative process . And with each version , kids get instant feedback about what works and what does n't work . So the capacity to play in prototype is really essential , but let 's look at how different teams perform . So the average for most people is around 20 inches , business schools students , about half of that , lawyers , a little better , but not much better than that , kindergarteners , better than most adults . Who does the very best ? Architects and engineers , thankfully . ( Laughter ) 39 inches is the tallest structure I 've seen . And why is it ? Because they understand triangles and self-re-enforcing geometrical patterns are the key to building stable structures . So CEOs , a little bit better than average . But here 's where it gets interesting . If you put you put an executive admin . on the team , they get significantly better . ( Laughter ) It 's incredible . You know , you look around , you go , " Oh , that team 
 's going to win . " You can just tell beforehand . And why is that ? Because they have special skills of facilitation . They manage the process , they understand the process . And any team who manages and pays a close attention to work will significantly improve the team 's performance . Specialized skills and facilitation skills are the combination [ that ] leads to strong success . If you have 10 teams that typically perform , you 'll get maybe six or so that have standing structures . And I tried something interesting . I thought , let 's up the ante once . So I offered a 10,000 dollar prize of software to the winning team . So what do you think happened to these design students ? What was the result ? Here 's what happened . Not one team had a standing structure . If anyone had built , say , a one inch structure , they could have taken home the prize . So , is n't it interesting that high stakes have a strong impact . We did the exercise again with the same students . What do yo
 u think happened then ? So now they understand the value of prototyping . So the same team went from being the very worst to being among the very best . They produced the tallest structures in the least amount of time . So there 's deep lessons for us about the nature of incentives and success . So , you might ask : Why would anyone actually spend time writing a marshmallow challenge ? And the reason is , I help create digital tools and processes to help teams build cars and video games and visual effects . And what the marshmallow challenge does is it helps them identify the hidden assumptions . Because , frankly , every project has its own marshmallow , does n't it . The challenge provides a shared experience , a common language , common stance to build the right prototype . And so , this is the value of the experience , of this so simple exercise . And those of you who are interested , may want to go to marshmallowchallenge . com . It 's a blog that you can look at how to build t
 he marshmallows . There 's step-by-step instructions on this . There are crazy examples from around the world of how people tweak and adjust the system . There 's world records on this as well . And the fundamental lesson , I believe , is that design truly is a contact sport . It demands that we bring all of our senses to the task , and that we apply the very best of our thinking , our feeling and our doing to the challenge that we have at hand . And , sometimes , a little prototype of this experience is all that it takes to turn us from an " uh-oh " moment to a " ta-da " moment . And that can make a big difference . Thank you very much . ( Applause ) 
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/opennlp-sandbox/blob/1f97041b/opennlp-similarity/src/test/resources/style_recognizer/txt/ted/176ted_tom_wujec_on_3_ways_the_brain_creates_meaning.txt
----------------------------------------------------------------------
diff --git a/opennlp-similarity/src/test/resources/style_recognizer/txt/ted/176ted_tom_wujec_on_3_ways_the_brain_creates_meaning.txt b/opennlp-similarity/src/test/resources/style_recognizer/txt/ted/176ted_tom_wujec_on_3_ways_the_brain_creates_meaning.txt
new file mode 100644
index 0000000..44a13c7
--- /dev/null
+++ b/opennlp-similarity/src/test/resources/style_recognizer/txt/ted/176ted_tom_wujec_on_3_ways_the_brain_creates_meaning.txt
@@ -0,0 +1,2 @@
+
+Last year at TED we aimed to try to clarify the overwhelming complexity and richness that we experience at the conference in a project called Big Viz . And the Big Viz is a collection of 650 sketches that were made by two visual artists . David Sibbet from The Grove , and Kevin Richards from Autodesk made 650 sketches that strive to capture the essence of each presenter 's ideas . And the consensus was , it really worked . These sketches brought to life the key ideas , the portraits , the magic moments that we all experienced last year . This year we were thinking , " Why does it work ? " What is it about animation , graphics , illustrations , that create meaning ? And this is an important question to ask and answer because the more we understand how the brain creates meaning , the better we can communicate , and I also think , the better we can think and collaborate together . So this year we 're going to visualize how the brain visualizes . Cognitive psychologists now tell us that
  the brain does n't actually see the world as it is , but instead , creates a series of mental models through a collection of " Ah-ha moments , " or moments of discovery , through various processes . The processing , of course , begins with the eyes . Light enters , hits the back of the retina , and is circulated , most of which is streamed to the very back of the brain , at the primary visual cortex . And primary visual cortex sees just simple geometry , just the simplest of shapes . But it also acts like a kind of relay station that re-radiates and redirects information to many other parts of the brain . As many as 30 other parts that selectively make more sense , create more meaning through the kind of " Ah-ha " experiences . We 're only going to talk about three of them . So the first one is called the ventral stream . It 's on this side of the brain . And this is the part of the brain that will recognize what something is . It 's the " what " detector . Look at a hand . Look at
  a remote control . Chair . Book . So that 's the part of the brain that is activated when you give a word to something . A second part of the brain is called the dorsal stream . And what it does is locates the object in physical body space . So if you look around the stage here you 'll create a kind of mental map of the stage . And if you closed your eyes you 'd be able to mentally navigate it . You 'd be activating the dorsal stream if you did that . The third part that I 'd like to talk about is the limbic system . And this is deep inside of the brain . It 's very old , evolutionarily . And it 's the part that feels . It 's the kind of gut center , where you see an image and you go , " Oh ! I have a strong or emotional reaction to whatever I 'm seeing . " So the combination of these processing centers help us make meaning in very different ways . So what can we learn about this ? How can we apply this insight ? Well , again , the schematic view is that the eye visually interrogat
 es what we look at . The brain processes this in parallel , the figments of information asking a whole bunch of questions to create a unified mental model . So , for example , when you look at this image a good graphic invites the eye to dart around , to selectively create a visual logic . So the act of engaging , and looking at the image creates the meaning . It 's the selective logic . Now we 've augmented this and spatialized this information . Many of you may remember the magic wall that we built in conjunction with Perceptive Pixel where we quite literally create an infinite wall . And so we can compare and contrast the big ideas . So the act of engaging and creating interactive imagery enriches meaning . It activates a different part of the brain . And then the limbic system is activated when we see motion , when we see color . and there are primary shapes and pattern detectors that we 've heard about before . So the point of this is what ? We make meaning by seeing , by an ac
 t of visual interrogation . The lessons for us are three-fold . First , use images to clarify what we 're trying to communicate . Secondly make those images interactive so that we engage much more fully . And the third is to augment memory by creating a visual persistence . These are techniques that can be used to be -- that can be applied in a wide range of problem solving . So the low-tech version looks like this . And , by the way , this is the way in which we develop and formulate strategy within Autodesk , in some of our organizations and some of our divisions . What we literally do is have the teams draw out the entire strategic plan on one giant wall . And it 's very powerful because everyone gets to see everything else . There 's always a room , always a place to be able to make sense of all of the components in the strategic plan . This is a time-lapse view of it . You can ask the question , " Who 's the boss ? " You 'll be able to figure that out . So the act of collective
 ly and collaboratively building the image transforms the collaboration . No Powerpoint is used in two days. But instead the entire team creates a shared mental model that they can all agree on and move forward on . And this can be enhanced and augmented with some emerging digital technology . And this is our great unveiling for today . And this is an emerging set of technologies that use large-screen displays with intelligent calculation in the background to make the invisible visible . Here what we can do is look at sustainability , quite literally . So a team can actually look at all the key components that heat the structure and make choices and then see the end result that is visualized on this screen . So making images meaningful has three components . The first again , is making ideas clear by visualizing them . Secondly , making them interactive . And then thirdly , making them persistent . And I believe that these three principles can be applied to solving some of the very t
 ough problems that we face in the world today . Thanks so much . ( Applause ) 
\ No newline at end of file