You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@stanbol.apache.org by og...@apache.org on 2012/05/11 18:34:40 UTC

svn commit: r1337268 - /incubator/stanbol/trunk/enhancer/topic-web/tools/README.md

Author: ogrisel
Date: Fri May 11 16:34:39 2012
New Revision: 1337268

URL: http://svn.apache.org/viewvc?rev=1337268&view=rev
Log:
STANBOL-197: more documentation on building your own classification model from DBpedia dumps

Modified:
    incubator/stanbol/trunk/enhancer/topic-web/tools/README.md

Modified: incubator/stanbol/trunk/enhancer/topic-web/tools/README.md
URL: http://svn.apache.org/viewvc/incubator/stanbol/trunk/enhancer/topic-web/tools/README.md?rev=1337268&r1=1337267&r2=1337268&view=diff
==============================================================================
--- incubator/stanbol/trunk/enhancer/topic-web/tools/README.md (original)
+++ incubator/stanbol/trunk/enhancer/topic-web/tools/README.md Fri May 11 16:34:39 2012
@@ -48,9 +48,28 @@ IPTC topics to text documents.
 ## Using DBpedia categories
 
 A subset of Wikipedia / DBpedia categories can be used as a classifier. To
-extract such a taxonomy of topics you can use [dbpediakit][3]:
+extract such a taxonomy of topics you can use [dbpediakit][3] (you
+will need python and postgresql for this to run):
 
-[3] https://github.com/ogrisel/dbpediakit
+    git clone https://github.com/ogrisel/dbpediakit
+    cd dbpediakit
+
+Create the dbpediakit database on the postgresql server by following the
+instructions in:
+
+    https://github.com/ogrisel/dbpediakit/blob/master/dbpediakit/postgres.py
+
+You can now run the extraction (this will download the required dumps and load
+them in postgresql hence can take a long time):
 
-    python dbpediacategories.py topics.tsv examples.tsv \
+    python examples/topics/build_taxonomy.py --max-depth=2
+
+Back in this folder, import the taxonomy and training set to Stanbol so
+as to build the classifier model:
+
+    python dbpediacategories.py
+        /path/to/dbpediakit/dbpedia-taxonomy.tsv \
+        /path/to/dbpediakit/dbpedia-examples.tsv.bz2 \
         http://localhost:8080/topic/model
+
+[3] https://github.com/ogrisel/dbpediakit