You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by "Shah, Nishant" <ni...@amazon.com> on 2013/05/25 04:11:32 UTC

Garbage values when stored in MySql

Hi everyone,

I followed an online tutorial which helped me setup Nutch 2.1 with MySql. I used blobs as storage mechanism for headers, metadata, parseProrocol etc.
When I try to convert the blobs to strings, I get garbage values like '? 0002(In a square)'  in between the texts. This is making extracting meta tags very difficult. I have the parser-meta plugin enabled and the scoring-opic plugin disabled. I want to extract fields like description, keywords, language, header information etc. Any tip would be helpful.

Thanks.

Nishant