You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by decoder <de...@own-hero.net> on 2006/08/25 23:17:14 UTC

FuzzyOcr 2.3b released, fixes bugs and improves stability

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hello,


I just uploaded FuzzyOcr 2.3b to the download site. If you find bugs
or run into problems, please mail back :)

The major changes are:

- - Added a configurable timeout (maximum runtime) for the plugin, to
avoid any lockups/unwanted delays
- - The default matching threshold (set in the config file) can now be
overridden on a per-word basis in the wordlist

    An example, wordlist contains:

    word1
    word2::0
    word3::0.2


    Then word1 is matched with the default threshold set in the config
    file,
    word2 must be an exact match (threshold 0), and word 3 is matched
    with a threshold of 0.2.

    This is especially useful for words which trigger false positives
    very often like: "penis", "money" or "news".

    Note that the tendency to produce a FP is not directly connected
    to the word length.
    The word "buy" produces very few FP compared to "penis", when both
    are being matched with the same threshold.

    The FuzzyOcr.words.sample contains some suggestions for word
    specific thresholds which I recommend.

- - The experimental MD5 database has been replaced by a custom hash
database which is able to match very similar images.

    Often, you get the same image twice, or all your customers get the
    same spam mail. But even though the pictures look the same, they
    are not identical. That is why MD5 was useless. The newly
    introduced hash (self invented) is able to recognize almost
    identical images based on features that I won't explain here as it
    would make it easier for spammers :)
    If a message contains a picture previously registered in the
    database, the original score is reread from the database and the
    message is immediatly tagged with this score and the plugin ends.

- - Some non-alpha->alpha translations are now used on the gocr output,
that fix common mistakes, like "i" being misread as ";" or "a" as "8".

- - There are now 2 scores for broken images, one is used when the
picture is recognized as broken, but giffix was able to correct the
errors and it gave some output that can be scanned, the other one is
used if the image is unfixable (that means either too broken, or
interlaced/animated and broken). The first one is set lower than the
second one (2.5 vs. 5).

- -Various bugfixes

TODO:

- -Write an external program to manage the database (add, remove and
verify given pictures).
- -Rewrite the temp file system to do all external program operations on
files (saves memory).


Another wish: I'd like to create a database to ship with the plugin so
it can be used out of the box but I do not have much samples here, so
it would be nice if you sent me picture samples of common picture spam
you get with "[picture sample]" in the subject to my mail address. I
will post here again if I got enough :).


Thanks to Jorge Valdes, Michael Alan Dorman and UxBoD for finding bugs
and sending improvement suggestions for this version

Chris
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.5 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFE72jaJQIKXnJyDxURApfeAJ47JcACEeIaYtEA8z6wDdFxGPhrUgCZAZSE
sdWROYeF8IFdbUX0njAdV+o=
=y7XM
-----END PGP SIGNATURE-----


Re: FuzzyOcr 2.3b released, fixes bugs and improves stability

Posted by decoder <de...@own-hero.net>.
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

John Andersen wrote:
> On Friday 25 August 2006 13:39, decoder wrote:
>> Maybe it would. But this kind of hash is no real "hash". It is
>> just a combination of picture features that I invented... but it
>> seems reliable in my tests so far.
>
> Not sure it matters a whole lot what the actual content is when
> using Razor.  If enough (trusted) people report a message with a
> given text content, it builds a razor confidence level fairly
> quickly.
>
> So what you report could be a simple hex dump of your hash, what
> ever that hash may look like.
>
> I'm betting this could be done with razor without any action on
> their part, (not that I'm recommending going around them).
>
The problem is, this is not a hash that is simply matched against
another hash 1:1.

When comparing two hashes, a small percentage of difference is allowed
on the values for better results. Sometimes it is a 100% match, but it
might also be a 99% match. So matching two hashes is rather complex.
If you know perl, feel free to check out the routines :)


Chris
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.5 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFE73FaJQIKXnJyDxURAqW7AJ9yJ+9yPQIYOWQl8xZpT8Mf3q2YygCeLae8
HJZm5YWEk19RuOCGRS0sJ7A=
=Sv3C
-----END PGP SIGNATURE-----


Re: FuzzyOcr 2.3b released, fixes bugs and improves stability

Posted by John Andersen <js...@pen.homeip.net>.
On Friday 25 August 2006 13:39, decoder wrote:
> Maybe it would. But this kind of hash is no real "hash". It is just a
> combination of picture features that I invented... but it seems
> reliable in my tests so far.

Not sure it matters a whole lot what the actual content is when using
Razor.  If enough (trusted) people report a message with a given text
content, it builds a razor confidence level fairly quickly.

So what you report could be a simple hex dump of your hash, what
ever that hash may look like.

I'm betting this could be done with razor without any action on
their part, (not that I'm recommending going around them).

-- 
_____________________________________
John Andersen

Re: FuzzyOcr 2.3b released, fixes bugs and improves stability

Posted by decoder <de...@own-hero.net>.
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

John Andersen wrote:
> On Friday 25 August 2006 13:17, decoder wrote:
>> Another wish: I'd like to create a database to ship with the
>> plugin so it can be used out of the box but I do not have much
>> samples here, so it would be nice if you sent me picture samples
>> of common picture spam you get with "[picture sample]" in the
>> subject to my mail address. I will post here again if I got
>> enough :).
>
> Wouldn't it be more productive to the community to work with SURBL
>  to enable the centralized storage of these hashes?
>
> Or perhaps with Razor2?
>
> I'm not an expert on Razor, but my limited understanding of it is
> that it generates hashes of (portions of) message bodies and stores
>  that hash for future comparison.
>
> It would seem that once someone decide something is spam, one could
> take your hash and wrap a minimal message around it and report THAT
> to razor.
>
> Then your engine could examine an image, generate your hash, and
> wrap it in the same minimal message and Query Razor.  Presumably
> getting a hit.
>
> No local database is needed, because a world wide one would be
> substituted. That way, if you get this spam and report it, It will
> already be known by the time I get the spam.
>
Maybe it would. But this kind of hash is no real "hash". It is just a
combination of picture features that I invented... but it seems
reliable in my tests so far. Once it has been tested in public, such a
cooperation with SURBL or Razor might be possible


Chris
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.5 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFE724mJQIKXnJyDxURAuW6AKClt1V0/faPEJaTwjLRXChXqhtTkwCfc9Yp
UBsuigcaOac6pOZz2EP7Gkk=
=LJEa
-----END PGP SIGNATURE-----


Re: FuzzyOcr 2.3b released, fixes bugs and improves stability

Posted by decoder <de...@own-hero.net>.
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

John D. Hardin wrote:
> On Fri, 25 Aug 2006, John Andersen wrote:
>
>> On Friday 25 August 2006 13:17, decoder wrote:
>>> Another wish: I'd like to create a database to ship with the
>>> plugin so it can be used out of the box but I do not have much
>>> samples here, so it would be nice if you sent me picture
>>> samples of common picture spam you get with "[picture sample]"
>>> in the subject to my mail address. I will post here again if I
>>> got enough :).
>> Wouldn't it be more productive to the community to work with
>> SURBL to enable the centralized storage of these hashes?
>
> I think he was speaking of word lists.
>
> I agree with the other poster, the best solution would be a way to
> append some extra text to the PerMsgStatus object and have the body
>  rules process that as well as the real message body.
No, I was actually speaking about hashes... Most spam seems recurring
so it might be a good idea to ship the plugin with a prebuilt database.

Just my thoughts... other opinions are welcome...

Chris
>
> -- John Hardin KA7OHZ    ICQ#15735746
> http://www.impsec.org/~jhardin/ jhardin@impsec.org    FALaholic
> #11174    pgpk -a jhardin@impsec.org key: 0xB8732E79 - 2D8C 34F4
> 6411 F507 136C  AF76 D822 E6E6 B873 2E79
> -----------------------------------------------------------------------
>  The fetters imposed on liberty at home have ever been forged out
> of the weapons provided for defense against real, pretended, or
> imaginary dangers from abroad.               -- James Madison, 1799
> 
> -----------------------------------------------------------------------
>  25 days until Talk Like a Pirate day
>

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.5 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFE73YTJQIKXnJyDxURAlMbAKCDyuFBb4RYVsG6ICIw8MbqZO/ExwCgl3GN
dGYobKLzcV6OVioMVCTgnno=
=OWVS
-----END PGP SIGNATURE-----


Re: FuzzyOcr 2.3b released, fixes bugs and improves stability

Posted by "John D. Hardin" <jh...@impsec.org>.
On Fri, 25 Aug 2006, John D. Hardin wrote:

> I think he was speaking of word lists.

Sigh. That's what I get for reading and responding in sequence.

--
 John Hardin KA7OHZ    ICQ#15735746    http://www.impsec.org/~jhardin/
 jhardin@impsec.org    FALaholic #11174    pgpk -a jhardin@impsec.org
 key: 0xB8732E79 - 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
  The fetters imposed on liberty at home have ever been forged out
  of the weapons provided for defense against real, pretended, or
  imaginary dangers from abroad.               -- James Madison, 1799
-----------------------------------------------------------------------
 25 days until Talk Like a Pirate day


Re: FuzzyOcr 2.3b released, fixes bugs and improves stability

Posted by "John D. Hardin" <jh...@impsec.org>.
On Fri, 25 Aug 2006, John Andersen wrote:

> On Friday 25 August 2006 13:17, decoder wrote:
> > Another wish: I'd like to create a database to ship with the plugin so
> > it can be used out of the box but I do not have much samples here, so
> > it would be nice if you sent me picture samples of common picture spam
> > you get with "[picture sample]" in the subject to my mail address. I
> > will post here again if I got enough :).
> 
> Wouldn't it be more productive to the community to work with SURBL 
> to enable the centralized storage of these hashes?

I think he was speaking of word lists.

I agree with the other poster, the best solution would be a way to
append some extra text to the PerMsgStatus object and have the body
rules process that as well as the real message body.

--
 John Hardin KA7OHZ    ICQ#15735746    http://www.impsec.org/~jhardin/
 jhardin@impsec.org    FALaholic #11174    pgpk -a jhardin@impsec.org
 key: 0xB8732E79 - 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
  The fetters imposed on liberty at home have ever been forged out
  of the weapons provided for defense against real, pretended, or
  imaginary dangers from abroad.               -- James Madison, 1799
-----------------------------------------------------------------------
 25 days until Talk Like a Pirate day


Re: FuzzyOcr 2.3b released, fixes bugs and improves stability

Posted by John Andersen <js...@pen.homeip.net>.
On Friday 25 August 2006 13:17, decoder wrote:
> Another wish: I'd like to create a database to ship with the plugin so
> it can be used out of the box but I do not have much samples here, so
> it would be nice if you sent me picture samples of common picture spam
> you get with "[picture sample]" in the subject to my mail address. I
> will post here again if I got enough :).

Wouldn't it be more productive to the community to work with SURBL 
to enable the centralized storage of these hashes?

Or perhaps with Razor2?

I'm not an expert on Razor, but my limited understanding of it is
that it generates hashes of (portions of) message bodies and stores
that hash for future comparison.

It would seem that once someone decide something is spam, one could take your 
hash and wrap a minimal message around it and report THAT to razor.

Then your engine could examine an image, generate your hash, and wrap
it in the same minimal message and Query Razor.  Presumably getting a hit.

No local database is needed, because a world wide one would be substituted.
That way, if you get this spam and report it, It will already be known
by the time I get the spam.

-- 
_____________________________________
John Andersen

Re: FuzzyOcr 2.3b release, broken with SA 3.1.0

Posted by Loren Wilton <lw...@earthlink.net>.
I believe there is at least one other plugin that includes its own copy of 
M:SA:Timeout when installed on a backlevel version of SA.  Probably a 
reasonable solution.

        Loren


Re: FuzzyOcr 2.3b release, broken with SA 3.1.0

Posted by "Daryl C. W. O'Shea" <sp...@dostech.ca>.
decoder wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> Hello,
> 
> I was just informed that the latest FuzzyOcr version, 3.2b, includes a
> function (module from SA) which is only available in 3.1.4, not in
> 3.1.0. The missing module is Mail::SpamAssassin::Timeout. Currently,
> the only way to fix this is to upgrade to 3.1.4.

That's not accurate.  M::SA::Timeout is available in 3.1.1+.  I highly 
suggest upgrading any 3.1.0 installation.  Any half decent distro should 
have at least 3.1.1 available by now.


> I am still unsure
> wether I should add my own timeout stuff with alert() only to support
> 3.1.0.
> 
> Maybe someone else here has a better idea :)

Unless you're really, really, really careful, and look out for all the 
weird things that can happen to alarms and timers under high load I 
wouldn't even think about rolling your own code.


Daryl

Re: FuzzyOcr 2.3b release, broken with SA 3.1.0

Posted by decoder <de...@own-hero.net>.
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hello,

I was just informed that the latest FuzzyOcr version, 3.2b, includes a
function (module from SA) which is only available in 3.1.4, not in
3.1.0. The missing module is Mail::SpamAssassin::Timeout. Currently,
the only way to fix this is to upgrade to 3.1.4. I am still unsure
wether I should add my own timeout stuff with alert() only to support
3.1.0.

Maybe someone else here has a better idea :)



Chris



decoder wrote:
> Hello,
>
>
> I just uploaded FuzzyOcr 2.3b to the download site. If you find
> bugs or run into problems, please mail back :)
>
> The major changes are:
>
> - Added a configurable timeout (maximum runtime) for the plugin, to
>  avoid any lockups/unwanted delays - The default matching threshold
> (set in the config file) can now be overridden on a per-word basis
> in the wordlist
>
> An example, wordlist contains:
>
> word1 word2::0 word3::0.2
>
>
> Then word1 is matched with the default threshold set in the config
> file, word2 must be an exact match (threshold 0), and word 3 is
> matched with a threshold of 0.2.
>
> This is especially useful for words which trigger false positives
> very often like: "penis", "money" or "news".
>
> Note that the tendency to produce a FP is not directly connected to
> the word length. The word "buy" produces very few FP compared to
> "penis", when both are being matched with the same threshold.
>
> The FuzzyOcr.words.sample contains some suggestions for word
> specific thresholds which I recommend.
>
> - The experimental MD5 database has been replaced by a custom hash
> database which is able to match very similar images.
>
> Often, you get the same image twice, or all your customers get the
> same spam mail. But even though the pictures look the same, they
> are not identical. That is why MD5 was useless. The newly
> introduced hash (self invented) is able to recognize almost
> identical images based on features that I won't explain here as it
> would make it easier for spammers :) If a message contains a
> picture previously registered in the database, the original score
> is reread from the database and the message is immediatly tagged
> with this score and the plugin ends.
>
> - Some non-alpha->alpha translations are now used on the gocr
> output, that fix common mistakes, like "i" being misread as ";" or
> "a" as "8".
>
> - There are now 2 scores for broken images, one is used when the
> picture is recognized as broken, but giffix was able to correct the
>  errors and it gave some output that can be scanned, the other one
> is used if the image is unfixable (that means either too broken, or
>  interlaced/animated and broken). The first one is set lower than
> the second one (2.5 vs. 5).
>
> -Various bugfixes
>
> TODO:
>
> -Write an external program to manage the database (add, remove and
> verify given pictures). -Rewrite the temp file system to do all
> external program operations on files (saves memory).
>
>
> Another wish: I'd like to create a database to ship with the plugin
> so it can be used out of the box but I do not have much samples
> here, so it would be nice if you sent me picture samples of common
> picture spam you get with "[picture sample]" in the subject to my
> mail address. I will post here again if I got enough :).
>
>
> Thanks to Jorge Valdes, Michael Alan Dorman and UxBoD for finding
> bugs and sending improvement suggestions for this version
>
> Chris
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.5 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFE8JRFJQIKXnJyDxURAgY1AJ97hGp6zw94H+eUCeH2lay9T2mVDgCdFWEE
4VOwP8X4yVlPguHD6S1m9tI=
=ufN9
-----END PGP SIGNATURE-----


Re: [Devel-spam] FuzzyOcr 2.3b released,fixes bugs and improves stability

Posted by decoder <de...@own-hero.net>.
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

jdow wrote:
> From: "decoder" <de...@own-hero.net>
>> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
>>
>> Expertsites, Inc. wrote:
>>> From: "decoder" <de...@own-hero.net>
>>>
>>>> Hello,
>>>>
>>>>
>>>> I just uploaded FuzzyOcr 2.3b to the download site. If you
>>>> find bugs or run into problems, please mail back :)
>>>
>>> This release failed to recognize the sample png.eml file with
>>> logfile error message: Debug mode: Image type not recognized,
>>> unknown format. Skipping this image...
>>>
>>> I resolved this problem by changing one line in FuzzyOcr.pm
>>>
>>> Changed: elsif ( substr($picture_data,0,5) eq
>>> "\x89\x50\x4e\x47" ) { To read: elsif (
>>> substr($picture_data,0,4) eq "\x89\x50\x4e\x47" ) { ^
>>>
>>> Tom Green -- Expertsites, Inc.
>>>
>>>
>>
>> Thank you for reporting this... seems I cant count bytes anymore
>> ;)
>>
>> For anyone who is downloading this past this message, the tarball
>> has been updated...
>
> As someone else pointed out - it has not been updated. I just
> checked, Chris.
>
> {^_^}
Hrm.... what the hell... I am 1000% sure I uploaded it.... -_-

ok NOW it is fixed... if not, then there is some kind of gremlin in
our server...


Chris
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.5 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFE8YOyJQIKXnJyDxURApY6AJsGyauiMoSbKvgAGQVUxr1iUqXASgCfd09k
bE/7zCyzwI8wGCFw9TZSwIw=
=OfOj
-----END PGP SIGNATURE-----


Re: [Devel-spam] FuzzyOcr 2.3b released,fixes bugs and improves stability

Posted by jdow <jd...@earthlink.net>.
From: "decoder" <de...@own-hero.net>
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> Expertsites, Inc. wrote:
>> From: "decoder" <de...@own-hero.net>
>>
>>> Hello,
>>>
>>>
>>> I just uploaded FuzzyOcr 2.3b to the download site. If you find
>>> bugs or run into problems, please mail back :)
>>
>> This release failed to recognize the sample png.eml file with
>> logfile error message: Debug mode: Image type not recognized,
>> unknown format. Skipping this image...
>>
>> I resolved this problem by changing one line in FuzzyOcr.pm
>>
>> Changed: elsif ( substr($picture_data,0,5) eq "\x89\x50\x4e\x47" )
>> { To read: elsif ( substr($picture_data,0,4) eq "\x89\x50\x4e\x47"
>> ) { ^
>>
>> Tom Green -- Expertsites, Inc.
>>
>>
> 
> Thank you for reporting this... seems I cant count bytes anymore ;)
> 
> For anyone who is downloading this past this message, the tarball has
> been updated...

As someone else pointed out - it has not been updated. I just checked,
Chris.

{^_^}

Re: [Devel-spam] FuzzyOcr 2.3b released,fixes bugs and improves stability

Posted by decoder <de...@own-hero.net>.
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Gary V wrote:
>> Hello,
>>
>>
>> I just uploaded FuzzyOcr 2.3b to the download site. If you find
>> bugs or run into problems, please mail back :)
>
> The jpeg.eml and png.eml samples failed to provide FuzzyOcr hits on
> my system because the messages scored higher than the default
> focr_autodisable_score. You should mention in the README file in the
> samples directory that you may need to temporarily raise the
> focr_autodisable_score while testing.
Ah thanks... I didn't think about that... earlier, the score was 50 by
default and I lowered it to 10 without redoing the tests :)

Chris
>
> Gary V
>
> _________________________________________________________________
> Check the weather nationwide with MSN Search: Try it now!
> http://search.msn.com/results.aspx?q=weather&FORM=WLMTAG
>

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.5 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFE8B+CJQIKXnJyDxURAvgMAJ9+zygJtk0qHNWjOoNwkKxfQMOanACeImox
I2+dh0H9UAtHxmkyHurPtfo=
=0TIT
-----END PGP SIGNATURE-----


Re: [Devel-spam] FuzzyOcr 2.3b released,fixes bugs and improves stability

Posted by Gary V <mr...@hotmail.com>.
>Hello,
>
>
>I just uploaded FuzzyOcr 2.3b to the download site. If you find
>bugs or run into problems, please mail back :)

The jpeg.eml and png.eml samples failed to provide FuzzyOcr hits on my 
system because the messages scored higher than the default 
focr_autodisable_score. You should mention in the README file in the samples 
directory that you may need to temporarily raise the focr_autodisable_score 
while testing.

Gary V

_________________________________________________________________
Check the weather nationwide with MSN Search: Try it now!  
http://search.msn.com/results.aspx?q=weather&FORM=WLMTAG


Re: [Devel-spam] FuzzyOcr 2.3b released,fixes bugs and improves stability

Posted by decoder <de...@own-hero.net>.
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Expertsites, Inc. wrote:
> From: "decoder" <de...@own-hero.net>
>
>> Hello,
>>
>>
>> I just uploaded FuzzyOcr 2.3b to the download site. If you find
>> bugs or run into problems, please mail back :)
>
> This release failed to recognize the sample png.eml file with
> logfile error message: Debug mode: Image type not recognized,
> unknown format. Skipping this image...
>
> I resolved this problem by changing one line in FuzzyOcr.pm
>
> Changed: elsif ( substr($picture_data,0,5) eq "\x89\x50\x4e\x47" )
> { To read: elsif ( substr($picture_data,0,4) eq "\x89\x50\x4e\x47"
> ) { ^
>
> Tom Green -- Expertsites, Inc.
>
>

Thank you for reporting this... seems I cant count bytes anymore ;)

For anyone who is downloading this past this message, the tarball has
been updated...

For all others, please change the line :)


Chris
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.5 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFE754FJQIKXnJyDxURAv1BAJ9KHh9VcKtCN4NWmPoWDg4Tp6m4nQCggOKT
aInWSnQgKlh0YhvE0YZclxs=
=nAbb
-----END PGP SIGNATURE-----


Re: [Devel-spam] FuzzyOcr 2.3b released,fixes bugs and improves stability

Posted by "Expertsites, Inc." <sp...@expertsites.com>.
From: "decoder" <de...@own-hero.net>

> Hello,
>
>
> I just uploaded FuzzyOcr 2.3b to the download site. If you find bugs
> or run into problems, please mail back :)

This release failed to recognize the sample png.eml file with logfile error 
message:
Debug mode: Image type not recognized, unknown format. Skipping this 
image...

I resolved this problem by changing one line in FuzzyOcr.pm

Changed:
elsif ( substr($picture_data,0,5) eq "\x89\x50\x4e\x47" ) {
To read:
elsif ( substr($picture_data,0,4) eq "\x89\x50\x4e\x47" ) {
                                                   ^

Tom Green
--
Expertsites, Inc.



Re: Fuzzy 2.3b and PNG

Posted by decoder <de...@own-hero.net>.
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Gary V wrote:
>> Rose, Bobby wrote:
>>> What am I missing?  I updated but not png isn't working.  If I
>> switch to
>>> debug logging 2 I see in the log when I run the sample thru.
>>>
>>> [2006-08-26 18:16:40] Debug mode: Analyzing file with
>>> content-type "image/png" [2006-08-26 18:16:40] Debug mode:
>>> Image type not recognized, unknown format. Skipping this
>>> image...
>>>
>>> Thanks Bobby
>> Yes, I already posted this in this thread, there is a bug in this
>>  line:
>>
>> elsif ( substr($picture_data,0,5) eq "\x89\x50\x4e\x47" )
>>
>> correct is:
>>
>> elsif ( substr($picture_data,0,4) eq "\x89\x50\x4e\x47" )
>>
>>
>> The tarball which is available for download has been fixed
>> already...
>>
>>
>> Chris
>
> I just downloaded it from
> http://users.own-hero.net/~decoder/fuzzyocr/ and line 733 says:
>
> elsif ( substr($picture_data,0,5) eq "\x89\x50\x4e\x47" ) {
>
> Gary V

Yea my problem.... it seems like the tarball was not uploaded... now
it should be... ;)

Chris
>
> _________________________________________________________________
> Get real-time traffic reports with Windows Live Local Search
> http://local.live.com/default.aspx?v=2&cp=42.336065~-109.392273&style=r&lvl=4&scene=3712634&trfc=1
>
>
>

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.5 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFE8YPfJQIKXnJyDxURApZaAJ9c3DmDnJyBWM/7kUCGf0s2pCBlMQCfbBj8
C0yO4KQrMU3UIPrfNeyowtE=
=unf7
-----END PGP SIGNATURE-----


Re: Fuzzy 2.3b and PNG

Posted by Gary V <mr...@hotmail.com>.
>Rose, Bobby wrote:
> > What am I missing?  I updated but not png isn't working.  If I switch to
> > debug logging 2 I see in the log when I run the sample thru.
> >
> > [2006-08-26 18:16:40] Debug mode: Analyzing file with content-type
> > "image/png"
> > [2006-08-26 18:16:40] Debug mode: Image type not recognized, unknown
> > format. Skipping this image...
> >
> > Thanks
> > Bobby
>Yes, I already posted this in this thread, there is a bug in this line:
>
>elsif ( substr($picture_data,0,5) eq "\x89\x50\x4e\x47" )
>
>correct is:
>
>elsif ( substr($picture_data,0,4) eq "\x89\x50\x4e\x47" )
>
>
>The tarball which is available for download has been fixed already...
>
>
>Chris

I just downloaded it from http://users.own-hero.net/~decoder/fuzzyocr/ and 
line 733 says:

elsif ( substr($picture_data,0,5) eq "\x89\x50\x4e\x47" ) {

Gary V

_________________________________________________________________
Get real-time traffic reports with Windows Live Local Search  
http://local.live.com/default.aspx?v=2&cp=42.336065~-109.392273&style=r&lvl=4&scene=3712634&trfc=1


Re: Fuzzy 2.3b and PNG

Posted by decoder <de...@own-hero.net>.
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Rose, Bobby wrote:
> What am I missing?  I updated but not png isn't working.  If I switch to
> debug logging 2 I see in the log when I run the sample thru.
>
> [2006-08-26 18:16:40] Debug mode: Analyzing file with content-type
> "image/png"
> [2006-08-26 18:16:40] Debug mode: Image type not recognized, unknown
> format. Skipping this image...
>
> Thanks
> Bobby
Yes, I already posted this in this thread, there is a bug in this line:

elsif ( substr($picture_data,0,5) eq "\x89\x50\x4e\x47" )

correct is:

elsif ( substr($picture_data,0,4) eq "\x89\x50\x4e\x47" )


The tarball which is available for download has been fixed already...


Chris


-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.5 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFE8MoeJQIKXnJyDxURAiVFAKCleKLAkgiklWw1yZdsWPmmXvibOgCfQa5K
eIWLLQcS1Lch1Rcd41tjB38=
=jYbC
-----END PGP SIGNATURE-----


Fuzzy 2.3b and PNG

Posted by "Rose, Bobby" <br...@med.wayne.edu>.
What am I missing?  I updated but not png isn't working.  If I switch to
debug logging 2 I see in the log when I run the sample thru. 

[2006-08-26 18:16:40] Debug mode: Analyzing file with content-type
"image/png"
[2006-08-26 18:16:40] Debug mode: Image type not recognized, unknown
format. Skipping this image...

Thanks
Bobby