You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Alex <my...@gmail.com> on 2017/08/22 18:55:01 UTC

Identifiying PDF phish docs

Hi,

We've been hit a number of times lately by phishing attacks using PDF
documents with a link in them. Has anyone had any success in blocking
these PDFs?

You can download one such example here:
https://www.dropbox.com/s/b97pcvl1wm1oocq/pdf-phish.pdf?dl=0

I know there was a PDF OCR plugin of some sort, but I don't recall it
being all that effective. Ideas greatly appreciated.

Re: Identifiying PDF phish docs

Posted by Kevin Golding <kp...@caomhin.org>.
On Wed, 23 Aug 2017 02:02:58 +0100, Alex <my...@gmail.com> wrote:

> John wrote:
>> clamav?
>
> It's too slow to react, particularly when the PDFs are written
> specifically to reach a domain. Sometimes the PDF will never be
> detected by any of the antivirus scanners because of this.

http://blog.adamsweet.org/?p=250

Re: Identifiying PDF phish docs

Posted by Alex <my...@gmail.com>.
Hi,

On Tue, Aug 22, 2017 at 8:46 PM, Dianne Skoll <df...@roaringpenguin.com> wrote:
> On Tue, 22 Aug 2017 20:19:06 -0400
> Alex <my...@gmail.com> wrote:
>
>> > Take a look at podofopdfinfo.  It can extract URLs from PDF docs
>> > and you can trigger on those.
>
>> Thank you. It didn't work on this one :-(
>
> It worked for me:
>
> $ podofopdfinfo pdf-phish.pdf
> Document Info
> -------------
>         File: pdf-phish.pdf
>         PDF Version: 1.5
> [,,, much output deleted ...]
>
>         Annotation 0
> [,,, much output deleted ...]
>
>                 Link Target: 1
>                 Action URI: http://dabanlar.com/west/scan.html

Ah, thank you. I used podofotxtextract.

John wrote:
>> Are there any current solutions for those of us with spamassassin and amavisd?

> clamav?

It's too slow to react, particularly when the PDFs are written
specifically to reach a domain. Sometimes the PDF will never be
detected by any of the antivirus scanners because of this.

Of course I'm further analyzing the other characteristics of the
message to build rules to stop them the next time, but this is still
after two emails were received with this PDF, each of which had like
40 recips. It only took a slight adjustment to my custom rules to
block these for next time, but an additional advantage with being able
to process the URLs would be nice.

>> I also don't see a way to use it with amavisd.
>
> Right.  I use MIMEDefang which is a little more flexible in how you
> glue the bits and pieces together.
>
>> "strings" was able to extract the URL.
>
> That works this time, but generally speaking, PDF documents may be
> compressed in which case "strings" won't be useful.
>
> I reported the URL to Google as fraudulent.

Thank you. What other steps can be taken with more of an automated
approach? Would a plugin that does a reverse lookup on the domain then
check the various RBLs be conceivable? Or would you somehow re-inject
the URL back into SA somehow? How much programming is involved with
doing something like this?

Re: Identifiying PDF phish docs

Posted by Dianne Skoll <df...@roaringpenguin.com>.
On Tue, 22 Aug 2017 20:19:06 -0400
Alex <my...@gmail.com> wrote:

> > Take a look at podofopdfinfo.  It can extract URLs from PDF docs
> > and you can trigger on those.  

> Thank you. It didn't work on this one :-(

It worked for me:

$ podofopdfinfo pdf-phish.pdf 
Document Info
-------------
        File: pdf-phish.pdf
        PDF Version: 1.5
[,,, much output deleted ...]

        Annotation 0
[,,, much output deleted ...]

                Link Target: 1
                Action URI: http://dabanlar.com/west/scan.html

My verstion of podofopdfinfo is 0.9.3 as packaged on Debian Jessie.

> I also don't see a way to use it with amavisd.

Right.  I use MIMEDefang which is a little more flexible in how you
glue the bits and pieces together.

> "strings" was able to extract the URL.

That works this time, but generally speaking, PDF documents may be
compressed in which case "strings" won't be useful.

I reported the URL to Google as fraudulent.

Regards,

Dianne.

Re: Identifiying PDF phish docs

Posted by John Hardin <jh...@impsec.org>.
On Tue, 22 Aug 2017, Alex wrote:

> Are there any current solutions for those of us with spamassassin and amavisd?

clamav?

-- 
  John Hardin KA7OHZ                    http://www.impsec.org/~jhardin/
  jhardin@impsec.org    FALaholic #11174     pgpk -a jhardin@impsec.org
  key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
   Judicial Activism (n): interpreting the Constitution to grant the
   government powers that are popularly felt to be "needed" but that
   are not explicitly provided for therein (common definition);
   interpreting the Constitution as it is written (Brady definition)
-----------------------------------------------------------------------
  2 days until the 1938th anniversary of the destruction of Pompeii

Re: Identifiying PDF phish docs

Posted by Alex <my...@gmail.com>.
Hi,

On Tue, Aug 22, 2017 at 3:14 PM, Dianne Skoll <df...@roaringpenguin.com> wrote:
> On Tue, 22 Aug 2017 14:55:01 -0400
> Alex <my...@gmail.com> wrote:
>
>> I know there was a PDF OCR plugin of some sort, but I don't recall it
>> being all that effective. Ideas greatly appreciated.
>
> Take a look at podofopdfinfo.  It can extract URLs from PDF docs and you
> can trigger on those.

Thank you. It didn't work on this one :-( I also don't see a way to
use it with amavisd. I'm recalling now that Axb once said this wasn't
a spamassassin problem, but I'm hoping with all the phishing attacks
these days that we can reconsider that - the malicious PDF is part of
the email message that spamassassin scans.

"strings" was able to extract the URL. The URL in the message is
http://dabanlar.com/west/scan.html and is still active. The domain
isn't listed on any major blacklist, but the IP address is listed on
zen. This sounds like something that would need to be done in a
plugin, if in spamassassin at all.

Are there any current solutions for those of us with spamassassin and amavisd?

Re: Identifiying PDF phish docs

Posted by Dianne Skoll <df...@roaringpenguin.com>.
On Tue, 22 Aug 2017 14:55:01 -0400
Alex <my...@gmail.com> wrote:

> I know there was a PDF OCR plugin of some sort, but I don't recall it
> being all that effective. Ideas greatly appreciated.

Take a look at podofopdfinfo.  It can extract URLs from PDF docs and you
can trigger on those.

Regards,

Dianne.

Re: Identifiying PDF phish docs

Posted by Alex <my...@gmail.com>.
Hi,

On Thu, Aug 24, 2017 at 8:00 PM, Alex <my...@gmail.com> wrote:
> Hi,
>
> On Wed, Aug 23, 2017 at 3:01 PM, Matus UHLAR - fantomas
> <uh...@fantomas.sk> wrote:
>> On 22.08.17 14:55, Alex wrote:
>>>
>>> We've been hit a number of times lately by phishing attacks using PDF
>>> documents with a link in them. Has anyone had any success in blocking
>>> these PDFs?
>>>
>>> You can download one such example here:
>>> https://www.dropbox.com/s/b97pcvl1wm1oocq/pdf-phish.pdf?dl=0
>>>
>>> I know there was a PDF OCR plugin of some sort, but I don't recall it
>>> being all that effective. Ideas greatly appreciated.
>>
>>
>> I think you mean PDFassassin, but I'd prefer ExtractText
>> both described at
>> https://wiki.apache.org/spamassassin/UnmaintainedCustomPlugins
>
> Both links to download ExtractText are dead :-( Given it's from 2007,
> is there any reasonable expectation it would even come close to
> working anyway?

Much to my surprise, I've managed to find it and actually make it
(almost) work. Does someone feel like helping me figure it out the
rest of the way?
https://github.com/DavidGoodwin/ExtractText

The plugin consists of an "ExtractText" part to extract text from PDFs
and an OpenXML part that extracts text from Word docs.

I'm having a problem with the OpenXML.pm plugin. It's lacking a new() function:
Aug 26 16:01:53.512 [18151] warn: plugin: failed to create instance of
plugin Mail::SpamAssassin::Plugin::OpenXML: Can't locate object method
"new" via package "Mail::SpamAssassin::Plugin::OpenXML" at (eval 2566)
line 1.

I'm not very good at OO perl. Would someone have some ideas? I've
pasted it here.
https://pastebin.com/Ac8fHJ3X

Re: Identifiying PDF phish docs

Posted by Alex <my...@gmail.com>.
Hi,

On Wed, Aug 23, 2017 at 3:01 PM, Matus UHLAR - fantomas
<uh...@fantomas.sk> wrote:
> On 22.08.17 14:55, Alex wrote:
>>
>> We've been hit a number of times lately by phishing attacks using PDF
>> documents with a link in them. Has anyone had any success in blocking
>> these PDFs?
>>
>> You can download one such example here:
>> https://www.dropbox.com/s/b97pcvl1wm1oocq/pdf-phish.pdf?dl=0
>>
>> I know there was a PDF OCR plugin of some sort, but I don't recall it
>> being all that effective. Ideas greatly appreciated.
>
>
> I think you mean PDFassassin, but I'd prefer ExtractText
> both described at
> https://wiki.apache.org/spamassassin/UnmaintainedCustomPlugins

Both links to download ExtractText are dead :-( Given it's from 2007,
is there any reasonable expectation it would even come close to
working anyway?

Re: Identifiying PDF phish docs

Posted by Matus UHLAR - fantomas <uh...@fantomas.sk>.
On 22.08.17 14:55, Alex wrote:
>We've been hit a number of times lately by phishing attacks using PDF
>documents with a link in them. Has anyone had any success in blocking
>these PDFs?
>
>You can download one such example here:
>https://www.dropbox.com/s/b97pcvl1wm1oocq/pdf-phish.pdf?dl=0
>
>I know there was a PDF OCR plugin of some sort, but I don't recall it
>being all that effective. Ideas greatly appreciated.

I think you mean PDFassassin, but I'd prefer ExtractText
both described at
https://wiki.apache.org/spamassassin/UnmaintainedCustomPlugins

-- 
Matus UHLAR - fantomas, uhlar@fantomas.sk ; http://www.fantomas.sk/
Warning: I wish NOT to receive e-mail advertising to this address.
Varovanie: na tuto adresu chcem NEDOSTAVAT akukolvek reklamnu postu.
"Where do you want to go to die?" [Microsoft]