You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@ponymail.apache.org by sebbASF <gi...@git.apache.org> on 2016/12/12 01:40:47 UTC

[GitHub] incubator-ponymail issue #288: Bug: does not parse multipart/mixed mails

GitHub user sebbASF opened an issue:

    https://github.com/apache/incubator-ponymail/issues/288

    Bug: does not parse multipart/mixed mails

    The archiver fails to parse a multipart/mixed email [1]
    It looks as though this may be a bug in the Python email parsing, because the email is not detected as multipart, even though there are 3 parts.
    
    
    
    [1] http://mail-archives.apache.org/mod_mbox/tomcat-users/200301.mbox/%3CAB3C10C46F698F42B3836BAD91CC0C3F054B21@MARCG-EVS.marcgroup.mail%3E

----

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-ponymail issue #288: Bug: does not parse multipart/mixed mails

Posted by sebbASF <gi...@git.apache.org>.

Github user sebbASF commented on the issue:

    https://github.com/apache/incubator-ponymail/issues/288
  
    Looks like the problem is caused by utils.unquote, which treats <> as quoting characters which it removes.
    Given that < and > are not valid as boundary characters, it's hard to class that as a bug.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-ponymail issue #288: Bug: does not parse multipart/mixed mails wit...

Posted by sebbASF <gi...@git.apache.org>.

Github user sebbASF commented on the issue:

    https://github.com/apache/incubator-ponymail/issues/288
  
    Further investigation shows that the problem due to unquote being called twice.
    Since both "" and <> are treated as quotes, they both get removed.
    This appears to still be a bug in 3.5.2 (I am using 3.4.1)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-ponymail issue #288: Bug: does not parse multipart/mixed mails wit...

Posted by sebbASF <gi...@git.apache.org>.

Github user sebbASF commented on the issue:

    https://github.com/apache/incubator-ponymail/issues/288
  
    In theory it should be possible to override the Message#get_boundary method.
    This is quite easy to do in the archiver by passing the amended class to the email#message_from_bytes() method.
    
    However I was unable to find out how to do this for the importer.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-ponymail issue #288: Bug: does not parse multipart/mixed mails

Posted by sebbASF <gi...@git.apache.org>.

Github user sebbASF commented on the issue:

    https://github.com/apache/incubator-ponymail/issues/288
  
    The message uses some invalid characters in the boundary: "<" and ">".
    Experimentation shows that the Python parser does not like the ">".
    The boundary in the message is:
    Content-Type: multipart/mixed; boundary="<<001-3e1dcd5a-119e>>"
    Further tests show that the parser strips off the outer <> when extracting the boundary, so it's not surprising that the parts are not detected. That does seem like a bug, as the parser  handles other invalid chars OK.
    
    A simple work-round would be to treat the entire body as text.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---