You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@ponymail.apache.org by se...@apache.org on 2019/09/06 17:20:44 UTC
[incubator-ponymail] branch master updated: Document stability issue
This is an automated email from the ASF dual-hosted git repository.
sebb pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-ponymail.git
The following commit(s) were added to refs/heads/master by this push:
new 7ef7dc7 Document stability issue
7ef7dc7 is described below
commit 7ef7dc7f675ab4f44fcd620f9655c9881c0fee04
Author: Sebb <se...@apache.org>
AuthorDate: Fri Sep 6 18:20:40 2019 +0100
Document stability issue
---
tools/generators.py | 8 ++++++++
1 file changed, 8 insertions(+)
diff --git a/tools/generators.py b/tools/generators.py
index 9a84693..528ca16 100644
--- a/tools/generators.py
+++ b/tools/generators.py
@@ -52,6 +52,10 @@ def medium(msg, body, lid, _attachments):
(does not generate sufficiently unique ids)
Also the lid is included in the hash; this causes problems if the listname needs to be changed.
+ N.B. The id is not guaranteed stable - i.e. it may change if the message is reparsed.
+ The id depends on the parsed body, which depends on the exact method used to parse the mail.
+ For example, are invalid characters ignored or replaced; is html parsing used?
+
The following message fields are concatenated to form the hash input:
- body: if bytes as is else encoded ascii, ignoring invalid characters; if the body is null an Exception is thrown
- lid
@@ -141,6 +145,10 @@ def cluster(msg, body, lid, attachments):
For mails with a valid Message-ID this is likely to be unique
In other cases it is better than the medium generator as it uses several extra fields
+ N.B. The id is not guaranteed stable - i.e. it may change if the message is reparsed.
+ The id depends on the parsed body, which depends on the exact method used to parse the mail.
+ For example, are invalid characters ignored or replaced; is html parsing used?
+
The following message fields are concatenated to form the hash input:
- body as is if bytes else encoded ascii, ignoring invalid characters; if the body is null it is treated as an empty string
(currently trailing whitespace is dropped)