You are viewing a plain text version of this content. The canonical link for it is here.
Posted to server-dev@james.apache.org by Jerry Malcolm <te...@malcolms.com> on 2019/10/06 19:40:30 UTC
3.4.0 AWS S3 Storage
This is a chain of several offline email exchanges between Matthieu and
me regarding the S3 blob storage project. I'm bringing it into this
forum to include anyone else who might be interested in this topic.
There are still a few open questions listed below. So if anyone can
assist with those, jump on in.
Jerry
Matthieu,
On 10/2/2019 4:33 AM, Matthieu Baechler wrote:
> Hi,
>
> On Tue, 2019-10-01 at 11:41 -0500, Jerry Malcolm wrote:
>
> [...]
>
>>>> Two initial questions:
>>>>
>>>> 1) is this enabled simply by adding blob.properties into
>>>> the
>>>> conf
>>>> directory?
>>> When using cassandra+guice, yes.
>>>
>>> With Spring, everything is much more complex.
>>> You probably need to rework Spring configuration to inject the
>>> right
>>> blob-store implementation.
>> I knew sooner or later I was going to have learn something about
>> Spring.... If you can tell me the cassandra-guice classes that
>> perform the same functions, I'll start looking into making Spring do
>> the
>> same thing.
> Honestly, I don't see the value to learn Spring 3 that has been
> obsolete for years. If you plan to work more frequently on James, I can
> just advise you to learn Guice which is way easier to work with.
Since I am not live yet on my 3.4.x installation, now is the right time
to move to all of the recommended 'future direction' configurations. So
I'll figure out what I need to do to move from Spring to Guice and begin
with that base on my 3.4.x installation.
>
>>> More important: JPA is currently not able to use a blob-store at
>>> all.
>>>
>>> That means, if you want that feature, that you should:
>>>
>>> * implement blob-api by moving some code out of mailbox-jpa to a
>>> blob-
>>> jpa module
>>> * allow mailbox-jpa to use any blob-api implementation
>> I'm not using JPA at all. I'm using direct JDBC to MySQL currently.
>> So
>> JPA will be on the back burner initially for me.
> Could you explain to me the exact setup you have right now? It's not
> clear to me what you have and then it's impossible for me to describe
> what you should do starting from there.
>
> [...]
That's a good idea for me to tell you exactly what I have now, and go
from there:
Big picture... I have had a hosting company since 2002 or 2003 with a
dedicated server (on Pier1, I think, but they keep changing names). My
background at IBM was OS/2 and Windows-based. So my dedicated server OS
is Windows Server 2016. However, I don't use Windows internet servers.
I'm all open source. Apache HTTP, Tomcat, MySQL, James, ISC BIND. In
July I decided it was time to move from dedicated server to AWS.
Currently, all of my web services, dns, and web database are fully
migrated to AWS. I host a video company with huge web galleries. So in
the migration process, I moved the galleries to S3, which gave me the
opportunity to learn the S3 APIs. James is my lone holdout on getting
to AWS and killing my dedicated server. I've had some struggles. But
I'm getting very close now to throwing the switch to AWS for James as well.
I started with James I believe in 02 or 03 with some version of 2.x
(SAR_INF, etc). For my entire career, I've always looked for ways to
make a program do something I needed that it didn't do. James'
matcher/mailet architecture was a dream come true for me. I did a couple
of minor version upgrades to James 2.x over the years. But my main
concentration was my custom matchers/mailets. Never tried building
james. In 2014 I really wanted imap and fast-fail that 3.x had. So I
began the migration process. A few customizations I needed forced some
base James tweaks. So I figured out how to build it. Then after
getting 3.0b5 up and moving all clients to imap, I began writing imap
utilities to maintain the imap accounts, such as pruning spam folders,
archiving mail to archive folders, etc.
My current configuration is AWS EC2 Linux2, james 3.3.something (build
from the 3.3 branch, but I'll get it to the latest master branch
shortly). I use straight JDBC to an AWS RDS MySQL database (tried
aurora, but had issues... so reverted at least temporarily back to
mySQL). I've added my mailets. But I haven't changed any base config.
So I am assuming I'm using Spring.
Summary: Currently James 3.3.? Spring JDBC MySQL, with a bunch of
separate IMAP utilities.
After building james, I copied server/app/target/appassembler/lib/* to
my james lib folder. I'm assuming there is a different target lib
folder that I should copy to get the guice version (??).
I see many references to Cassandra-Guice-JPA almost as a single entity
configuration. Are they tightly coupled? Do you enable each
independently? Is there a build target folder for all combinations? Or
does it not matter? Should I just go with Cassandra-Guice-JPA as the
future direction and move on?
>> My only
>> concern with 2 databases and an imapsync utility is how long the
>> migration might take (in large db cases like I have) and having to
>> keep
>> both db's with the absolute latest mail entries.
> Moving a large database between two systems is going to take some time
> anyway.
>
> First, with imapsync, keeping two server in sync is quite easy and
> efficient. You can run it first to move most of the data, then another
> time one day between the server switch and finally run it during the
> switch with a minimal downtime.
>
> [...]
I was not aware of an actual imapsync utility until now. I thought you
were just referring to writing a utility that 'synced imap'. If the
imapsync utility I found on google is reliable, obviously the best answer.
>> Is there some process I need to go through to get approved as a
>> contributor? I'm ready. (Also have a couple of base mailet
>> enhancements I've added that others might be interested in).
> The first steps are to propose to pull requests on github.
>
> People gain contributor status based on their previous contributions
> (it's a kind of meritocracy).
>
> You could start by proposing a pull request for each mailet you have?
Many of my mailets are specific to certain clients. But I do have a
couple of fairly minor code changes I'll create pull requests for just
to get my feet wet.
>
>> Also, I mentioned that I'm still just learning the specifics of GIT.
>> I
>> went to GIT expecting to find a 3.4.x branch. There's not one
>> (yet). I
>> thought maybe 'master' would work. But I got tons of build errors
>> when
>> I cloned master. So right now I've got the 3.3.x branch and it
>> builds.
>> What branch should I be working with in order to guarantee I'm
>> working
>> with the latest code base?
> master branch. It should be working. We changed the requirement for the
> build phase, you now need a JDK 11 to build it. We soon will require a
> JDK to run, too.
JDK 11 could definitely be a problem. I was on 9, and moved back to 8
thinking there were problems with 9. Obviously moving the wrong
direction. I'll install 11 and see if the master branch acts nicer.
>
>
> Finally, it's important we go back to the mailing for our discussion as
> they are of general interest. We can keep the part about your current
> setup private but everything else should be handle publicly.
I subscribed to the james-dev list, got my 'confirm-request' email and
replied. But no response as yet. I'll post this entire chain when I
get authorized. (I have no confidentiality issues with my configuration
/ setup / etc.)
Cheers,