You are viewing a plain text version of this content. The canonical link for it is here.

Posted to notifications@accumulo.apache.org by GitBox <gi...@apache.org> on 2022/11/04 16:06:21 UTC

[GitHub] [accumulo] keith-turner commented on issue #3065: Support read only Accumulo snapshots that can run in other data centers.

keith-turner commented on issue #3065:
URL: https://github.com/apache/accumulo/issues/3065#issuecomment-1303813313

> This seems similar to the replication feature that has been unmaintained / abandoned. What makes this different?
And, would it be better to advocate users create an ingest pipeline outside of Accumulo for cross-data center snapshots/replication/copies? Like, have data come in via Kafka or NiFi, and have multiple consumers/destinations for the different target data centers?

The main difference with the replication feature and multiple ingest pipelines is that each those require redundant data maintenance operations. I am a bit uncertain if centralized data maintenance makes this feature worthwhile though. Is doing the computation cheaper than the copy? Is the consistency provided by doing data maintenance in one place worth the overhead of implementing this feature? I suspect it may be worthwhile, but I am not completely sure.

This feature could not be done w/o first doing system snapshots which I think have a lot of their own benefits independent of this. I think this feature would mainly be about making scan servers be able to read from a system snapshots.

--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org