You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@accumulo.apache.org by "ctubbsii (via GitHub)" <gi...@apache.org> on 2023/04/28 00:07:49 UTC

[GitHub] [accumulo] ctubbsii opened a new pull request, #3355: Remove deprecated reserveMapFileReader

ctubbsii opened a new pull request, #3355:
URL: https://github.com/apache/accumulo/pull/3355

   Remove deprecated reserveMapFileReader from IteratorEnvironment. This change affects the behavior of the MapFileIterator's deepCopy, using the implementation from the DefaultIteratorEnvironment that opens a new file reader on the given map file instead of the TabletIteratorEnvironment's implementation that would retrieve a reference to an already open file reader that was cached.
   
   For context, map files have fallen out of favor, and RFiles have been in use for a long time as their replacement (since at least Accumulo 1.3). I do not know if this MapFileIterator code is likely to even work, as I don't know of anybody still using map files. The support for reading map files should probably be removed in their entirety, the property to control the file extension removed (or at least, currently restricted to only accept RFile.EXTENSION as valid), and upgrade should be blocked until all map files are compacted away. This commit is much more narrowly focused than that, but should help move towards that end goal.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [accumulo] keith-turner commented on pull request #3355: Remove deprecated reserveMapFileReader

Posted by "keith-turner (via GitHub)" <gi...@apache.org>.
keith-turner commented on PR #3355:
URL: https://github.com/apache/accumulo/pull/3355#issuecomment-1527753486

   > @keith-turner Do you see any pitfalls due to not using the TabletIteratorEnvironment's implementation? (See previous comments)
   
   Before this change it was calling the following code which could potentially add the ProblemReportingIterator and TimeSettingIterator.
   
   https://github.com/apache/accumulo/blob/9c459ce8f06d9d9e04280d020ea091fedf13749b/server/base/src/main/java/org/apache/accumulo/server/fs/FileManager.java#L503


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [accumulo] keith-turner commented on pull request #3355: Remove deprecated reserveMapFileReader

Posted by "keith-turner (via GitHub)" <gi...@apache.org>.
keith-turner commented on PR #3355:
URL: https://github.com/apache/accumulo/pull/3355#issuecomment-1527800231

   Looking into a bit more the initial MapFileIterator is at the lowest level s of the stack.  For scan and compaction it would have iterators like the ProblemReportingIterator and TimeSettingIterator layered on top of the initial MapFileITerator.  If deep copy is called on the stack it will go down through ProblemReportingIterator and TimeSettingIterator, so there is no need to set those up at the lowest level because they are already at higher levels.  Since its so low in the stack it should be doing very minimal and I think this changes looks good.  Would keep the config passed into the constructor.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [accumulo] keith-turner commented on a diff in pull request #3355: Remove deprecated reserveMapFileReader

Posted by "keith-turner (via GitHub)" <gi...@apache.org>.
keith-turner commented on code in PR #3355:
URL: https://github.com/apache/accumulo/pull/3355#discussion_r1180603488


##########
core/src/main/java/org/apache/accumulo/core/iteratorsImpl/system/MapFileIterator.java:
##########
@@ -124,14 +125,15 @@ public Value getTopValue() {
   @Override
   public SortedKeyValueIterator<Key,Value> deepCopy(IteratorEnvironment env) {
     try {
-      @SuppressWarnings("deprecation")
-      SortedKeyValueIterator<Key,Value> other = env.reserveMapFileReader(dirName);
-      ((InterruptibleIterator) other).setInterruptFlag(interruptFlag);
+      Configuration hadoopConf = new Configuration();
+      FileSystem fs = FileSystem.get(hadoopConf);

Review Comment:
   Should keep teh config and filesystem objects passed into the constructor instead of creating new ones here.



##########
core/src/main/java/org/apache/accumulo/core/iteratorsImpl/system/MapFileIterator.java:
##########
@@ -124,14 +125,15 @@ public Value getTopValue() {
   @Override
   public SortedKeyValueIterator<Key,Value> deepCopy(IteratorEnvironment env) {
     try {
-      @SuppressWarnings("deprecation")
-      SortedKeyValueIterator<Key,Value> other = env.reserveMapFileReader(dirName);
-      ((InterruptibleIterator) other).setInterruptFlag(interruptFlag);
+      Configuration hadoopConf = new Configuration();
+      FileSystem fs = FileSystem.get(hadoopConf);

Review Comment:
   Should keep the config and filesystem objects passed into the constructor instead of creating new ones here.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [accumulo] ctubbsii commented on pull request #3355: Remove deprecated reserveMapFileReader

Posted by "ctubbsii (via GitHub)" <gi...@apache.org>.
ctubbsii commented on PR #3355:
URL: https://github.com/apache/accumulo/pull/3355#issuecomment-1527647818

   @keith-turner Do you see any pitfalls due to not using the TabletIteratorEnvironment's implementation? (See previous comments)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [accumulo] ctubbsii merged pull request #3355: Remove deprecated reserveMapFileReader

Posted by "ctubbsii (via GitHub)" <gi...@apache.org>.
ctubbsii merged PR #3355:
URL: https://github.com/apache/accumulo/pull/3355


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [accumulo] ctubbsii commented on pull request #3355: Remove deprecated reserveMapFileReader

Posted by "ctubbsii (via GitHub)" <gi...@apache.org>.
ctubbsii commented on PR #3355:
URL: https://github.com/apache/accumulo/pull/3355#issuecomment-1526799184

   I have no idea how to test this change, because I don't even know where I'd start trying to create a bunch of old map files in a recent version of Accumulo and then to try to force a deepCopy in the iterator stack with this reader active. Using the DefaultIteratorEnvironment implementation may be perfectly acceptable in all cases. I'm not sure if there are any pitfalls to be aware of because of the TabletIteratorEnvironment's implementation no longer being used. (And it's not possible to use that without a much bigger refactor that moves this class from the core module to the server module, or by doing a bunch of reflection to try to use the implementation from the server code.)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [accumulo] ctubbsii commented on a diff in pull request #3355: Remove deprecated reserveMapFileReader

Posted by "ctubbsii (via GitHub)" <gi...@apache.org>.
ctubbsii commented on code in PR #3355:
URL: https://github.com/apache/accumulo/pull/3355#discussion_r1180763997


##########
core/src/main/java/org/apache/accumulo/core/iteratorsImpl/system/MapFileIterator.java:
##########
@@ -124,14 +125,15 @@ public Value getTopValue() {
   @Override
   public SortedKeyValueIterator<Key,Value> deepCopy(IteratorEnvironment env) {
     try {
-      @SuppressWarnings("deprecation")
-      SortedKeyValueIterator<Key,Value> other = env.reserveMapFileReader(dirName);
-      ((InterruptibleIterator) other).setInterruptFlag(interruptFlag);
+      Configuration hadoopConf = new Configuration();
+      FileSystem fs = FileSystem.get(hadoopConf);

Review Comment:
   That makes sense. I must've overlooked those. I thought I had checked for them, but I guess I got distracted by the fact that it was using the ones from the environment implementation. I will update that before merging.
   
   Thanks for the thorough review.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [accumulo] ctubbsii commented on a diff in pull request #3355: Remove deprecated reserveMapFileReader

Posted by "ctubbsii (via GitHub)" <gi...@apache.org>.
ctubbsii commented on code in PR #3355:
URL: https://github.com/apache/accumulo/pull/3355#discussion_r1180775950


##########
core/src/main/java/org/apache/accumulo/core/iteratorsImpl/system/MapFileIterator.java:
##########
@@ -124,14 +125,15 @@ public Value getTopValue() {
   @Override
   public SortedKeyValueIterator<Key,Value> deepCopy(IteratorEnvironment env) {
     try {
-      @SuppressWarnings("deprecation")
-      SortedKeyValueIterator<Key,Value> other = env.reserveMapFileReader(dirName);
-      ((InterruptibleIterator) other).setInterruptFlag(interruptFlag);
+      Configuration hadoopConf = new Configuration();
+      FileSystem fs = FileSystem.get(hadoopConf);

Review Comment:
   I made that change, and also made some of the fields final that could be. I noticed that topKey and topValue are never actually assigned... which seems completely incorrect to me. I opened a discussion in Slack about that, to discuss before I try to do anything about that.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [accumulo] ctubbsii commented on pull request #3355: Remove deprecated reserveMapFileReader

Posted by "ctubbsii (via GitHub)" <gi...@apache.org>.
ctubbsii commented on PR #3355:
URL: https://github.com/apache/accumulo/pull/3355#issuecomment-1527772760

   @keith-turner wrote:
   > Before this change it was calling the following code which could potentially add the ProblemReportingIterator and TimeSettingIterator.
   
   Right. I think it also limited the number of open readers (because there is a configurable max open) and caches the open readers for re-use. We're bypassing all of that, and I'm not sure if that's safe to do. It's very unlikely we're going to hit this code at all, given that nobody I know uses Mapfiles at all anymore. But I'm just not that familiar with the file reader code at the bottom of the iterator stack on the server side, and don't feel qualified to proceed with this if those things are important enough to not bypass.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [accumulo] ctubbsii commented on pull request #3355: Remove deprecated reserveMapFileReader

Posted by "ctubbsii (via GitHub)" <gi...@apache.org>.
ctubbsii commented on PR #3355:
URL: https://github.com/apache/accumulo/pull/3355#issuecomment-1526888969

   See related #3359


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org