You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@jena.apache.org by afs <gi...@git.apache.org> on 2018/08/10 15:56:16 UTC

[GitHub] jena pull request #458: JENA-1481, JENA-1586: Delete databases and expel fro...

GitHub user afs opened a pull request:

    https://github.com/apache/jena/pull/458

    JENA-1481, JENA-1586: Delete databases and expel from JVM

    

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/afs/jena fuseki-db-delete

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/jena/pull/458.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #458
    
----
commit f5197905d57eee6500b7d4ed6cd87f2bd9ae738e
Author: Andy Seaborne <an...@...>
Date:   2018-08-10T15:21:39Z

    JENA-1481, JENA-1586: Delete databases and expel from JVM

----


---

[GitHub] jena pull request #458: JENA-1481, JENA-1586: Delete databases and expel fro...

Posted by ajs6f <gi...@git.apache.org>.
Github user ajs6f commented on a diff in the pull request:

    https://github.com/apache/jena/pull/458#discussion_r209307965
  
    --- Diff: jena-base/src/main/java/org/apache/jena/atlas/io/IO.java ---
    @@ -364,4 +369,31 @@ public static String uniqueFilename(String directory, String base, String ext) {
                 return null ;
             }
         }
    +    
    +    /** Delete everything from a {@code Path} start point, including the path itself.
    +     * This function works on files or directories.
    +     * This function does not follow symbolic links.
    +     */  
    +    public static void deleteAll(Path start) {
    +        // Walks down the tree and deletes directories on the way backup.
    +        try { 
    +            Files.walkFileTree(start, new SimpleFileVisitor<Path>() {
    --- End diff --
    
    Could this `new SimpleFileVisitor` be broken out as a `static` inner class? 


---

[GitHub] jena pull request #458: JENA-1481, JENA-1586: Delete databases and expel fro...

Posted by afs <gi...@git.apache.org>.
Github user afs commented on a diff in the pull request:

    https://github.com/apache/jena/pull/458#discussion_r209313977
  
    --- Diff: jena-base/src/main/java/org/apache/jena/atlas/io/IO.java ---
    @@ -364,4 +369,31 @@ public static String uniqueFilename(String directory, String base, String ext) {
                 return null ;
             }
         }
    +    
    +    /** Delete everything from a {@code Path} start point, including the path itself.
    +     * This function works on files or directories.
    +     * This function does not follow symbolic links.
    +     */  
    +    public static void deleteAll(Path start) {
    +        // Walks down the tree and deletes directories on the way backup.
    +        try { 
    +            Files.walkFileTree(start, new SimpleFileVisitor<Path>() {
    --- End diff --
    
    Could do though I prefer to put the action code near to its usage, all in the method. (It is also an idiom (e.g. stackoverflow) to inline the visitor.)


---

[GitHub] jena pull request #458: JENA-1481, JENA-1586: Delete databases and expel fro...

Posted by ajs6f <gi...@git.apache.org>.
Github user ajs6f commented on a diff in the pull request:

    https://github.com/apache/jena/pull/458#discussion_r209309616
  
    --- Diff: jena-fuseki2/jena-fuseki-core/src/main/java/org/apache/jena/fuseki/mgt/ActionDatasets.java ---
    @@ -288,28 +294,89 @@ protected void execDeleteItem(HttpAction action) {
             if ( ! action.getDataAccessPointRegistry().isRegistered(name) )
                 ServletOps.errorNotFound("No such dataset registered: "+name);
     
    +        // This acts as a lock. 
             systemDSG.begin(ReadWrite.WRITE) ;
             boolean committed = false ;
    +
             try {
                 // Here, go offline.
                 // Need to reference count operations when they drop to zero
                 // or a timer goes off, we delete the dataset.
     
                 DataAccessPoint ref = action.getDataAccessPointRegistry().get(name) ;
    +            
                 // Redo check inside transaction.
                 if ( ref == null )
                     ServletOps.errorNotFound("No such dataset registered: "+name);
    +            
    +            String filename = name.startsWith("/") ? name.substring(1) : name;
    +            List<String> configurationFiles = FusekiSystem.existingConfigurationFile(filename);
    +            if  ( configurationFiles.size() != 1 ) {
    +                // This should not happen.
    +                action.log.warn(format("[%d] There are %d configuration files, not one.", action.id, configurationFiles.size()));
    +                ServletOps.errorOccurred(
    +                    format(
    +                        "There are %d configuration files, not one. Delete not performed; clearup of the filesystem needed.",
    +                        action.id, configurationFiles.size()));
    +            }
    +            
    +            String cfgPathname = configurationFiles.get(0);
    +            
    +            // Delete configuration file.
    +            // Once deleted, server restart will not have the database. 
    +            FileOps.deleteSilent(cfgPathname);
     
    -            // Make it invisible to the outside.
    +            // Get before removing.
    +            DatasetGraph database = ref.getDataService().getDataset();
    +            // Make it invisible in this running server.
                 action.getDataAccessPointRegistry().remove(name);
    -            // Delete configuration file.
    -            // Should be only one, undo damage if multiple.
    -            String filename = name.startsWith("/") ? name.substring(1) : name;
    -            FusekiSystem.existingConfigurationFile(filename).stream().forEach(FileOps::deleteSilent);
    -            // Leave the database in place. if it is in /databases/, recreating the
    -            // configuration will make the database reappear. This is intentional.
    -            // Deleting a large database by accident is a major mistake.
     
    +            // JENA-1481 & JENA-1586 : Delete the database.
    +            // Delete the database for real only when it is in the server "run/databases"
    +            // area. Don't delete databases that reside elsewhere. We do delete the
    +            // configuration file, so the databases will not be associated with the server
    +            // anymore.
    +
    +            boolean tryToRemoveFiles = true;
    +            
    +            // Text databases.
    +            // Close the in-JVM objects for Lucene index and databases. 
    +            // Do not delete files; at least for the lucene index, they are likely outside the run/databases. 
    +            if ( database instanceof DatasetGraphText ) {
    +                DatasetGraphText dbtext = (DatasetGraphText)database;
    +                database = dbtext.getBase();
    +                dbtext.getTextIndex().close();
    +                tryToRemoveFiles = false ;
    +            }
    +
    +            boolean isTDB1 = org.apache.jena.tdb.sys.TDBInternal.isTDB1(database);
    +            boolean isTDB2 = org.apache.jena.tdb2.sys.TDBInternal.isTDB2(database);
    +            
    +            if ( ( isTDB1 || isTDB2 ) ) {
    +                // JENA-1586: Remove database from the process.
    +                if ( isTDB1 )
    +                    org.apache.jena.tdb.sys.TDBInternal.expel(database);
    +                if ( isTDB2 )
    +                    org.apache.jena.tdb2.sys.TDBInternal.expel(database);
    +            
    +                // JENA-1481: Really delete files.
    +                // Find the database files (may not be any - e.g. in-memory).
    +                Path pDatabase = FusekiSystem.dirDatabases.resolve(filename);
    +                if ( tryToRemoveFiles && Files.exists(pDatabase)) {
    +                    try { 
    +                        if ( Files.isSymbolicLink(pDatabase)) {
    +                            action.log.info(format("[%d] Database is a symbolic link, not removing files", action.id, pDatabase)) ;
    +                        } else {
    +                            IO.deleteAll(pDatabase);
    +                            action.log.info(format("[%d] Deleted database files %s", action.id, pDatabase)) ;
    +                        }
    +                    } catch (RuntimeIOException e) {
    +                        e.printStackTrace();
    --- End diff --
    
    Should this be sent to a logger?


---

[GitHub] jena pull request #458: JENA-1481, JENA-1586: Delete databases and expel fro...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/jena/pull/458


---

[GitHub] jena pull request #458: JENA-1481, JENA-1586: Delete databases and expel fro...

Posted by ajs6f <gi...@git.apache.org>.
Github user ajs6f commented on a diff in the pull request:

    https://github.com/apache/jena/pull/458#discussion_r209314854
  
    --- Diff: jena-base/src/main/java/org/apache/jena/atlas/io/IO.java ---
    @@ -364,4 +369,31 @@ public static String uniqueFilename(String directory, String base, String ext) {
                 return null ;
             }
         }
    +    
    +    /** Delete everything from a {@code Path} start point, including the path itself.
    +     * This function works on files or directories.
    +     * This function does not follow symbolic links.
    +     */  
    +    public static void deleteAll(Path start) {
    +        // Walks down the tree and deletes directories on the way backup.
    +        try { 
    +            Files.walkFileTree(start, new SimpleFileVisitor<Path>() {
    --- End diff --
    
    I'm not worried about it. I'm always a bit suspicious on anon inner classes because of the occasional GC problems, but I think I'm letting my suspicions run away from me.


---

[GitHub] jena pull request #458: JENA-1481, JENA-1586: Delete databases and expel fro...

Posted by afs <gi...@git.apache.org>.
Github user afs commented on a diff in the pull request:

    https://github.com/apache/jena/pull/458#discussion_r209313461
  
    --- Diff: jena-fuseki2/jena-fuseki-core/src/main/java/org/apache/jena/fuseki/mgt/ActionDatasets.java ---
    @@ -288,28 +294,89 @@ protected void execDeleteItem(HttpAction action) {
             if ( ! action.getDataAccessPointRegistry().isRegistered(name) )
                 ServletOps.errorNotFound("No such dataset registered: "+name);
     
    +        // This acts as a lock. 
             systemDSG.begin(ReadWrite.WRITE) ;
             boolean committed = false ;
    +
             try {
                 // Here, go offline.
                 // Need to reference count operations when they drop to zero
                 // or a timer goes off, we delete the dataset.
     
                 DataAccessPoint ref = action.getDataAccessPointRegistry().get(name) ;
    +            
                 // Redo check inside transaction.
                 if ( ref == null )
                     ServletOps.errorNotFound("No such dataset registered: "+name);
    +            
    +            String filename = name.startsWith("/") ? name.substring(1) : name;
    +            List<String> configurationFiles = FusekiSystem.existingConfigurationFile(filename);
    +            if  ( configurationFiles.size() != 1 ) {
    +                // This should not happen.
    +                action.log.warn(format("[%d] There are %d configuration files, not one.", action.id, configurationFiles.size()));
    +                ServletOps.errorOccurred(
    +                    format(
    +                        "There are %d configuration files, not one. Delete not performed; clearup of the filesystem needed.",
    +                        action.id, configurationFiles.size()));
    +            }
    +            
    +            String cfgPathname = configurationFiles.get(0);
    +            
    +            // Delete configuration file.
    +            // Once deleted, server restart will not have the database. 
    +            FileOps.deleteSilent(cfgPathname);
     
    -            // Make it invisible to the outside.
    +            // Get before removing.
    +            DatasetGraph database = ref.getDataService().getDataset();
    +            // Make it invisible in this running server.
                 action.getDataAccessPointRegistry().remove(name);
    -            // Delete configuration file.
    -            // Should be only one, undo damage if multiple.
    -            String filename = name.startsWith("/") ? name.substring(1) : name;
    -            FusekiSystem.existingConfigurationFile(filename).stream().forEach(FileOps::deleteSilent);
    -            // Leave the database in place. if it is in /databases/, recreating the
    -            // configuration will make the database reappear. This is intentional.
    -            // Deleting a large database by accident is a major mistake.
     
    +            // JENA-1481 & JENA-1586 : Delete the database.
    +            // Delete the database for real only when it is in the server "run/databases"
    +            // area. Don't delete databases that reside elsewhere. We do delete the
    +            // configuration file, so the databases will not be associated with the server
    +            // anymore.
    +
    +            boolean tryToRemoveFiles = true;
    +            
    +            // Text databases.
    +            // Close the in-JVM objects for Lucene index and databases. 
    +            // Do not delete files; at least for the lucene index, they are likely outside the run/databases. 
    +            if ( database instanceof DatasetGraphText ) {
    +                DatasetGraphText dbtext = (DatasetGraphText)database;
    +                database = dbtext.getBase();
    +                dbtext.getTextIndex().close();
    +                tryToRemoveFiles = false ;
    +            }
    +
    +            boolean isTDB1 = org.apache.jena.tdb.sys.TDBInternal.isTDB1(database);
    +            boolean isTDB2 = org.apache.jena.tdb2.sys.TDBInternal.isTDB2(database);
    +            
    +            if ( ( isTDB1 || isTDB2 ) ) {
    +                // JENA-1586: Remove database from the process.
    +                if ( isTDB1 )
    +                    org.apache.jena.tdb.sys.TDBInternal.expel(database);
    +                if ( isTDB2 )
    +                    org.apache.jena.tdb2.sys.TDBInternal.expel(database);
    +            
    +                // JENA-1481: Really delete files.
    +                // Find the database files (may not be any - e.g. in-memory).
    +                Path pDatabase = FusekiSystem.dirDatabases.resolve(filename);
    +                if ( tryToRemoveFiles && Files.exists(pDatabase)) {
    +                    try { 
    +                        if ( Files.isSymbolicLink(pDatabase)) {
    +                            action.log.info(format("[%d] Database is a symbolic link, not removing files", action.id, pDatabase)) ;
    +                        } else {
    +                            IO.deleteAll(pDatabase);
    +                            action.log.info(format("[%d] Deleted database files %s", action.id, pDatabase)) ;
    +                        }
    +                    } catch (RuntimeIOException e) {
    +                        e.printStackTrace();
    --- End diff --
    
    Yes, done.


---