You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@commons.apache.org by "Georg Henzler (Created) (JIRA)" <ji...@apache.org> on 2011/10/18 22:43:10 UTC

[jira] [Created] (IO-288) Supply a ReverseFileReader

Supply a ReverseFileReader
--------------------------

                 Key: IO-288
                 URL: https://issues.apache.org/jira/browse/IO-288
             Project: Commons IO
          Issue Type: New Feature
          Components: Utilities
            Reporter: Georg Henzler
         Attachments: ReverseFileReader.java

I needed to analyse a log file today and I was looking for a ReverseFileReader: A class that behaves exactly like BufferedReader except that it goes from bottom to top when readLine() is called. I didn't find it in IOUtils and the internet didn't help a lot either, e.g. http://www.java2s.com/Tutorial/Java/0180__File/ReversingaFile.htm is a fairly inefficient - the log files I'm analysing is huge and it is not a good idea to load the whole content in the memory. 

So I ended up writing an implementation myself using little memory and the class RandomAccessFile - see attached file. It's used follows:

int blockSize = 4096; // only that much memory is needed, no matter how big the file is
ReverseFileReader reverseFileReader = new ReverseFileReader(myFile, blockSize, "UTF-8");
String line = null;
while((line=reverseFileReader.readLine())!=null) {
  ... // use the line
  if(enoughLinesSeen) {
     break;  
  }

}
reverseFileReader.close();

I believe this could be useful for other people as well!


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (IO-288) Supply a ReversedLinesFileReader

Posted by "Henri Yandell (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/IO-288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Henri Yandell updated IO-288:
-----------------------------

    Fix Version/s: 2.2
    
> Supply a ReversedLinesFileReader 
> ---------------------------------
>
>                 Key: IO-288
>                 URL: https://issues.apache.org/jira/browse/IO-288
>             Project: Commons IO
>          Issue Type: New Feature
>          Components: Utilities
>            Reporter: Georg Henzler
>             Fix For: 2.2
>
>         Attachments: ReversedLinesFileReader.zip
>
>
> I needed to analyse a log file today and I was looking for a ReversedLinesFileReader: A class that behaves exactly like BufferedReader except that it goes from bottom to top when readLine() is called. I didn't find it in IOUtils and the internet didn't help a lot either, e.g. http://www.java2s.com/Tutorial/Java/0180__File/ReversingaFile.htm is a fairly inefficient - the log files I'm analysing are huge and it is not a good idea to load the whole content in the memory. 
> So I ended up writing an implementation myself using little memory and the class RandomAccessFile - see attached file. It's used as follows:
> int blockSize = 4096; // only that much memory is needed, no matter how big the file is
> ReversedLinesFileReader reversedLinesFileReader = new ReversedLinesFileReader (myFile, blockSize, "UTF-8"); // encoding is supported
> String line = null;
> while((line=reversedLinesFileReader.readLine())!=null) {
>   ... // use the line
>   if(enoughLinesSeen) {
>      break;  
>   }
> }
> reversedLinesFileReader.close();
> I believe this could be useful for other people as well!

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (IO-288) Supply a ReversedLinesFileReader

Posted by "Niall Pemberton (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/IO-288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13181778#comment-13181778 ] 

Niall Pemberton commented on IO-288:
------------------------------------

ReversedLinesFileReader should be in the org.apache.commons.io.input package along with all the other InputStreams & Readers, so I have moved it.

Also fixed some checkstyle issues (removed tabs, javadocs etc)
                
> Supply a ReversedLinesFileReader 
> ---------------------------------
>
>                 Key: IO-288
>                 URL: https://issues.apache.org/jira/browse/IO-288
>             Project: Commons IO
>          Issue Type: New Feature
>          Components: Utilities
>            Reporter: Georg Henzler
>             Fix For: 2.2
>
>         Attachments: ReversedLinesFileReader0.3.zip
>
>
> I needed to analyse a log file today and I was looking for a ReversedLinesFileReader: A class that behaves exactly like BufferedReader except that it goes from bottom to top when readLine() is called. I didn't find it in IOUtils and the internet didn't help a lot either, e.g. http://www.java2s.com/Tutorial/Java/0180__File/ReversingaFile.htm is a fairly inefficient - the log files I'm analysing are huge and it is not a good idea to load the whole content in the memory. 
> So I ended up writing an implementation myself using little memory and the class RandomAccessFile - see attached file. It's used as follows:
> int blockSize = 4096; // only that much memory is needed, no matter how big the file is
> ReversedLinesFileReader reversedLinesFileReader = new ReversedLinesFileReader (myFile, blockSize, "UTF-8"); // encoding is supported
> String line = null;
> while((line=reversedLinesFileReader.readLine())!=null) {
>   ... // use the line
>   if(enoughLinesSeen) {
>      break;  
>   }
> }
> reversedLinesFileReader.close();
> I believe this could be useful for other people as well!

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (IO-288) Supply a ReverseFileReader

Posted by "Georg Henzler (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/IO-288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13130101#comment-13130101 ] 

Georg Henzler commented on IO-288:
----------------------------------

I will provide you with Unit tests if you're interested - I tested the class with a main class and different input files but it's easy to make Unit tests from the main class. 

Regarding your questions:
- The Tailer class listens to file changes (as the unix tail does) and notifies a provided Listener passing the added line. The ReverseFileReader starts at the last line of a file and moves towards the start of the file (ignoring added lines after it has instantiated).
- We could subclass FileReader but I'm not sure how to implement e.g. read(char[] cbuf, int off, int len)... implementing this going backward would be hard. Mixing going forward and backward is probably not really intuitive. I would suggest that if we implement FileReader, we throw a UnsupportedOperationException for most of the Reader inferface's methods. 
- I'm not sure of the Filename... is BufferedReverseFileReader a better name to emphasize on the fact that it's all about the method readLine()? Any other name suggestions? 
 


                
> Supply a ReverseFileReader
> --------------------------
>
>                 Key: IO-288
>                 URL: https://issues.apache.org/jira/browse/IO-288
>             Project: Commons IO
>          Issue Type: New Feature
>          Components: Utilities
>            Reporter: Georg Henzler
>         Attachments: ReverseFileReader.java
>
>
> I needed to analyse a log file today and I was looking for a ReverseFileReader: A class that behaves exactly like BufferedReader except that it goes from bottom to top when readLine() is called. I didn't find it in IOUtils and the internet didn't help a lot either, e.g. http://www.java2s.com/Tutorial/Java/0180__File/ReversingaFile.htm is a fairly inefficient - the log files I'm analysing are huge and it is not a good idea to load the whole content in the memory. 
> So I ended up writing an implementation myself using little memory and the class RandomAccessFile - see attached file. It's used as follows:
> int blockSize = 4096; // only that much memory is needed, no matter how big the file is
> ReverseFileReader reverseFileReader = new ReverseFileReader(myFile, blockSize, "UTF-8"); // encoding is supported
> String line = null;
> while((line=reverseFileReader.readLine())!=null) {
>   ... // use the line
>   if(enoughLinesSeen) {
>      break;  
>   }
> }
> reverseFileReader.close();
> I believe this could be useful for other people as well!

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Issue Comment Edited] (IO-288) Supply a ReversedLinesFileReader

Posted by "Lars Kolb (Issue Comment Edited) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/IO-288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13210212#comment-13210212 ] 

Lars Kolb edited comment on IO-288 at 2/17/12 12:41 PM:
--------------------------------------------------------

* [Web|http://svn.apache.org/viewvc/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/LineReader.java?view=log]
* [SVN|http://hadoop.apache.org/common/version_control.html]

Basically, it would be required to support:
{code}
FSDataInputStream is= FileSystem.get(conf);
is.seek(offset);

//constructor takes (positioned) Input stream
public ReversedLinesReader(InputStream is)
{
   //do not seek to end of file, simply start reading from is
}

public int readLine(Text text)
{
   //return bytes read and store line in text
   //alternatively one could return a Pair<String,Integer> to not depend on org.apache.hadoop.io.Text
}
{code}

                
      was (Author: weihnachtsmann):
    * [Web|http://svn.apache.org/viewvc/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/LineReader.java?view=log]
* [SVN|http://hadoop.apache.org/common/version_control.html]

Basically, it would be required to support:
{code}
FSDataInputStream is= FileSystem.get(conf);
is.seek(offset);

//constructor takes (positioned) Input stream
public ReversedLinesFileReader(InputStream is)
{
   //do not seek to end of file, simply start reading from is
}

public int readLine(Text text)
{
   //return bytes read and store line in text
   //alternatively one could return a Pair<String,Integer> to not depend on org.apache.hadoop.io.Text
}
{code}

                  
> Supply a ReversedLinesFileReader 
> ---------------------------------
>
>                 Key: IO-288
>                 URL: https://issues.apache.org/jira/browse/IO-288
>             Project: Commons IO
>          Issue Type: New Feature
>          Components: Utilities
>            Reporter: Georg Henzler
>             Fix For: 2.2
>
>         Attachments: ReversedLinesFileReader0.3.zip
>
>
> I needed to analyse a log file today and I was looking for a ReversedLinesFileReader: A class that behaves exactly like BufferedReader except that it goes from bottom to top when readLine() is called. I didn't find it in IOUtils and the internet didn't help a lot either, e.g. http://www.java2s.com/Tutorial/Java/0180__File/ReversingaFile.htm is a fairly inefficient - the log files I'm analysing are huge and it is not a good idea to load the whole content in the memory. 
> So I ended up writing an implementation myself using little memory and the class RandomAccessFile - see attached file. It's used as follows:
> int blockSize = 4096; // only that much memory is needed, no matter how big the file is
> ReversedLinesFileReader reversedLinesFileReader = new ReversedLinesFileReader (myFile, blockSize, "UTF-8"); // encoding is supported
> String line = null;
> while((line=reversedLinesFileReader.readLine())!=null) {
>   ... // use the line
>   if(enoughLinesSeen) {
>      break;  
>   }
> }
> reversedLinesFileReader.close();
> I believe this could be useful for other people as well!

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (IO-288) Supply a ReversedLinesFileReader

Posted by "Lars Kolb (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/IO-288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13210212#comment-13210212 ] 

Lars Kolb commented on IO-288:
------------------------------

* [Web|http://svn.apache.org/viewvc/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/LineReader.java?view=log]
* [SVN|http://hadoop.apache.org/common/version_control.html]

Basically, it would be required to support:
{code}
FSDataInputStream is= FileSystem.get(conf);
is.seek(offset);

//constructor takes (positioned) Input stream
public ReversedLinesFileReader(InputStream is)
{
   //do not seek to end of file, simply start reading from is
}

public int readLine(Text text)
{
   //return bytes read and store line in text
   //alternatively one could return a Pair<String,Integer> to not depend on org.apache.hadoop.io.Text
}
{code}

                
> Supply a ReversedLinesFileReader 
> ---------------------------------
>
>                 Key: IO-288
>                 URL: https://issues.apache.org/jira/browse/IO-288
>             Project: Commons IO
>          Issue Type: New Feature
>          Components: Utilities
>            Reporter: Georg Henzler
>             Fix For: 2.2
>
>         Attachments: ReversedLinesFileReader0.3.zip
>
>
> I needed to analyse a log file today and I was looking for a ReversedLinesFileReader: A class that behaves exactly like BufferedReader except that it goes from bottom to top when readLine() is called. I didn't find it in IOUtils and the internet didn't help a lot either, e.g. http://www.java2s.com/Tutorial/Java/0180__File/ReversingaFile.htm is a fairly inefficient - the log files I'm analysing are huge and it is not a good idea to load the whole content in the memory. 
> So I ended up writing an implementation myself using little memory and the class RandomAccessFile - see attached file. It's used as follows:
> int blockSize = 4096; // only that much memory is needed, no matter how big the file is
> ReversedLinesFileReader reversedLinesFileReader = new ReversedLinesFileReader (myFile, blockSize, "UTF-8"); // encoding is supported
> String line = null;
> while((line=reversedLinesFileReader.readLine())!=null) {
>   ... // use the line
>   if(enoughLinesSeen) {
>      break;  
>   }
> }
> reversedLinesFileReader.close();
> I believe this could be useful for other people as well!

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Issue Comment Edited] (IO-288) Supply a ReverseFileReader

Posted by "Georg Henzler (Issue Comment Edited) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/IO-288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13130101#comment-13130101 ] 

Georg Henzler edited comment on IO-288 at 10/18/11 10:54 PM:
-------------------------------------------------------------

I will provide you with Unit tests if you're interested - I tested the class with a main class and different input files but it's easy to create Unit tests from the main class. 

Regarding your questions:
- The Tailer class listens to file changes (as the unix tail does) and notifies a provided Listener passing the added line. The ReverseFileReader starts at the last line of a file and moves towards the start of the file (ignoring added lines after it has been instantiated).
- We could subclass FileReader but I'm not sure how to implement e.g. read(char[] cbuf, int off, int len)... implementing this going backward would be hard. Mixing going forward and backward is probably not really intuitive. I would suggest that if we implement FileReader, we throw a UnsupportedOperationException for most of the Reader inferface's methods. 
- I'm not sure of the Filename... is BufferedReverseFileReader a better name to emphasize on the fact that it's all about the method readLine()? Any other name suggestions? 
 


                
      was (Author: henzlerg):
    I will provide you with Unit tests if you're interested - I tested the class with a main class and different input files but it's easy to make Unit tests from the main class. 

Regarding your questions:
- The Tailer class listens to file changes (as the unix tail does) and notifies a provided Listener passing the added line. The ReverseFileReader starts at the last line of a file and moves towards the start of the file (ignoring added lines after it has instantiated).
- We could subclass FileReader but I'm not sure how to implement e.g. read(char[] cbuf, int off, int len)... implementing this going backward would be hard. Mixing going forward and backward is probably not really intuitive. I would suggest that if we implement FileReader, we throw a UnsupportedOperationException for most of the Reader inferface's methods. 
- I'm not sure of the Filename... is BufferedReverseFileReader a better name to emphasize on the fact that it's all about the method readLine()? Any other name suggestions? 
 


                  
> Supply a ReverseFileReader
> --------------------------
>
>                 Key: IO-288
>                 URL: https://issues.apache.org/jira/browse/IO-288
>             Project: Commons IO
>          Issue Type: New Feature
>          Components: Utilities
>            Reporter: Georg Henzler
>         Attachments: ReverseFileReader.java
>
>
> I needed to analyse a log file today and I was looking for a ReverseFileReader: A class that behaves exactly like BufferedReader except that it goes from bottom to top when readLine() is called. I didn't find it in IOUtils and the internet didn't help a lot either, e.g. http://www.java2s.com/Tutorial/Java/0180__File/ReversingaFile.htm is a fairly inefficient - the log files I'm analysing are huge and it is not a good idea to load the whole content in the memory. 
> So I ended up writing an implementation myself using little memory and the class RandomAccessFile - see attached file. It's used as follows:
> int blockSize = 4096; // only that much memory is needed, no matter how big the file is
> ReverseFileReader reverseFileReader = new ReverseFileReader(myFile, blockSize, "UTF-8"); // encoding is supported
> String line = null;
> while((line=reverseFileReader.readLine())!=null) {
>   ... // use the line
>   if(enoughLinesSeen) {
>      break;  
>   }
> }
> reverseFileReader.close();
> I believe this could be useful for other people as well!

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (IO-288) Supply a ReversedLinesFileReader

Posted by "Georg Henzler (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/IO-288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Georg Henzler updated IO-288:
-----------------------------

    Attachment: ReversedLinesFileReader0.2.zip

I didn't really like the "inefficient" idea of using the BufferedReader to get around the encoding issue... so I read up about encodings in general and I think the solution as provided works for all one byte encodings, UTF-8, UTF-16BE/UTF-16LE and Shift-JIS. For other encodings at the moment an exception is thrown to be on the safe side (but this can be easily extended).

Also, now \r alone is treated as newline as well.

PS: Added Apache headers and removed author tag.
                
> Supply a ReversedLinesFileReader 
> ---------------------------------
>
>                 Key: IO-288
>                 URL: https://issues.apache.org/jira/browse/IO-288
>             Project: Commons IO
>          Issue Type: New Feature
>          Components: Utilities
>            Reporter: Georg Henzler
>             Fix For: 2.2
>
>         Attachments: ReversedLinesFileReader0.2.zip
>
>
> I needed to analyse a log file today and I was looking for a ReversedLinesFileReader: A class that behaves exactly like BufferedReader except that it goes from bottom to top when readLine() is called. I didn't find it in IOUtils and the internet didn't help a lot either, e.g. http://www.java2s.com/Tutorial/Java/0180__File/ReversingaFile.htm is a fairly inefficient - the log files I'm analysing are huge and it is not a good idea to load the whole content in the memory. 
> So I ended up writing an implementation myself using little memory and the class RandomAccessFile - see attached file. It's used as follows:
> int blockSize = 4096; // only that much memory is needed, no matter how big the file is
> ReversedLinesFileReader reversedLinesFileReader = new ReversedLinesFileReader (myFile, blockSize, "UTF-8"); // encoding is supported
> String line = null;
> while((line=reversedLinesFileReader.readLine())!=null) {
>   ... // use the line
>   if(enoughLinesSeen) {
>      break;  
>   }
> }
> reversedLinesFileReader.close();
> I believe this could be useful for other people as well!

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (IO-288) Supply a ReversedLinesFileReader

Posted by "Sebb (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/IO-288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13150124#comment-13150124 ] 

Sebb commented on IO-288:
-------------------------

Good to know that it's easy to unambiguously detect CR and LF.

There seems to be a lot of spurious files in the zip archive.

I'm not sure that the getNewLineMatchByteCount() is as efficient as BufferedReader.readLine() - it seems to process characters multiple times. It could probably be improved by just checking current and previous chars. Also, I don't think it's necessary to encode \n or \r - just use the appropriate characters.

There are no tests for multi-block files where there may be lines spanning blocks.
Indeed the CRLF pair may span blocks; I'm not convinced that the code handles that correctly.
In order for getNewLineMatchByteCount() to detect all CRLF pairs, it generally needs at least 2 characters to be present; this does not seem to be guaranteed.

Note: could use a smaller block size to make the test files smaller; probably sensible to compare the results with a forward line reader. It would then be simple to have a directory of various different test files - read the file forward and store the lines; ensure that the reverse reader matches the reversed lines.

The field totalBlockCount needs to be a long, not an int.

Might simplify the code to use empty arrays rather than null.
                
> Supply a ReversedLinesFileReader 
> ---------------------------------
>
>                 Key: IO-288
>                 URL: https://issues.apache.org/jira/browse/IO-288
>             Project: Commons IO
>          Issue Type: New Feature
>          Components: Utilities
>            Reporter: Georg Henzler
>             Fix For: 2.2
>
>         Attachments: ReversedLinesFileReader0.2.zip
>
>
> I needed to analyse a log file today and I was looking for a ReversedLinesFileReader: A class that behaves exactly like BufferedReader except that it goes from bottom to top when readLine() is called. I didn't find it in IOUtils and the internet didn't help a lot either, e.g. http://www.java2s.com/Tutorial/Java/0180__File/ReversingaFile.htm is a fairly inefficient - the log files I'm analysing are huge and it is not a good idea to load the whole content in the memory. 
> So I ended up writing an implementation myself using little memory and the class RandomAccessFile - see attached file. It's used as follows:
> int blockSize = 4096; // only that much memory is needed, no matter how big the file is
> ReversedLinesFileReader reversedLinesFileReader = new ReversedLinesFileReader (myFile, blockSize, "UTF-8"); // encoding is supported
> String line = null;
> while((line=reversedLinesFileReader.readLine())!=null) {
>   ... // use the line
>   if(enoughLinesSeen) {
>      break;  
>   }
> }
> reversedLinesFileReader.close();
> I believe this could be useful for other people as well!

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Issue Comment Edited] (IO-288) Supply a ReversedLinesFileReader

Posted by "Lars Kolb (Issue Comment Edited) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/IO-288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13210212#comment-13210212 ] 

Lars Kolb edited comment on IO-288 at 2/17/12 12:57 PM:
--------------------------------------------------------

* [Web|http://svn.apache.org/viewvc/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/LineReader.java?view=log]
* [SVN|http://hadoop.apache.org/common/version_control.html]

Basically, it would be required to support:
{code}
Text str= new Text();
FSDataInputStream is= FileSystem.get(conf);
is.seek(offset);

ReversedLinesReader reader= new ReversedLinesReader(is);
int bytesConsumed;
long bytesConsumedTotal=0L;

while((bytesConsumed=reader.readLine(str))>0 && bytesConsumedTotal<treshold)
{
   //...
   bytesConsumedTotal+= bytesConsumed;
}


public class ReversedLinesReader
{
   public ReversedLinesReader(InputStream is)
   {
      //simply start reading from (positioned) is
   }

   public ReversedLinesReader(File file)
   {
      //current behaviour seek to end of file
   }

   public int readLine(Text text)
   {
      //return bytes read and store line in text
      //alternatively one could return a Pair<String,Integer> to not depend on org.apache.hadoop.io.Text
   }

   public String readLine()
   {
      //current behaviour 
   }
}
{code}

                
      was (Author: weihnachtsmann):
    * [Web|http://svn.apache.org/viewvc/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/LineReader.java?view=log]
* [SVN|http://hadoop.apache.org/common/version_control.html]

Basically, it would be required to support:
{code}
Text str= new Text();
FSDataInputStream is= FileSystem.get(conf);
is.seek(offset);
ReversedLinesReader reader= new ReversedLinesReader(is);
int bytesConsumed;
while((bytesConsumed=reader.readLine(str))>0)
{
   ...
}

public class ReversedLinesReader
{
   public ReversedLinesReader(InputStream is)
   {
      //simply start reading from (positioned) is
   }

   public ReversedLinesReader(File file)
   {
      //current behaviour seek to end of file
   }

   public int readLine(Text text)
   {
      //return bytes read and store line in text
      //alternatively one could return a Pair<String,Integer> to not depend on org.apache.hadoop.io.Text
   }

   public String readLine()
   {
      //current behaviour 
   }
}
{code}

                  
> Supply a ReversedLinesFileReader 
> ---------------------------------
>
>                 Key: IO-288
>                 URL: https://issues.apache.org/jira/browse/IO-288
>             Project: Commons IO
>          Issue Type: New Feature
>          Components: Utilities
>            Reporter: Georg Henzler
>             Fix For: 2.2
>
>         Attachments: ReversedLinesFileReader0.3.zip
>
>
> I needed to analyse a log file today and I was looking for a ReversedLinesFileReader: A class that behaves exactly like BufferedReader except that it goes from bottom to top when readLine() is called. I didn't find it in IOUtils and the internet didn't help a lot either, e.g. http://www.java2s.com/Tutorial/Java/0180__File/ReversingaFile.htm is a fairly inefficient - the log files I'm analysing are huge and it is not a good idea to load the whole content in the memory. 
> So I ended up writing an implementation myself using little memory and the class RandomAccessFile - see attached file. It's used as follows:
> int blockSize = 4096; // only that much memory is needed, no matter how big the file is
> ReversedLinesFileReader reversedLinesFileReader = new ReversedLinesFileReader (myFile, blockSize, "UTF-8"); // encoding is supported
> String line = null;
> while((line=reversedLinesFileReader.readLine())!=null) {
>   ... // use the line
>   if(enoughLinesSeen) {
>      break;  
>   }
> }
> reversedLinesFileReader.close();
> I believe this could be useful for other people as well!

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (IO-288) Supply a ReversedLinesFileReader

Posted by "Sebb (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/IO-288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13148939#comment-13148939 ] 

Sebb commented on IO-288:
-------------------------

BufferedReader.readLine() allows for LF, CRLF and CR line termination; perhaps this should too?

What about multi-byte encodings?
Can these ever have CR or LF as part of a multi-byte character?
Won't the processing fail in such cases?
If so, is that a restriction, or can it be fixed?

I'm wondering whether it would be possible to use BufferedReader.readLine() to scan forward through the buffer, saving up lines as it goes, and then return the lines in reverse order. This would use a bit more memory, but would re-use the readLine processing which is well-tested. It can also allow for the encoding, though there is still a potential issue where a buffer happens to start in the middle of a multi-byte character. It might be possible to check the first few bytes of the buffer and adjust the start offset if necessary.

We don't use @author tags in code; contributors are credited on the website instead.
Also, files should have AL headers please.
                
> Supply a ReversedLinesFileReader 
> ---------------------------------
>
>                 Key: IO-288
>                 URL: https://issues.apache.org/jira/browse/IO-288
>             Project: Commons IO
>          Issue Type: New Feature
>          Components: Utilities
>            Reporter: Georg Henzler
>             Fix For: 2.2
>
>         Attachments: ReversedLinesFileReader.zip
>
>
> I needed to analyse a log file today and I was looking for a ReversedLinesFileReader: A class that behaves exactly like BufferedReader except that it goes from bottom to top when readLine() is called. I didn't find it in IOUtils and the internet didn't help a lot either, e.g. http://www.java2s.com/Tutorial/Java/0180__File/ReversingaFile.htm is a fairly inefficient - the log files I'm analysing are huge and it is not a good idea to load the whole content in the memory. 
> So I ended up writing an implementation myself using little memory and the class RandomAccessFile - see attached file. It's used as follows:
> int blockSize = 4096; // only that much memory is needed, no matter how big the file is
> ReversedLinesFileReader reversedLinesFileReader = new ReversedLinesFileReader (myFile, blockSize, "UTF-8"); // encoding is supported
> String line = null;
> while((line=reversedLinesFileReader.readLine())!=null) {
>   ... // use the line
>   if(enoughLinesSeen) {
>      break;  
>   }
> }
> reversedLinesFileReader.close();
> I believe this could be useful for other people as well!

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (IO-288) Supply a ReverseFileReader

Posted by "Georg Henzler (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/IO-288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Georg Henzler updated IO-288:
-----------------------------

    Attachment:     (was: ReverseFileReader.java)
    
> Supply a ReverseFileReader
> --------------------------
>
>                 Key: IO-288
>                 URL: https://issues.apache.org/jira/browse/IO-288
>             Project: Commons IO
>          Issue Type: New Feature
>          Components: Utilities
>            Reporter: Georg Henzler
>         Attachments: ReversedLinesFileReader.zip
>
>
> I needed to analyse a log file today and I was looking for a ReverseFileReader: A class that behaves exactly like BufferedReader except that it goes from bottom to top when readLine() is called. I didn't find it in IOUtils and the internet didn't help a lot either, e.g. http://www.java2s.com/Tutorial/Java/0180__File/ReversingaFile.htm is a fairly inefficient - the log files I'm analysing are huge and it is not a good idea to load the whole content in the memory. 
> So I ended up writing an implementation myself using little memory and the class RandomAccessFile - see attached file. It's used as follows:
> int blockSize = 4096; // only that much memory is needed, no matter how big the file is
> ReverseFileReader reverseFileReader = new ReverseFileReader(myFile, blockSize, "UTF-8"); // encoding is supported
> String line = null;
> while((line=reverseFileReader.readLine())!=null) {
>   ... // use the line
>   if(enoughLinesSeen) {
>      break;  
>   }
> }
> reverseFileReader.close();
> I believe this could be useful for other people as well!

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (IO-288) Supply a ReversedLinesFileReader

Posted by "Georg Henzler (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/IO-288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Georg Henzler updated IO-288:
-----------------------------

    Summary: Supply a ReversedLinesFileReader   (was: Supply a ReverseFileReader)
    
> Supply a ReversedLinesFileReader 
> ---------------------------------
>
>                 Key: IO-288
>                 URL: https://issues.apache.org/jira/browse/IO-288
>             Project: Commons IO
>          Issue Type: New Feature
>          Components: Utilities
>            Reporter: Georg Henzler
>         Attachments: ReversedLinesFileReader.zip
>
>
> I needed to analyse a log file today and I was looking for a ReversedLinesFileReader: A class that behaves exactly like BufferedReader except that it goes from bottom to top when readLine() is called. I didn't find it in IOUtils and the internet didn't help a lot either, e.g. http://www.java2s.com/Tutorial/Java/0180__File/ReversingaFile.htm is a fairly inefficient - the log files I'm analysing are huge and it is not a good idea to load the whole content in the memory. 
> So I ended up writing an implementation myself using little memory and the class RandomAccessFile - see attached file. It's used as follows:
> int blockSize = 4096; // only that much memory is needed, no matter how big the file is
> ReversedLinesFileReader reversedLinesFileReader = new ReversedLinesFileReader (myFile, blockSize, "UTF-8"); // encoding is supported
> String line = null;
> while((line=reversedLinesFileReader.readLine())!=null) {
>   ... // use the line
>   if(enoughLinesSeen) {
>      break;  
>   }
> }
> reversedLinesFileReader.close();
> I believe this could be useful for other people as well!

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (IO-288) Supply a ReverseFileReader

Posted by "Georg Henzler (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/IO-288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Georg Henzler updated IO-288:
-----------------------------

    Attachment: ReversedLinesFileReader.zip

Attached simple maven project with java class and unit tests.
                
> Supply a ReverseFileReader
> --------------------------
>
>                 Key: IO-288
>                 URL: https://issues.apache.org/jira/browse/IO-288
>             Project: Commons IO
>          Issue Type: New Feature
>          Components: Utilities
>            Reporter: Georg Henzler
>         Attachments: ReversedLinesFileReader.zip
>
>
> I needed to analyse a log file today and I was looking for a ReverseFileReader: A class that behaves exactly like BufferedReader except that it goes from bottom to top when readLine() is called. I didn't find it in IOUtils and the internet didn't help a lot either, e.g. http://www.java2s.com/Tutorial/Java/0180__File/ReversingaFile.htm is a fairly inefficient - the log files I'm analysing are huge and it is not a good idea to load the whole content in the memory. 
> So I ended up writing an implementation myself using little memory and the class RandomAccessFile - see attached file. It's used as follows:
> int blockSize = 4096; // only that much memory is needed, no matter how big the file is
> ReverseFileReader reverseFileReader = new ReverseFileReader(myFile, blockSize, "UTF-8"); // encoding is supported
> String line = null;
> while((line=reverseFileReader.readLine())!=null) {
>   ... // use the line
>   if(enoughLinesSeen) {
>      break;  
>   }
> }
> reverseFileReader.close();
> I believe this could be useful for other people as well!

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (IO-288) Supply a ReversedLinesFileReader

Posted by "Georg Henzler (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/IO-288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Georg Henzler updated IO-288:
-----------------------------

    Attachment:     (was: ReversedLinesFileReader0.2.zip)
    
> Supply a ReversedLinesFileReader 
> ---------------------------------
>
>                 Key: IO-288
>                 URL: https://issues.apache.org/jira/browse/IO-288
>             Project: Commons IO
>          Issue Type: New Feature
>          Components: Utilities
>            Reporter: Georg Henzler
>             Fix For: 2.2
>
>         Attachments: ReversedLinesFileReader0.3.zip
>
>
> I needed to analyse a log file today and I was looking for a ReversedLinesFileReader: A class that behaves exactly like BufferedReader except that it goes from bottom to top when readLine() is called. I didn't find it in IOUtils and the internet didn't help a lot either, e.g. http://www.java2s.com/Tutorial/Java/0180__File/ReversingaFile.htm is a fairly inefficient - the log files I'm analysing are huge and it is not a good idea to load the whole content in the memory. 
> So I ended up writing an implementation myself using little memory and the class RandomAccessFile - see attached file. It's used as follows:
> int blockSize = 4096; // only that much memory is needed, no matter how big the file is
> ReversedLinesFileReader reversedLinesFileReader = new ReversedLinesFileReader (myFile, blockSize, "UTF-8"); // encoding is supported
> String line = null;
> while((line=reversedLinesFileReader.readLine())!=null) {
>   ... // use the line
>   if(enoughLinesSeen) {
>      break;  
>   }
> }
> reversedLinesFileReader.close();
> I believe this could be useful for other people as well!

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (IO-288) Supply a ReversedLinesFileReader

Posted by "Lars Kolb (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/IO-288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13210186#comment-13210186 ] 

Lars Kolb commented on IO-288:
------------------------------

Works like a charm, thanks Georg.

I had a similar requirement. I wanted to "chunk-wise" read a HDFS file backwards to allow file browsing similar to the Hadoop namenode's web interface. By clicking a button a user triggers to fetch the previous N lines starting from a specific offset.

With a few changes to your ReversedLinesFileReader implementation I was able to implement this functionality. I would suggest to extend your ReversedLinesFileReader to be able to operate on InputStreams and to return the number of consumed bytes ((i.e. the number of bytes actually read for "line construction" not the number of buffered)). This actually results in a Reverse org.apache.hadoop.util.LineReader.
                
> Supply a ReversedLinesFileReader 
> ---------------------------------
>
>                 Key: IO-288
>                 URL: https://issues.apache.org/jira/browse/IO-288
>             Project: Commons IO
>          Issue Type: New Feature
>          Components: Utilities
>            Reporter: Georg Henzler
>             Fix For: 2.2
>
>         Attachments: ReversedLinesFileReader0.3.zip
>
>
> I needed to analyse a log file today and I was looking for a ReversedLinesFileReader: A class that behaves exactly like BufferedReader except that it goes from bottom to top when readLine() is called. I didn't find it in IOUtils and the internet didn't help a lot either, e.g. http://www.java2s.com/Tutorial/Java/0180__File/ReversingaFile.htm is a fairly inefficient - the log files I'm analysing are huge and it is not a good idea to load the whole content in the memory. 
> So I ended up writing an implementation myself using little memory and the class RandomAccessFile - see attached file. It's used as follows:
> int blockSize = 4096; // only that much memory is needed, no matter how big the file is
> ReversedLinesFileReader reversedLinesFileReader = new ReversedLinesFileReader (myFile, blockSize, "UTF-8"); // encoding is supported
> String line = null;
> while((line=reversedLinesFileReader.readLine())!=null) {
>   ... // use the line
>   if(enoughLinesSeen) {
>      break;  
>   }
> }
> reversedLinesFileReader.close();
> I believe this could be useful for other people as well!

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Issue Comment Edited] (IO-288) Supply a ReversedLinesFileReader

Posted by "Lars Kolb (Issue Comment Edited) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/IO-288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13210186#comment-13210186 ] 

Lars Kolb edited comment on IO-288 at 2/17/12 11:20 AM:
--------------------------------------------------------

Works like a charm, thanks Georg.

I had a similar requirement. I wanted to "chunk-wise" read a HDFS file backwards to allow file browsing similar to the Hadoop namenode's web interface. By clicking a button a user triggers to fetch the previous N lines starting from a specific offset.

With a few changes to your ReversedLinesFileReader implementation I was able to implement this functionality. I would suggest to extend your ReversedLinesFileReader to be able to operate on InputStreams and to return the number of consumed bytes ((i.e. the number of bytes actually read for "line construction" not the number of buffered bytes). This actually results in a Reverse org.apache.hadoop.util.LineReader.
                
      was (Author: weihnachtsmann):
    Works like a charm, thanks Georg.

I had a similar requirement. I wanted to "chunk-wise" read a HDFS file backwards to allow file browsing similar to the Hadoop namenode's web interface. By clicking a button a user triggers to fetch the previous N lines starting from a specific offset.

With a few changes to your ReversedLinesFileReader implementation I was able to implement this functionality. I would suggest to extend your ReversedLinesFileReader to be able to operate on InputStreams and to return the number of consumed bytes ((i.e. the number of bytes actually read for "line construction" not the number of buffered)). This actually results in a Reverse org.apache.hadoop.util.LineReader.
                  
> Supply a ReversedLinesFileReader 
> ---------------------------------
>
>                 Key: IO-288
>                 URL: https://issues.apache.org/jira/browse/IO-288
>             Project: Commons IO
>          Issue Type: New Feature
>          Components: Utilities
>            Reporter: Georg Henzler
>             Fix For: 2.2
>
>         Attachments: ReversedLinesFileReader0.3.zip
>
>
> I needed to analyse a log file today and I was looking for a ReversedLinesFileReader: A class that behaves exactly like BufferedReader except that it goes from bottom to top when readLine() is called. I didn't find it in IOUtils and the internet didn't help a lot either, e.g. http://www.java2s.com/Tutorial/Java/0180__File/ReversingaFile.htm is a fairly inefficient - the log files I'm analysing are huge and it is not a good idea to load the whole content in the memory. 
> So I ended up writing an implementation myself using little memory and the class RandomAccessFile - see attached file. It's used as follows:
> int blockSize = 4096; // only that much memory is needed, no matter how big the file is
> ReversedLinesFileReader reversedLinesFileReader = new ReversedLinesFileReader (myFile, blockSize, "UTF-8"); // encoding is supported
> String line = null;
> while((line=reversedLinesFileReader.readLine())!=null) {
>   ... // use the line
>   if(enoughLinesSeen) {
>      break;  
>   }
> }
> reversedLinesFileReader.close();
> I believe this could be useful for other people as well!

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (IO-288) Supply a ReversedLinesFileReader

Posted by "Georg Henzler (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/IO-288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Georg Henzler updated IO-288:
-----------------------------

    Attachment: ReversedLinesFileReader.zip

Improved javadoc and unit tests. Fixed empty file bug. 
                
> Supply a ReversedLinesFileReader 
> ---------------------------------
>
>                 Key: IO-288
>                 URL: https://issues.apache.org/jira/browse/IO-288
>             Project: Commons IO
>          Issue Type: New Feature
>          Components: Utilities
>            Reporter: Georg Henzler
>             Fix For: 2.2
>
>         Attachments: ReversedLinesFileReader.zip
>
>
> I needed to analyse a log file today and I was looking for a ReversedLinesFileReader: A class that behaves exactly like BufferedReader except that it goes from bottom to top when readLine() is called. I didn't find it in IOUtils and the internet didn't help a lot either, e.g. http://www.java2s.com/Tutorial/Java/0180__File/ReversingaFile.htm is a fairly inefficient - the log files I'm analysing are huge and it is not a good idea to load the whole content in the memory. 
> So I ended up writing an implementation myself using little memory and the class RandomAccessFile - see attached file. It's used as follows:
> int blockSize = 4096; // only that much memory is needed, no matter how big the file is
> ReversedLinesFileReader reversedLinesFileReader = new ReversedLinesFileReader (myFile, blockSize, "UTF-8"); // encoding is supported
> String line = null;
> while((line=reversedLinesFileReader.readLine())!=null) {
>   ... // use the line
>   if(enoughLinesSeen) {
>      break;  
>   }
> }
> reversedLinesFileReader.close();
> I believe this could be useful for other people as well!

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (IO-288) Supply a ReversedLinesFileReader

Posted by "Georg Henzler (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/IO-288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Georg Henzler updated IO-288:
-----------------------------

    Attachment: ReversedLinesFileReader.zip
    
> Supply a ReversedLinesFileReader 
> ---------------------------------
>
>                 Key: IO-288
>                 URL: https://issues.apache.org/jira/browse/IO-288
>             Project: Commons IO
>          Issue Type: New Feature
>          Components: Utilities
>            Reporter: Georg Henzler
>         Attachments: ReversedLinesFileReader.zip
>
>
> I needed to analyse a log file today and I was looking for a ReversedLinesFileReader: A class that behaves exactly like BufferedReader except that it goes from bottom to top when readLine() is called. I didn't find it in IOUtils and the internet didn't help a lot either, e.g. http://www.java2s.com/Tutorial/Java/0180__File/ReversingaFile.htm is a fairly inefficient - the log files I'm analysing are huge and it is not a good idea to load the whole content in the memory. 
> So I ended up writing an implementation myself using little memory and the class RandomAccessFile - see attached file. It's used as follows:
> int blockSize = 4096; // only that much memory is needed, no matter how big the file is
> ReversedLinesFileReader reversedLinesFileReader = new ReversedLinesFileReader (myFile, blockSize, "UTF-8"); // encoding is supported
> String line = null;
> while((line=reversedLinesFileReader.readLine())!=null) {
>   ... // use the line
>   if(enoughLinesSeen) {
>      break;  
>   }
> }
> reversedLinesFileReader.close();
> I believe this could be useful for other people as well!

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (IO-288) Supply a ReversedLinesFileReader

Posted by "Sebb (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/IO-288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13150426#comment-13150426 ] 

Sebb commented on IO-288:
-------------------------

Sorry, you're correct about needing to convert CR and LF. 
I was forgetting that BufferedReader.readLine() works on the decoded values, so does not need to encode them for comparison.
Whereas your code works on bytes, and decodes later.

AFAICT, the code depends on the line end byte arrays being sorted order of descending length. 
This should be documented. Hopefully it's not possible for an encoding to use different lengths for CR and LF!
                
> Supply a ReversedLinesFileReader 
> ---------------------------------
>
>                 Key: IO-288
>                 URL: https://issues.apache.org/jira/browse/IO-288
>             Project: Commons IO
>          Issue Type: New Feature
>          Components: Utilities
>            Reporter: Georg Henzler
>             Fix For: 2.2
>
>         Attachments: ReversedLinesFileReader0.2.zip
>
>
> I needed to analyse a log file today and I was looking for a ReversedLinesFileReader: A class that behaves exactly like BufferedReader except that it goes from bottom to top when readLine() is called. I didn't find it in IOUtils and the internet didn't help a lot either, e.g. http://www.java2s.com/Tutorial/Java/0180__File/ReversingaFile.htm is a fairly inefficient - the log files I'm analysing are huge and it is not a good idea to load the whole content in the memory. 
> So I ended up writing an implementation myself using little memory and the class RandomAccessFile - see attached file. It's used as follows:
> int blockSize = 4096; // only that much memory is needed, no matter how big the file is
> ReversedLinesFileReader reversedLinesFileReader = new ReversedLinesFileReader (myFile, blockSize, "UTF-8"); // encoding is supported
> String line = null;
> while((line=reversedLinesFileReader.readLine())!=null) {
>   ... // use the line
>   if(enoughLinesSeen) {
>      break;  
>   }
> }
> reversedLinesFileReader.close();
> I believe this could be useful for other people as well!

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (IO-288) Supply a ReverseFileReader

Posted by "Georg Henzler (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/IO-288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Georg Henzler updated IO-288:
-----------------------------

    Attachment: ReverseFileReader.java

Sorry, I didn't notice the radio buttons when I first uploaded...
                
> Supply a ReverseFileReader
> --------------------------
>
>                 Key: IO-288
>                 URL: https://issues.apache.org/jira/browse/IO-288
>             Project: Commons IO
>          Issue Type: New Feature
>          Components: Utilities
>            Reporter: Georg Henzler
>         Attachments: ReverseFileReader.java
>
>
> I needed to analyse a log file today and I was looking for a ReverseFileReader: A class that behaves exactly like BufferedReader except that it goes from bottom to top when readLine() is called. I didn't find it in IOUtils and the internet didn't help a lot either, e.g. http://www.java2s.com/Tutorial/Java/0180__File/ReversingaFile.htm is a fairly inefficient - the log files I'm analysing are huge and it is not a good idea to load the whole content in the memory. 
> So I ended up writing an implementation myself using little memory and the class RandomAccessFile - see attached file. It's used as follows:
> int blockSize = 4096; // only that much memory is needed, no matter how big the file is
> ReverseFileReader reverseFileReader = new ReverseFileReader(myFile, blockSize, "UTF-8"); // encoding is supported
> String line = null;
> while((line=reverseFileReader.readLine())!=null) {
>   ... // use the line
>   if(enoughLinesSeen) {
>      break;  
>   }
> }
> reverseFileReader.close();
> I believe this could be useful for other people as well!

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Issue Comment Edited] (IO-288) Supply a ReversedLinesFileReader

Posted by "Lars Kolb (Issue Comment Edited) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/IO-288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13210212#comment-13210212 ] 

Lars Kolb edited comment on IO-288 at 2/17/12 12:46 PM:
--------------------------------------------------------

* [Web|http://svn.apache.org/viewvc/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/LineReader.java?view=log]
* [SVN|http://hadoop.apache.org/common/version_control.html]

Basically, it would be required to support:
{code}
Text str= new Text();
FSDataInputStream is= FileSystem.get(conf);
is.seek(offset);
ReversedLinesReader reader= new ReversedLinesReader(is);
int bytesConsumed;
while((bytesConsumed=reader.readLine(str))>0)
{
   ...
}

public class ReversedLinesReader
{
   //constructor takes (positioned) Input stream
   public ReversedLinesReader(InputStream is)
   {
      //do not seek to end of file, simply start reading from is
   }

   public int readLine(Text text)
   {
      //return bytes read and store line in text
      //alternatively one could return a Pair<String,Integer> to not depend on org.apache.hadoop.io.Text
   }
}
{code}

                
      was (Author: weihnachtsmann):
    * [Web|http://svn.apache.org/viewvc/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/LineReader.java?view=log]
* [SVN|http://hadoop.apache.org/common/version_control.html]

Basically, it would be required to support:
{code}
FSDataInputStream is= FileSystem.get(conf);
is.seek(offset);

//constructor takes (positioned) Input stream
public ReversedLinesReader(InputStream is)
{
   //do not seek to end of file, simply start reading from is
}

public int readLine(Text text)
{
   //return bytes read and store line in text
   //alternatively one could return a Pair<String,Integer> to not depend on org.apache.hadoop.io.Text
}
{code}

                  
> Supply a ReversedLinesFileReader 
> ---------------------------------
>
>                 Key: IO-288
>                 URL: https://issues.apache.org/jira/browse/IO-288
>             Project: Commons IO
>          Issue Type: New Feature
>          Components: Utilities
>            Reporter: Georg Henzler
>             Fix For: 2.2
>
>         Attachments: ReversedLinesFileReader0.3.zip
>
>
> I needed to analyse a log file today and I was looking for a ReversedLinesFileReader: A class that behaves exactly like BufferedReader except that it goes from bottom to top when readLine() is called. I didn't find it in IOUtils and the internet didn't help a lot either, e.g. http://www.java2s.com/Tutorial/Java/0180__File/ReversingaFile.htm is a fairly inefficient - the log files I'm analysing are huge and it is not a good idea to load the whole content in the memory. 
> So I ended up writing an implementation myself using little memory and the class RandomAccessFile - see attached file. It's used as follows:
> int blockSize = 4096; // only that much memory is needed, no matter how big the file is
> ReversedLinesFileReader reversedLinesFileReader = new ReversedLinesFileReader (myFile, blockSize, "UTF-8"); // encoding is supported
> String line = null;
> while((line=reversedLinesFileReader.readLine())!=null) {
>   ... // use the line
>   if(enoughLinesSeen) {
>      break;  
>   }
> }
> reversedLinesFileReader.close();
> I believe this could be useful for other people as well!

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Issue Comment Edited] (IO-288) Supply a ReversedLinesFileReader

Posted by "Lars Kolb (Issue Comment Edited) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/IO-288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13210212#comment-13210212 ] 

Lars Kolb edited comment on IO-288 at 2/17/12 12:50 PM:
--------------------------------------------------------

* [Web|http://svn.apache.org/viewvc/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/LineReader.java?view=log]
* [SVN|http://hadoop.apache.org/common/version_control.html]

Basically, it would be required to support:
{code}
Text str= new Text();
FSDataInputStream is= FileSystem.get(conf);
is.seek(offset);
ReversedLinesReader reader= new ReversedLinesReader(is);
int bytesConsumed;
while((bytesConsumed=reader.readLine(str))>0)
{
   ...
}

public class ReversedLinesReader
{
   //constructor takes (positioned) Input stream
   public ReversedLinesReader(InputStream is)
   {
      //do not seek to end of file, simply start reading from is
   }

   //current behaviour
   public ReversedLinesReader(File file)
   {
      //seek to end of file
   }

   public int readLine(Text text)
   {
      //return bytes read and store line in text
      //alternatively one could return a Pair<String,Integer> to not depend on org.apache.hadoop.io.Text
   }
}
{code}

                
      was (Author: weihnachtsmann):
    * [Web|http://svn.apache.org/viewvc/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/LineReader.java?view=log]
* [SVN|http://hadoop.apache.org/common/version_control.html]

Basically, it would be required to support:
{code}
Text str= new Text();
FSDataInputStream is= FileSystem.get(conf);
is.seek(offset);
ReversedLinesReader reader= new ReversedLinesReader(is);
int bytesConsumed;
while((bytesConsumed=reader.readLine(str))>0)
{
   ...
}

public class ReversedLinesReader
{
   //constructor takes (positioned) Input stream
   public ReversedLinesReader(InputStream is)
   {
      //do not seek to end of file, simply start reading from is
   }

   //constructor takes (positioned) Input stream
   public ReversedLinesReader(File file)
   {
      //seek to end of file
   }

   public int readLine(Text text)
   {
      //return bytes read and store line in text
      //alternatively one could return a Pair<String,Integer> to not depend on org.apache.hadoop.io.Text
   }
}
{code}

                  
> Supply a ReversedLinesFileReader 
> ---------------------------------
>
>                 Key: IO-288
>                 URL: https://issues.apache.org/jira/browse/IO-288
>             Project: Commons IO
>          Issue Type: New Feature
>          Components: Utilities
>            Reporter: Georg Henzler
>             Fix For: 2.2
>
>         Attachments: ReversedLinesFileReader0.3.zip
>
>
> I needed to analyse a log file today and I was looking for a ReversedLinesFileReader: A class that behaves exactly like BufferedReader except that it goes from bottom to top when readLine() is called. I didn't find it in IOUtils and the internet didn't help a lot either, e.g. http://www.java2s.com/Tutorial/Java/0180__File/ReversingaFile.htm is a fairly inefficient - the log files I'm analysing are huge and it is not a good idea to load the whole content in the memory. 
> So I ended up writing an implementation myself using little memory and the class RandomAccessFile - see attached file. It's used as follows:
> int blockSize = 4096; // only that much memory is needed, no matter how big the file is
> ReversedLinesFileReader reversedLinesFileReader = new ReversedLinesFileReader (myFile, blockSize, "UTF-8"); // encoding is supported
> String line = null;
> while((line=reversedLinesFileReader.readLine())!=null) {
>   ... // use the line
>   if(enoughLinesSeen) {
>      break;  
>   }
> }
> reversedLinesFileReader.close();
> I believe this could be useful for other people as well!

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Issue Comment Edited] (IO-288) Supply a ReversedLinesFileReader

Posted by "Lars Kolb (Issue Comment Edited) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/IO-288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13210186#comment-13210186 ] 

Lars Kolb edited comment on IO-288 at 2/17/12 12:58 PM:
--------------------------------------------------------

Works like a charm, thanks Georg.

I had a similar requirement. I wanted to "chunk-wise" read a HDFS file backwards to allow file browsing similar to the Hadoop namenode's web interface. By clicking a button a user triggers to fetch the previous N lines starting from a specific offset.

With a few changes to your ReversedLinesFileReader implementation I was able to implement this functionality. I would suggest to extend your ReversedLinesFileReader to be able to operate on InputStreams and to return the number of consumed bytes (i.e. the number of bytes actually read for "line construction" not the number of buffered bytes). This actually results in a Reverse org.apache.hadoop.util.LineReader.
                
      was (Author: weihnachtsmann):
    Works like a charm, thanks Georg.

I had a similar requirement. I wanted to "chunk-wise" read a HDFS file backwards to allow file browsing similar to the Hadoop namenode's web interface. By clicking a button a user triggers to fetch the previous N lines starting from a specific offset.

With a few changes to your ReversedLinesFileReader implementation I was able to implement this functionality. I would suggest to extend your ReversedLinesFileReader to be able to operate on InputStreams and to return the number of consumed bytes ((i.e. the number of bytes actually read for "line construction" not the number of buffered bytes). This actually results in a Reverse org.apache.hadoop.util.LineReader.
                  
> Supply a ReversedLinesFileReader 
> ---------------------------------
>
>                 Key: IO-288
>                 URL: https://issues.apache.org/jira/browse/IO-288
>             Project: Commons IO
>          Issue Type: New Feature
>          Components: Utilities
>            Reporter: Georg Henzler
>             Fix For: 2.2
>
>         Attachments: ReversedLinesFileReader0.3.zip
>
>
> I needed to analyse a log file today and I was looking for a ReversedLinesFileReader: A class that behaves exactly like BufferedReader except that it goes from bottom to top when readLine() is called. I didn't find it in IOUtils and the internet didn't help a lot either, e.g. http://www.java2s.com/Tutorial/Java/0180__File/ReversingaFile.htm is a fairly inefficient - the log files I'm analysing are huge and it is not a good idea to load the whole content in the memory. 
> So I ended up writing an implementation myself using little memory and the class RandomAccessFile - see attached file. It's used as follows:
> int blockSize = 4096; // only that much memory is needed, no matter how big the file is
> ReversedLinesFileReader reversedLinesFileReader = new ReversedLinesFileReader (myFile, blockSize, "UTF-8"); // encoding is supported
> String line = null;
> while((line=reversedLinesFileReader.readLine())!=null) {
>   ... // use the line
>   if(enoughLinesSeen) {
>      break;  
>   }
> }
> reversedLinesFileReader.close();
> I believe this could be useful for other people as well!

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (IO-288) Supply a ReverseFileReader

Posted by "Georg Henzler (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/IO-288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Georg Henzler updated IO-288:
-----------------------------

    Description: 
I needed to analyse a log file today and I was looking for a ReversedLinesFileReader: A class that behaves exactly like BufferedReader except that it goes from bottom to top when readLine() is called. I didn't find it in IOUtils and the internet didn't help a lot either, e.g. http://www.java2s.com/Tutorial/Java/0180__File/ReversingaFile.htm is a fairly inefficient - the log files I'm analysing are huge and it is not a good idea to load the whole content in the memory. 

So I ended up writing an implementation myself using little memory and the class RandomAccessFile - see attached file. It's used as follows:

int blockSize = 4096; // only that much memory is needed, no matter how big the file is
ReversedLinesFileReader reversedLinesFileReader = new ReversedLinesFileReader (myFile, blockSize, "UTF-8"); // encoding is supported
String line = null;
while((line=reversedLinesFileReader.readLine())!=null) {
  ... // use the line
  if(enoughLinesSeen) {
     break;  
  }

}
reversedLinesFileReader.close();

I believe this could be useful for other people as well!


  was:
I needed to analyse a log file today and I was looking for a ReverseFileReader: A class that behaves exactly like BufferedReader except that it goes from bottom to top when readLine() is called. I didn't find it in IOUtils and the internet didn't help a lot either, e.g. http://www.java2s.com/Tutorial/Java/0180__File/ReversingaFile.htm is a fairly inefficient - the log files I'm analysing are huge and it is not a good idea to load the whole content in the memory. 

So I ended up writing an implementation myself using little memory and the class RandomAccessFile - see attached file. It's used as follows:

int blockSize = 4096; // only that much memory is needed, no matter how big the file is
ReverseFileReader reverseFileReader = new ReverseFileReader(myFile, blockSize, "UTF-8"); // encoding is supported
String line = null;
while((line=reverseFileReader.readLine())!=null) {
  ... // use the line
  if(enoughLinesSeen) {
     break;  
  }

}
reverseFileReader.close();

I believe this could be useful for other people as well!


    
> Supply a ReverseFileReader
> --------------------------
>
>                 Key: IO-288
>                 URL: https://issues.apache.org/jira/browse/IO-288
>             Project: Commons IO
>          Issue Type: New Feature
>          Components: Utilities
>            Reporter: Georg Henzler
>         Attachments: ReversedLinesFileReader.zip
>
>
> I needed to analyse a log file today and I was looking for a ReversedLinesFileReader: A class that behaves exactly like BufferedReader except that it goes from bottom to top when readLine() is called. I didn't find it in IOUtils and the internet didn't help a lot either, e.g. http://www.java2s.com/Tutorial/Java/0180__File/ReversingaFile.htm is a fairly inefficient - the log files I'm analysing are huge and it is not a good idea to load the whole content in the memory. 
> So I ended up writing an implementation myself using little memory and the class RandomAccessFile - see attached file. It's used as follows:
> int blockSize = 4096; // only that much memory is needed, no matter how big the file is
> ReversedLinesFileReader reversedLinesFileReader = new ReversedLinesFileReader (myFile, blockSize, "UTF-8"); // encoding is supported
> String line = null;
> while((line=reversedLinesFileReader.readLine())!=null) {
>   ... // use the line
>   if(enoughLinesSeen) {
>      break;  
>   }
> }
> reversedLinesFileReader.close();
> I believe this could be useful for other people as well!

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (IO-288) Supply a ReverseFileReader

Posted by "Georg Henzler (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/IO-288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Georg Henzler updated IO-288:
-----------------------------

    Description: 
I needed to analyse a log file today and I was looking for a ReverseFileReader: A class that behaves exactly like BufferedReader except that it goes from bottom to top when readLine() is called. I didn't find it in IOUtils and the internet didn't help a lot either, e.g. http://www.java2s.com/Tutorial/Java/0180__File/ReversingaFile.htm is a fairly inefficient - the log files I'm analysing are huge and it is not a good idea to load the whole content in the memory. 

So I ended up writing an implementation myself using little memory and the class RandomAccessFile - see attached file. It's used as follows:

int blockSize = 4096; // only that much memory is needed, no matter how big the file is
ReverseFileReader reverseFileReader = new ReverseFileReader(myFile, blockSize, "UTF-8"); // encoding is supported
String line = null;
while((line=reverseFileReader.readLine())!=null) {
  ... // use the line
  if(enoughLinesSeen) {
     break;  
  }

}
reverseFileReader.close();

I believe this could be useful for other people as well!


  was:
I needed to analyse a log file today and I was looking for a ReverseFileReader: A class that behaves exactly like BufferedReader except that it goes from bottom to top when readLine() is called. I didn't find it in IOUtils and the internet didn't help a lot either, e.g. http://www.java2s.com/Tutorial/Java/0180__File/ReversingaFile.htm is a fairly inefficient - the log files I'm analysing is huge and it is not a good idea to load the whole content in the memory. 

So I ended up writing an implementation myself using little memory and the class RandomAccessFile - see attached file. It's used follows:

int blockSize = 4096; // only that much memory is needed, no matter how big the file is
ReverseFileReader reverseFileReader = new ReverseFileReader(myFile, blockSize, "UTF-8");
String line = null;
while((line=reverseFileReader.readLine())!=null) {
  ... // use the line
  if(enoughLinesSeen) {
     break;  
  }

}
reverseFileReader.close();

I believe this could be useful for other people as well!


    
> Supply a ReverseFileReader
> --------------------------
>
>                 Key: IO-288
>                 URL: https://issues.apache.org/jira/browse/IO-288
>             Project: Commons IO
>          Issue Type: New Feature
>          Components: Utilities
>            Reporter: Georg Henzler
>         Attachments: ReverseFileReader.java
>
>
> I needed to analyse a log file today and I was looking for a ReverseFileReader: A class that behaves exactly like BufferedReader except that it goes from bottom to top when readLine() is called. I didn't find it in IOUtils and the internet didn't help a lot either, e.g. http://www.java2s.com/Tutorial/Java/0180__File/ReversingaFile.htm is a fairly inefficient - the log files I'm analysing are huge and it is not a good idea to load the whole content in the memory. 
> So I ended up writing an implementation myself using little memory and the class RandomAccessFile - see attached file. It's used as follows:
> int blockSize = 4096; // only that much memory is needed, no matter how big the file is
> ReverseFileReader reverseFileReader = new ReverseFileReader(myFile, blockSize, "UTF-8"); // encoding is supported
> String line = null;
> while((line=reverseFileReader.readLine())!=null) {
>   ... // use the line
>   if(enoughLinesSeen) {
>      break;  
>   }
> }
> reverseFileReader.close();
> I believe this could be useful for other people as well!

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Issue Comment Edited] (IO-288) Supply a ReversedLinesFileReader

Posted by "Lars Kolb (Issue Comment Edited) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/IO-288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13210212#comment-13210212 ] 

Lars Kolb edited comment on IO-288 at 2/17/12 12:51 PM:
--------------------------------------------------------

* [Web|http://svn.apache.org/viewvc/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/LineReader.java?view=log]
* [SVN|http://hadoop.apache.org/common/version_control.html]

Basically, it would be required to support:
{code}
Text str= new Text();
FSDataInputStream is= FileSystem.get(conf);
is.seek(offset);
ReversedLinesReader reader= new ReversedLinesReader(is);
int bytesConsumed;
while((bytesConsumed=reader.readLine(str))>0)
{
   ...
}

public class ReversedLinesReader
{
   //constructor takes (positioned) Input stream
   public ReversedLinesReader(InputStream is)
   {
      //simply start reading from is
   }

   //current behaviour
   public ReversedLinesReader(File file)
   {
      //seek to end of file
   }

   public int readLine(Text text)
   {
      //return bytes read and store line in text
      //alternatively one could return a Pair<String,Integer> to not depend on org.apache.hadoop.io.Text
   }
}
{code}

                
      was (Author: weihnachtsmann):
    * [Web|http://svn.apache.org/viewvc/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/LineReader.java?view=log]
* [SVN|http://hadoop.apache.org/common/version_control.html]

Basically, it would be required to support:
{code}
Text str= new Text();
FSDataInputStream is= FileSystem.get(conf);
is.seek(offset);
ReversedLinesReader reader= new ReversedLinesReader(is);
int bytesConsumed;
while((bytesConsumed=reader.readLine(str))>0)
{
   ...
}

public class ReversedLinesReader
{
   //constructor takes (positioned) Input stream
   public ReversedLinesReader(InputStream is)
   {
      //do not seek to end of file, simply start reading from is
   }

   //current behaviour
   public ReversedLinesReader(File file)
   {
      //seek to end of file
   }

   public int readLine(Text text)
   {
      //return bytes read and store line in text
      //alternatively one could return a Pair<String,Integer> to not depend on org.apache.hadoop.io.Text
   }
}
{code}

                  
> Supply a ReversedLinesFileReader 
> ---------------------------------
>
>                 Key: IO-288
>                 URL: https://issues.apache.org/jira/browse/IO-288
>             Project: Commons IO
>          Issue Type: New Feature
>          Components: Utilities
>            Reporter: Georg Henzler
>             Fix For: 2.2
>
>         Attachments: ReversedLinesFileReader0.3.zip
>
>
> I needed to analyse a log file today and I was looking for a ReversedLinesFileReader: A class that behaves exactly like BufferedReader except that it goes from bottom to top when readLine() is called. I didn't find it in IOUtils and the internet didn't help a lot either, e.g. http://www.java2s.com/Tutorial/Java/0180__File/ReversingaFile.htm is a fairly inefficient - the log files I'm analysing are huge and it is not a good idea to load the whole content in the memory. 
> So I ended up writing an implementation myself using little memory and the class RandomAccessFile - see attached file. It's used as follows:
> int blockSize = 4096; // only that much memory is needed, no matter how big the file is
> ReversedLinesFileReader reversedLinesFileReader = new ReversedLinesFileReader (myFile, blockSize, "UTF-8"); // encoding is supported
> String line = null;
> while((line=reversedLinesFileReader.readLine())!=null) {
>   ... // use the line
>   if(enoughLinesSeen) {
>      break;  
>   }
> }
> reversedLinesFileReader.close();
> I believe this could be useful for other people as well!

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Issue Comment Edited] (IO-288) Supply a ReversedLinesFileReader

Posted by "Lars Kolb (Issue Comment Edited) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/IO-288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13210212#comment-13210212 ] 

Lars Kolb edited comment on IO-288 at 2/17/12 12:50 PM:
--------------------------------------------------------

* [Web|http://svn.apache.org/viewvc/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/LineReader.java?view=log]
* [SVN|http://hadoop.apache.org/common/version_control.html]

Basically, it would be required to support:
{code}
Text str= new Text();
FSDataInputStream is= FileSystem.get(conf);
is.seek(offset);
ReversedLinesReader reader= new ReversedLinesReader(is);
int bytesConsumed;
while((bytesConsumed=reader.readLine(str))>0)
{
   ...
}

public class ReversedLinesReader
{
   //constructor takes (positioned) Input stream
   public ReversedLinesReader(InputStream is)
   {
      //do not seek to end of file, simply start reading from is
   }

   //constructor takes (positioned) Input stream
   public ReversedLinesReader(File file)
   {
      //seek to end of file
   }

   public int readLine(Text text)
   {
      //return bytes read and store line in text
      //alternatively one could return a Pair<String,Integer> to not depend on org.apache.hadoop.io.Text
   }
}
{code}

                
      was (Author: weihnachtsmann):
    * [Web|http://svn.apache.org/viewvc/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/LineReader.java?view=log]
* [SVN|http://hadoop.apache.org/common/version_control.html]

Basically, it would be required to support:
{code}
Text str= new Text();
FSDataInputStream is= FileSystem.get(conf);
is.seek(offset);
ReversedLinesReader reader= new ReversedLinesReader(is);
int bytesConsumed;
while((bytesConsumed=reader.readLine(str))>0)
{
   ...
}

public class ReversedLinesReader
{
   //constructor takes (positioned) Input stream
   public ReversedLinesReader(InputStream is)
   {
      //do not seek to end of file, simply start reading from is
   }

   public int readLine(Text text)
   {
      //return bytes read and store line in text
      //alternatively one could return a Pair<String,Integer> to not depend on org.apache.hadoop.io.Text
   }
}
{code}

                  
> Supply a ReversedLinesFileReader 
> ---------------------------------
>
>                 Key: IO-288
>                 URL: https://issues.apache.org/jira/browse/IO-288
>             Project: Commons IO
>          Issue Type: New Feature
>          Components: Utilities
>            Reporter: Georg Henzler
>             Fix For: 2.2
>
>         Attachments: ReversedLinesFileReader0.3.zip
>
>
> I needed to analyse a log file today and I was looking for a ReversedLinesFileReader: A class that behaves exactly like BufferedReader except that it goes from bottom to top when readLine() is called. I didn't find it in IOUtils and the internet didn't help a lot either, e.g. http://www.java2s.com/Tutorial/Java/0180__File/ReversingaFile.htm is a fairly inefficient - the log files I'm analysing are huge and it is not a good idea to load the whole content in the memory. 
> So I ended up writing an implementation myself using little memory and the class RandomAccessFile - see attached file. It's used as follows:
> int blockSize = 4096; // only that much memory is needed, no matter how big the file is
> ReversedLinesFileReader reversedLinesFileReader = new ReversedLinesFileReader (myFile, blockSize, "UTF-8"); // encoding is supported
> String line = null;
> while((line=reversedLinesFileReader.readLine())!=null) {
>   ... // use the line
>   if(enoughLinesSeen) {
>      break;  
>   }
> }
> reversedLinesFileReader.close();
> I believe this could be useful for other people as well!

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Issue Comment Edited] (IO-288) Supply a ReversedLinesFileReader

Posted by "Lars Kolb (Issue Comment Edited) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/IO-288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13210212#comment-13210212 ] 

Lars Kolb edited comment on IO-288 at 2/17/12 1:06 PM:
-------------------------------------------------------

* [Web|http://svn.apache.org/viewvc/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/LineReader.java?view=log]
* [SVN|http://hadoop.apache.org/common/version_control.html]

Basically, it would be required to support:
{code}
Text str= new Text();
FSDataInputStream is= FileSystem.get(conf);
is.seek(offset);

ReversedLinesReader reader= new ReversedLinesReader(is);
int bytesConsumed;
long bytesConsumedTotal=0L;

while(bytesConsumedTotal<treshold && (bytesConsumed=reader.readLine(str))>0)
{
   //...
   bytesConsumedTotal+= bytesConsumed;
}


public class ReversedLinesReader
{
   public ReversedLinesReader(InputStream is)
   {
      //simply start reading from (positioned) is
   }

   public ReversedLinesReader(File file)
   {
      //current behaviour seek to end of file
   }

   public int readLine(Text text)
   {
      //return bytes read and store line in text
      //alternatively one could return a Pair<String,Integer> to not depend on org.apache.hadoop.io.Text
   }

   public String readLine()
   {
      //current behaviour 
   }
}
{code}

                
      was (Author: weihnachtsmann):
    * [Web|http://svn.apache.org/viewvc/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/LineReader.java?view=log]
* [SVN|http://hadoop.apache.org/common/version_control.html]

Basically, it would be required to support:
{code}
Text str= new Text();
FSDataInputStream is= FileSystem.get(conf);
is.seek(offset);

ReversedLinesReader reader= new ReversedLinesReader(is);
int bytesConsumed;
long bytesConsumedTotal=0L;

while((bytesConsumed=reader.readLine(str))>0 && bytesConsumedTotal<treshold)
{
   //...
   bytesConsumedTotal+= bytesConsumed;
}


public class ReversedLinesReader
{
   public ReversedLinesReader(InputStream is)
   {
      //simply start reading from (positioned) is
   }

   public ReversedLinesReader(File file)
   {
      //current behaviour seek to end of file
   }

   public int readLine(Text text)
   {
      //return bytes read and store line in text
      //alternatively one could return a Pair<String,Integer> to not depend on org.apache.hadoop.io.Text
   }

   public String readLine()
   {
      //current behaviour 
   }
}
{code}

                  
> Supply a ReversedLinesFileReader 
> ---------------------------------
>
>                 Key: IO-288
>                 URL: https://issues.apache.org/jira/browse/IO-288
>             Project: Commons IO
>          Issue Type: New Feature
>          Components: Utilities
>            Reporter: Georg Henzler
>             Fix For: 2.2
>
>         Attachments: ReversedLinesFileReader0.3.zip
>
>
> I needed to analyse a log file today and I was looking for a ReversedLinesFileReader: A class that behaves exactly like BufferedReader except that it goes from bottom to top when readLine() is called. I didn't find it in IOUtils and the internet didn't help a lot either, e.g. http://www.java2s.com/Tutorial/Java/0180__File/ReversingaFile.htm is a fairly inefficient - the log files I'm analysing are huge and it is not a good idea to load the whole content in the memory. 
> So I ended up writing an implementation myself using little memory and the class RandomAccessFile - see attached file. It's used as follows:
> int blockSize = 4096; // only that much memory is needed, no matter how big the file is
> ReversedLinesFileReader reversedLinesFileReader = new ReversedLinesFileReader (myFile, blockSize, "UTF-8"); // encoding is supported
> String line = null;
> while((line=reversedLinesFileReader.readLine())!=null) {
>   ... // use the line
>   if(enoughLinesSeen) {
>      break;  
>   }
> }
> reversedLinesFileReader.close();
> I believe this could be useful for other people as well!

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Resolved] (IO-288) Supply a ReversedLinesFileReader

Posted by "Sebb (Resolved) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/IO-288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sebb resolved IO-288.
---------------------

    Resolution: Fixed

Thanks for your perserverance!

I had problems with committing the .txt files, as SVN tried to add SVN eol-style:native. This would have broken the tests, so they were renamed as .bin and stored as application/octet-stream.

Made some minor code changes:
- more final fields
- added extra BufferedReader comparison tests
                
> Supply a ReversedLinesFileReader 
> ---------------------------------
>
>                 Key: IO-288
>                 URL: https://issues.apache.org/jira/browse/IO-288
>             Project: Commons IO
>          Issue Type: New Feature
>          Components: Utilities
>            Reporter: Georg Henzler
>             Fix For: 2.2
>
>         Attachments: ReversedLinesFileReader0.3.zip
>
>
> I needed to analyse a log file today and I was looking for a ReversedLinesFileReader: A class that behaves exactly like BufferedReader except that it goes from bottom to top when readLine() is called. I didn't find it in IOUtils and the internet didn't help a lot either, e.g. http://www.java2s.com/Tutorial/Java/0180__File/ReversingaFile.htm is a fairly inefficient - the log files I'm analysing are huge and it is not a good idea to load the whole content in the memory. 
> So I ended up writing an implementation myself using little memory and the class RandomAccessFile - see attached file. It's used as follows:
> int blockSize = 4096; // only that much memory is needed, no matter how big the file is
> ReversedLinesFileReader reversedLinesFileReader = new ReversedLinesFileReader (myFile, blockSize, "UTF-8"); // encoding is supported
> String line = null;
> while((line=reversedLinesFileReader.readLine())!=null) {
>   ... // use the line
>   if(enoughLinesSeen) {
>      break;  
>   }
> }
> reversedLinesFileReader.close();
> I believe this could be useful for other people as well!

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (IO-288) Supply a ReversedLinesFileReader

Posted by "Georg Henzler (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/IO-288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13150294#comment-13150294 ] 

Georg Henzler commented on IO-288:
----------------------------------

Sorry for the spurious files, i created the zip with the default utility in OS-X.

I think the code addresses most of your questions already:
- There are a few tests already (testXxxxSmallBlockSize()) that test the multi-block behaviour for lines that span a block (you can go down to a block size of 1 and it still works, that shows that the algorithm is solid I think)
- I think it's clever to encode the newline characters - that way we automatically get the correct byte sequence for multi byte encodings (e.g. UTF-16) and if a one byte-per-char-encoding chose to use different bytes it would also work (performance is no issue for this as it happens only once)
- I think about getNewLineMatchByteCount() to make it more efficient - although for the standard ISO case it ends up just being four byte comparisons instead of three. Should make almost no difference but on the pro side it makes the implementation nicely generic.
- It's true, there is an issue with block-spanning newlines to be fixed. If a windows newline (\r\n) happens to span a block a wrong extra empty line will be returned.

I'll provide a fix for the newline problem and will change totalBlockCount to long.



                
> Supply a ReversedLinesFileReader 
> ---------------------------------
>
>                 Key: IO-288
>                 URL: https://issues.apache.org/jira/browse/IO-288
>             Project: Commons IO
>          Issue Type: New Feature
>          Components: Utilities
>            Reporter: Georg Henzler
>             Fix For: 2.2
>
>         Attachments: ReversedLinesFileReader0.2.zip
>
>
> I needed to analyse a log file today and I was looking for a ReversedLinesFileReader: A class that behaves exactly like BufferedReader except that it goes from bottom to top when readLine() is called. I didn't find it in IOUtils and the internet didn't help a lot either, e.g. http://www.java2s.com/Tutorial/Java/0180__File/ReversingaFile.htm is a fairly inefficient - the log files I'm analysing are huge and it is not a good idea to load the whole content in the memory. 
> So I ended up writing an implementation myself using little memory and the class RandomAccessFile - see attached file. It's used as follows:
> int blockSize = 4096; // only that much memory is needed, no matter how big the file is
> ReversedLinesFileReader reversedLinesFileReader = new ReversedLinesFileReader (myFile, blockSize, "UTF-8"); // encoding is supported
> String line = null;
> while((line=reversedLinesFileReader.readLine())!=null) {
>   ... // use the line
>   if(enoughLinesSeen) {
>      break;  
>   }
> }
> reversedLinesFileReader.close();
> I believe this could be useful for other people as well!

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (IO-288) Supply a ReversedLinesFileReader

Posted by "Georg Henzler (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/IO-288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Georg Henzler updated IO-288:
-----------------------------

    Attachment:     (was: ReversedLinesFileReader.zip)
    
> Supply a ReversedLinesFileReader 
> ---------------------------------
>
>                 Key: IO-288
>                 URL: https://issues.apache.org/jira/browse/IO-288
>             Project: Commons IO
>          Issue Type: New Feature
>          Components: Utilities
>            Reporter: Georg Henzler
>
> I needed to analyse a log file today and I was looking for a ReversedLinesFileReader: A class that behaves exactly like BufferedReader except that it goes from bottom to top when readLine() is called. I didn't find it in IOUtils and the internet didn't help a lot either, e.g. http://www.java2s.com/Tutorial/Java/0180__File/ReversingaFile.htm is a fairly inefficient - the log files I'm analysing are huge and it is not a good idea to load the whole content in the memory. 
> So I ended up writing an implementation myself using little memory and the class RandomAccessFile - see attached file. It's used as follows:
> int blockSize = 4096; // only that much memory is needed, no matter how big the file is
> ReversedLinesFileReader reversedLinesFileReader = new ReversedLinesFileReader (myFile, blockSize, "UTF-8"); // encoding is supported
> String line = null;
> while((line=reversedLinesFileReader.readLine())!=null) {
>   ... // use the line
>   if(enoughLinesSeen) {
>      break;  
>   }
> }
> reversedLinesFileReader.close();
> I believe this could be useful for other people as well!

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (IO-288) Supply a ReversedLinesFileReader

Posted by "Georg Henzler (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/IO-288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Georg Henzler updated IO-288:
-----------------------------

    Attachment:     (was: ReversedLinesFileReader.zip)
    
> Supply a ReversedLinesFileReader 
> ---------------------------------
>
>                 Key: IO-288
>                 URL: https://issues.apache.org/jira/browse/IO-288
>             Project: Commons IO
>          Issue Type: New Feature
>          Components: Utilities
>            Reporter: Georg Henzler
>             Fix For: 2.2
>
>         Attachments: ReversedLinesFileReader.zip
>
>
> I needed to analyse a log file today and I was looking for a ReversedLinesFileReader: A class that behaves exactly like BufferedReader except that it goes from bottom to top when readLine() is called. I didn't find it in IOUtils and the internet didn't help a lot either, e.g. http://www.java2s.com/Tutorial/Java/0180__File/ReversingaFile.htm is a fairly inefficient - the log files I'm analysing are huge and it is not a good idea to load the whole content in the memory. 
> So I ended up writing an implementation myself using little memory and the class RandomAccessFile - see attached file. It's used as follows:
> int blockSize = 4096; // only that much memory is needed, no matter how big the file is
> ReversedLinesFileReader reversedLinesFileReader = new ReversedLinesFileReader (myFile, blockSize, "UTF-8"); // encoding is supported
> String line = null;
> while((line=reversedLinesFileReader.readLine())!=null) {
>   ... // use the line
>   if(enoughLinesSeen) {
>      break;  
>   }
> }
> reversedLinesFileReader.close();
> I believe this could be useful for other people as well!

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (IO-288) Supply a ReverseFileReader

Posted by "Georg Henzler (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/IO-288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13144146#comment-13144146 ] 

Georg Henzler commented on IO-288:
----------------------------------

@Gary: Let me know if you need anything more
                
> Supply a ReverseFileReader
> --------------------------
>
>                 Key: IO-288
>                 URL: https://issues.apache.org/jira/browse/IO-288
>             Project: Commons IO
>          Issue Type: New Feature
>          Components: Utilities
>            Reporter: Georg Henzler
>         Attachments: ReversedLinesFileReader.zip
>
>
> I needed to analyse a log file today and I was looking for a ReverseFileReader: A class that behaves exactly like BufferedReader except that it goes from bottom to top when readLine() is called. I didn't find it in IOUtils and the internet didn't help a lot either, e.g. http://www.java2s.com/Tutorial/Java/0180__File/ReversingaFile.htm is a fairly inefficient - the log files I'm analysing are huge and it is not a good idea to load the whole content in the memory. 
> So I ended up writing an implementation myself using little memory and the class RandomAccessFile - see attached file. It's used as follows:
> int blockSize = 4096; // only that much memory is needed, no matter how big the file is
> ReverseFileReader reverseFileReader = new ReverseFileReader(myFile, blockSize, "UTF-8"); // encoding is supported
> String line = null;
> while((line=reverseFileReader.readLine())!=null) {
>   ... // use the line
>   if(enoughLinesSeen) {
>      break;  
>   }
> }
> reverseFileReader.close();
> I believe this could be useful for other people as well!

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (IO-288) Supply a ReversedLinesFileReader

Posted by "Georg Henzler (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/IO-288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13210202#comment-13210202 ] 

Georg Henzler commented on IO-288:
----------------------------------

Great to hear that the class is being used already! Can you provide the exact SVN-Location of org.apache.hadoop.util.LineReader? I will try to create a base class for ReversedLinesFileReader that you could use then. 


                
> Supply a ReversedLinesFileReader 
> ---------------------------------
>
>                 Key: IO-288
>                 URL: https://issues.apache.org/jira/browse/IO-288
>             Project: Commons IO
>          Issue Type: New Feature
>          Components: Utilities
>            Reporter: Georg Henzler
>             Fix For: 2.2
>
>         Attachments: ReversedLinesFileReader0.3.zip
>
>
> I needed to analyse a log file today and I was looking for a ReversedLinesFileReader: A class that behaves exactly like BufferedReader except that it goes from bottom to top when readLine() is called. I didn't find it in IOUtils and the internet didn't help a lot either, e.g. http://www.java2s.com/Tutorial/Java/0180__File/ReversingaFile.htm is a fairly inefficient - the log files I'm analysing are huge and it is not a good idea to load the whole content in the memory. 
> So I ended up writing an implementation myself using little memory and the class RandomAccessFile - see attached file. It's used as follows:
> int blockSize = 4096; // only that much memory is needed, no matter how big the file is
> ReversedLinesFileReader reversedLinesFileReader = new ReversedLinesFileReader (myFile, blockSize, "UTF-8"); // encoding is supported
> String line = null;
> while((line=reversedLinesFileReader.readLine())!=null) {
>   ... // use the line
>   if(enoughLinesSeen) {
>      break;  
>   }
> }
> reversedLinesFileReader.close();
> I believe this could be useful for other people as well!

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (IO-288) Supply a ReverseFileReader

Posted by "Georg Henzler (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/IO-288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Georg Henzler updated IO-288:
-----------------------------

    Attachment:     (was: ReverseFileReader.java)
    
> Supply a ReverseFileReader
> --------------------------
>
>                 Key: IO-288
>                 URL: https://issues.apache.org/jira/browse/IO-288
>             Project: Commons IO
>          Issue Type: New Feature
>          Components: Utilities
>            Reporter: Georg Henzler
>
> I needed to analyse a log file today and I was looking for a ReverseFileReader: A class that behaves exactly like BufferedReader except that it goes from bottom to top when readLine() is called. I didn't find it in IOUtils and the internet didn't help a lot either, e.g. http://www.java2s.com/Tutorial/Java/0180__File/ReversingaFile.htm is a fairly inefficient - the log files I'm analysing are huge and it is not a good idea to load the whole content in the memory. 
> So I ended up writing an implementation myself using little memory and the class RandomAccessFile - see attached file. It's used as follows:
> int blockSize = 4096; // only that much memory is needed, no matter how big the file is
> ReverseFileReader reverseFileReader = new ReverseFileReader(myFile, blockSize, "UTF-8"); // encoding is supported
> String line = null;
> while((line=reverseFileReader.readLine())!=null) {
>   ... // use the line
>   if(enoughLinesSeen) {
>      break;  
>   }
> }
> reverseFileReader.close();
> I believe this could be useful for other people as well!

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (IO-288) Supply a ReversedLinesFileReader

Posted by "Georg Henzler (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/IO-288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Georg Henzler updated IO-288:
-----------------------------

    Attachment: ReversedLinesFileReader0.3.zip

Eventually... there's a new version 0.3 attached.

- The block-spanning newline issue is fixed
- Comment for the line-end-bytes-array added
- The last-line-is-empty-behaviour has been aligned with BufferedReader (and there is a test to check the "BufferedReader Compliancy") 
- The tests have been split up in two parametrized ones and one standard junit test.

                
> Supply a ReversedLinesFileReader 
> ---------------------------------
>
>                 Key: IO-288
>                 URL: https://issues.apache.org/jira/browse/IO-288
>             Project: Commons IO
>          Issue Type: New Feature
>          Components: Utilities
>            Reporter: Georg Henzler
>             Fix For: 2.2
>
>         Attachments: ReversedLinesFileReader0.2.zip, ReversedLinesFileReader0.3.zip
>
>
> I needed to analyse a log file today and I was looking for a ReversedLinesFileReader: A class that behaves exactly like BufferedReader except that it goes from bottom to top when readLine() is called. I didn't find it in IOUtils and the internet didn't help a lot either, e.g. http://www.java2s.com/Tutorial/Java/0180__File/ReversingaFile.htm is a fairly inefficient - the log files I'm analysing are huge and it is not a good idea to load the whole content in the memory. 
> So I ended up writing an implementation myself using little memory and the class RandomAccessFile - see attached file. It's used as follows:
> int blockSize = 4096; // only that much memory is needed, no matter how big the file is
> ReversedLinesFileReader reversedLinesFileReader = new ReversedLinesFileReader (myFile, blockSize, "UTF-8"); // encoding is supported
> String line = null;
> while((line=reversedLinesFileReader.readLine())!=null) {
>   ... // use the line
>   if(enoughLinesSeen) {
>      break;  
>   }
> }
> reversedLinesFileReader.close();
> I believe this could be useful for other people as well!

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (IO-288) Supply a ReverseFileReader

Posted by "Georg Henzler (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/IO-288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Georg Henzler updated IO-288:
-----------------------------

    Attachment: ReverseFileReader.java
    
> Supply a ReverseFileReader
> --------------------------
>
>                 Key: IO-288
>                 URL: https://issues.apache.org/jira/browse/IO-288
>             Project: Commons IO
>          Issue Type: New Feature
>          Components: Utilities
>            Reporter: Georg Henzler
>         Attachments: ReverseFileReader.java
>
>
> I needed to analyse a log file today and I was looking for a ReverseFileReader: A class that behaves exactly like BufferedReader except that it goes from bottom to top when readLine() is called. I didn't find it in IOUtils and the internet didn't help a lot either, e.g. http://www.java2s.com/Tutorial/Java/0180__File/ReversingaFile.htm is a fairly inefficient - the log files I'm analysing is huge and it is not a good idea to load the whole content in the memory. 
> So I ended up writing an implementation myself using little memory and the class RandomAccessFile - see attached file. It's used follows:
> int blockSize = 4096; // only that much memory is needed, no matter how big the file is
> ReverseFileReader reverseFileReader = new ReverseFileReader(myFile, blockSize, "UTF-8");
> String line = null;
> while((line=reverseFileReader.readLine())!=null) {
>   ... // use the line
>   if(enoughLinesSeen) {
>      break;  
>   }
> }
> reverseFileReader.close();
> I believe this could be useful for other people as well!

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (IO-288) Supply a ReverseFileReader

Posted by "Gary D. Gregory (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/IO-288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13130071#comment-13130071 ] 

Gary D. Gregory commented on IO-288:
------------------------------------

Hi Georg,

We cannot consider this for consideration without:
- Grant of license to Apache which is done when an attachment is uploaded.
- Unit tests.

Questions:

- How does this fit in with the Tailer class. Should one reuse the other?
- Should this be a real Reader implementation? Subclassing FileReader?


                
> Supply a ReverseFileReader
> --------------------------
>
>                 Key: IO-288
>                 URL: https://issues.apache.org/jira/browse/IO-288
>             Project: Commons IO
>          Issue Type: New Feature
>          Components: Utilities
>            Reporter: Georg Henzler
>         Attachments: ReverseFileReader.java
>
>
> I needed to analyse a log file today and I was looking for a ReverseFileReader: A class that behaves exactly like BufferedReader except that it goes from bottom to top when readLine() is called. I didn't find it in IOUtils and the internet didn't help a lot either, e.g. http://www.java2s.com/Tutorial/Java/0180__File/ReversingaFile.htm is a fairly inefficient - the log files I'm analysing are huge and it is not a good idea to load the whole content in the memory. 
> So I ended up writing an implementation myself using little memory and the class RandomAccessFile - see attached file. It's used as follows:
> int blockSize = 4096; // only that much memory is needed, no matter how big the file is
> ReverseFileReader reverseFileReader = new ReverseFileReader(myFile, blockSize, "UTF-8"); // encoding is supported
> String line = null;
> while((line=reverseFileReader.readLine())!=null) {
>   ... // use the line
>   if(enoughLinesSeen) {
>      break;  
>   }
> }
> reverseFileReader.close();
> I believe this could be useful for other people as well!

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (IO-288) Supply a ReversedLinesFileReader

Posted by "Georg Henzler (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/IO-288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Georg Henzler updated IO-288:
-----------------------------

    Attachment:     (was: ReversedLinesFileReader.zip)
    
> Supply a ReversedLinesFileReader 
> ---------------------------------
>
>                 Key: IO-288
>                 URL: https://issues.apache.org/jira/browse/IO-288
>             Project: Commons IO
>          Issue Type: New Feature
>          Components: Utilities
>            Reporter: Georg Henzler
>             Fix For: 2.2
>
>
> I needed to analyse a log file today and I was looking for a ReversedLinesFileReader: A class that behaves exactly like BufferedReader except that it goes from bottom to top when readLine() is called. I didn't find it in IOUtils and the internet didn't help a lot either, e.g. http://www.java2s.com/Tutorial/Java/0180__File/ReversingaFile.htm is a fairly inefficient - the log files I'm analysing are huge and it is not a good idea to load the whole content in the memory. 
> So I ended up writing an implementation myself using little memory and the class RandomAccessFile - see attached file. It's used as follows:
> int blockSize = 4096; // only that much memory is needed, no matter how big the file is
> ReversedLinesFileReader reversedLinesFileReader = new ReversedLinesFileReader (myFile, blockSize, "UTF-8"); // encoding is supported
> String line = null;
> while((line=reversedLinesFileReader.readLine())!=null) {
>   ... // use the line
>   if(enoughLinesSeen) {
>      break;  
>   }
> }
> reversedLinesFileReader.close();
> I believe this could be useful for other people as well!

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Issue Comment Edited] (IO-288) Supply a ReversedLinesFileReader

Posted by "Lars Kolb (Issue Comment Edited) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/IO-288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13210212#comment-13210212 ] 

Lars Kolb edited comment on IO-288 at 2/17/12 12:52 PM:
--------------------------------------------------------

* [Web|http://svn.apache.org/viewvc/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/LineReader.java?view=log]
* [SVN|http://hadoop.apache.org/common/version_control.html]

Basically, it would be required to support:
{code}
Text str= new Text();
FSDataInputStream is= FileSystem.get(conf);
is.seek(offset);
ReversedLinesReader reader= new ReversedLinesReader(is);
int bytesConsumed;
while((bytesConsumed=reader.readLine(str))>0)
{
   ...
}

public class ReversedLinesReader
{
   public ReversedLinesReader(InputStream is)
   {
      //simply start reading from (positioned) is
   }

   public ReversedLinesReader(File file)
   {
      //current behaviour seek to end of file
   }

   public int readLine(Text text)
   {
      //return bytes read and store line in text
      //alternatively one could return a Pair<String,Integer> to not depend on org.apache.hadoop.io.Text
   }

   public String readLine()
   {
      //current behaviour 
   }
}
{code}

                
      was (Author: weihnachtsmann):
    * [Web|http://svn.apache.org/viewvc/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/LineReader.java?view=log]
* [SVN|http://hadoop.apache.org/common/version_control.html]

Basically, it would be required to support:
{code}
Text str= new Text();
FSDataInputStream is= FileSystem.get(conf);
is.seek(offset);
ReversedLinesReader reader= new ReversedLinesReader(is);
int bytesConsumed;
while((bytesConsumed=reader.readLine(str))>0)
{
   ...
}

public class ReversedLinesReader
{
   //constructor takes (positioned) Input stream
   public ReversedLinesReader(InputStream is)
   {
      //simply start reading from is
   }

   //current behaviour
   public ReversedLinesReader(File file)
   {
      //seek to end of file
   }

   public int readLine(Text text)
   {
      //return bytes read and store line in text
      //alternatively one could return a Pair<String,Integer> to not depend on org.apache.hadoop.io.Text
   }
}
{code}

                  
> Supply a ReversedLinesFileReader 
> ---------------------------------
>
>                 Key: IO-288
>                 URL: https://issues.apache.org/jira/browse/IO-288
>             Project: Commons IO
>          Issue Type: New Feature
>          Components: Utilities
>            Reporter: Georg Henzler
>             Fix For: 2.2
>
>         Attachments: ReversedLinesFileReader0.3.zip
>
>
> I needed to analyse a log file today and I was looking for a ReversedLinesFileReader: A class that behaves exactly like BufferedReader except that it goes from bottom to top when readLine() is called. I didn't find it in IOUtils and the internet didn't help a lot either, e.g. http://www.java2s.com/Tutorial/Java/0180__File/ReversingaFile.htm is a fairly inefficient - the log files I'm analysing are huge and it is not a good idea to load the whole content in the memory. 
> So I ended up writing an implementation myself using little memory and the class RandomAccessFile - see attached file. It's used as follows:
> int blockSize = 4096; // only that much memory is needed, no matter how big the file is
> ReversedLinesFileReader reversedLinesFileReader = new ReversedLinesFileReader (myFile, blockSize, "UTF-8"); // encoding is supported
> String line = null;
> while((line=reversedLinesFileReader.readLine())!=null) {
>   ... // use the line
>   if(enoughLinesSeen) {
>      break;  
>   }
> }
> reversedLinesFileReader.close();
> I believe this could be useful for other people as well!

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Closed] (IO-288) Supply a ReversedLinesFileReader

Posted by "Gary D. Gregory (Closed) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/IO-288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Gary D. Gregory closed IO-288.
------------------------------


Version 2.2 has been released and addresses this issue.
                
> Supply a ReversedLinesFileReader 
> ---------------------------------
>
>                 Key: IO-288
>                 URL: https://issues.apache.org/jira/browse/IO-288
>             Project: Commons IO
>          Issue Type: New Feature
>          Components: Utilities
>            Reporter: Georg Henzler
>             Fix For: 2.2
>
>         Attachments: ReversedLinesFileReader0.3.zip
>
>
> I needed to analyse a log file today and I was looking for a ReversedLinesFileReader: A class that behaves exactly like BufferedReader except that it goes from bottom to top when readLine() is called. I didn't find it in IOUtils and the internet didn't help a lot either, e.g. http://www.java2s.com/Tutorial/Java/0180__File/ReversingaFile.htm is a fairly inefficient - the log files I'm analysing are huge and it is not a good idea to load the whole content in the memory. 
> So I ended up writing an implementation myself using little memory and the class RandomAccessFile - see attached file. It's used as follows:
> int blockSize = 4096; // only that much memory is needed, no matter how big the file is
> ReversedLinesFileReader reversedLinesFileReader = new ReversedLinesFileReader (myFile, blockSize, "UTF-8"); // encoding is supported
> String line = null;
> while((line=reversedLinesFileReader.readLine())!=null) {
>   ... // use the line
>   if(enoughLinesSeen) {
>      break;  
>   }
> }
> reversedLinesFileReader.close();
> I believe this could be useful for other people as well!

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira