You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@tika.apache.org by Will Jones <sy...@gmail.com> on 2017/01/03 17:14:16 UTC

Memory issues with the Tika Facade

Hi,

Big fan of what you are doing with Apache Tika. I have been using the Tika
facade to fetch metadata on each file in a directory containing a large
number of files.

It returns the data I need, but the running process very quickly consumes a
large amount of memory as it proceeds through the files.

What am I doing wrong? I have attached the code required to reproduce my
problem below.



public class TikaTest {

    public void tikaProcess(Path filePath) {
        Tika t = new Tika();
        try {
            Metadata metadata = new Metadata();

            String result = t.parse(filePath, metadata).toString();
        }catch (Exception e){
            e.printStackTrace();
        }
    }

    public static void main(String[] args) {
        TikaTest tt = new TikaTest();
        try {
            Files.list(Paths.get("g:/somedata/")).forEach(
                    path -> tt.tikaProcess(path)
            );
        }catch (Exception e) {
            e.printStackTrace();
        }
    }
}