BufferedReader no longer buffering after a while?

Posted by BobTurbo on Stack Overflow See other posts from Stack Overflow or by BobTurbo
Published on 2011-01-02T02:26:15Z Indexed on 2011/01/02 7:53 UTC
Read the original article Hit count: 200

Filed under:
|

Sorry I can't post code but I have a bufferedreader with 50000000 bytes set as the buffer size. It works as you would expect for half an hour, the HDD light flashing every two minutes or so, reading in the big chunk of data, and then going quiet again as the CPU processes it. But after about half an hour (this is a very big file), the HDD starts thrashing as if it is reading one byte at a time. It is still in the same loop and I think I checked free ram to rule out swapping (heap size is default).

Probably won't get any helpful answers, but worth a try.

OK I have changed heap size to 768mb and still nothing. There is plenty of free memory and java.exe is only using about 300mb.

Now I have profiled it and heap stays at about 200MB, well below what is available. CPU stays at 50%. Yet the HDD starts thrashing like crazy. I have.. no idea. I am going to rewrite the whole thing in c#, that is my solution.

Here is the code (it is just a throw-away script, not pretty):

    BufferedReader s = null;
    HashMap<String, Integer> allWords = new HashMap<String, Integer>();
    HashSet<String> pageWords = new HashSet<String>();
    long[] pageCount = new long[78592];
    long pages = 0;

    Scanner wordFile = new Scanner(new BufferedReader(new FileReader("allWords.txt")));
    while (wordFile.hasNext()) {
        allWords.put(wordFile.next(), Integer.parseInt(wordFile.next()));
    }
    s = new BufferedReader(new FileReader("wikipedia/enwiki-latest-pages-articles.xml"), 50000000);
    StringBuilder words = new StringBuilder();
    String nextLine = null;
    while ((nextLine = s.readLine()) != null) {
        if (a.matcher(nextLine).matches()) {
            continue;
        }
        else if (b.matcher(nextLine).matches()) {
            continue;
        }
        else if (c.matcher(nextLine).matches()) {
            continue;
        }
        else if (d.matcher(nextLine).matches()) {
            nextLine = s.readLine();
            if (e.matcher(nextLine).matches()) {
                if (f.matcher(s.readLine()).matches()) {
                    pageWords.addAll(Arrays.asList(words.toString().toLowerCase().split("[^a-zA-Z]")));
                    words.setLength(0);
                    pages++;
                    for (String word : pageWords) {
                        if (allWords.containsKey(word)) {
                            pageCount[allWords.get(word)]++;
                        }
                        else if (!word.isEmpty() && allWords.containsKey(word.substring(0, word.length() - 1))) {
                            pageCount[allWords.get(word.substring(0, word.length() - 1))]++;
                        }
                    }
                    pageWords.clear();
                }
            }
        }
        else if (g.matcher(nextLine).matches()) {
            continue;
        }
        words.append(nextLine);
        words.append(" ");
    }

© Stack Overflow or respective owner

Related posts about java

Related posts about bufferedreader