Hello guys, today I would like to write about How to Avoid JavaScript Heap Out of Memory in NodeJS.
JavaScript heap out of memory is normally happend when you are reading very large file.
But sometimes this could happen when you are in big loop process to push data into array.

What is JavaScript Heap Out of Memory?

JavaScript Heap Out of Memory means you have reach the limit of NodeJS memory usage. The strict standar limit memory usage in V8 is around 1.7 GB. So you have to increase this manually if you have reach this limit.

Using Stream

When you are going to handle big file in NodeJS, the best practice is using stream. Because stream will break apart process into line per line, so the CPU can breath longer when processing big data. But using native stream is still not enough for very very large data file. You still get JavaScript Heap Out of Memory. So I going to describe what is the simple way and the best way of using stream to avoid JavaScript Heap Out of Memory below here.

Simple Way

After I take a look searching through google and make some research. The very simple way to avoid JavaScript Heap Out of Memory in NodeJS is using createReadStream.

  • Example for read file
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    const fs = require('fs');

    var reader = fs.createReadStream('your/path/file');
    reader.on('data', (line) => {
    // handle data line per line here
    });
    reader.on('error',function(err) {
    console.log(err);
    });
    reader.on('done', function() {
    console.log('Reading file success!');
    });

If you’re file is under 400Mb, then above method is still work well. But how if you’re file is very large? like 1Gb or 2Gb in size? Actually this way is still fails (not working), you still get JavaScript Heap Out of Memory.

To solve this problem you should run extra arguments when executing your js.

1
node --max-old-space-size=4096 yourFile.js

Very simple way because you have already increase the strict standar limit of memory usage in NodeJS. This is working but actually I don’t recommend to use this way.

Best Way

So what is the best way to handle big file? The answer is turn your native stream into event stream.
Yeah I know this is not simple, but thanks to him, this guy name Dominictarr was created the event-stream library which is this library is the best eficient way to play with stream in NodeJS.

Example for read file

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
const fs = require('fs');
const es = require('event-stream');

fs.createReadStream('your/path/file')
.pipe(
es.mapSync(function(line) {
// handle data line per line here
})
)
.on('error',function(err) {
console.log(err);
})
.on('end', function() {
console.log('Reading file success!');
});

I successfully to read log file around 2Gb in size with this way without have to set

1
--max-old-space-size=4096

Big JSON File Problem

I facing another issue when the file is json string. That above method is very slow, it is because to parse your data to json object. But thanks to him again, Dominictarr create another JSONStream library which is to improve the performance in event-stream while parsing to json object.

Example to parse big json

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
const fs = require('fs');
const es = require('event-stream');
const jsonStream = require('JSONStream');

var json = [];
fs.createReadStream('your/path/file')
.pipe(jsonStream.parse())
.pipe(
es.mapSync(function(line) {
json.push(line);
})
)
.on('error',function(err) {
console.log(err);
})
.on('end', function() {
console.log('Reading file success!');
});

event-stream + JSONStream will make improve the performance speed 3x faster for parsing string to json object.

Conclusion

Using stream is the best practice to handle big data. But not for speed. Even you are successfully to avoid JavaScript Heap Out of Memory in NodeJS, reading big file is always slow.

Always remember that, just stop or better don’t ever think to use file for saving big data in the future. If right now you having big file for example like log file, just try to move it into database engine like ElasticSearch or Hadoop. This will make your life easier to manage your log data.

Thank You.