-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Description
🐛 Bug Report
stream.xlsx.WorkbookReader.parse()
does not cleanup the temporary xml files it creates if iteration is stopped prematurely.
This happens whenever iteration is prematurely stopped, e.g.:
const workbookReader = new exceljs.stream.xlsx.WorkbookReader(/*...*/);
for await (const worksheetReader of workbookReader) {
// stop iteration
break;
}
which causes all temporary xml files generated by the WorkbookReader to not be cleaned up until process termination.
This is primarily a problem for long-running server applications, since those may be up for very long timespans (i noticed this bug due to 50GB+ of xml files hanging around in the tmp directory of an express application that had been running for ~ 2 months)
Lib version: 4.3.0
Steps To Reproduce
Here's a code example that demonstrates the problem:
-
any-excel-file.xlsx
can be any excel file (contents don't matter) -
For
exampleGood()
"Cleanup called" will be outputted for each worksheet in the excel file (and the associated temporary file will be deleted) -
For
exampleBad()
"Cleanup called" will NEVER be outputted, and all associated temporary files will remain until process termination.
const exceljs = require("exceljs");
const fs = require("fs");
const tmp = require("tmp");
// hooking tmp.file to output "Cleanup called" whenever
// the cleanupCallback gets called
const origTmpFile = tmp.file;
tmp.file = function(cb) {
origTmpFile(function(err, path, fd, cleanupCallback) {
cb(err, path, fd, function() {
console.log("Cleanup called");
cleanupCallback();
});
});
};
async function exampleGood() {
const workbookReader = new exceljs.stream.xlsx.WorkbookReader(fs.createReadStream("./any-excel-file.xlsx"), {});
for await (const worksheetReader of workbookReader) {
// iteration will run until the end
}
}
async function exampleBad() {
const workbookReader = new exceljs.stream.xlsx.WorkbookReader(fs.createReadStream("./any-excel-file.xlsx"), {});
for await (const worksheetReader of workbookReader) {
// aborting iteration prematurely
// throw or return would have the same effect
break;
}
}
async function main() {
console.log("GOOD:");
await exampleGood();
// "Cleanup called" will NOT be outputted in this case:
console.log("BAD:");
await exampleBad();
console.log("DONE");
}
main();
The expected behaviour:
Temporary files should be cleaned up if iteration stops before reaching the end.
Possible solution (optional, but very helpful):
for await
/ for
in javascript will automatically call .return()
on the (async-) iterable if it doesn't complete iteration (to give the iterable a chance to cleanup)
For (async-)generator iterables .return()
will result in finally
blocks around the yield
statement to be executed, e.g.:
async function* foo() {
try {
yield 1;
console.log("AFTER YIELD");
yield 2;
yield 3;
} finally {
console.log("FINALLY");
}
}
for await(const _ of foo()) {
break;
}
will output "FINALLY" (but NOT "AFTER YIELD")
So in this case adding a large try-finally around parse()
should suffice to cleanup the temporary files.