Skip to content

Large excels - optimize performance of writing file by excelJS + optimize generated file (MS excel opens it much faster) #1018

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Nov 12, 2019

Conversation

pzawadzki82
Copy link

No description provided.

@@ -82,6 +82,8 @@ class StylesXform extends BaseXform {
// add default fills
this._addFill({type: 'pattern', pattern: 'none'});
this._addFill({type: 'pattern', pattern: 'gray125'});

this.weakMap = new Map()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch! Quick question - any good reason not to use a WeakMap here?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No reason. i'll change it

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed

@@ -28,17 +28,65 @@ function assignBool(definedName, attributes, name, defaultValue) {
}
}

function mergeDataValidations(model) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add a comment describing how the model is being optimised here?

Copy link
Author

@pzawadzki82 pzawadzki82 Nov 5, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

currently excelJS model stores dataValidation per each cell. So if there is many cells data validation is created for every cell. I'll give you example: on my project there 5k rows x 20 columns ~ 100k cells. So in this case MS Excel openes generated file very long (minutes) or if some number is exceeded it reports that ms excel is corrupted.
In many cases validation is the same for bigger range (in our case it is while column but header). So my optimization just replaces multiple dataValidation xml nodes(with the same formula and continous range) with with one where address is defined as range instead one cell. So on my project instead 100k dataValidation nodes there is 20 and this file is opened by MS Excel fast and without errors. This modification does not change excelJS internal model - optimization is made just before save

@guyonroche guyonroche merged commit 2c3acb7 into exceljs:master Nov 12, 2019
@guyonroche
Copy link
Collaborator

@pzawadzki82 thanks for this PR - I had to modify it a little so that it would handle parsing the files too and I've extended it to include rectangular ranges

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants