Skip to content

Heuristically filter out minified files #114

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Aug 11, 2016
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 5 additions & 2 deletions bin/eslint.js
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@ console.log = console.error;

var eslint = require('../lib/eslint-patch')(require('eslint'));

var BatchSanitizer = require("../lib/batch_sanitizer");
var CLIEngine = eslint.CLIEngine;
var docs = eslint.docs;
var fs = require("fs");
Expand Down Expand Up @@ -183,17 +184,19 @@ function analyzeFiles() {
var batchNum = 0
, batchSize = 10
, batchFiles
, batchReport;
, batchReport
, sanitizedBatchFiles;

while(analysisFiles.length > 0) {
batchFiles = analysisFiles.splice(0, batchSize);
sanitizedBatchFiles = (new BatchSanitizer(batchFiles)).sanitizedFiles();

if (debug) {
process.stderr.write("Analyzing: " + batchFiles + "\n");
}

runWithTiming("analyze-batch-" + batchNum, function() {
batchReport = cli.executeOnFiles(batchFiles);
batchReport = cli.executeOnFiles(sanitizedBatchFiles);
});
runWithTiming("report-batch" + batchNum, function() {
batchReport.results.forEach(function(result) {
Expand Down
41 changes: 41 additions & 0 deletions lib/batch_sanitizer.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
var fs = require("fs");

var MINIFIED_AVG_LINE_LENGTH_CUTOFF = 100;

function BatchSanitizer(files, stderr) {
this.files = files;
this.stderr = stderr || process.stderr;
}

BatchSanitizer.prototype.sanitizedFiles = function() {
var sanitizedFiles = [];

for(var i = 0; i < this.files.length; i++) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason you did this with a for loop and not something like forEach?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no reason. Want me to switch to forEach instead?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Meh - don't feel strongly. This is fine!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was because I honestly forgot Node had forEach.

if (this.isMinified(this.files[i])) {
this.stderr.write("WARN: Skipping " + this.files[i] + ": it appears to be minified\n");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if we catch a false positive? How does the user analyze that file?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a valid concern. In manual testing I didn't find false positives, but the experience would be not getting a file analyzed:

  • file reported as skipped in stderr output for engine on builds page
  • no other eslint issues reported for that file on .com
  • file still rated

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok - good to be aware of that as a caveat here in case anyone brings it up

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think this will be something to think about if we do start seeing false positives. Either tweaking the heuristic or adding some "no really I mean it" config flag. I figure we'll cross that bridge when we get to it. In the things we tested while developing it, no "real" code really naturally came anywhere close to crossing this threshold.

} else {
sanitizedFiles.push(this.files[i]);
}
}

return sanitizedFiles;
};

BatchSanitizer.prototype.isMinified = function(path) {
var buf = fs.readFileSync(path)
, newline = "\n".charCodeAt(0)
, lineCount = 0
, charsSeen = 0;

for(var i = 0; i < buf.length; i++) {
if (buf[i] === newline) {
lineCount++;
} else {
charsSeen++;
}
}

return (charsSeen / lineCount) >= MINIFIED_AVG_LINE_LENGTH_CUTOFF;
};

module.exports = BatchSanitizer;
17 changes: 17 additions & 0 deletions test/batch_sanitizer_test.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
var BatchSanitizer = require("../lib/batch_sanitizer")
, expect = require("chai").expect;

describe("BatchSanitizer module", function() {
describe(".sanitizedFiles()", function() {
it("filters out files that appear minified", function() {
var stderr = { contents: "", write: function(str) { this.contents += str; } }
, sanitizer = new BatchSanitizer(
["./test/fixtures/minified_batch/minified.js", "./test/fixtures/minified_batch/unminified.js"],
stderr
);

expect(sanitizer.sanitizedFiles()).to.eql(["./test/fixtures/minified_batch/unminified.js"]);
expect(stderr.contents).to.match(/WARN: Skipping/);
});
});
});
Loading