-
Notifications
You must be signed in to change notification settings - Fork 519
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Code hotspots #2412
Comments
Do you know why it is faster? What's the source of slowdown in the current version? Btw, how did you perform the benchmark? /cc @jeffmendoza |
1k seems good to me. |
Might be hard to test that ahead of time, especially as the repos could change while doing the tests. I can play around with some comparisons using release test data before implementing anything.
with the built-in benchmarks. I had a harness like this for both functions and then ran the benchmarks func BenchmarkIsText(b *testing.B) {
buf, err := os.ReadFile("testdata/script-pkg-managers")
if err != nil {
b.Fatalf("read testdata: %v", err)
}
for i := 0; i < b.N; i++ {
isText(buf)
}
} |
Thanks for the additional info. So inlining code is what speed things up. I'm surprised there's a 2x speedup on 1k files due to that. In fact that seems weird to me because there's a single stack frame created to call the function... |
In a loop, called for each rune in the file contents.
👍 SGTM. |
A significant portion (5.5%) of the cron job is spent in
raw.isText
scorecard/checks/raw/binary_artifact.go
Lines 170 to 180 in 1fa7910
There's some experimental code that also does this, and has better performance. I plan to import/copy the implementation over.
Should be an easy performance win, but would change the behavior slightly as it only checks the first 1024 bytes.
Since you implemented the heuristic, thoughts @laurentsimon ? I know the code was introduced to reduce false positives, so should still be valid for that.
Benchmarking shows a 2x speedup on small files (1KiB). 6x speedup on medium files (3KiB). And on gigantic files (like our 50MB
projects.csv
file), the difference is 5-6 orders of magnitude.The text was updated successfully, but these errors were encountered: