-
Notifications
You must be signed in to change notification settings - Fork 71
Description
IMHO
is correct to flag as follows ⬇️
find ... -exec wc -l {} \;runs onewcprocess per file and scans every file type (-name '*'). On larger repos this can be very slow and risks hitting workflow timeouts. Prefer restricting to source extensions and using a singlewcinvocation (e.g.,-exec wc -l {} +or-print0 | xargs -0 wc -l) while keeping existing path exclusions.
- "find . -type f \\( -name '*.go' -o -name '*.py' -o -name '*.ts' -o -name '*.js' -o -name '*.rb' -o -name '*.java' -o -name '*.rs' -o -name '*.cs' -o -name '*.cpp' -o -name '*.c' \\) -not -path '*/.git/*' -not -path '*/node_modules/*' -not -path '*/vendor/*' -not -path '*/dist/*' -not -path '*/build/*' -not -path '*/.next/*' -not -path '*/target/*' -not -path '*/__pycache__/*' -not -path '*/coverage/*' -not -path '*/venv/*' -not -path '*/.tox/*' -not -path '*/.mypy_cache/*' -print0 | xargs -0 wc -l 2>/dev/null"
What about doing something like git ls-tree -r -t -l --full-name HEAD | grep \.c\\?jsx\\?$ | sort -rn -k 4 | head -n 10 for a near-instant calculation even on larger repos?
Example output
± git ls-tree -r -t -l --full-name HEAD | grep \.c\\?jsx\\?$ | sort -rn -k 4 | head -n 10
100644 blob 65d31932231ed13af4fc89e6d6a427f1355a5159 61617 apps/api/src/resources/payment/payment.service.js
100644 blob dffccf9f746756dbd0126cb6320a94ea660c87b6 51076 apps/testers-portal-api/src/resources/test/get/validation-helpers.js
100644 blob eb680232f5ba857d69d3edbdf525aaac80ed7719 50382 apps/web/src/client/pages/test-results/survey/survey.jsx
100644 blob b99b9e9e942cf1d3691e57bbcb4b22e4385e79c1 48097 apps/admin/src/client/components/tests-table/tests-table.jsx
100644 blob 95ed5702d9652b6e5b2a0a3b168d8af9c72f6e97 47869 apps/api/src/resources/test/test.service.js
100644 blob 8c750d70131c032c6f54f6eb3a1d8a15c4891374 47060 apps/web/src/client/pages/test/steps/payment/payment.jsx
100644 blob 4a445db81b6c76916039e16945acf9338e97143b 43841 packages/lib/client/components/logo/logo-type.jsx
100644 blob 74fca97dcab4310b3bf9681792d14451a2af35b7 42685 apps/api/src/helpers/create-survey-results-csv.helper.spec.js
100644 blob 09a19ca908cbcaa442e31adc7c191814b44ce4c5 41439 apps/web/src/client/components/common-payment/common-payment.jsx
100644 blob 867a95cf3fb3423e90f4b6535d0a21145c46e43f 41114 apps/web/src/client/pages/test-results/standard-test/standard-test.jsx
grep statement specifics need to be figured out, but this also avoids manually ignorelisting a ton of potentially unrelated files in a general purpose solution such as we're shipping here.
Source: https://stackoverflow.com/questions/9456550/how-can-i-find-the-n-largest-files-in-a-git-repository