-
Notifications
You must be signed in to change notification settings - Fork 147
Adding a new script #175
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Adding a new script #175
Changes from all commits
Commits
Show all changes
22 commits
Select commit
Hold shift + click to select a range
4e06283
Add new toolbox script and readme
0b30553
Renamed get-busiest-collections-fixed.js to get-busiest-collections.j…
d54e0ca
Moved get-busiest-collections.js to getBusiestCollection directory
23429f6
Remove busiest collections data and its associated README file from t…
6ff7010
Updated README.md in getBusiestCollection directory
b76cffb
Remove busiest_collections.json file and add a new line in get-busies…
18eacc1
Update README.md to clarify event types in get-busiest-collections.js
8ea35c6
Update README.md for better clarity on usage.
081b700
Delete the temporary output file
69047b4
Include all fields in the output even if there are no writes of that …
dd9f019
Merge branch 'mongodb:master' into scripts
riteshsaigal 7e0cfd3
Modified README.md to include information about the getBusiestCollect…
fe8d21b
Add link to getBusiestCollection script in README.md
38bc770
Modified README.md to include information about the getBusiestCollect…
e72ef00
Add Get busiest collection script
fbd6089
Update migration/toolbox/getBusiestCollection/get-busiest-collections.js
riteshsaigal b50a1da
Update migration/toolbox/getBusiestCollection/get-busiest-collections.js
riteshsaigal 68ee09e
Update migration/toolbox/getBusiestCollection/get-busiest-collections.js
riteshsaigal c0cc279
Update migration/toolbox/getBusiestCollection/README.md
riteshsaigal b3997d1
Update migration/toolbox/getBusiestCollection/README.md
riteshsaigal 3de46be
Merge branch 'scripts' of https://github.com/riteshsaigal/RS-MF-Migra…
d57564f
Updated README.md to reflect the correct usage of the script.
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,54 @@ | ||
| ## Get Busiest Collections | ||
|
|
||
| **Script:** `get-busiest-collections.js` | ||
|
|
||
| Gets the busiest collections in terms of writes (delete/insert/replace/update) as recorded in the mongosync logs in the CEA phase | ||
|
|
||
| ### Usage | ||
|
|
||
| ```bash | ||
| node get-busiest-collections.js </path/to/mongosynclog/files-or-directory> [--markdown] [--no-console] | ||
| ``` | ||
|
|
||
| The script expects an existing file or directory path. It does not expand wildcard/glob patterns (such as `*.log`) | ||
| itself, so any wildcards must be expanded by your shell before invoking the script (for example, by relying on | ||
| unquoted shell globs). On platforms or shells where patterns are not expanded automatically, pass explicit paths. | ||
|
|
||
| ### Example Output | ||
|
|
||
| ``` | ||
| Namespace | Total Write Ops | delete | insert | replace | update | ||
| -------------------------------- | ----------------- | ---------- | ---------- | ---------- | ---------- | ||
| db0.test2 | 29,847 | 5,503 | 9,419 | 1,234 | 13,691 | ||
| db0.test5 | 7,289 | 2,456 | 2,438 | 150 | 2,245 | ||
| db0.test1 | 7,253 | 2,476 | 2,450 | 100 | 2,227 | ||
| db0.test4 | 7,176 | 2,414 | 2,386 | 120 | 2,256 | ||
| db0.test3 | 7,076 | 2,352 | 2,360 | 130 | 2,234 | ||
| ... | ||
| ... | ||
|
|
||
| Data successfully exported to "busiest_collections.json". You can open it for offline analysis. | ||
| ``` | ||
|
|
||
|
|
||
| ### License | ||
|
|
||
| [Apache 2.0](http://www.apache.org/licenses/LICENSE-2.0) | ||
|
|
||
| DISCLAIMER | ||
| ---------- | ||
| Please note: all tools/ scripts in this repo are released for use "AS IS" **without any warranties of any kind**, | ||
| including, but not limited to their installation, use, or performance. We disclaim any and all warranties, either | ||
| express or implied, including but not limited to any warranty of noninfringement, merchantability, and/ or fitness | ||
| for a particular purpose. We do not warrant that the technology will meet your requirements, that the operation | ||
| thereof will be uninterrupted or error-free, or that any errors will be corrected. | ||
|
|
||
| Any use of these scripts and tools is **at your own risk**. There is no guarantee that they have been through | ||
| thorough testing in a comparable environment and we are not responsible for any damage or data loss incurred with | ||
| their use. | ||
|
|
||
| You are responsible for reviewing and testing any scripts you run *thoroughly* before use in any non-testing | ||
| environment. | ||
|
|
||
| Thanks, | ||
| The MongoDB Support Team | ||
236 changes: 236 additions & 0 deletions
236
migration/toolbox/getBusiestCollection/get-busiest-collections.js
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,236 @@ | ||
| const fs = require('fs'); | ||
| const path = require('path'); | ||
| const readline = require('readline'); | ||
|
|
||
| // Get the command-line arguments | ||
| const args = process.argv.slice(2); // Skip the first two arguments: `node` and script name | ||
|
|
||
| const outputAsMarkdown = args.includes('--markdown'); // Check for the optional --markdown flag | ||
| const suppressConsole = args.includes('--no-console'); // Check for the optional --no-console flag | ||
|
|
||
| // Separate flags from input paths | ||
| const inputPaths = args.filter((a) => !a.startsWith('--')); | ||
|
|
||
| if (inputPaths.length < 1) { | ||
| console.error('\x1b[31m%s\x1b[0m', 'Error: Please provide the path to a mongosync log file, a directory, or a wildcard pattern.'); | ||
| console.error(`Usage: node ${path.basename(process.argv[1])} <path/to/log/files-or-directory> [--markdown] [--no-console]`); | ||
| console.error(' Accepts a single file, multiple files (e.g. via wildcard), or a directory.'); | ||
|
Comment on lines
+15
to
+17
|
||
| console.error(' If a directory is provided, all files matching mongosync*.log will be processed.'); | ||
| process.exit(1); // Exit with error code 1 | ||
| } | ||
|
|
||
| // Resolve the list of log files to process | ||
| let filePaths = []; | ||
| for (const inputPath of inputPaths) { | ||
| if (!fs.existsSync(inputPath)) { | ||
| console.error('\x1b[31m%s\x1b[0m', `Error: "${inputPath}" does not exist.`); | ||
| process.exit(1); | ||
| } | ||
| const stat = fs.statSync(inputPath); | ||
| if (stat.isDirectory()) { | ||
| const dirFiles = fs.readdirSync(inputPath) | ||
| .filter((f) => f.startsWith('mongosync') && f.endsWith('.log')) | ||
| .sort() | ||
| .map((f) => path.join(inputPath, f)); | ||
| if (dirFiles.length === 0) { | ||
| console.error('\x1b[31m%s\x1b[0m', `Error: No mongosync*.log files found in directory "${inputPath}".`); | ||
| process.exit(1); | ||
| } | ||
| filePaths.push(...dirFiles); | ||
| } else { | ||
| filePaths.push(inputPath); | ||
| } | ||
| } | ||
|
|
||
| // Sort all resolved files and remove duplicates | ||
| filePaths = [...new Set(filePaths)].sort(); | ||
|
|
||
| // Function to process one or more JSON Lines log files | ||
| async function processFiles(filePaths) { | ||
| const namespaceSummary = {}; | ||
| const eventTypesSet = new Set(['delete', 'insert', 'replace', 'update']); | ||
| const MAX_LINES_PER_FILE = 5000000; // Optional safeguard for very large files | ||
| for (const filePath of filePaths) { | ||
| if (!suppressConsole) { | ||
| console.log(`Processing file: ${filePath}`); | ||
| } | ||
|
|
||
| // Setup a readable stream | ||
| const fileStream = fs.createReadStream(filePath); | ||
| const rl = readline.createInterface({ | ||
| input: fileStream, | ||
| crlfDelay: Infinity, | ||
| }); | ||
|
|
||
|
riteshsaigal marked this conversation as resolved.
|
||
| fileStream.on('error', (err) => { | ||
| const message = `Error reading file "${filePath}": ${err.message}`; | ||
| if (!suppressConsole) { | ||
| console.error('\x1b[31m%s\x1b[0m', message); | ||
| } | ||
| rl.close(); | ||
| }); | ||
| let lineCounter = 0; // Count lines processed per file | ||
|
|
||
| // Read each line of the file | ||
| for await (const line of rl) { | ||
| lineCounter++; | ||
| if (lineCounter > MAX_LINES_PER_FILE) { | ||
| if (!suppressConsole) { | ||
| console.warn('\x1b[33m%s\x1b[0m', `Warning: Processing of "${filePath}" stopped after ${MAX_LINES_PER_FILE} lines for safety.`); | ||
| } | ||
|
riteshsaigal marked this conversation as resolved.
|
||
| rl.close(); | ||
| fileStream.destroy(); | ||
| break; | ||
| } | ||
|
Comment on lines
+77
to
+84
|
||
|
|
||
| try { | ||
| // Parse the line into a JSON object | ||
| const jsonObj = JSON.parse(line); | ||
|
|
||
| // Check if the message field matches | ||
| if (jsonObj.message === 'Recent CRUD change event statistics.') { | ||
| const busiestCollections = jsonObj.recentCRUDStatistics?.busiestCollections || []; | ||
|
|
||
| // Loop through busiestCollections to summarize namespaces | ||
| busiestCollections.forEach((collection) => { | ||
| const namespace = collection.namespace; | ||
| const totalEvents = collection.totalEvents; | ||
| const totalEventsPerType = collection.totalEventsPerType || {}; | ||
|
|
||
| if (namespace) { | ||
| // Initialize namespace entry if not already present | ||
| if (!namespaceSummary[namespace]) { | ||
| namespaceSummary[namespace] = { totalEvents: 0, totalEventsPerType: {} }; | ||
| } | ||
|
|
||
| // Accumulate totalEvents | ||
| namespaceSummary[namespace].totalEvents += totalEvents; | ||
|
|
||
| // Accumulate totalEventsPerType for each type | ||
| for (const [type, count] of Object.entries(totalEventsPerType)) { | ||
| if (!namespaceSummary[namespace].totalEventsPerType[type]) { | ||
| namespaceSummary[namespace].totalEventsPerType[type] = 0; | ||
| } | ||
| namespaceSummary[namespace].totalEventsPerType[type] += count; | ||
|
|
||
| // Add type to the set for later use in header creation | ||
| eventTypesSet.add(type); | ||
| } | ||
| } | ||
| }); | ||
| } | ||
| } catch (error) { | ||
| if (!suppressConsole) { | ||
| const previewLength = 80; | ||
| const linePreview = | ||
| typeof line === 'string' | ||
| ? line.slice(0, previewLength) + (line.length > previewLength ? '...' : '') | ||
| : ''; | ||
| console.error( | ||
| `Error processing line ${lineCounter} in file "${filePath}".` + | ||
| (linePreview ? ` Line preview: ${linePreview}` : '') | ||
| ); | ||
| console.error('Error details:', error.message); | ||
| } | ||
| } | ||
| } | ||
| } | ||
|
|
||
| // Convert the summary object into an array for sorting | ||
| const summarizedData = Object.entries(namespaceSummary) | ||
| .map(([namespace, { totalEvents, totalEventsPerType }]) => ({ | ||
| namespace, | ||
| totalEvents, | ||
| totalEventsPerType, | ||
| })) | ||
| .sort((a, b) => b.totalEvents - a.totalEvents); // Sort by totalEvents in descending order | ||
|
|
||
| const eventTypes = Array.from(eventTypesSet).sort(); // Sort event types alphabetically for consistent output | ||
|
|
||
| // Dynamically calculate column widths | ||
| const columnWidthNamespace = Math.max(30, ...summarizedData.map(({ namespace }) => namespace.length)) + 2; | ||
| const columnWidthEvents = Math.max(15, ...summarizedData.map(({ totalEvents }) => totalEvents.toLocaleString().length)) + 2; | ||
| const columnWidthsByType = eventTypes.map( | ||
| (type) => | ||
| Math.max(type.length + 2, ...summarizedData.map(({ totalEventsPerType }) => (totalEventsPerType[type] || 0).toLocaleString().length)) + 2 | ||
| ); | ||
|
|
||
| // Output the sorted result in a readable column format | ||
| if (!suppressConsole) { | ||
| console.log('\n# Namespace Statistics\n'); | ||
| console.log('# Sorted by descending total number of write operations\n'); | ||
|
|
||
| // Format and print the header | ||
| const headerNamespace = 'Namespace'.padEnd(columnWidthNamespace); | ||
| const headerEvents = 'Total Write Ops'.padStart(columnWidthEvents); | ||
| const headerTypes = eventTypes.map((type, i) => type.padStart(columnWidthsByType[i])).join(' | '); | ||
|
|
||
| console.log(`${headerNamespace} | ${headerEvents} | ${headerTypes}`); | ||
|
|
||
| // Fixing separator row padding | ||
| const separator = `${'-'.repeat(columnWidthNamespace)} | ${'-'.repeat(columnWidthEvents)} | ${columnWidthsByType.map((width) => '-'.repeat(width)).join(' | ')}`; | ||
| console.log(separator); | ||
|
|
||
| // Format and print each row of data | ||
| summarizedData.forEach(({ namespace, totalEvents, totalEventsPerType }) => { | ||
| const typeCounts = eventTypes | ||
| .map((type, i) => (totalEventsPerType[type] || 0).toLocaleString().padStart(columnWidthsByType[i])) | ||
| .join(' | '); | ||
|
|
||
| console.log(`${namespace.padEnd(columnWidthNamespace)} | ${totalEvents.toLocaleString().padStart(columnWidthEvents)} | ${typeCounts}`); | ||
| }); | ||
| } | ||
|
|
||
| // Optionally export the data to JSON | ||
| const outputPath = 'busiest_collections.json'; | ||
| try { | ||
| fs.writeFileSync(outputPath, JSON.stringify(summarizedData, null, 2)); | ||
| if (!suppressConsole) { | ||
| console.log('\x1b[34m%s\x1b[0m', `\nData successfully exported to "${outputPath}". You can open it for offline analysis.\n`); | ||
| } | ||
| } catch (error) { | ||
| console.error('\x1b[31m%s\x1b[0m', `Error writing to "${outputPath}": ${error.message}`); | ||
| } | ||
|
|
||
| // If --markdown flag is provided, generate Markdown output | ||
| if (outputAsMarkdown) { | ||
| const markdownOutput = generateMarkdown(summarizedData, eventTypes, columnWidthNamespace, columnWidthEvents, columnWidthsByType); | ||
| const markdownPath = 'busiest_collections.md'; | ||
| try { | ||
| fs.writeFileSync(markdownPath, markdownOutput); | ||
| if (!suppressConsole) { | ||
| console.log('\x1b[34m%s\x1b[0m', `\nMarkdown results successfully exported to "${markdownPath}".\n`); | ||
| } | ||
| } catch (error) { | ||
| console.error('\x1b[31m%s\x1b[0m', `Error writing to "${markdownPath}": ${error.message}`); | ||
| } | ||
| } | ||
| } | ||
|
|
||
| // Function to generate Markdown output | ||
| function generateMarkdown(data, eventTypes, columnWidthNamespace, columnWidthEvents, columnWidthsByType) { | ||
| let markdown = `# Namespace Statistics\n\n`; | ||
| markdown += `# Sorted by descending total number of write operations\n\n`; | ||
|
|
||
| // Use dynamic column widths for consistent alignment | ||
| const headerNamespace = 'Namespace'.padEnd(columnWidthNamespace); | ||
| const headerEvents = 'Total Write Ops'.padStart(columnWidthEvents); | ||
| const headerTypes = eventTypes.map((type, i) => type.padStart(columnWidthsByType[i])).join(' | '); | ||
|
|
||
| markdown += `| ${headerNamespace} | ${headerEvents} | ${headerTypes} |\n`; | ||
| markdown += `|${'-'.repeat(columnWidthNamespace + 2)}|${'-'.repeat(columnWidthEvents + 2)}|${columnWidthsByType.map((width) => '-'.repeat(width + 2)).join('|')}|\n`; | ||
|
|
||
| data.forEach(({ namespace, totalEvents, totalEventsPerType }) => { | ||
| const typeCounts = eventTypes.map((type, i) => (totalEventsPerType[type] || 0).toLocaleString().padStart(columnWidthsByType[i])).join(' | '); | ||
| markdown += `| ${namespace.padEnd(columnWidthNamespace)} | ${totalEvents.toLocaleString().padStart(columnWidthEvents)} | ${typeCounts} |\n`; | ||
| }); | ||
|
|
||
| return markdown; | ||
| } | ||
|
|
||
| // Run the function with proper error handling | ||
| processFiles(filePaths).catch((error) => { | ||
| console.error('\x1b[31m%s\x1b[0m', `Error: ${error.message}`); | ||
| process.exit(1); | ||
| }); | ||
|
|
||
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.