Skip to content

feat: replace Flask backend with Node.js/Express#269

Open
TusharB-07 wants to merge 1 commit into
openmainframeproject:masterfrom
TusharB-07:feat/replace-flask-with-node
Open

feat: replace Flask backend with Node.js/Express#269
TusharB-07 wants to merge 1 commit into
openmainframeproject:masterfrom
TusharB-07:feat/replace-flask-with-node

Conversation

@TusharB-07
Copy link
Copy Markdown
Collaborator

Summary

Replaces the Flask backend with a lightweight Node.js/Express server.
Since React took over the frontend (#248), Flask was only used as a
thin MySQL interface — this replaces it with a simpler Node.js solution.

Changes

  • Added backend/ directory with Express + mysql2
  • Removed node_modules from tracking, added .gitignore entries
  • Added .env.example for environment variables

How to Run

  1. cd backend
  2. cp .env.example .env and fill in DB credentials
  3. npm install
  4. node index.js

Breaking Changes

None — API response shapes are identical to Flask.

Closes #268

Copilot AI review requested due to automatic review settings April 19, 2026 19:07
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

Replaces the previous Flask-based backend with a Node.js/Express API while introducing supporting configuration/docs and automation around submodule syncing.

Changes:

  • Added an Express server (backend/) implementing the existing API endpoints using mysql2/promise.
  • Added containerization and environment variable examples for the new backend.
  • Added repo/docs updates (README, third-party notice, submodule + workflow).

Reviewed changes

Copilot reviewed 9 out of 1349 changed files in this pull request and generated 12 comments.

Show a summary per file
File Description
backend/index.js Implements Express API endpoints and MySQL querying logic replacing Flask routes
backend/config.js Defines supported distro/table mapping and paging constant used by the backend
backend/Dockerfile Adds a container build for running the new Node backend
backend/.env.example Example env vars for backend DB + port configuration
.env.example Example env vars at repo root (overlaps with backend example)
.gitmodules Adds distro_data submodule reference
.github/workflows/updatesubmodules.yml Adds a workflow to sync submodule changes via SSH
README.md Updates project introduction and repository tour
THIRDPARTY.txt Adds third-party license notice text

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread backend/index.js
Comment on lines +22 to +33
const DISTRO_BIT_MAP = {};
let bitFlag = 1;
for (const distroName of Object.keys(SUPPORTED_DISTROS)) {
const versions = Object.keys(SUPPORTED_DISTROS[distroName]).sort();
for (const version of versions) {
if (!DISTRO_BIT_MAP[distroName]) {
DISTRO_BIT_MAP[distroName] = {};
}
DISTRO_BIT_MAP[distroName][version] = bitFlag;
bitFlag *= 2;
}
}
Copy link

Copilot AI Apr 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bitFlag is a Number that will exceed Number.MAX_SAFE_INTEGER once enough versions are added, causing silent precision loss. Since you later compare via BigInt(...), build and store the bitflags as BigInt from the start (e.g., initialize bitFlag as 1n and multiply by 2n) so DISTRO_BIT_MAP values remain exact.

Copilot uses AI. Check for mistakes.
Comment thread backend/index.js Outdated
Comment on lines +82 to +115

let allRows = [];
for (const table of tables) {
let query;
let params;
if (exactMatch) {
query = `SELECT packageName, description, version, osName FROM ${table} WHERE packageName = ?`;
params = [searchTerm];
} else {
query = `SELECT packageName, description, version, osName FROM ${table} WHERE packageName REGEXP ?`;
params = [searchTerm];
}
const [rows] = await pool.query(query, params);
allRows = allRows.concat(rows);
}

const totalLength = allRows.length;
let results = [];
let lastPage = Math.ceil(totalLength / MAX_RECORDS_TO_SEND);

if (totalLength <= MAX_RECORDS_TO_SEND) {
results = allRows;
} else {
const startIdx = pageNumber * MAX_RECORDS_TO_SEND;
let endIdx;
if (pageNumber === 0) {
endIdx = MAX_RECORDS_TO_SEND;
lastPage = 1; // Replicating the weird Python logic where last_page is 1 if page_number is 0?
// Actually the Python code says: last_page = 1 #math.ceil(totalLength/MAX_RECORDS_TO_SEND)
} else {
endIdx = totalLength;
lastPage = 1;
}
results = allRows.slice(startIdx, endIdx);
Copy link

Copilot AI Apr 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This pulls all matching rows from all tables into memory and only paginates afterward, which can become a major DB+memory bottleneck. Consider pushing pagination down to SQL (e.g., UNION ALL across tables with LIMIT/OFFSET, or apply LIMIT per table plus a merge strategy) so the server never loads an unbounded result set.

Suggested change
let allRows = [];
for (const table of tables) {
let query;
let params;
if (exactMatch) {
query = `SELECT packageName, description, version, osName FROM ${table} WHERE packageName = ?`;
params = [searchTerm];
} else {
query = `SELECT packageName, description, version, osName FROM ${table} WHERE packageName REGEXP ?`;
params = [searchTerm];
}
const [rows] = await pool.query(query, params);
allRows = allRows.concat(rows);
}
const totalLength = allRows.length;
let results = [];
let lastPage = Math.ceil(totalLength / MAX_RECORDS_TO_SEND);
if (totalLength <= MAX_RECORDS_TO_SEND) {
results = allRows;
} else {
const startIdx = pageNumber * MAX_RECORDS_TO_SEND;
let endIdx;
if (pageNumber === 0) {
endIdx = MAX_RECORDS_TO_SEND;
lastPage = 1; // Replicating the weird Python logic where last_page is 1 if page_number is 0?
// Actually the Python code says: last_page = 1 #math.ceil(totalLength/MAX_RECORDS_TO_SEND)
} else {
endIdx = totalLength;
lastPage = 1;
}
results = allRows.slice(startIdx, endIdx);
const whereClause = exactMatch ? 'packageName = ?' : 'packageName REGEXP ?';
const params = [searchTerm];
const tableCounts = [];
let totalLength = 0;
for (const table of tables) {
const countQuery = `SELECT COUNT(*) AS totalCount FROM ${table} WHERE ${whereClause}`;
const [countRows] = await pool.query(countQuery, params);
const tableCount = countRows[0]?.totalCount || 0;
tableCounts.push({ table, count: tableCount });
totalLength += tableCount;
}
let results = [];
let lastPage = Math.ceil(totalLength / MAX_RECORDS_TO_SEND);
if (totalLength > 0) {
let startIdx = 0;
let rowsToFetch = totalLength;
if (totalLength > MAX_RECORDS_TO_SEND) {
startIdx = pageNumber * MAX_RECORDS_TO_SEND;
if (pageNumber === 0) {
rowsToFetch = MAX_RECORDS_TO_SEND;
lastPage = 1; // Replicating the weird Python logic where last_page is 1 if page_number is 0?
// Actually the Python code says: last_page = 1 #math.ceil(totalLength/MAX_RECORDS_TO_SEND)
} else {
rowsToFetch = Math.max(totalLength - startIdx, 0);
lastPage = 1;
}
}
if (rowsToFetch > 0 && startIdx < totalLength) {
let remainingOffset = startIdx;
let remainingToFetch = rowsToFetch;
for (const { table, count } of tableCounts) {
if (remainingToFetch <= 0) {
break;
}
if (count === 0) {
continue;
}
if (remainingOffset >= count) {
remainingOffset -= count;
continue;
}
const tableOffset = remainingOffset;
const tableLimit = Math.min(count - tableOffset, remainingToFetch);
const dataQuery = `SELECT packageName, description, version, osName FROM ${table} WHERE ${whereClause} LIMIT ? OFFSET ?`;
const [rows] = await pool.query(dataQuery, [...params, tableLimit, tableOffset]);
results = results.concat(rows);
remainingToFetch -= rows.length;
remainingOffset = 0;
}
}

Copilot uses AI. Check for mistakes.
Comment thread backend/index.js Outdated
Comment on lines +98 to +116
const totalLength = allRows.length;
let results = [];
let lastPage = Math.ceil(totalLength / MAX_RECORDS_TO_SEND);

if (totalLength <= MAX_RECORDS_TO_SEND) {
results = allRows;
} else {
const startIdx = pageNumber * MAX_RECORDS_TO_SEND;
let endIdx;
if (pageNumber === 0) {
endIdx = MAX_RECORDS_TO_SEND;
lastPage = 1; // Replicating the weird Python logic where last_page is 1 if page_number is 0?
// Actually the Python code says: last_page = 1 #math.ceil(totalLength/MAX_RECORDS_TO_SEND)
} else {
endIdx = totalLength;
lastPage = 1;
}
results = allRows.slice(startIdx, endIdx);
}
Copy link

Copilot AI Apr 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The pagination metadata is internally inconsistent: lastPage is always forced to 1 when totalLength > MAX_RECORDS_TO_SEND, and for pageNumber > 0 you set endIdx = totalLength, which returns all remaining rows rather than one page. This also makes more_available (computed later) misleading. Recommendation: compute endIdx = startIdx + MAX_RECORDS_TO_SEND, calculate last_page consistently from totalLength, and derive more_available from whether endIdx < totalLength.

Copilot uses AI. Check for mistakes.
Comment thread backend/index.js
Comment on lines +121 to +127
res.json({
total_packages: totalLength,
current_page: pageNumber,
last_page: lastPage,
more_available: totalLength !== results.length,
packages: results
});
Copy link

Copilot AI Apr 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

more_available: totalLength !== results.length does not reflect whether there are additional pages; it compares the total count to the size of the current response, which will be true on any non-zero page even when you’re already at the end. Compute more_available based on paging indices (e.g., startIdx + results.length < totalLength) so clients can rely on it.

Copilot uses AI. Check for mistakes.
Comment thread backend/index.js Outdated
});
} catch (err) {
console.error(err);
res.status(500).json({ error: err.message });
Copy link

Copilot AI Apr 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Returning err.message can leak internal details (e.g., SQL errors, schema/table names, host info) to clients. Prefer returning a generic error message (and optionally a request correlation id), while keeping the detailed error logged server-side.

Suggested change
res.status(500).json({ error: err.message });
res.status(500).json({ error: 'Internal server error' });

Copilot uses AI. Check for mistakes.
Comment thread .github/workflows/updatesubmodules.yml Outdated
runs-on: ubuntu-latest
steps:
- name: ssh and git module update
uses: appleboy/ssh-action@master
Copy link

Copilot AI Apr 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using an action from a floating ref (@master) is a supply-chain risk because the workflow can change without review. Pin this to a specific release tag or (preferably) a commit SHA to make builds auditable and reduce the risk of unexpected upstream changes.

Suggested change
uses: appleboy/ssh-action@master
uses: appleboy/ssh-action@v1.0.3

Copilot uses AI. Check for mistakes.
Comment thread README.md Outdated
Comment on lines +13 to +17
| Repository | Description |
| -------------------------------------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| [Software Discovery Tool](https://github.com/openmainframeproject/software-discovery-tool) | The main repository of the Software Discovery Tool contains the core functionalities and codebase of the tool. It enables you to search and discover open source software for `zArchitecture/s390x`. You'll find the source code, documentation, and guidelines for contributing to the project in this repository. |
| [Software Discovery Tool Data](https://github.com/openmainframeproject/software-discovery-tool-data) | The Software Discovery Tool Data repository holds the necessary data for the software available in the tool. It contains information such as the names and versions of the software that you can discover using the Software Discovery Tool. This data is regularly updated to provide you with the most comprehensive and up-to-date software inventory. |
| [Software Discovery Tool Deploy](https://github.com/openmainframeproject/software-discovery-tool-deploy) | The Software Discovery Tool Deploy repository serves as the deployed version of the main Software Discovery Tool repository. It utilizes the main repository as a submodule and integrates the data repository as a sub-submodule. This deployment repository enables you to quickly set up and run the Software Discovery Tool in your environment. Read more about it [here](https://gist.github.com/rachejazz/de39c09612788635d5d0f491dcf8571a). |
Copy link

Copilot AI Apr 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The markdown table rows start with ||, which creates an extra empty column and renders incorrectly in most markdown viewers. Use a single leading | per row (and ensure consistent trailing pipes) so the table displays as intended.

Copilot uses AI. Check for mistakes.
Comment thread .env.example Outdated
Comment on lines +1 to +5
# Add user credentials to .env file
DB_HOST=localhost
DB_USER=sdtreaduser
DB_PASSWORD=UPDATEME
DB_NAME=sdtDB
Copy link

Copilot AI Apr 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are now two .env.example files (repo root and backend/.env.example) with different values/keys (e.g., backend includes PORT). This is likely to confuse setup and deployment. Recommend consolidating to a single source of truth (preferably under backend/) or clearly documenting which file is used in which context and keeping the variable sets aligned.

Copilot uses AI. Check for mistakes.
Comment thread THIRDPARTY.txt
@@ -0,0 +1,11 @@
The Angular, Bootstrap and JQuery dependencies are received and made available under the MIT license, as follows:
Copy link

Copilot AI Apr 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Corrected spelling/capitalization of 'JQuery' to 'jQuery' (the official project capitalization).

Copilot uses AI. Check for mistakes.
Comment thread THIRDPARTY.txt

Angular (Copyright 2010-2017 Google, Inc.)
Bootstrap (Copyright 2012-2017 the AngularUI Team)
JQuery (Copyright JS Foundation and other contributors)
Copy link

Copilot AI Apr 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Corrected spelling/capitalization of 'JQuery' to 'jQuery' (the official project capitalization).

Copilot uses AI. Check for mistakes.
@TusharB-07 TusharB-07 force-pushed the feat/replace-flask-with-node branch 2 times, most recently from c89ba4f to 67b9b11 Compare April 20, 2026 04:24
@TusharB-07
Copy link
Copy Markdown
Collaborator Author

@pleia2 Hi I've submitted this PR to replace Flask with Node.js/Express as discussed in #268. Happy to make any changes needed.

Please let me know if you'd like me to address the Copilot review suggestions as well.
Thanks for your time.

@pleia2
Copy link
Copy Markdown
Contributor

pleia2 commented Apr 20, 2026

@vmuralictr Can you take a look at this PR from @TusharB-07 to see if it's aligned with what you presented at our last meeting?

muraliveesambattu

This comment was marked as duplicate.

Copy link
Copy Markdown
Collaborator

@vmuralictr vmuralictr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @TusharB-07 thanks for the contribution! I tested this locally and the backend starts cleanly, distro bitmaps load correctly, and curl-based searches work. Good foundation.

A few things need to be fixed before merging:

The API response shape is not identical to Flask. Flask returned packages as arrays but this returns objects, so search results render empty in the UI because SearchBar.jsx uses array indexing like pkg[0], pkg[1]. Either match Flask's format or update SearchBar.jsx to use object keys.

The database_build.py script hardcodes /opt/software-discovery-tool so local setup doesn't work out of the box. Please add local dev setup instructions to the README.

bitFlag is a JS Number which will silently overflow as more distros are added. Since BigInt is already used in getTables(), initialize bitFlag as 1n and multiply by 2n throughout.

All matching rows are loaded into memory before paginating. With Debian having 60,000+ rows, a broad search can be very heavy. SQL-level LIMIT/OFFSET would be better.

The two .env.example files at repo root and backend/ have different keys which is confusing. Please consolidate to one.

Items 1 to 3 are the blockers. Happy to discuss — thanks again!

@TusharB-07 TusharB-07 force-pushed the feat/replace-flask-with-node branch from 036d26b to 54f1ac7 Compare April 23, 2026 17:52
@TusharB-07
Copy link
Copy Markdown
Collaborator Author

Hi @vmuralictr, thanks for the detailed review. All three blockers are resolved. Please let me know if you’d like any further changes.

@TusharB-07 TusharB-07 requested a review from vmuralictr April 23, 2026 18:08
Copy link
Copy Markdown
Collaborator

@vmuralictr vmuralictr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please Find the following Points

  1. React front end is Pointing to http://localhost:3000/
  2. There is "undefined" string in the endpoint "http://localhost:3000/undefined/getSupportedDistros"
  3. Can you check all the other End points and Add the swagger documentation for the same ?

@pleia2
Copy link
Copy Markdown
Contributor

pleia2 commented May 6, 2026

@TusharB-07 Do you have some time to take a look at the comments from @vmuralictr above? We're working on a strategy to Dockerize bringing up both the front and backends, so we're hoping to have this component finished up soon 😄

@TusharB-07 TusharB-07 requested a review from vmuralictr May 6, 2026 20:57
@TusharB-07
Copy link
Copy Markdown
Collaborator Author

Hi @pleia2 — apologies for the delay; I’ve been busy preparing for my end-semester exams next week. I've now pushed a new commit addressing all of @vmuralictr's review comments and requested a re-review.

Here’s a quick summary of the fixes:

  • Fixed frontend API URL handling (localhost:3000 / undefined no longer appear in requests)
  • Added Swagger docs for all 4 endpoints at /api-docs
  • Removed err.message leakage from client responses
  • Pinned appleboy/ssh-action to @v1.0.3
  • Consolidated .env.example files and updated README setup instructions
  • Fixed README table formatting
  • Implemented SQL-level LIMIT/OFFSET pagination to avoid loading all rows into memory

Also, I’d be happy to contribute to the Dockerization effort once this is merged — feel free to assign me.

@vmuralictr
Copy link
Copy Markdown
Collaborator

hi @TusharB-07 ,
Can you fix the following dependencies & Warnings ?
Can you able to run the application in your local . If it is , can you please attach few screenshots?

Screenshot 2026-05-13 at 12 57 24 PM

@pleia2
Copy link
Copy Markdown
Contributor

pleia2 commented May 18, 2026

@vmuralictr Are these errors in the front-end, rather than the back-end that is being replaced here? This is a big PR and I just want to make sure we're being very specific about what's being fixed in each change, so we don't make it bigger than it has to be 😄

But confirming that the whole thing works is essential to merging this change (warnings are OK if they're front-end only, as long as it works).

@TusharB-07 TusharB-07 force-pushed the feat/replace-flask-with-node branch 3 times, most recently from 1eec2eb to f993ab5 Compare May 19, 2026 19:42
@TusharB-07
Copy link
Copy Markdown
Collaborator Author

hi @vmuralictr, I've pushed the latest fixes:

  • Resolved all ESLint errors (BigInt env, useEffect deps, unused imports).
  • Frontend compiles cleanly with no errors.
    -Tested end-to-end locally with a full DB.
image

Copy link
Copy Markdown
Collaborator

@vmuralictr vmuralictr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving — a few things worth addressing in a follow-up:

config/supported_distros.py and backend/config.js are duplicates — consolidate to a single source of truth
Wildcard CORS (app.use(cors())) should be restricted before production deployment
database_build.py still requires Python — a JS equivalent would complete the migration
Please resolve the 6 merge conflicts with master before merging
Good work overall. Happy to re-review once conflicts are resolved.

@TusharB-07 TusharB-07 force-pushed the feat/replace-flask-with-node branch 4 times, most recently from 04a6205 to 73b03b8 Compare May 25, 2026 17:22
- replace Flask backend with Express + MySQL API implementation

- move distro mapping to a single JSON source of truth

- restrict CORS via ALLOWED_ORIGINS and add backend tests

- add Node.js database build script and update docs

- remove deprecated Flask source tree

Signed-off-by: TusharB-07 <tusharnbiswas07@gmail.com>
@TusharB-07 TusharB-07 force-pushed the feat/replace-flask-with-node branch from 73b03b8 to 98b6d00 Compare May 25, 2026 17:29
@TusharB-07 TusharB-07 requested a review from vmuralictr May 25, 2026 17:36
@TusharB-07
Copy link
Copy Markdown
Collaborator Author

hi @vmuralictr all the blocking items from your review have been addressed.

Copy link
Copy Markdown
Contributor

@pleia2 pleia2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @TusharB-07, thanks for hanging in there through this MASSIVE change!

I think we're close to the finish line.

Today I was testing this in a production style environment (real virtual machine, localhost can't be assumed) running Ubuntu 24.04. I've noted the issues I've run into inline.

And please restore bin/config_build.py to automate the process of creating config/distros.json (it'll be very broken because distros.json has a new name, but it's been broken for a long time and we'll just follow up in existing issue #260 later).

Comment thread package.json
@@ -0,0 +1,13 @@
{
"name": "software-discovery-tool",
"version": "1.0.0",
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using the package of node from Ubuntu 24.04, I had to add this line:

"type": "module",

Comment thread docs/Installation.md

```bash
sudo -u www-data cp src/config/supported_distros.py.example src/config/supported_distros.py
python3 ./bin/package_build.py RHEL_8_Package_List.json
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should add that after these .json files are pulled down, you need to update config/distros.json accordingly (or run bin/config_build.py)

Comment thread docs/Installation.md
sudo mysql_secure_installation
sudo apt install mariadb-server
sudo mariadb-secure-installation
```
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Replace Flask with slimmer backend

5 participants