The trusted, curated professional network for researchers. LinkedIn meets arXiv, Mendeley, and a trusted academic commons β built around credibility, discovery, and real research progress, not vanity metrics.
π Documentation Β· π API Reference Β· πΊοΈ Roadmap Β· π€ Contributing Β· ποΈ Organization
Warning
PROJECT:RS is currently under active development and has not yet been officially named. The codebase is in pre-alpha / incubation stage. APIs, schema, and architecture are subject to change. Not recommended for production use.
| Project Codename | PROJECT:RS (official name TBD) |
| Incubating Organization | Singularity Student Lab |
| Lead Developer | @jayanthoffl |
| Development Stage | Pre-Alpha / Incubation |
| License | MIT |
"A place where a PhD student, professor, industry scientist, policy researcher, or independent scholar can build a verified identity, discover meaningful work, meet the right collaborators, and follow research conversations β without the noise of generic social media."
PROJECT:RS addresses the fragmentation problem every researcher faces today. Your identity is on ORCID. Your papers are on ResearchGate. Your discussions are on Twitter. Your jobs are on LinkedIn. Your literature is in Mendeley.
We are building the layer that unifies all of this. One platform. Four core questions answered clearly:
| Question | How we solve it |
|---|---|
| πͺͺ Who is this person? | Verified researcher identity via ORCID, with institution-aware trust scores |
| π What are they working on? | Living research portfolios: papers, datasets, current projects, open questions |
| π° What work deserves my attention? | A curated, AI-powered semantic feed β not an engagement-optimized timeline |
| π€ Who should I collaborate with? | An active collaboration marketplace matched by expertise, methods, and goals |
research-commons/
βββ π¦ backend/ # FastAPI β Knowledge Graph Engine
β βββ app/
β β βββ main.py # API routes: auth, feed, semantic search
β β βββ db/
β β β βββ database.py # SQLAlchemy + PostgreSQL connection
β β βββ models/
β β β βββ models.py # ORM: Users, Works, Authorships, Opportunities
β β βββ services/
β β βββ arxiv_bot.py # ArXiv ingestion + PDF extraction + embedding
β βββ schema.sql # PostgreSQL schema with pgvector extension
β
βββ π frontend/ # Next.js 16 β Research Interface
β βββ app/
β βββ page.tsx # Landing page & ORCID login
β βββ layout.tsx # Root layout
β βββ feed/
β βββ page.tsx # Curated discovery feed + semantic search UI
β
βββ π³ docker-compose.yml # PostgreSQL + pgvector database service
βββββββββββββββββββββββββββ
β ORCID Identity Layer β
β (OAuth2 / Verified ID) β
ββββββββββββββ¬βββββββββββββ
β
βββββββββββββββββΌββββββββββββββββ
β Next.js Frontend β
β :3000 β
β β’ Landing & ORCID Login β
β β’ Curated Discovery Feed β
β β’ Semantic Search UI β
βββββββββββββββββ¬ββββββββββββββββ
β REST API
βββββββββββββββββΌββββββββββββββββ
β FastAPI Backend β
β :8000 β
β β’ /auth/orcid β OAuth flow β
β β’ /api/feed β Latest papers β
β β’ /api/search β Vector search β
βββββββββββββββββ¬ββββββββββββββββ
β
ββββββββββββββββββββββββΌβββββββββββββββββββββββ
β PostgreSQL 16 + pgvector β
β :5432 β
β β
β users works opportunitiesβ
β ββββββββββββ ββββββββββββββ ββββββββββββ β
β β orcid_id β β doi β β type β β
β β trust_ β β abstract β β vector β β
β β score β β embedding β β (384d) β β
β β career_ β β vector(384)β β β β
β β stage β ββββββββββββββ ββββββββββββ β
β ββββββββββββ β
β βΈ 384-dimensional semantic embeddings β
β βΈ Cosine distance search (MiniLM-L6-v2) β
ββββββββββββββββββββββββββββββββββββββββββββββββ
β²
ββββββββββββββββββββββββ΄βββββββββββββββββββββββ
β ArXiv Ingestion Bot β
β 1. Fetch latest cs.LG + quant-ph papers β
β 2. Download & parse full PDFs β
β 3. Encode abstracts β 384d vector β
β 4. Store in Knowledge Graph β
ββββββββββββββββββββββββββββββββββββββββββββββββ
| Feature | Technology | Endpoint / Component |
|---|---|---|
| ORCID OAuth2 Login | FastAPI + ORCID | GET /auth/orcid |
| Auth Callback & User Creation | FastAPI + SQLAlchemy | GET /auth/callback |
| Verified User Profiles | PostgreSQL users table |
Trust score, ORCID ID, career stage |
| ArXiv Paper Ingestion | arxiv + pypdf |
services/arxiv_bot.py |
| PDF Full-Text Extraction | PyPDF | Multi-page text extraction |
| Semantic Embeddings | all-MiniLM-L6-v2 (384d) |
Stored as vector(384) |
| Curated Feed API | FastAPI | GET /api/feed |
| Semantic Search API | pgvector cosine distance | GET /api/search?q=... |
| Discovery Feed UI | Next.js 16 | /feed page |
| Health Check | FastAPI | GET /health |
- Research Communities β topic groups, method circles, private lab spaces
- Collaboration Marketplace β vector-matched active collaboration requests
- Living Portfolio β narrative view of a researcher's evolving body of work
- Scholarly Discussion Layer β threaded comments, paper annotations, mini-reviews
- Career & Opportunity Hub β postdocs, grants, speaking invitations, fellowships
- Trust & Reputation Engine β earned from peer endorsements, reviewing, replication
- Cross-disciplinary Feed β adjacent-field recommendations via vector similarity
- Institutional Integration β verified university/lab pages with member lists
- JWT Session Auth β replace redirect-based flow with secure cookie sessions
- Real ORCID OAuth β swap mock flow for production ORCID credentials
git clone https://github.com/Singularity-Student-Lab/research-commons.git
cd research-commons# Spins up PostgreSQL 16 with the pgvector extension
docker compose up -dVerify:
docker ps
# Expected: commons_db Up 0.0.0.0:5432->5432/tcpcd backend
# Create and activate virtual environment
python -m venv venv
.\venv\Scripts\activate # Windows
# source venv/bin/activate # macOS / Linux
# Install dependencies
pip install fastapi uvicorn sqlalchemy psycopg2-binary pgvector \
sentence-transformers arxiv pypdf
# Start the API server
uvicorn app.main:app --reload --port 8000Note: On first run, the
all-MiniLM-L6-v2model (~80 MB) downloads automatically. This only happens once.
The API is now live at http://localhost:8000
Auto-generated docs at http://localhost:8000/docs
# In a new terminal (with venv activated)
cd backend
python -m app.services.arxiv_botThis will:
- Fetch the 5 latest
cs.LGandquant-phpapers from ArXiv - Download and parse their full PDFs
- Generate 384-dimensional semantic embeddings for each abstract
- Store everything in the local PostgreSQL Knowledge Graph
cd frontend
npm install
npm run devThe app is now live at http://localhost:3000
Important
Make sure the backend is running at :8000 before starting the frontend. The feed page calls http://127.0.0.1:8000/api/feed at render time via server-side fetch.
Base URL: http://localhost:8000
| Method | Endpoint | Description |
|---|---|---|
GET |
/ |
Health ping |
GET |
/health |
Database connectivity check |
GET |
/auth/orcid |
Initiate ORCID OAuth2 login |
GET |
/auth/callback?code= |
Handle OAuth callback, provision user |
GET |
/api/feed |
Retrieve 10 most recent indexed papers |
GET |
/api/search?q={query} |
Semantic vector search over Knowledge Graph |
# Search by concept β not just keywords
curl "http://localhost:8000/api/search?q=quantum+error+correction+fault+tolerant"
curl "http://localhost:8000/api/search?q=LLM+hallucination+and+factuality"
curl "http://localhost:8000/api/search?q=transformer+attention+mechanism+efficiency"The search uses cosine distance over 384-dimensional embeddings β meaning it finds papers semantically close to your intent, not just keyword matches.
| Interface | URL |
|---|---|
| Swagger UI | http://localhost:8000/docs |
| ReDoc | http://localhost:8000/redoc |
The schema is organized into four conceptual zones, mirroring the four pillars of the platform:
-- ββ Zone A: Verified Identity & Reputation ββββββββββββββββββββββββββββββ
CREATE TABLE users (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
orcid_id VARCHAR(50) UNIQUE NOT NULL,
full_name VARCHAR(255) NOT NULL,
affiliation VARCHAR(255),
career_stage VARCHAR(100),
trust_score FLOAT DEFAULT 1.0,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
CREATE TABLE expertise_tags (id SERIAL PRIMARY KEY, tag_name VARCHAR(100) UNIQUE);
CREATE TABLE user_expertise (user_id UUID, tag_id INT, endorsements INT DEFAULT 0);
-- ββ Zone B: The Living Portfolio βββββββββββββββββββββββββββββββββββββββββ
CREATE TABLE works (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
doi VARCHAR(100) UNIQUE,
title TEXT NOT NULL,
abstract TEXT,
abstract_embedding vector(384), -- all-MiniLM-L6-v2 dimensions
work_type VARCHAR(50), -- 'paper', 'dataset', 'preprint'
published_date DATE
);
CREATE TABLE authorships (user_id UUID, work_id UUID, is_corresponding BOOLEAN);
-- ββ Zone D: Collaboration Marketplace ββββββββββββββββββββββββββββββββββββ
CREATE TABLE opportunities (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
author_id UUID REFERENCES users(id),
title VARCHAR(255) NOT NULL,
description TEXT NOT NULL,
opportunity_type VARCHAR(50), -- 'co-author', 'grant', 'mentorship'
requirement_embedding vector(384), -- matched against user portfolio vectors
is_active BOOLEAN DEFAULT TRUE
);Why pgvector? Both
works.abstract_embeddingandopportunities.requirement_embeddingare stored as 384-dimensional vectors. This enables future matching of researchers to opportunities by the mathematical similarity of their research portfolio to what a collaboration needs β not by keyword.
A researcher profile anchored on ORCID-authenticated identity. Methods, open questions, career stage, affiliations, and contributions beyond publications β reviewing, mentoring, replication, open resources. Trust is foundational, not optional.
Not a social timeline. A research-grade feed filtered by topics, methods, trusted collaborators, and quality signals β open data badges, code availability, replication flags, peer commentary. The ArXiv bot seeds this today; Semantic Scholar, PubMed, and more are planned.
Active, intent-based matching β not passive browsing. Researchers post specific needs and the platform matches against verified expertise portfolios using vector similarity. "I need a Bayesian statistician for a clinical trial" β matched to researchers whose published work embeds nearest to that description.
Postdocs, faculty openings, grants, fellowships, reviewer invitations, conference calls, and industry consulting β surfaced based on actual expertise, not generic job boards.
| User | Core Need |
|---|---|
| PhD Students & Early-Career Researchers | Visibility, mentorship, collaboration, curated literature |
| Principal Investigators | Lab recruitment, grant collaborators, reputation management |
| Industry R&D Scientists | Expert discovery, applied collaboration, talent pipeline |
| Independent Scholars | Credibility without institutional affiliation |
| Institutions & Publishers | Verified researcher records, community showcasing |
| Platform | Strengths | Gap |
|---|---|---|
| ORCID | Trusted identity infrastructure | Not a discovery or networking tool |
| ResearchGate | Professional network, Q&A, paper sharing | Noisy feed, no semantic search, weak curation |
| Semantic Scholar | AI-powered paper discovery | No professional identity or collaboration layer |
| Professional network, job discovery | Not research-aware, zero academic trust signals | |
| Mendeley | Reference management, groups | Passive; no active collaboration or reputation |
| PROJECT:RS | All four pillars, with curation as the defining principle | β |
| Layer | Technology | Rationale |
|---|---|---|
| Frontend | Next.js 16, TypeScript, Tailwind CSS | SSR for SEO, type-safe, fast |
| Backend | FastAPI (Python) | Async, auto-documented, ML-adjacent |
| Database | PostgreSQL 16 + pgvector | Relational integrity + native vector search |
| Embeddings | all-MiniLM-L6-v2 (SentenceTransformers) |
384d, fast, fully local, zero API cost |
| Auth | ORCID OAuth2 | The gold standard for verified academic identity |
| Ingestion | arxiv Python library + pypdf |
Fetches and parses full PDFs, not just abstracts |
| Infra | Docker Compose | One-command database bootstrapping |
This project is in early incubation under Singularity Student Lab. Contributions, ideas, and feedback are welcome.
- Fork the repository
- Branch off
main:git checkout -b feature/your-feature - Commit with conventional commits:
git commit -m 'feat: add collaboration matching' - Push and open a Pull Request against
main
Please open an issue before submitting large changes so we can discuss the approach first.
Distributed under the MIT License. See LICENSE for details.