Skip to content

ZahraRahimii/Information-Retrieval-Scoring-Models

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 

Repository files navigation

Information-Retrieval-Scoring-Models

This poject contains implementations of two key document ranking models used in Information Retrieval (IR):

  • BM25/Okapi: A probabilistic ranking function based on term frequency, document length normalization, and inverse document frequency.
  • Binary Independence Model (BIM): A probabilistic model assuming independence between query terms and used to score and rank documents based on binary term presence.

Project Workflow:

  • Tokenization of documents
  • BM25 scoring and ranking (configurable parameters: k=1.5, b=0.75)
  • BIM scoring with support for a set of query terms
  • Outputs ranked list of documents with corresponding relevance scores

Dataset

A sample set of documents and a query are used to test the models:

  • Query: "information retrieval models"
  • Documents: D1 to D8 (provided in the code)

About

Implementation of BM25/Okapi and BIM models for probabilistic document ranking in Information Retrieval.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors