Skip to content

Latest commit

Β 

History

History
126 lines (100 loc) Β· 4.52 KB

File metadata and controls

126 lines (100 loc) Β· 4.52 KB

AutoCVSS

This repository contains the code for the EMNLP 2025 paper πŸ“„

AutoCVSS: Assessing the Performance of LLMs for Automated Software Vulnerability Scoring

If you use AutoCVSS for your research, please cite our paper:

@inproceedings{sanvito2025autocvss,
  title={AutoCVSS: Assessing the Performance of LLMs for Automated Software Vulnerability Scoring},
  author={Sanvito, Davide and Arriciati, Giovanni and Siracusano, Giuseppe and Bifulco, Roberto and Carminati, Michele},
  booktitle = "Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing: Industry Track",
  month = nov,
  year = "2025",
  publisher = "Association for Computational Linguistics",
}

Table of Contents

0. Pre-requisites

AutoCVSS uses Python 3.12: ensure you have it on your system.

# sudo apt update; sudo apt install software-properties-common # (if `add-apt-repository` is not found)
sudo add-apt-repository ppa:deadsnakes/ppa
sudo apt update
sudo apt install -y python3.12

This project dependencies are managed with poetry. We suggest using pipx to install poetry.

sudo apt update
sudo apt install -y pipx
pipx ensurepath
pipx install poetry

1. Install dependencies

To set up the project environment and install all necessary dependencies, run the following command:

cd AutoCVSS
poetry install

2. Configure API keys

A valid OpenAI API key must be provided in the configuration file: .keys.ini.sample provides an example.

# cd AutoCVSS
cp .keys.ini.sample .keys.ini
nano .keys.ini
  • A valid OpenAI API key must be provided in the OpenAI_auth_params section.
  • The Langfuse section can be optionally uncommented and configured to enable the integration with Langfuse, a SOTA tool for LLM observability.
  • For Local LLMs, assuming they are served through an OpenAI-compatible API (e.g. via vLLM/ollama), you can configure the endpoints and model names in autocvss/connector/llm.py file based on your local deployment:
    • LLaMA3: get_llama31_api_model_name() and get_llama31_client()
    • DeepSeek-R1: get_ollama_deepseek_r1_70b_api_model_name() and get_ollama_deepseek_r1_70b_client()

3. Download dataset from NVD

Before running the experiments, run the following commands to download the CVE data from NVD.

# cd AutoCVSS
cd data
for year in {2022..2024}; do
    if [ ! -f nvdcve-2.0-${year}.json.zip ]; then
        wget https://nvd.nist.gov/feeds/json/cve/2.0/nvdcve-2.0-${year}.json.zip
        unzip nvdcve-2.0-${year}.json.zip
    fi
done
poetry run python process_nvd_data.py

Your data directory should now look as follows:

.
β”œβ”€β”€ v31_full_dataset
β”‚Β Β  β”œβ”€β”€ dataset_test_df.csv
β”‚Β Β  β”œβ”€β”€ dataset_train_df.csv
β”‚Β Β  β”œβ”€β”€ test_set_cve_ids.csv
β”‚Β Β  └── train_set_cve_ids.csv
β”œβ”€β”€ v31_low_resource_dataset
β”‚Β Β  β”œβ”€β”€ dataset_test_df.csv
β”‚Β Β  β”œβ”€β”€ dataset_train_df.csv
β”‚Β Β  β”œβ”€β”€ test_set_cve_ids.csv
β”‚Β Β  └── train_set_cve_ids.csv
└── v40_38_samples_by_first
    β”œβ”€β”€ dataset_from_NVD.csv
    └── test_set_cve_ids.csv

4. Run experiments

You are now ready to execute the experiments:

  • The notebooks directory includes the Jupyter notebooks to evaluate LLMs for the prediction of CVSS scores.
  • In each notebook you should select the data scenario and LLM configuration to be tested by providing:
    1. DATASET_NAME and CVSS_METRICS
    2. EXPERIMENT_NAME
  • ⚠️ By default, the notebooks only test the first 3 samples: comment the following line to run tests on the entire test data!
    - dataset_test_df = dataset_test_df.head(3)
    + # dataset_test_df = dataset_test_df.head(3)
  • Optionally, the notebook can be run directly from the shell with the 3 bash scripts provided in the root of this repository.
    • run__evaluation_zeroshot_STD_DTD.sh
    • run__evaluation_zeroshot_FVP.sh
    • run__evaluation_fewshots_STD_DTD.sh
  • Finally, the tables with the summary of the results can be visualized with the parse_results.ipynb notebook.