Narcisissitic Twitter

Project goals

Create a model that can predict the narcissism of the person based on their tweets. The project uses data that distinguishes between two-factor Narcissism - Admiration (ADM) and Rivary (RIV). Each factor is calculated from the average of nine responses to statements about narcissism, to which the respondent answers on a scale of 1-6 (from strongly disagree to strongly agree).

Train and test models

The repository is inspired by lightning-hydra-template. To train a model one should use python train_lightning.py experiment=bert (lightning models) or train.py experiment=baseline (other models). An experiment should be defined in configs/experiments, where one should make necessary overrides.

Examples

Multirun of baselines

For example, one can evaluate all baselines with one command line.

python train.py experiment=base/baseline_tr_adm model=baselines/decision_tree,baselines/gradient_boosting,baselines/mlp,baselines/random_forest,baselines/svr seed=42,43,44,45,46 -m

python train.py experiment=base/baseline_tr_adm,base/baseline_tr_riv,base/baseline_ab_adm,base/baseline_ab_riv,base/baseline_ai_adm,base/baseline_ai_riv model=baselines/decision_tree,baselines/gradient_boosting,baselines/mlp,baselines/random_forest,baselines/svr,baselines/linear_regression seed=42,47,72,43,12 -m

Few shot

python train.py -m experiment=few_shot/casual_conversation

Bert Optuna

Searching for best hyperparameters using Optuna.

python train_lightning.py experiment=bert hparams_search=bert_optuna

The results are saved in the file example.db It can return an error if there are a couple of experiments with the same name/similar parameters. Changing the study_name should resolve the problem. The second option is to change the storage file.

Project setup

Python environment

Using Poetry to set up the environment is the preferred option. To start using Poetry, you can install it.

If the Python version is mismatched one can use

poetry env use [full_path_to_the_python_interpreter]

This ensures that your project uses the desired Python version.

Once you have your environment set up, you can install the project's dependencies from the lock file by running:

poetry install

This command reads the poetry.lock file and installs the exact versions of the dependencies specified in it To add a new dependency one needs to type

poetry add [package_name]

This command will install the specified package and update both the pyproject.toml file and the poetry.lock file with the new dependency information. To update or make a new lock file we can type in:

poetry lock [--no-update] [--check]

By default, running poetry lock without any options will update the lock file based on the latest pyproject.toml information. However, you can use the --no-update option to only refresh the lock file without modifying it based on the project's configuration. Additionally, the --check option allows you to verify whether the lock file is consistent with the pyproject.toml file.

Poetry allows us to use groups with specific dependencies. As Poetry is super concise with its documentation, to modify, add or make optional one should refer here.

Environment file .env

Create your environment file from template .env.example. Change its name to .env file and paste your credentials.

Twitter credentials

Generate your bearer token here and add it to your .env file. You will also need a Twitter project and connected to it Consumer Keys (API Key and Secret).

Load credentials

You can use:

import os

os.getenv('KEY_NAME')

In the application, you may also use python-dotenv:

from dotenv import load_dotenv

load_dotenv()

in notebooks:

%load_ext dotenv
%dotenv

Git hooks

We use pre-commit to manage git hooks. Unfortunately, the custom hook doesn't work with Windows. Windows users can comment strip-notebooks hook on .pre-commit-config.yaml. To configure this extension one has to run

chmod +x .hooks
pre-commit install

Running tests

Run tests for a given directory or file:

python3 -m pytest tests -rA

Add -k 'test_name or expression' to run just selected test(s):

python3 -m pytest tests/model -k 'test_metrics' -rA

Linting

We use flake8 and black. They are running automatically as git hooks. Configuration for flake8 is stored as a .flake8 file and black configuration is stored in pyproject.toml.

VS Code

To enable linting you have to have configured interpreter Python: Select Interpreter (Ctrl+Shift+P) and choose your conda environment. Next, you have to choose Python: Select Linter and pick flake8.

Additional Tools

Hydra

Hydra is a framework for elegantly configuring complex applications. It is very useful for managing experiments config and for multijobs. It could be used for example to run multiple experiments with different parameters like learning rate.

It is recommended to read the article written by Hydra author.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Narcisissitic Twitter

Project goals

Train and test models

Examples

Multirun of baselines

Few shot

Bert Optuna

Project setup

Python environment

Environment file .env

Twitter credentials

Load credentials

Git hooks

Running tests

Linting

VS Code

Additional Tools

Hydra

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 139 Commits
.hooks		.hooks
configs		configs
data		data
lib		lib
notebooks		notebooks
scripts		scripts
tests/models		tests/models
.env.example		.env.example
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
.project-root		.project-root
README.md		README.md
eval_lightning.py		eval_lightning.py
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml
train.py		train.py
train_lightning.py		train_lightning.py

Folders and files

Latest commit

History

Repository files navigation

Narcisissitic Twitter

Project goals

Train and test models

Examples

Multirun of baselines

Few shot

Bert Optuna

Project setup

Python environment

Environment file .env

Twitter credentials

Load credentials

Git hooks

Running tests

Linting

VS Code

Additional Tools

Hydra

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages