Skip to content

Latest commit

 

History

History
37 lines (29 loc) · 2.36 KB

File metadata and controls

37 lines (29 loc) · 2.36 KB

Evaluation helpers

Papers to explore

Papers/stuff to keep as reference

Future work

  • Calculation of new Embedding:
    1. Use weigthed average based on the initial tokens lengths
      • E.g: New token: "martelo". Old tokens: ["mar", "telo"] -> the embed for "martelo" will be the weighted average of the embeds for "mar" and "telo", with 3/7 weighted for "mar" and 4/7 weighted for "telo"
    2. Try to translate the meaning of the word to the Original Language from the Target Language, and use the vector for that translated word
      • E.g: New token: "martelo". Translation is "hammer", so use the embed for the word "hammer" (if the original language is English)
  • Final deployment:
    1. Explore https://github.com/vllm-project/vllm to deploy a final LLM or to do some testings