Skip to content

f1monkey/elasticsearch-ru-en-translit-plugin

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

29 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Elasticsearch RU-EN Phonetic Translit Plugin

CI

Phonetic transliteration filter for Russian and English text. Enables sound-based search across languages.

In example, this words will match

  • "extension" / "экстеншн" / "экстеншен"
  • "daily" / "дейли" / "дэйли"
  • "chlorophyll" / "хлорофилл"

Perfect for:

  • E-commerce product search
  • Brand name matching
  • Multilingual catalogs
  • Fuzzy phonetic search

Installation

  • Select the plugin version that matches your ElasticSearch version here.
  • Copy url
  • Run command
sudo /usr/share/elasticsearch/bin/elasticsearch-plugin install %url%
  • Restart ElasticSearch

Building from source

If there is no plugin for your version, compile it yourself. I.e. for 8.19.3:

git clone https://github.com/f1monkey/elasticsearch-ru-en-translit-plugin.git
cd elasticsearch-ru-en-translit-plugin
./gradlew clean build -PesVersion=8.19.3

Usage

PUT /my_index
{
  "settings": {
    "analysis": {
      "analyzer": {
        "my_analyzer": {
          "tokenizer": "standard",
          "filter": ["ru_en_translit"]
        }
      }
    }
  },
  "mappings": {
    "properties": {
      "name": { "type": "text", "analyzer": "my_analyzer" }
    }
  }
}

POST /my_index/_analyze
{
  "analyzer": "my_analyzer",
  "text": "extension экстеншен экстеншон экстеншн"
}

Result:

{
  "tokens": [
    {
      "token": "ekstenshn",
      "start_offset": 0,
      "end_offset": 9,
      "type": "<ALPHANUM>",
      "position": 0
    },
    {
      "token": "ekstenshn",
      "start_offset": 10,
      "end_offset": 19,
      "type": "<ALPHANUM>",
      "position": 1
    },
    {
      "token": "ekstenshn",
      "start_offset": 20,
      "end_offset": 29,
      "type": "<ALPHANUM>",
      "position": 2
    },
    {
      "token": "ekstenshn",
      "start_offset": 30,
      "end_offset": 38,
      "type": "<ALPHANUM>",
      "position": 3
    }
  ]
}

Run tests

./gradlew test

Licence

MIT

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages