🗺️ back to content
go-slugify - Make pretty slug with multiple languages support.
slug - URL-friendly slugify with multiple languages support.
Slugify - Go slugify application that handles string.
gojieba - This is a Go implementation of jieba which a Chinese word splitting algorithm.
gotokenizer - A tokenizer based on the dictionary and Bigram language models for Golang. (Now only support chinese segmentation)
gse - Go efficient text segmentation; support english, chinese, japanese and other.
MMSEGO - This is a GO implementation of MMSEG which a Chinese word splitting algorithm.
prose - Library for text processing that supports tokenization, part-of-speech tagging, named-entity extraction, and more. English only.
segment - Go library for performing Unicode Text Segmentation as described in Unicode Standard Annex #29
sentences - Sentence tokenizer: converts text into a list of sentences.
shamoji - The shamoji is word filtering package written in Go.
stemmer - Stemmer packages for Go programming language. Includes English and German stemmers.
textcat - Go package for n-gram based text categorization, with support for utf-8 and raw text.
ctxi18n - Context aware i18n with a short and consise API, pluralization, interpolation, and fs.FS support. YAML locale definitions are based on Rails i18n.
go-i18n - Package and an accompanying tool to work with localized text.
go-mystem - CGo bindings to Yandex.Mystem - russian morphology analyzer.
go-pinyin - CN Hanzi to Hanyu Pinyin converter.
go-words - A words table and text resource library for Golang projects.
gotext - GNU gettext utilities for Go.
iuliia-go - Transliterate Cyrillic → Latin in every possible way.
spreak - Flexible translation and humanization library for Go, based on the concepts behind gettext.
t - Another i18n pkg for golang, which follows GNU gettext style and supports .po/.mo files: t.T (gettext), t.N (ngettext), etc. And it contains a cmd tool xtemplate, which can extract messages as a pot file from text/html template.
enca - Minimal cgo bindings for libenca, which detects character encodings.
go-unidecode - ASCII transliterations of Unicode text.
gounidecode - Unicode transliterator (also known as unidecode) for Go.
transliterator - Provides one-way string transliteration with supporting of language-specific transliteration rules.