🗺️ back to content
menu
-
gojieba - This is a Go implementation of jieba which a Chinese word splitting algorithm.
-
gotokenizer - A tokenizer based on the dictionary and Bigram language models for Golang. (Now only support chinese segmentation)
-
gse - Go efficient text segmentation; support english, chinese, japanese and other.
-
MMSEGO - This is a GO implementation of MMSEG which a Chinese word splitting algorithm.
-
prose - Library for text processing that supports tokenization, part-of-speech tagging, named-entity extraction, and more. English only.
-
segment - Go library for performing Unicode Text Segmentation as described in Unicode Standard Annex #29
-
sentences - Sentence tokenizer: converts text into a list of sentences.
-
shamoji - The shamoji is word filtering package written in Go.
-
stemmer - Stemmer packages for Go programming language. Includes English and German stemmers.
-
textcat - Go package for n-gram based text categorization, with support for utf-8 and raw text.
-
ctxi18n - Context aware i18n with a short and consise API, pluralization, interpolation, and fs.FS support. YAML locale definitions are based on Rails i18n.
-
go-i18n - Package and an accompanying tool to work with localized text.
-
go-mystem - CGo bindings to Yandex.Mystem - russian morphology analyzer.
-
go-pinyin - CN Hanzi to Hanyu Pinyin converter.
-
go-words - A words table and text resource library for Golang projects.
-
gotext - GNU gettext utilities for Go.
-
iuliia-go - Transliterate Cyrillic → Latin in every possible way.
-
spreak - Flexible translation and humanization library for Go, based on the concepts behind gettext.
-
t - Another i18n pkg for golang, which follows GNU gettext style and supports .po/.mo files: t.T (gettext), t.N (ngettext), etc. And it contains a cmd tool xtemplate, which can extract messages as a pot file from text/html template.
-
enca - Minimal cgo bindings for libenca, which detects character encodings.
-
go-unidecode - ASCII transliterations of Unicode text.
-
gounidecode - Unicode transliterator (also known as unidecode) for Go.
-
transliterator - Provides one-way string transliteration with supporting of language-specific transliteration rules.