MeCab, the Japanese morphological analyser by Taku Kudo compiled to WebAssembly.
References prior emscripten example by fasiha.
Uses NAIST-jdic as dataset for MeCab.
Uses WanaKana for transliteration, additional tokenization, and classification.
Uses Jisho for external dictionary lookup – created by Kim Ahlström, Miwa Ahlström and Andrew Plummer.
Uses Ve's algorithm for agglutinating tokens into words – created by Kim Ahlström.
Uses EDICT2 and ENAMDICT for embedded dictionary lookup – created by Jim Breen.
This webpage combines language technologies to suggest where to insert spaces,
how to pronounce kanji, and how to find words in a dictionary.
Source code available on GitHub.
— Alex Birch