A wrapper around MeCab, a part-of-speech and morphological
analyzer for Japanese.
(let ((mecab (make-mecab "/var/lib/mecab/dic/ipadic-utf8/")))
(mecab-parse mecab "今日学校に行きます"))
=> (("今日" "名詞,時相名詞,*,*,今日,きょう,代表表記:今日/きょう カテゴリ:時間") ...)
(mecab-tokenize "今日学校に行きます")
=> ("今日" "学校" "に" "行き" "ます" "n")
(mecab-yomi "今日学校に行きます")
=> "きょうがっこうにいきますn"
Note in addition to the mecab library you will need a dictionary
in utf-8 format installed, e.g. in Ubuntu from the
mecab-ipadic-utf8 or mecab-jumandic-utf8 packages.
make-mecab
.Create a new mecab parser.Parses the string str
with the mecab
parser, and
returns a list of parses.A parameter holding the mecab parser used for
mecab-tokenize
.Splits str
into a list of tokens.A parameter holding the mecab parser used for
mecab-yomi
.Returns the hiragana pronunciation of str
.Returns the last error generated by mecab-parse.