面白い文章を自動で作る→面白い文章から学習する→文章を単語に分解する
ということで、文章を単語に分解する(形態素解析)が必要に。
どうやら「mecab」なる形態素解析エンジンが公開されているのでインストールしてみました。
でも、APIみたいな形にしておけばなにかと便利。誰に便利かは知りません。
で、どっかにありそうですがmecab APIを作りました。
- URL
- http://kudaranai.jp/mecab/mj.cgi
- JqueryによるMecab APIデモ
- 入力
- キーは"s"で、値は日本語文字列
- 出力
- JSON形式
- 単語オブジェクトの配列
- 例:「これは犬ですか?いいえ、母です。」
[{
"surface":"これ",
"feature":["名詞","代名詞","一般","*","*","*","これ","コレ","コレ"]
},{
"surface":"は",
"feature":["助詞","係助詞","*","*","*","*","は","ハ","ワ"]
},{
"surface":"犬",
"feature":["名詞","一般","*","*","*","*","犬","イヌ","イヌ"]
},{
"surface":"です",
"feature":["助動詞","*","*","*","特殊・デス","基本形","です","デス","デス"]
},{
"surface":"か",
"feature":["助詞","副助詞/並立助詞/終助詞","*","*","*","*","か","カ","カ"]
},{
"surface":"?","feature":["記号","一般","*","*","*","*","?","?","?"]
},{
"surface":"いいえ",
"feature":["感動詞","*","*","*","*","*","いいえ","イイエ","イーエ"]
},{
"surface":"母",
"feature":["名詞","一般","*","*","*","*","母","ハハ","ハハ"]
},{
"surface":"です",
"feature":["助動詞","*","*","*","特殊・デス","基本形","です","デス","デス"]
},{
"surface":"。","feature":["記号","句点","*","*","*","*","。","。","。"]
}]
- featureの配列は以下の順です。
- 品詞,品詞細分類1,品詞細分類2,品詞細分類3,活用形,活用型,原形,読み,発音
まとめ
で、これを使ってどんなくだらないサービスを作るかは次回考えます。