mecabの形態素解析結果をJSONで返すAPI

time 2011年4月6日 06時29分30秒   category くだらない開発   editor ギヤマシン@所長  

面白い文章を自動で作る→面白い文章から学習する→文章を単語に分解する

ということで、文章を単語に分解する(形態素解析)が必要に。
どうやら「mecab」なる形態素解析エンジンが公開されているのでインストールしてみました。

でも、APIみたいな形にしておけばなにかと便利。誰に便利かは知りません。
で、どっかにありそうですがmecab APIを作りました。

  • URL
  • 入力
    • キーは"s"で、値は日本語文字列
  • 出力
    • JSON形式
    • 単語オブジェクトの配列
    • 例:「これは犬ですか?いいえ、母です。」

      [{
      "surface":"これ",
      "feature":["名詞","代名詞","一般","*","*","*","これ","コレ","コレ"]
      },{
      "surface":"は",
      "feature":["助詞","係助詞","*","*","*","*","は","ハ","ワ"]
      },{
      "surface":"犬",
      "feature":["名詞","一般","*","*","*","*","犬","イヌ","イヌ"]
      },{
      "surface":"です",
      "feature":["助動詞","*","*","*","特殊・デス","基本形","です","デス","デス"]
      },{
      "surface":"か",
      "feature":["助詞","副助詞/並立助詞/終助詞","*","*","*","*","か","カ","カ"]
      },{
      "surface":"?","feature":["記号","一般","*","*","*","*","?","?","?"]
      },{
      "surface":"いいえ",
      "feature":["感動詞","*","*","*","*","*","いいえ","イイエ","イーエ"]
      },{
      "surface":"母",
      "feature":["名詞","一般","*","*","*","*","母","ハハ","ハハ"]
      },{
      "surface":"です",
      "feature":["助動詞","*","*","*","特殊・デス","基本形","です","デス","デス"]
      },{
      "surface":"。","feature":["記号","句点","*","*","*","*","。","。","。"]
      }]

  • featureの配列は以下の順です。
    • 品詞,品詞細分類1,品詞細分類2,品詞細分類3,活用形,活用型,原形,読み,発音

まとめ

で、これを使ってどんなくだらないサービスを作るかは次回考えます。

 

コメントを書く

Premium WordPress Themes

Weboy