{"id":67863,"date":"2018-08-14T01:46:07","date_gmt":"2018-08-13T16:46:07","guid":{"rendered":"https:\/\/www.sejuku.net\/blog\/?p=67863"},"modified":"2024-05-06T11:50:33","modified_gmt":"2024-05-06T02:50:33","slug":"ph3-3","status":"publish","type":"post","link":"https:\/\/www.sejuku.net\/blog\/67863","title":{"rendered":"\u96e3\u3057\u3044\u30c8\u30d4\u30c3\u30af\u30e2\u30c7\u30eb\u3092\u7c21\u5358\u306b\uff01Python\u30e9\u30a4\u30d6\u30e9\u30eaGensim\u306e\u4f7f\u3044\u65b9"},"content":{"rendered":"<p>\u81ea\u7136\u8a00\u8a9e\u51e6\u7406\u3084\u60c5\u5831\u691c\u7d22\u306e\u5206\u91ce\u306b\u3001\u30c8\u30d4\u30c3\u30af\u30e2\u30c7\u30eb\u3068\u3044\u3046\u624b\u6cd5\u304c\u3042\u308a\u307e\u3059\u3002\u30c8\u30d4\u30c3\u30af\u30e2\u30c7\u30eb\u306f\u6587\u66f8\u5206\u985e\u3084\u63a8\u85a6\u30b7\u30b9\u30c6\u30e0\u306a\u3069\u306b\u5fdc\u7528\u3067\u304d\u308b\u6280\u8853\u3067\u3059\u3002\u8a73\u3057\u304f\u306f\u4ee5\u4e0b\u3092\u53c2\u8003\u306b\u3057\u3066\u304f\u3060\u3055\u3044\u3002<\/p>\n<p>https:\/\/www.albert2005.co.jp\/knowledge\/machine_learning\/topic_model\/about_topic_model<\/p>\n<p>\u30c8\u30d4\u30c3\u30af\u30e2\u30c7\u30eb\u306f\u975e\u5e38\u306b\u96e3\u3057\u3044\uff08\u3068\u500b\u4eba\u7684\u306b\u601d\u3046\uff09\u624b\u6cd5\u3067\u3059\u304c\u3001Python\u3067\u306fgensim\u3068\u3044\u3046\u30e9\u30a4\u30d6\u30e9\u30ea\u3092\u4f7f\u3046\u3053\u3068\u3067\u7c21\u5358\u306b\u4f7f\u3046\u3053\u3068\u304c\u3067\u304d\u307e\u3059\u3002\u3053\u306e\u8a18\u4e8b\u3067\u306f\u305d\u3093\u306agensim\u306b\u3064\u3044\u3066\u3001\u305d\u306e\u57fa\u672c\u7684\u306a\u4f7f\u3044\u65b9\u3092\u3054\u7d39\u4ecb\u3057\u307e\u3059\uff01<\/p>\n<h2>gensim\u3068\u306f<\/h2>\n<p><img decoding=\"async\" class=\"aligncenter size-full wp-image-68079\" src=\"https:\/\/www.sejuku.net\/blog\/wp-content\/uploads\/2018\/08\/gensim_scsh.png\" alt=\"\" width=\"600\" height=\"294\" srcset=\"https:\/\/www.sejuku.net\/blog\/wp-content\/uploads\/2018\/08\/gensim_scsh.png 600w, https:\/\/www.sejuku.net\/blog\/wp-content\/uploads\/2018\/08\/gensim_scsh-150x74.png 150w, https:\/\/www.sejuku.net\/blog\/wp-content\/uploads\/2018\/08\/gensim_scsh-300x147.png 300w\" sizes=\"(max-width: 600px) 100vw, 600px\" \/><\/p>\n<p><a href=\"https:\/\/radimrehurek.com\/gensim\/\">[\u30ea\u30f3\u30af: gensim\u516c\u5f0f\u30da\u30fc\u30b8]<\/a><\/p>\n<p>gensim\u306f\u3001\u69d8\u3005\u306a\u30c8\u30d4\u30c3\u30af\u30e2\u30c7\u30eb\u3092\u5b9f\u88c5\u3057\u305fPython\u30e9\u30a4\u30d6\u30e9\u30ea\u3067\u3059\u3002\u300ctopic modeling for humans\u300d\u3068\u3042\u308b\u3088\u3046\u306b\u3001\u5b9f\u88c5\u304c\u5927\u5909\u306a\u30c8\u30d4\u30c3\u30af\u30e2\u30c7\u30eb\u3092\u7c21\u5358\u306b\u4f7f\u3046\u3053\u3068\u304c\u3067\u304d\u307e\u3059\u3002<\/p>\n<div class=\"box01\">\n<ul>\n<li><b>LSI\uff08Latent Semantic Indexing\uff09<\/b><\/li>\n<li><b>LDA\uff08Latent Dirichlet Allocation\uff09<\/b><\/li>\n<li><b>DTM\uff08Dynamic Topic Modeling\uff09<\/b><\/li>\n<\/ul>\n<\/div>\n<p>\u306a\u3069\u306e\u30c8\u30d4\u30c3\u30af\u30e2\u30c7\u30eb\u304c\u5b9f\u88c5\u3055\u308c\u3066\u3044\u307e\u3059\u3002\u307e\u305f\u3001word2vec\u306e\u3088\u3046\u306aword embedding\u624b\u6cd5\u3082gensim\u304b\u3089\u4f7f\u3046\u3053\u3068\u304c\u3067\u304d\u307e\u3059\u3002<\/p>\n<h2>gensim\u306e\u57fa\u672c\u7684\u306a\u4f7f\u3044\u65b9<\/h2>\n<p>gensim\u306e\u57fa\u672c\u7684\u306a\u4f7f\u3044\u65b9\u3092\u7d39\u4ecb\u3057\u3066\u3044\u304d\u307e\u3059\u3002<\/p>\n<p>\u7c21\u5358\u306a\u6d41\u308c\u3068\u3057\u3066\u306f<br \/>\n<div class=\"box01\"><\/p>\n<ul>\n<li><b>\u6587\u66f8\u30c7\u30fc\u30bf\u306e\u524d\u51e6\u7406<\/b><\/li>\n<li><b>\u6587\u66f8\u30c7\u30fc\u30bf\u304b\u3089\u8f9e\u66f8\uff08\u5358\u8a9e\u3068\u305d\u308c\u306b\u5bfe\u5fdc\u3057\u305f\u30a4\u30f3\u30c7\u30c3\u30af\u30b9\u306e\u8f9e\u66f8\uff09\u3092\u4f5c\u308b<\/b><\/li>\n<li><b>\u30ab\u30a6\u30f3\u30c8\u30d9\u30fc\u30b9\u3084TF-IDF\u306a\u3069\u306e\u65b9\u6cd5\u3092\u4f7f\u3063\u3066\u6587\u66f8\u30c7\u30fc\u30bf\u3092\u5909\u63db\u3059\u308b<\/b><\/li>\n<li><b>LDA\u306a\u3069\u306e\u30c8\u30d4\u30c3\u30af\u30e2\u30c7\u30eb\u304c\u5b9f\u88c5\u3055\u308c\u305f\u30af\u30e9\u30b9\u306b2\u3067\u4f5c\u3063\u305f\u30aa\u30d6\u30b8\u30a7\u30af\u30c8\u3092\u6e21\u3057\u3066\u5b66\u7fd2\u3059\u308b<\/b><\/li>\n<\/ul>\n<\/div>\n<p>\u306e4\u30b9\u30c6\u30c3\u30d7\u3067\u3059\u3002\u3053\u3053\u3067\u306f\u305d\u308c\u306b\u52a0\u3048\u3066\u3001\u30c8\u30d4\u30c3\u30af\u30e2\u30c7\u30eb\u306e\u7d50\u679c\u306e\u53ef\u8996\u5316\u307e\u3067\u884c\u3063\u3066\u307f\u308b\u4e00\u9023\u306e\u5b9f\u9a13\u3092\u884c\u3044\u307e\u3059\u3002\u3053\u306e\u5b9f\u9a13\u3092\u901a\u3057\u3066gensim\u306e\u4f7f\u3044\u65b9\u3092\u78ba\u8a8d\u3057\u307e\u3057\u3087\u3046\u3002<\/p>\n<h3>\u30b9\u30c6\u30c3\u30d71\uff1a\u524d\u51e6\u7406<\/h3>\n<p>\u3053\u3053\u3067\u306f\u6587\u66f8\u30c7\u30fc\u30bf\u3092\u8aad\u307f\u8fbc\u3093\u3067\u3001\u6587\u66f8\u30c7\u30fc\u30bf\u304b\u3089\u8f9e\u66f8\u3092\u4f5c\u6210\u3057\u307e\u3059\u3002\u307e\u305a\u306f\u30e9\u30a4\u30d6\u30e9\u30ea\u3092import\u3057\u307e\u3057\u3087\u3046\u3002<\/p>\n<pre class=\"lang:python decode:true\">import gensim\r\nimport numpy as np\r\nfrom collections import Counter\r\nfrom sklearn import datasets\r\nimport matplotlib.pyplot as plt\r\nfrom wordcloud import WordCloud<\/pre>\n<p>\u6b21\u306b\u3001\u30c7\u30fc\u30bf\u30bb\u30c3\u30c8\u3092\u8aad\u307f\u8fbc\u307f\u307e\u3059\u3002<\/p>\n<pre class=\"lang:python decode:true \">print(\"Loading dataset...\")\r\ntwenty_news = datasets.fetch_20newsgroups(shuffle=True, random_state=1,\r\n                             remove=('headers', 'footers', 'quotes'))<\/pre>\n<p>scikit-learn\u3092\u4f7f\u3063\u3066\u30c7\u30fc\u30bf\u3092\u8aad\u307f\u8fbc\u307f\u307e\u3059\u3002\u3053\u3053\u3067\u8aad\u307f\u8fbc\u3093\u3060\u306e\u306f20new groups\u3068\u3044\u3046\u3001\u6587\u66f8\u30c7\u30fc\u30bf\u3092\u53ce\u3081\u305f\u30c7\u30fc\u30bf\u30bb\u30c3\u30c8\u3067\u3059\u3002\u6587\u66f8\u5206\u985e\u306b\u9069\u3057\u305f\u30c7\u30fc\u30bf\u30bb\u30c3\u30c8\u3067\u300120\u30af\u30e9\u30b9\u306b\u30e9\u30d9\u30eb\u4ed8\u3051\u3055\u308c\u305f\u30c6\u30ad\u30b9\u30c8\u304c\u53ce\u3081\u3089\u308c\u3066\u3044\u307e\u3059\u3002<\/p>\n<p>\u8aad\u307f\u8fbc\u3093\u3060\u30c7\u30fc\u30bf\u30bb\u30c3\u30c8\u306ftwenty_news\u5909\u6570\u306b\u53ce\u3081\u3089\u308c\u3066\u3044\u308b\u306e\u3067\u3001print\u3057\u3066\u4e2d\u8eab\u3092\u78ba\u8a8d\u3057\u3066\u304f\u3060\u3055\u3044\u3002\u3055\u3066\u3001\u30c8\u30d4\u30c3\u30af\u30e2\u30c7\u30eb\u3092\u4f7f\u3046\u306b\u3042\u305f\u308a\u3001\u6587\u66f8\u30c7\u30fc\u30bf\u3092\u7c21\u5358\u306b\u524d\u51e6\u7406\u3057\u3066\u304d\u308c\u3044\u306b\u3057\u306a\u3051\u308c\u3070\u306a\u308a\u307e\u305b\u3093\u3002\u307e\u305a\u306f\u30b9\u30c8\u30c3\u30d7\u30ef\u30fc\u30c9\u306e\u9664\u53bb\u3067\u3059\u3002<\/p>\n<p>\u30b9\u30c8\u30c3\u30d7\u30ef\u30fc\u30c9\u306fI, am, do, have\u306e\u3088\u3046\u306a\u51fa\u73fe\u983b\u5ea6\u306f\u591a\u3044\u304c\u3042\u307e\u308a\u610f\u5473\u306e\u7121\u3044\u5358\u8a9e\u306e\u3053\u3068\u3067\u3059\u3002\u3053\u308c\u306f\u30b9\u30c8\u30c3\u30d7\u30ef\u30fc\u30c9\u30ea\u30b9\u30c8\u3092\u4f7f\u3063\u3066\u30d5\u30a3\u30eb\u30bf\u30ea\u30f3\u30b0\u3092\u884c\u3046\u306e\u304c\u7c21\u5358\u3067\u3059\u3002<\/p>\n<p>\u3053\u3053\u3067\u306f\u3001gist\u3067\u898b\u3064\u3051\u305f<a href=\"https:\/\/gist.github.com\/sebleier\/554280#file-nltk-s-list-of-english-stopwords\">\u30b9\u30c8\u30c3\u30d7\u30ef\u30fc\u30c9\u30ea\u30b9\u30c8<\/a>\u3092\u4fdd\u5b58\u3057\u3066\u4f7f\u3063\u3066\u3044\u307e\u3059\u3002<\/p>\n<pre class=\"lang:python decode:true \">with open(\"stopwords.txt\") as f:\r\n    stopwords = f.read()\r\n    stopwords = stopwords.split()<\/pre>\n<p>\u3053\u308c\u3067\u30b9\u30c8\u30c3\u30d7\u30ef\u30fc\u30c9\u306e\u30ea\u30b9\u30c8\u304c\u3067\u304d\u305f\u306e\u3067\u3001\u3053\u308c\u3092\u4f7f\u3063\u3066\u30d5\u30a3\u30eb\u30bf\u30ea\u30f3\u30b0\u3092\u884c\u3044\u307e\u3059\u3002\u3064\u307e\u308a\u3001\u30b9\u30c8\u30c3\u30d7\u30ef\u30fc\u30c9\u30ea\u30b9\u30c8\u306b\u5165\u3063\u3066\u3044\u308b\u5358\u8a9e\u3092\u6d88\u3057\u306a\u304c\u3089\u3001<b>gensim\u306e\u5165\u529b\u30c7\u30fc\u30bf\u306e\u5f62\u306b\u30c7\u30fc\u30bf\u3092\u6574\u5f62<\/b>\u3057\u307e\u3059\u3002<\/p>\n<p>\u3061\u306a\u307f\u306b\u3053\u306e\u6642\u70b9\u3067\u3001docs\u306e\u4e2d\u8eab\u306f\u30ea\u30b9\u30c8\u306e\u4e2d\u306bstr\u578b\u304c\u5165\u3063\u3066\u3044\u308b\u5f62\u3067\u3059\u3002<\/p>\n<p>\uff08\u203b\u3064\u307e\u308a [&#8220;\u6587\u7ae01&#8221;, &#8220;\u6587\u7ae02&#8221;, &#8230;] \u306e\u5f62\uff09<\/p>\n<pre class=\"lang:python decode:true\">docs = data.data\r\ntexts = [\r\n    [w for w in doc.lower().split() if w not in stopwords]\r\n        for doc in docs\r\n]<\/pre>\n<p>\u82f1\u8a9e\u306b\u306f\u5927\u6587\u5b57\u3068\u5c0f\u6587\u5b57\u304c\u3042\u308a\u307e\u3059\u3002<\/p>\n<p>\u3053\u308c\u3089\u3092\u5168\u3066\u5c0f\u6587\u5b57\u306b\u76f4\u3057\u3066\u304b\u3089\u3001\u30b9\u30c8\u30c3\u30d7\u30ef\u30fc\u30c9\u30ea\u30b9\u30c8\u306e\u4e2d\u306b\u3042\u308b\u5358\u8a9e\u306a\u3089\u3070\u524a\u9664\u3057\u3066\u3044\u307e\u3059\u3002\u203b\u3053\u3053\u307e\u3067\u3067\u3001texts\u5909\u6570\u306f[[&#8220;\u5358\u8a9e&#8221;, &#8220;\u5358\u8a9e&#8221;, &#8230;],\u00a0 [&#8220;\u5358\u8a9e&#8221;, &#8220;\u5358\u8a9e&#8221;, &#8230;], &#8230;]\u306e\u5f62\u306b\u306a\u3063\u3066\u3044\u307e\u3059\u3002<\/p>\n<p>\u3055\u3066\u3001\u5b9f\u306f\u6587\u66f8\u30c7\u30fc\u30bf\u306e\u4e2d\u3067\u4f7f\u308f\u308c\u3066\u3044\u308b\u5358\u8a9e\u306e\u51fa\u73fe\u983b\u5ea6\u306b\u306f\u504f\u308a\u304c\u975e\u5e38\u306b\u5927\u304d\u3044\u3068\u3044\u3046\u3053\u3068\u304c\u77e5\u3089\u308c\u3066\u3044\u307e\u3059\u3002\u3053\u308c\u3092\u30b0\u30e9\u30d5\u3092\u4f7f\u3063\u3066\u78ba\u8a8d\u3057\u3066\u307f\u307e\u3057\u3087\u3046\u3002<\/p>\n<pre class=\"lang:python decode:true \">count = Counter(w for doc in texts for w in doc)\r\ncount.most_common()[:10]<\/pre>\n<p>[\u51fa\u529b\u7d50\u679c]<\/p>\n<pre class=\"lang:default decode:true \">[('-', 6200),\r\n ('would', 6074),\r\n ('one', 5420),\r\n ('x', 4688),\r\n (\"don't\", 3652),\r\n (':', 3649),\r\n ('like', 3624),\r\n ('get', 3447),\r\n ('people', 3387),\r\n (\"max&gt;'ax&gt;'ax&gt;'ax&gt;'ax&gt;'ax&gt;'ax&gt;'ax&gt;'ax&gt;'ax&gt;'ax&gt;'ax&gt;'ax&gt;'ax&gt;'ax&gt;'\", 3289)]\r\n<\/pre>\n<p>Counter\u30af\u30e9\u30b9\u306emost_common\u3092\u4f7f\u3046\u3068\u30ea\u30b9\u30c8\u306e\u4e2d\u306e\u8981\u7d20\u304c\u4f55\u56de\u51fa\u73fe\u3057\u3066\u3044\u308b\u306e\u304b\u3092\u3001\u4e0a\u306e\u3088\u3046\u306b\u6570\u3048\u4e0a\u3052\u3066\u304f\u308c\u307e\u3059\u3002\u3053\u308c\u3092\u4f7f\u3063\u3066\u30b0\u30e9\u30d5\u3092\u4f5c\u3063\u3066\u307f\u307e\u3057\u305f\u3002<\/p>\n<pre class=\"lang:python decode:true \">y = [i[1] for i in count.most_common()]\r\nplt.plot(y)<\/pre>\n<p><img decoding=\"async\" class=\"aligncenter size-full wp-image-68092\" src=\"https:\/\/www.sejuku.net\/blog\/wp-content\/uploads\/2018\/08\/hist1.png\" alt=\"\" width=\"400\" height=\"252\" srcset=\"https:\/\/www.sejuku.net\/blog\/wp-content\/uploads\/2018\/08\/hist1.png 400w, https:\/\/www.sejuku.net\/blog\/wp-content\/uploads\/2018\/08\/hist1-150x95.png 150w, https:\/\/www.sejuku.net\/blog\/wp-content\/uploads\/2018\/08\/hist1-300x189.png 300w\" sizes=\"(max-width: 400px) 100vw, 400px\" \/><\/p>\n<p>x\u8ef8\u306f\u5358\u8a9e\u3067\u3059\uff08\u7d04200000\u500b\u304f\u3089\u3044\u306e\u5358\u8a9e\u304c\u3042\u308a\u307e\u3059\u3002\uff09y\u306f\u305d\u308c\u305e\u308c\u306e\u5358\u8a9e\u306e\u51fa\u73fe\u56de\u6570\u3067\u3059\u3002<\/p>\n<p>\u672c\u5f53\u306b\u4e00\u63e1\u308a\u306e\u9ad8\u983b\u51fa\u306e\u5358\u8a9e\u3068\u3001\u305d\u308c\u4ee5\u5916\u306e\u307b\u3068\u3093\u3069\u51fa\u73fe\u3057\u306a\u3044\u5358\u8a9e\u306b\u5206\u304b\u308c\u3066\u3044\u307e\u3059\u306d\u3002\u307e\u305f\u3001\u5bfe\u6570\u30b0\u30e9\u30d5\u306b\u3059\u308b\u3068\u4ee5\u4e0b\u306e\u3088\u3046\u306b\u306a\u308a\u307e\u3059\u3002<\/p>\n<p><img decoding=\"async\" class=\"aligncenter size-full wp-image-68094\" src=\"https:\/\/www.sejuku.net\/blog\/wp-content\/uploads\/2018\/08\/hist2.png\" alt=\"\" width=\"381\" height=\"252\" srcset=\"https:\/\/www.sejuku.net\/blog\/wp-content\/uploads\/2018\/08\/hist2.png 381w, https:\/\/www.sejuku.net\/blog\/wp-content\/uploads\/2018\/08\/hist2-150x99.png 150w, https:\/\/www.sejuku.net\/blog\/wp-content\/uploads\/2018\/08\/hist2-300x198.png 300w, https:\/\/www.sejuku.net\/blog\/wp-content\/uploads\/2018\/08\/hist2-380x252.png 380w\" sizes=\"(max-width: 381px) 100vw, 381px\" \/><\/p>\n<p>\u307b\u3068\u3093\u3069\u306e\u5358\u8a9e\u304c1\u56de\u3057\u304b\u51fa\u3066\u3044\u306a\u3044\u3088\u3046\u3067\u3059\u3002\u30c8\u30d4\u30c3\u30af\u30e2\u30c7\u30eb\u3092\u4f7f\u3046\u3068\u304d\u3001\u4e00\u56de\u3057\u304b\u51fa\u3066\u3044\u306a\u3044\u3088\u3046\u306a\u672c\u5f53\u306b\u4f4e\u3044\u51fa\u73fe\u983b\u5ea6\u306e\u5358\u8a9e\u3084\u3001\u3081\u3061\u3083\u304f\u3061\u3083\u51fa\u73fe\u983b\u5ea6\u306e\u9ad8\u3044\u5358\u8a9e\u3092\u30d5\u30a3\u30eb\u30bf\u30ea\u30f3\u30b0\u3057\u3066\u3057\u307e\u3046\u3068\u3044\u3046\u524d\u51e6\u7406\u304c\u3042\u308a\u307e\u3059\u3002<\/p>\n<p>\u3053\u3053\u3067\u306f\u305d\u308c\u3092\u63a1\u7528\u3057\u307e\u3059\u3002\u203b\u3053\u306e\u30b3\u30fc\u30c9\u306f\u7c21\u5358\u306b\u3059\u308b\u305f\u3081\u306b\u3001\u30b9\u30c6\u30c3\u30d72, \u30b9\u30c6\u30c3\u30d73\u3092\u884c\u3063\u3066\u304b\u3089\u4f7f\u3044\u307e\u3057\u3087\u3046\u3002<\/p>\n<pre class=\"lang:python decode:true\">num_tokens = len(count.most_common())\r\nN = int(num_tokens*0.05)\r\nmax_frequency = count.most_common()[N][1]\r\n\r\ncorpus = [[w for w in doc if max_frequency &gt; w[1] &gt;= 3] for doc in corpus]\r\n<\/pre>\n<p>3\u56de\u4ee5\u4e0a\u51fa\u73fe\u3057\u3066\u3044\u308b or \u51fa\u73fe\u983b\u5ea6\u4e0a\u4f4d5\uff05\u306e\u5358\u8a9e\u3092\u30d5\u30a3\u30eb\u30bf\u30ea\u30f3\u30b0\u3057\u307e\u3057\u305f\u3002<\/p>\n<h3>\u30b9\u30c6\u30c3\u30d72\uff1a\u8f9e\u66f8\u306e\u4f5c\u6210, \u30b9\u30c6\u30c3\u30d73\uff1aCorpus\u306e\u4f5c\u6210<\/h3>\n<p>\u524d\u51e6\u7406\u304c\u7d42\u308f\u3063\u305f\u3089\u3001\u8f9e\u66f8\u3092\u4f5c\u6210\u3057\u307e\u3059\u3002gensim\u3067\u4f7f\u3046\u8f9e\u66f8\u306e\u4f5c\u6210\u306b\u306f\u3001gensim.corpora.Dictionary\u95a2\u6570\u3092\u4f7f\u3044\u307e\u3059\u3002\u3053\u306e\u8f9e\u66f8\u306f\u524d\u8ff0\u3057\u305f\u3068\u304a\u308a\u3001\u5358\u8a9e\u3068index\u306e\u7d44\u307f\u5408\u308f\u305b\u3092\u6301\u3063\u3066\u3044\u307e\u3059\u3002<\/p>\n<p>\u3053\u308c\u3092\u4f7f\u3063\u3066\u3001\u305d\u306e\u307e\u307e\u6587\u66f8\u30c7\u30fc\u30bf\u3092gensim\u3067\u4f7f\u3048\u308b\u5f62\u306b\u5909\u63db\u3057\u307e\u3059\u3002\u3053\u306e\u5909\u63db\u306b\u306fdoc2vec\u30e1\u30bd\u30c3\u30c9\u3092\u4f7f\u3044\u307e\u3059\u3002<\/p>\n<pre class=\"lang:python decode:true\">dictionary = gensim.corpora.Dictionary(texts)\r\ncorpus = [dictionary.doc2bow(text) for text in texts]\r\ncorpus[0]<\/pre>\n<p>[\u51fa\u529b\u7d50\u679c]<\/p>\n<pre class=\"\">[(18, 3),\r\n  (32, 3),\r\n  (38, 4),\r\n  (78, 3),\r\n  (81, 3),\r\n  (83, 15),\r\n  (105, 4),\r\n  (109, 4),\r\n  (113, 8),\r\n  (148, 10),\r\n  ...\u4e2d\u7565...\r\n]<\/pre>\n<p>\u5909\u63db\u3057\u305f\u30c7\u30fc\u30bf\u306f\u3001index\u3068\u51fa\u73fe\u56de\u6570\u306e\u30bf\u30d7\u30eb\u306e\u30ea\u30b9\u30c8\u304c\u6587\u66f8\u306e\u5206\u3060\u3051\u5165\u3063\u305f\u30ea\u30b9\u30c8\u306b\u306a\u3063\u3066\u3044\u307e\u3059\u3002<\/p>\n<p>\u51fa\u529b\u7d50\u679c\u3092\u307f\u3066\u304f\u3060\u3055\u3044\u3002\u3053\u308c\u306f\u3000[(\u5358\u8a9e\u306eindex, \u51fa\u73fe\u56de\u6570), &#8230;]\u306b\u306a\u3063\u3066\u3044\u307e\u3059\u3002\u3053\u306e\u30ea\u30b9\u30c8\u3067\u4e00\u3064\u306e\u6587\u66f8\u306b\u76f8\u5f53\u3059\u308b\u306e\u3067\u3001corpus\u5909\u6570\u306b\u306f\u3001\u3053\u306e\u30ea\u30b9\u30c8\u3092\u6587\u66f8\u6570\u5206\u3060\u3051\u5185\u5305\u3057\u305f\u30ea\u30b9\u30c8\u3092\u6301\u3063\u3066\u3044\u307e\u3059\u3002<\/p>\n<h3>\u30c8\u30d4\u30c3\u30af\u30e2\u30c7\u30eb\u306e\u5b66\u7fd2<\/h3>\n<p>\u3053\u3053\u307e\u3067\u3067gensim\u3067\u5165\u529b\u30c7\u30fc\u30bf\u3068\u3057\u3066\u4f7f\u3048\u308b\u30aa\u30d6\u30b8\u30a7\u30af\u30c8\uff08corpus\uff09\u304c\u3067\u304d\u307e\u3057\u305f\u3002\u3053\u308c\u3068\u8f9e\u66f8\u30aa\u30d6\u30b8\u30a7\u30af\u30c8\uff08dictionary\uff09\u3092\u4f7f\u3063\u3066\u3001LDA\u30e2\u30c7\u30eb\u306e\u5b66\u7fd2\u3092\u8a66\u3057\u3066\u307f\u307e\u3057\u3087\u3046\u3002<\/p>\n<p>LdaModel\u30af\u30e9\u30b9\u3092\u4f7f\u3046\u306b\u306f\u3001\u6700\u4f4e\uff13\u3064\u306e\u30d1\u30e9\u30e1\u30fc\u30bf\u3092\u8a2d\u5b9a\u3057\u307e\u3059\u3002corpus\u3001num_topics, id2word\u306e\uff13\u3064\u3067\u3059\u3002LDA\u306ftopic\u6570\u3092\u5229\u7528\u8005\u304c\u6c7a\u3081\u308b\u5fc5\u8981\u304c\u3042\u308b\u306e\u3067\u3001\u3053\u3053\u306fint\u578b\u3067\u6307\u5b9a\u3057\u3066\u304f\u3060\u3055\u3044\u3002<\/p>\n<p>id2word\u306f\u8f9e\u66f8\u30aa\u30d6\u30b8\u30a7\u30af\u30c8\u3092\u6e21\u305b\u3070OK\u3067\u3059\u3002<\/p>\n<pre class=\"lang:python decode:true\">num_topics = 20\r\n\r\nlda = gensim.models.ldamodel.LdaModel(\r\n    corpus=corpus,\r\n    num_topics=num_topics,\r\n    id2word=dictionary\r\n)<\/pre>\n<p>\u3053\u308c\u3067\u5b66\u7fd2\u304c\u3067\u304d\u307e\u3057\u305f\u3002\u30c7\u30fc\u30bf\u306e\u5927\u304d\u3055\u306b\u3088\u3063\u3066\u306f\u6642\u9593\u304c\u304b\u304b\u308b\u306e\u3067\u6c17\u3092\u3064\u3051\u3066\u304f\u3060\u3055\u3044\u3002<\/p>\n<h3>\u7d50\u679c\u306e\u78ba\u8a8d<\/h3>\n<p>\u30c8\u30d4\u30c3\u30af\u3092\u53ef\u8996\u5316\u3057\u3066\u307f\u307e\u3057\u3087\u3046\u3002\u4eca\u56de\u306f\u30ef\u30fc\u30c9\u30af\u30e9\u30a6\u30c9\u3092\u4f7f\u3063\u3066\u53ef\u8996\u5316\u3057\u307e\u3059\u3002\u5358\u8a9e\u306e\u5927\u304d\u3055\u304c\u305d\u306e\u5358\u8a9e\u306e\u305d\u306e\u30c8\u30d4\u30c3\u30af\u306b\u304a\u3051\u308b\u6240\u5c5e\u78ba\u7387\u306e\u9ad8\u3055\u306b\u306a\u308a\u307e\u3059\u3002<\/p>\n<pre class=\"lang:python decode:true\">plt.figure(figsize=(30,30))\r\nfor t in range(lda.num_topics):\r\n    plt.subplot(5,4,t+1)\r\n    x = dict(lda.show_topic(t,200))\r\n    im = WordCloud().generate_from_frequencies(x)\r\n    plt.imshow(im)\r\n    plt.axis(\"off\")\r\n    plt.title(\"Topic #\" + str(t))\r\n<\/pre>\n<p><img decoding=\"async\" class=\"aligncenter size-large wp-image-68106\" src=\"https:\/\/www.sejuku.net\/blog\/wp-content\/uploads\/2018\/08\/tpvis-640x586.png\" alt=\"\" width=\"640\" height=\"586\" srcset=\"https:\/\/www.sejuku.net\/blog\/wp-content\/uploads\/2018\/08\/tpvis-640x586.png 640w, https:\/\/www.sejuku.net\/blog\/wp-content\/uploads\/2018\/08\/tpvis-150x137.png 150w, https:\/\/www.sejuku.net\/blog\/wp-content\/uploads\/2018\/08\/tpvis-300x275.png 300w, https:\/\/www.sejuku.net\/blog\/wp-content\/uploads\/2018\/08\/tpvis-810x741.png 810w, https:\/\/www.sejuku.net\/blog\/wp-content\/uploads\/2018\/08\/tpvis-1140x1043.png 1140w, https:\/\/www.sejuku.net\/blog\/wp-content\/uploads\/2018\/08\/tpvis.png 1724w\" sizes=\"(max-width: 640px) 100vw, 640px\" \/><\/p>\n<p>\u5de6\u4e0a\u304ctopic0\u3001\u53f3\u4e0b\u304ctopic19\u306b\u306a\u3063\u3066\u3044\u307e\u3059\u3002\u4eca\u56de\u306f\u30c1\u30e5\u30fc\u30cb\u30f3\u30b0\u3092\u9069\u5f53\u306b\u3057\u3066\u3044\u307e\u3059\u3057\u3001\u524d\u51e6\u7406\u3082\u6700\u4f4e\u9650\u3067\u3059\u306e\u3067\u3042\u307e\u308a\u304d\u308c\u3044\u306b\u306f\u30c8\u30d4\u30c3\u30af\u304c\u3067\u304d\u3066\u3044\u307e\u305b\u3093\u306d\u3002<\/p>\n<p>\u3057\u304b\u3057topic14\u3084topic17\u306e\u3088\u3046\u306a\u3001\u306a\u3093\u3068\u306a\u304f\u610f\u5473\u7684\u306a\u307e\u3068\u307e\u308a\u304c\u3042\u308a\u305d\u3046\u306a\u30c8\u30d4\u30c3\u30af\u3082\u3067\u304d\u3066\u3044\u308b\u3088\u3046\u3067\u3059\u3002\u3053\u308c\u3092\u66f4\u306b\u304d\u308c\u3044\u306a\u30c8\u30d4\u30c3\u30af\u304c\u4f5c\u308c\u308b\u3088\u3046\u306b\u30c1\u30e5\u30fc\u30cb\u30f3\u30b0\u3057\u305f\u308a\u3001\u53ef\u8996\u5316\u306e\u65b9\u6cd5\u3092\u5909\u3048\u305f\u308a\u3068\u3044\u3063\u305f\u65b9\u6cd5\u306b\u3064\u3044\u3066\u52c9\u5f37\u3059\u308b\u3068\u9762\u767d\u3044\u3067\u3059\u3088\u3002<\/p>\n<h2>\u307e\u3068\u3081<\/h2>\n<p>\u3053\u306e\u8a18\u4e8b\u3067\u306fgensim\u306e\u4f7f\u3044\u65b9\u306b\u3064\u3044\u3066\u3001\u5b9f\u969b\u306b\u7c21\u5358\u306b\u5b9f\u9a13\u3092\u4e00\u901a\u308a\u884c\u3046\u3053\u3068\u3067\u7d39\u4ecb\u3057\u307e\u3057\u305f\u3002gensim\u306b\u306f\u3053\u3053\u3067\u7d39\u4ecb\u3057\u305fLDA\u4ee5\u5916\u306b\u3082\u305f\u304f\u3055\u3093\u306e\u30e2\u30c7\u30eb\u304c\u5b9f\u88c5\u3055\u308c\u3066\u3044\u307e\u3059\u3002<\/p>\n<p>Deep Learning\u4ee5\u5916\u306b\u3082\u9762\u767d\u3044\u6a5f\u68b0\u5b66\u7fd2\u624b\u6cd5\u304c\u3042\u308b\u3053\u3068\u3001\u305d\u308c\u304cgensim\u306a\u3069\u306e\u30e9\u30a4\u30d6\u30e9\u30ea\u306b\u3088\u3063\u3066\u624b\u8efd\u306b\u8a66\u305b\u308b\u3053\u3068\u3092\u5b9f\u611f\u3057\u3066\u3044\u305f\u3060\u3051\u305f\u3089\u5b09\u3057\u3044\u3067\u3059\u3002<\/p>\n","protected":false},"excerpt":{"rendered":"<p>\u81ea\u7136\u8a00\u8a9e\u51e6\u7406\u3084\u60c5\u5831\u691c\u7d22\u306e\u5206\u91ce\u306b\u3001\u30c8\u30d4\u30c3\u30af\u30e2\u30c7\u30eb\u3068\u3044\u3046\u624b\u6cd5\u304c\u3042\u308a\u307e\u3059\u3002\u30c8\u30d4\u30c3\u30af\u30e2\u30c7\u30eb\u306f\u6587\u66f8\u5206\u985e\u3084\u63a8\u85a6\u30b7\u30b9\u30c6\u30e0\u306a\u3069\u306b\u5fdc\u7528\u3067\u304d\u308b\u6280\u8853\u3067\u3059\u3002\u8a73\u3057\u304f\u306f\u4ee5\u4e0b\u3092\u53c2\u8003\u306b\u3057\u3066\u304f\u3060\u3055\u3044\u3002 https:\/\/www.albert2005.co. [&hellip;]<\/p>\n","protected":false},"author":4,"featured_media":68077,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"inline_featured_image":false,"swell_btn_cv_data":"","footnotes":""},"categories":[2350],"tags":[1281,49],"class_list":["post-67863","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-python-study","tag-ai","tag-python"],"acf":[],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v27.4 - https:\/\/yoast.com\/product\/yoast-seo-wordpress\/ -->\n<title>\u96e3\u3057\u3044\u30c8\u30d4\u30c3\u30af\u30e2\u30c7\u30eb\u3092\u7c21\u5358\u306b\uff01Python\u30e9\u30a4\u30d6\u30e9\u30eaGensim\u306e\u4f7f\u3044\u65b9 | \u4f8d\u30a8\u30f3\u30b8\u30cb\u30a2\u30d6\u30ed\u30b0<\/title>\n<meta name=\"description\" content=\"\u3053\u306e\u8a18\u4e8b\u3067\u306f\u300c \u96e3\u3057\u3044\u30c8\u30d4\u30c3\u30af\u30e2\u30c7\u30eb\u3092\u7c21\u5358\u306b\uff01Python\u30e9\u30a4\u30d6\u30e9\u30eaGensim\u306e\u4f7f\u3044\u65b9 \u300d\u306b\u3064\u3044\u3066\u3001\u8ab0\u3067\u3082\u7406\u89e3\u3067\u304d\u308b\u3088\u3046\u306b\u89e3\u8aac\u3057\u307e\u3059\u3002\u3053\u306e\u8a18\u4e8b\u3092\u8aad\u3081\u3070\u3001\u3042\u306a\u305f\u306e\u60a9\u307f\u304c\u89e3\u6c7a\u3059\u308b\u3060\u3051\u3058\u3083\u306a\u304f\u3001\u65b0\u305f\u306a\u6c17\u4ed8\u304d\u3082\u767a\u898b\u3067\u304d\u308b\u3053\u3068\u3067\u3057\u3087\u3046\u3002\u304a\u60a9\u307f\u306e\u65b9\u306f\u305c\u3072\u3054\u4e00\u8aad\u304f\u3060\u3055\u3044\u3002\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/www.sejuku.net\/blog\/67863\" \/>\n<meta property=\"og:locale\" content=\"ja_JP\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"\u96e3\u3057\u3044\u30c8\u30d4\u30c3\u30af\u30e2\u30c7\u30eb\u3092\u7c21\u5358\u306b\uff01Python\u30e9\u30a4\u30d6\u30e9\u30eaGensim\u306e\u4f7f\u3044\u65b9 | \u4f8d\u30a8\u30f3\u30b8\u30cb\u30a2\u30d6\u30ed\u30b0\" \/>\n<meta property=\"og:description\" content=\"\u3053\u306e\u8a18\u4e8b\u3067\u306f\u300c \u96e3\u3057\u3044\u30c8\u30d4\u30c3\u30af\u30e2\u30c7\u30eb\u3092\u7c21\u5358\u306b\uff01Python\u30e9\u30a4\u30d6\u30e9\u30eaGensim\u306e\u4f7f\u3044\u65b9 \u300d\u306b\u3064\u3044\u3066\u3001\u8ab0\u3067\u3082\u7406\u89e3\u3067\u304d\u308b\u3088\u3046\u306b\u89e3\u8aac\u3057\u307e\u3059\u3002\u3053\u306e\u8a18\u4e8b\u3092\u8aad\u3081\u3070\u3001\u3042\u306a\u305f\u306e\u60a9\u307f\u304c\u89e3\u6c7a\u3059\u308b\u3060\u3051\u3058\u3083\u306a\u304f\u3001\u65b0\u305f\u306a\u6c17\u4ed8\u304d\u3082\u767a\u898b\u3067\u304d\u308b\u3053\u3068\u3067\u3057\u3087\u3046\u3002\u304a\u60a9\u307f\u306e\u65b9\u306f\u305c\u3072\u3054\u4e00\u8aad\u304f\u3060\u3055\u3044\u3002\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.sejuku.net\/blog\/67863\" \/>\n<meta property=\"og:site_name\" content=\"\u4f8d\u30a8\u30f3\u30b8\u30cb\u30a2\u30d6\u30ed\u30b0\" \/>\n<meta property=\"article:publisher\" content=\"https:\/\/www.facebook.com\/sejuku2013\" \/>\n<meta property=\"article:author\" content=\"https:\/\/www.facebook.com\/sejuku2013\/\" \/>\n<meta property=\"article:published_time\" content=\"2018-08-13T16:46:07+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2024-05-06T02:50:33+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/www.sejuku.net\/blog\/wp-content\/uploads\/2018\/08\/gensim_eye.png\" \/>\n\t<meta property=\"og:image:width\" content=\"700\" \/>\n\t<meta property=\"og:image:height\" content=\"400\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/png\" \/>\n<meta name=\"author\" content=\"\u4f8d\u30a8\u30f3\u30b8\u30cb\u30a2\u7de8\u96c6\u90e8\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@https:\/\/twitter.com\/samuraijuku\" \/>\n<meta name=\"twitter:site\" content=\"@samuraijuku\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\\\/\\\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\\\/\\\/www.sejuku.net\\\/blog\\\/67863#article\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/www.sejuku.net\\\/blog\\\/67863\"},\"author\":{\"name\":\"\u4f8d\u30a8\u30f3\u30b8\u30cb\u30a2\u7de8\u96c6\u90e8\",\"@id\":\"https:\\\/\\\/www.sejuku.net\\\/blog\\\/#\\\/schema\\\/person\\\/e8ca7fd09857a736a25e6b4455a3ab61\"},\"headline\":\"\u96e3\u3057\u3044\u30c8\u30d4\u30c3\u30af\u30e2\u30c7\u30eb\u3092\u7c21\u5358\u306b\uff01Python\u30e9\u30a4\u30d6\u30e9\u30eaGensim\u306e\u4f7f\u3044\u65b9\",\"datePublished\":\"2018-08-13T16:46:07+00:00\",\"dateModified\":\"2024-05-06T02:50:33+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\\\/\\\/www.sejuku.net\\\/blog\\\/67863\"},\"wordCount\":126,\"publisher\":{\"@id\":\"https:\\\/\\\/www.sejuku.net\\\/blog\\\/#organization\"},\"image\":{\"@id\":\"https:\\\/\\\/www.sejuku.net\\\/blog\\\/67863#primaryimage\"},\"thumbnailUrl\":\"https:\\\/\\\/www.sejuku.net\\\/blog\\\/wp-content\\\/uploads\\\/2018\\\/08\\\/gensim_eye.png\",\"keywords\":[\"AI\",\"python\"],\"articleSection\":[\"Python\u5b66\u7fd2\"],\"inLanguage\":\"ja\"},{\"@type\":\"WebPage\",\"@id\":\"https:\\\/\\\/www.sejuku.net\\\/blog\\\/67863\",\"url\":\"https:\\\/\\\/www.sejuku.net\\\/blog\\\/67863\",\"name\":\"\u96e3\u3057\u3044\u30c8\u30d4\u30c3\u30af\u30e2\u30c7\u30eb\u3092\u7c21\u5358\u306b\uff01Python\u30e9\u30a4\u30d6\u30e9\u30eaGensim\u306e\u4f7f\u3044\u65b9 | \u4f8d\u30a8\u30f3\u30b8\u30cb\u30a2\u30d6\u30ed\u30b0\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/www.sejuku.net\\\/blog\\\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\\\/\\\/www.sejuku.net\\\/blog\\\/67863#primaryimage\"},\"image\":{\"@id\":\"https:\\\/\\\/www.sejuku.net\\\/blog\\\/67863#primaryimage\"},\"thumbnailUrl\":\"https:\\\/\\\/www.sejuku.net\\\/blog\\\/wp-content\\\/uploads\\\/2018\\\/08\\\/gensim_eye.png\",\"datePublished\":\"2018-08-13T16:46:07+00:00\",\"dateModified\":\"2024-05-06T02:50:33+00:00\",\"description\":\"\u3053\u306e\u8a18\u4e8b\u3067\u306f\u300c \u96e3\u3057\u3044\u30c8\u30d4\u30c3\u30af\u30e2\u30c7\u30eb\u3092\u7c21\u5358\u306b\uff01Python\u30e9\u30a4\u30d6\u30e9\u30eaGensim\u306e\u4f7f\u3044\u65b9 \u300d\u306b\u3064\u3044\u3066\u3001\u8ab0\u3067\u3082\u7406\u89e3\u3067\u304d\u308b\u3088\u3046\u306b\u89e3\u8aac\u3057\u307e\u3059\u3002\u3053\u306e\u8a18\u4e8b\u3092\u8aad\u3081\u3070\u3001\u3042\u306a\u305f\u306e\u60a9\u307f\u304c\u89e3\u6c7a\u3059\u308b\u3060\u3051\u3058\u3083\u306a\u304f\u3001\u65b0\u305f\u306a\u6c17\u4ed8\u304d\u3082\u767a\u898b\u3067\u304d\u308b\u3053\u3068\u3067\u3057\u3087\u3046\u3002\u304a\u60a9\u307f\u306e\u65b9\u306f\u305c\u3072\u3054\u4e00\u8aad\u304f\u3060\u3055\u3044\u3002\",\"breadcrumb\":{\"@id\":\"https:\\\/\\\/www.sejuku.net\\\/blog\\\/67863#breadcrumb\"},\"inLanguage\":\"ja\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\\\/\\\/www.sejuku.net\\\/blog\\\/67863\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"ja\",\"@id\":\"https:\\\/\\\/www.sejuku.net\\\/blog\\\/67863#primaryimage\",\"url\":\"https:\\\/\\\/www.sejuku.net\\\/blog\\\/wp-content\\\/uploads\\\/2018\\\/08\\\/gensim_eye.png\",\"contentUrl\":\"https:\\\/\\\/www.sejuku.net\\\/blog\\\/wp-content\\\/uploads\\\/2018\\\/08\\\/gensim_eye.png\",\"width\":700,\"height\":400},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\\\/\\\/www.sejuku.net\\\/blog\\\/67863#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\\\/\\\/www.sejuku.net\\\/blog\\\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"\u96e3\u3057\u3044\u30c8\u30d4\u30c3\u30af\u30e2\u30c7\u30eb\u3092\u7c21\u5358\u306b\uff01Python\u30e9\u30a4\u30d6\u30e9\u30eaGensim\u306e\u4f7f\u3044\u65b9\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\\\/\\\/www.sejuku.net\\\/blog\\\/#website\",\"url\":\"https:\\\/\\\/www.sejuku.net\\\/blog\\\/\",\"name\":\"\u4f8d\u30a8\u30f3\u30b8\u30cb\u30a2\u30d6\u30ed\u30b0\",\"description\":\"\u30d7\u30ed\u30b0\u30e9\u30df\u30f3\u30b0\u5b66\u7fd2\u306e\u3059\u3079\u3066\u304c\u30b3\u30b3\u306b\u3002\",\"publisher\":{\"@id\":\"https:\\\/\\\/www.sejuku.net\\\/blog\\\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\\\/\\\/www.sejuku.net\\\/blog\\\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"ja\"},{\"@type\":\"Organization\",\"@id\":\"https:\\\/\\\/www.sejuku.net\\\/blog\\\/#organization\",\"name\":\"\u682a\u5f0f\u4f1a\u793eSAMURAI\",\"url\":\"https:\\\/\\\/www.sejuku.net\\\/blog\\\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"ja\",\"@id\":\"https:\\\/\\\/www.sejuku.net\\\/blog\\\/#\\\/schema\\\/logo\\\/image\\\/\",\"url\":\"https:\\\/\\\/www.sejuku.net\\\/blog\\\/wp-content\\\/uploads\\\/2023\\\/07\\\/logo.png\",\"contentUrl\":\"https:\\\/\\\/www.sejuku.net\\\/blog\\\/wp-content\\\/uploads\\\/2023\\\/07\\\/logo.png\",\"width\":600,\"height\":600,\"caption\":\"\u682a\u5f0f\u4f1a\u793eSAMURAI\"},\"image\":{\"@id\":\"https:\\\/\\\/www.sejuku.net\\\/blog\\\/#\\\/schema\\\/logo\\\/image\\\/\"},\"sameAs\":[\"https:\\\/\\\/www.facebook.com\\\/sejuku2013\",\"https:\\\/\\\/x.com\\\/samuraijuku\",\"https:\\\/\\\/www.youtube.com\\\/channel\\\/UCCFOQO5aDK0xXam4cUQXT8g\\\/featured\"]},{\"@type\":\"Person\",\"@id\":\"https:\\\/\\\/www.sejuku.net\\\/blog\\\/#\\\/schema\\\/person\\\/e8ca7fd09857a736a25e6b4455a3ab61\",\"name\":\"\u4f8d\u30a8\u30f3\u30b8\u30cb\u30a2\u7de8\u96c6\u90e8\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"ja\",\"@id\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/507c280c5c67d2c11fec4fdba20e5bf1ec2fe91f9deb42d2ec50382778b311bf?s=96&d=mm&r=g\",\"url\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/507c280c5c67d2c11fec4fdba20e5bf1ec2fe91f9deb42d2ec50382778b311bf?s=96&d=mm&r=g\",\"contentUrl\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/507c280c5c67d2c11fec4fdba20e5bf1ec2fe91f9deb42d2ec50382778b311bf?s=96&d=mm&r=g\",\"caption\":\"\u4f8d\u30a8\u30f3\u30b8\u30cb\u30a2\u7de8\u96c6\u90e8\"},\"description\":\"\u3010\u30d7\u30ed\u30d5\u30a3\u30fc\u30eb\u3011 DX\u8a8d\u5b9a\u53d6\u5f97\u4e8b\u696d\u8005\u306b\u9078\u5b9a\u3055\u308c\u3066\u3044\u308b\u682a\u5f0f\u4f1a\u793eSAMURAI\u306e\u30de\u30fc\u30b1\u30c6\u30a3\u30f3\u30b0\u30fb\u30b3\u30df\u30e5\u30cb\u30b1\u30fc\u30b7\u30e7\u30f3\u90e8\u304c\u904b\u55b6\u3002\u300c\u8cea\u306e\u9ad8\u3044IT\u6559\u80b2\u3092\u3001\u3059\u3079\u3066\u306e\u4eba\u306b\u300d\u3092\u30df\u30c3\u30b7\u30e7\u30f3\u306b\u3001IT\u30fb\u30d7\u30ed\u30b0\u30e9\u30df\u30f3\u30b0\u3092\u5b66\u3073\u59cb\u3081\u305f\u521d\u5b66\u8005\u306e\u65b9\u306b\u5411\u3051\u8a18\u4e8b\u3092\u57f7\u7b46\u3002 \u7d2f\u8a08\u6307\u5c0e\u8005\u65704\u4e075,000\u540d\u4ee5\u4e0a\u306e\u30d7\u30ed\u30b0\u30e9\u30df\u30f3\u30b0\u30b9\u30af\u30fc\u30eb\u300c\u4f8d\u30a8\u30f3\u30b8\u30cb\u30a2\u300d\u3001\u7d2f\u8a08\u767b\u9332\u8005\u65701\u4e078,000\u4eba\u4ee5\u4e0a\u306e\u30aa\u30f3\u30e9\u30a4\u30f3\u5b66\u7fd2\u30b5\u30fc\u30d3\u30b9\u300c\u4f8d\u30c6\u30e9\u30b3\u30e4\u300d\u3067\u6271\u3046\u6559\u6750\u958b\u767a\u306e\u30ce\u30a6\u30cf\u30a6\u30012013\u5e74\u306e\u5275\u696d\u304b\u3089\u904b\u55b6\u3067\u5f97\u305f\u77e5\u898b\u306b\u57fa\u3065\u304d\u3001\u8a18\u4e8b\u306e\u57f7\u7b46\u3060\u3051\u3067\u306a\u304f\u7de8\u96c6\u30fb\u76e3\u4fee\u3082\u62c5\u5f53\u3057\u3066\u3044\u307e\u3059\u3002 \u3010\u5c02\u9580\u5206\u91ce\u3011 IT\\\/Web\u958b\u767a\\\/AI\u30fb\u30ed\u30dc\u30c3\u30c8\u958b\u767a\\\/\u30a4\u30f3\u30d5\u30e9\u958b\u767a\\\/\u30b2\u30fc\u30e0\u958b\u767a\\\/AI\\\/Web\u30c7\u30b6\u30a4\u30f3\",\"sameAs\":[\"https:\\\/\\\/www.sejuku.net\\\/\",\"https:\\\/\\\/www.facebook.com\\\/sejuku2013\\\/\",\"https:\\\/\\\/www.instagram.com\\\/samuraiengineer_official\\\/\",\"https:\\\/\\\/x.com\\\/https:\\\/\\\/twitter.com\\\/samuraijuku\",\"https:\\\/\\\/www.youtube.com\\\/channel\\\/UCCFOQO5aDK0xXam4cUQXT8g\"],\"url\":\"https:\\\/\\\/www.sejuku.net\\\/blog\\\/author\\\/samurai-blog\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"\u96e3\u3057\u3044\u30c8\u30d4\u30c3\u30af\u30e2\u30c7\u30eb\u3092\u7c21\u5358\u306b\uff01Python\u30e9\u30a4\u30d6\u30e9\u30eaGensim\u306e\u4f7f\u3044\u65b9 | \u4f8d\u30a8\u30f3\u30b8\u30cb\u30a2\u30d6\u30ed\u30b0","description":"\u3053\u306e\u8a18\u4e8b\u3067\u306f\u300c \u96e3\u3057\u3044\u30c8\u30d4\u30c3\u30af\u30e2\u30c7\u30eb\u3092\u7c21\u5358\u306b\uff01Python\u30e9\u30a4\u30d6\u30e9\u30eaGensim\u306e\u4f7f\u3044\u65b9 \u300d\u306b\u3064\u3044\u3066\u3001\u8ab0\u3067\u3082\u7406\u89e3\u3067\u304d\u308b\u3088\u3046\u306b\u89e3\u8aac\u3057\u307e\u3059\u3002\u3053\u306e\u8a18\u4e8b\u3092\u8aad\u3081\u3070\u3001\u3042\u306a\u305f\u306e\u60a9\u307f\u304c\u89e3\u6c7a\u3059\u308b\u3060\u3051\u3058\u3083\u306a\u304f\u3001\u65b0\u305f\u306a\u6c17\u4ed8\u304d\u3082\u767a\u898b\u3067\u304d\u308b\u3053\u3068\u3067\u3057\u3087\u3046\u3002\u304a\u60a9\u307f\u306e\u65b9\u306f\u305c\u3072\u3054\u4e00\u8aad\u304f\u3060\u3055\u3044\u3002","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/www.sejuku.net\/blog\/67863","og_locale":"ja_JP","og_type":"article","og_title":"\u96e3\u3057\u3044\u30c8\u30d4\u30c3\u30af\u30e2\u30c7\u30eb\u3092\u7c21\u5358\u306b\uff01Python\u30e9\u30a4\u30d6\u30e9\u30eaGensim\u306e\u4f7f\u3044\u65b9 | \u4f8d\u30a8\u30f3\u30b8\u30cb\u30a2\u30d6\u30ed\u30b0","og_description":"\u3053\u306e\u8a18\u4e8b\u3067\u306f\u300c \u96e3\u3057\u3044\u30c8\u30d4\u30c3\u30af\u30e2\u30c7\u30eb\u3092\u7c21\u5358\u306b\uff01Python\u30e9\u30a4\u30d6\u30e9\u30eaGensim\u306e\u4f7f\u3044\u65b9 \u300d\u306b\u3064\u3044\u3066\u3001\u8ab0\u3067\u3082\u7406\u89e3\u3067\u304d\u308b\u3088\u3046\u306b\u89e3\u8aac\u3057\u307e\u3059\u3002\u3053\u306e\u8a18\u4e8b\u3092\u8aad\u3081\u3070\u3001\u3042\u306a\u305f\u306e\u60a9\u307f\u304c\u89e3\u6c7a\u3059\u308b\u3060\u3051\u3058\u3083\u306a\u304f\u3001\u65b0\u305f\u306a\u6c17\u4ed8\u304d\u3082\u767a\u898b\u3067\u304d\u308b\u3053\u3068\u3067\u3057\u3087\u3046\u3002\u304a\u60a9\u307f\u306e\u65b9\u306f\u305c\u3072\u3054\u4e00\u8aad\u304f\u3060\u3055\u3044\u3002","og_url":"https:\/\/www.sejuku.net\/blog\/67863","og_site_name":"\u4f8d\u30a8\u30f3\u30b8\u30cb\u30a2\u30d6\u30ed\u30b0","article_publisher":"https:\/\/www.facebook.com\/sejuku2013","article_author":"https:\/\/www.facebook.com\/sejuku2013\/","article_published_time":"2018-08-13T16:46:07+00:00","article_modified_time":"2024-05-06T02:50:33+00:00","og_image":[{"width":700,"height":400,"url":"https:\/\/www.sejuku.net\/blog\/wp-content\/uploads\/2018\/08\/gensim_eye.png","type":"image\/png"}],"author":"\u4f8d\u30a8\u30f3\u30b8\u30cb\u30a2\u7de8\u96c6\u90e8","twitter_card":"summary_large_image","twitter_creator":"@https:\/\/twitter.com\/samuraijuku","twitter_site":"@samuraijuku","schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/www.sejuku.net\/blog\/67863#article","isPartOf":{"@id":"https:\/\/www.sejuku.net\/blog\/67863"},"author":{"name":"\u4f8d\u30a8\u30f3\u30b8\u30cb\u30a2\u7de8\u96c6\u90e8","@id":"https:\/\/www.sejuku.net\/blog\/#\/schema\/person\/e8ca7fd09857a736a25e6b4455a3ab61"},"headline":"\u96e3\u3057\u3044\u30c8\u30d4\u30c3\u30af\u30e2\u30c7\u30eb\u3092\u7c21\u5358\u306b\uff01Python\u30e9\u30a4\u30d6\u30e9\u30eaGensim\u306e\u4f7f\u3044\u65b9","datePublished":"2018-08-13T16:46:07+00:00","dateModified":"2024-05-06T02:50:33+00:00","mainEntityOfPage":{"@id":"https:\/\/www.sejuku.net\/blog\/67863"},"wordCount":126,"publisher":{"@id":"https:\/\/www.sejuku.net\/blog\/#organization"},"image":{"@id":"https:\/\/www.sejuku.net\/blog\/67863#primaryimage"},"thumbnailUrl":"https:\/\/www.sejuku.net\/blog\/wp-content\/uploads\/2018\/08\/gensim_eye.png","keywords":["AI","python"],"articleSection":["Python\u5b66\u7fd2"],"inLanguage":"ja"},{"@type":"WebPage","@id":"https:\/\/www.sejuku.net\/blog\/67863","url":"https:\/\/www.sejuku.net\/blog\/67863","name":"\u96e3\u3057\u3044\u30c8\u30d4\u30c3\u30af\u30e2\u30c7\u30eb\u3092\u7c21\u5358\u306b\uff01Python\u30e9\u30a4\u30d6\u30e9\u30eaGensim\u306e\u4f7f\u3044\u65b9 | \u4f8d\u30a8\u30f3\u30b8\u30cb\u30a2\u30d6\u30ed\u30b0","isPartOf":{"@id":"https:\/\/www.sejuku.net\/blog\/#website"},"primaryImageOfPage":{"@id":"https:\/\/www.sejuku.net\/blog\/67863#primaryimage"},"image":{"@id":"https:\/\/www.sejuku.net\/blog\/67863#primaryimage"},"thumbnailUrl":"https:\/\/www.sejuku.net\/blog\/wp-content\/uploads\/2018\/08\/gensim_eye.png","datePublished":"2018-08-13T16:46:07+00:00","dateModified":"2024-05-06T02:50:33+00:00","description":"\u3053\u306e\u8a18\u4e8b\u3067\u306f\u300c \u96e3\u3057\u3044\u30c8\u30d4\u30c3\u30af\u30e2\u30c7\u30eb\u3092\u7c21\u5358\u306b\uff01Python\u30e9\u30a4\u30d6\u30e9\u30eaGensim\u306e\u4f7f\u3044\u65b9 \u300d\u306b\u3064\u3044\u3066\u3001\u8ab0\u3067\u3082\u7406\u89e3\u3067\u304d\u308b\u3088\u3046\u306b\u89e3\u8aac\u3057\u307e\u3059\u3002\u3053\u306e\u8a18\u4e8b\u3092\u8aad\u3081\u3070\u3001\u3042\u306a\u305f\u306e\u60a9\u307f\u304c\u89e3\u6c7a\u3059\u308b\u3060\u3051\u3058\u3083\u306a\u304f\u3001\u65b0\u305f\u306a\u6c17\u4ed8\u304d\u3082\u767a\u898b\u3067\u304d\u308b\u3053\u3068\u3067\u3057\u3087\u3046\u3002\u304a\u60a9\u307f\u306e\u65b9\u306f\u305c\u3072\u3054\u4e00\u8aad\u304f\u3060\u3055\u3044\u3002","breadcrumb":{"@id":"https:\/\/www.sejuku.net\/blog\/67863#breadcrumb"},"inLanguage":"ja","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.sejuku.net\/blog\/67863"]}]},{"@type":"ImageObject","inLanguage":"ja","@id":"https:\/\/www.sejuku.net\/blog\/67863#primaryimage","url":"https:\/\/www.sejuku.net\/blog\/wp-content\/uploads\/2018\/08\/gensim_eye.png","contentUrl":"https:\/\/www.sejuku.net\/blog\/wp-content\/uploads\/2018\/08\/gensim_eye.png","width":700,"height":400},{"@type":"BreadcrumbList","@id":"https:\/\/www.sejuku.net\/blog\/67863#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/www.sejuku.net\/blog\/"},{"@type":"ListItem","position":2,"name":"\u96e3\u3057\u3044\u30c8\u30d4\u30c3\u30af\u30e2\u30c7\u30eb\u3092\u7c21\u5358\u306b\uff01Python\u30e9\u30a4\u30d6\u30e9\u30eaGensim\u306e\u4f7f\u3044\u65b9"}]},{"@type":"WebSite","@id":"https:\/\/www.sejuku.net\/blog\/#website","url":"https:\/\/www.sejuku.net\/blog\/","name":"\u4f8d\u30a8\u30f3\u30b8\u30cb\u30a2\u30d6\u30ed\u30b0","description":"\u30d7\u30ed\u30b0\u30e9\u30df\u30f3\u30b0\u5b66\u7fd2\u306e\u3059\u3079\u3066\u304c\u30b3\u30b3\u306b\u3002","publisher":{"@id":"https:\/\/www.sejuku.net\/blog\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/www.sejuku.net\/blog\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"ja"},{"@type":"Organization","@id":"https:\/\/www.sejuku.net\/blog\/#organization","name":"\u682a\u5f0f\u4f1a\u793eSAMURAI","url":"https:\/\/www.sejuku.net\/blog\/","logo":{"@type":"ImageObject","inLanguage":"ja","@id":"https:\/\/www.sejuku.net\/blog\/#\/schema\/logo\/image\/","url":"https:\/\/www.sejuku.net\/blog\/wp-content\/uploads\/2023\/07\/logo.png","contentUrl":"https:\/\/www.sejuku.net\/blog\/wp-content\/uploads\/2023\/07\/logo.png","width":600,"height":600,"caption":"\u682a\u5f0f\u4f1a\u793eSAMURAI"},"image":{"@id":"https:\/\/www.sejuku.net\/blog\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/www.facebook.com\/sejuku2013","https:\/\/x.com\/samuraijuku","https:\/\/www.youtube.com\/channel\/UCCFOQO5aDK0xXam4cUQXT8g\/featured"]},{"@type":"Person","@id":"https:\/\/www.sejuku.net\/blog\/#\/schema\/person\/e8ca7fd09857a736a25e6b4455a3ab61","name":"\u4f8d\u30a8\u30f3\u30b8\u30cb\u30a2\u7de8\u96c6\u90e8","image":{"@type":"ImageObject","inLanguage":"ja","@id":"https:\/\/secure.gravatar.com\/avatar\/507c280c5c67d2c11fec4fdba20e5bf1ec2fe91f9deb42d2ec50382778b311bf?s=96&d=mm&r=g","url":"https:\/\/secure.gravatar.com\/avatar\/507c280c5c67d2c11fec4fdba20e5bf1ec2fe91f9deb42d2ec50382778b311bf?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/507c280c5c67d2c11fec4fdba20e5bf1ec2fe91f9deb42d2ec50382778b311bf?s=96&d=mm&r=g","caption":"\u4f8d\u30a8\u30f3\u30b8\u30cb\u30a2\u7de8\u96c6\u90e8"},"description":"\u3010\u30d7\u30ed\u30d5\u30a3\u30fc\u30eb\u3011 DX\u8a8d\u5b9a\u53d6\u5f97\u4e8b\u696d\u8005\u306b\u9078\u5b9a\u3055\u308c\u3066\u3044\u308b\u682a\u5f0f\u4f1a\u793eSAMURAI\u306e\u30de\u30fc\u30b1\u30c6\u30a3\u30f3\u30b0\u30fb\u30b3\u30df\u30e5\u30cb\u30b1\u30fc\u30b7\u30e7\u30f3\u90e8\u304c\u904b\u55b6\u3002\u300c\u8cea\u306e\u9ad8\u3044IT\u6559\u80b2\u3092\u3001\u3059\u3079\u3066\u306e\u4eba\u306b\u300d\u3092\u30df\u30c3\u30b7\u30e7\u30f3\u306b\u3001IT\u30fb\u30d7\u30ed\u30b0\u30e9\u30df\u30f3\u30b0\u3092\u5b66\u3073\u59cb\u3081\u305f\u521d\u5b66\u8005\u306e\u65b9\u306b\u5411\u3051\u8a18\u4e8b\u3092\u57f7\u7b46\u3002 \u7d2f\u8a08\u6307\u5c0e\u8005\u65704\u4e075,000\u540d\u4ee5\u4e0a\u306e\u30d7\u30ed\u30b0\u30e9\u30df\u30f3\u30b0\u30b9\u30af\u30fc\u30eb\u300c\u4f8d\u30a8\u30f3\u30b8\u30cb\u30a2\u300d\u3001\u7d2f\u8a08\u767b\u9332\u8005\u65701\u4e078,000\u4eba\u4ee5\u4e0a\u306e\u30aa\u30f3\u30e9\u30a4\u30f3\u5b66\u7fd2\u30b5\u30fc\u30d3\u30b9\u300c\u4f8d\u30c6\u30e9\u30b3\u30e4\u300d\u3067\u6271\u3046\u6559\u6750\u958b\u767a\u306e\u30ce\u30a6\u30cf\u30a6\u30012013\u5e74\u306e\u5275\u696d\u304b\u3089\u904b\u55b6\u3067\u5f97\u305f\u77e5\u898b\u306b\u57fa\u3065\u304d\u3001\u8a18\u4e8b\u306e\u57f7\u7b46\u3060\u3051\u3067\u306a\u304f\u7de8\u96c6\u30fb\u76e3\u4fee\u3082\u62c5\u5f53\u3057\u3066\u3044\u307e\u3059\u3002 \u3010\u5c02\u9580\u5206\u91ce\u3011 IT\/Web\u958b\u767a\/AI\u30fb\u30ed\u30dc\u30c3\u30c8\u958b\u767a\/\u30a4\u30f3\u30d5\u30e9\u958b\u767a\/\u30b2\u30fc\u30e0\u958b\u767a\/AI\/Web\u30c7\u30b6\u30a4\u30f3","sameAs":["https:\/\/www.sejuku.net\/","https:\/\/www.facebook.com\/sejuku2013\/","https:\/\/www.instagram.com\/samuraiengineer_official\/","https:\/\/x.com\/https:\/\/twitter.com\/samuraijuku","https:\/\/www.youtube.com\/channel\/UCCFOQO5aDK0xXam4cUQXT8g"],"url":"https:\/\/www.sejuku.net\/blog\/author\/samurai-blog"}]}},"_links":{"self":[{"href":"https:\/\/www.sejuku.net\/blog\/wp-json\/wp\/v2\/posts\/67863","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.sejuku.net\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.sejuku.net\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.sejuku.net\/blog\/wp-json\/wp\/v2\/users\/4"}],"replies":[{"embeddable":true,"href":"https:\/\/www.sejuku.net\/blog\/wp-json\/wp\/v2\/comments?post=67863"}],"version-history":[{"count":0,"href":"https:\/\/www.sejuku.net\/blog\/wp-json\/wp\/v2\/posts\/67863\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.sejuku.net\/blog\/wp-json\/wp\/v2\/media\/68077"}],"wp:attachment":[{"href":"https:\/\/www.sejuku.net\/blog\/wp-json\/wp\/v2\/media?parent=67863"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.sejuku.net\/blog\/wp-json\/wp\/v2\/categories?post=67863"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.sejuku.net\/blog\/wp-json\/wp\/v2\/tags?post=67863"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}