Japanese Question-Answering Corpus (JQAC) is a dataset, consisting of question-answering pairs, which is manually made by university students on a set of Japanese Wikipedia articles and some public documents.
(distributed under the CC BY-SA 4.0 license)
Category | Number of Themes |
Author(1) | Number of Questions(1) |
Author(2) | Number of Questions(2) |
Total |
---|---|---|---|---|---|---|
学問 (Academic Ddiscipline) | 10 | KA | 100 | - | 0 | 100 |
技術 (Techonology) | 10 | SI | 99 | KI | 57 | 156 |
自然 (Nature) | 11 | SI | 83 | KI | 63 | 149 |
社会 (Society) | 11 | KI | 66 | SI | 66 | 132 |
地理 (Geograpghy) | 10 | KA | 100 | - | 0 | 100 |
人間 (Humans) | 10 | SA | 74 | - | 0 | 74 |
文化 (Culture) | 10 | HI | 138 | - | 0 | 138 |
歴史 (History) | 10 | YA | 60 | - | 0 | 60 |
徳島大学シラバス (Tokushima University Syllabus) | 10 | KA | 60 | YA | 60 | 120 |
The JQAC data containts nine CSV files in UTF-8. All the sentences are wrtten in Japanese. Each file is partetionned as follows:
Theme (Category) | Topic (Title) | Question (What, Who, Where, Whose, How, Yes/No) | Answer | Difficulty (by Author) | Difficulty (by Answerer) | URL (Original Content) |
---|---|---|---|---|---|---|
学問 | アリストテレス | アリストテレスは誰の弟子ですか? | プラトン | 5 | https://ja.wikipedia.org/wiki/%E3%82%A2%... | |
学問 | アリストテレス | アリストテレスは紀元前何年に出生しましたか? | 紀元前384年 | 5 | https://ja.wikipedia.org/wiki/%E3%82%A2%E3%... | |
学問 | アリストテレス | 紀元前367年,アリストテレスはどこに入門しましたか? | アカデメイア | 5 | https://ja.wikipedia.org/wiki/%E3%82%A2%E3%... | |
学問 | アリストテレス | アリストテレスは師プラトンから何と評されましたか? | 学校の精神 | 5 | https://ja.wikipedia.org/wiki/%E3%82%A2%E3%... |
The latest JQAC dataset is here. jqac20180625.tgz
Feel free to ask me any questions or comments regarding this project and dataset.
To cite this paper, please use the following reference:
This work was supported by Works Appilcations Co., Ltd.
This containt is managed by Hiroki Tanioka (taniokah[at]gmail.com), since 2018.