Are you sure you want to delete this task? Once this task is deleted, it cannot be recovered.
xupike f81c12939c | 2 years ago | |
---|---|---|
MilKB | 2 years ago | |
MilKBQA | 2 years ago | |
LICENSE | 2 years ago | |
README.md | 2 years ago |
Facts in military field tend to involve elements of time, space, quantity, status, and so on. Existing methods of representing knowledge in the form of triples fail to adequately express these facts, and also cause obstacle to knowledge storage and updating. Furthermore, question answering on these facts introduces new complexity dimension, which are complicated to be supported by existing corpus. Thus, we construct a Chinese knowledge base for military field covering entities and events centric knowledge, referred as MilKB. It consists of 965 entities and 3,017 facts. Moreover, we classify the natural questions into 26 types and construct a complex question answering dataset derived from MilKB, referred as MilKBQA. It consists of 2,829 questions, in which 600 are event-centric questions.
Question Type | Count |
---|---|
Entity-centric Simple | 1749 |
Entity-centric Logic Reasoning | 82 |
Entity-centric Quantity Reasoning | 302 |
Entity-centric Calculation Reasoning | 96 |
Event-centric Simple | 509 |
Event-centric Quantity Reasoning | 83 |
Event-centric-Probability Reasoning | 8 |
Total | 2829 |
Place the following code in the same directory as node.json and relation.json, modify the username and password to the current neo4j account. Execute the code to import MilKB into local neo4j.
As for MilKBQA, users can configure the number of each question type according to needs.
It should be noted that because neo4j does not support numerical operations with units.Questions involving numerical calculations and comparisons cannot be answered directly by executing Cypher queries. Therefore, part of the results mentioned above needs to be further processed. The answers in the corpus are manually calculated.
import json
from py2neo import Graph, Node, Relationship, NodeMatcher
def get_graph():
"""
connect the neo4j
:return: neo4j object
"""
try:
graph = Graph("http://localhost:7474",username="neo4j",password="neo4j")
print("success for neo4j connection.")
return graph
except Exception as e:
print(e)
return None
def deal_json(file_name):
"""
deal with the json
:param file_name: json file name
:return: list abound with dicts
"""
with open(file_name, 'r', encoding='utf-8') as f:
load_dict = json.load(f)
return load_dict
def trans_nodes(graph, file_name):
"""
transfer the nodes to new db
:param graph: neo4j object
:param file_name: json file name
:return: None
"""
load_dict = deal_json(file_name)
for row in load_dict:
# select by json structure
old_id = row['n']['identity']
node_type = row['n']['labels'][0]
properties = row['n']['properties']
node = Node(node_type, old_id=old_id)
for key, val in properties.items():
node[key] = val
print('creating-', node)
graph.create(node)
def trans_relations(graph, file_name):
"""
transfer the relations to the new db
:param graph: neo4j object
:param file_name: json file name
:return: None
"""
load_dict = deal_json(file_name)
for row in load_dict:
start_id = row['p']['segments'][0]['start']['identity']
print(start_id)
start_label = row['p']['segments'][0]['start']['labels'][0]
end_id = row['p']['segments'][0]['end']['identity']
end_label = row['p']['segments'][0]['end']['labels'][0]
relation_type = row['p']['segments'][0]['relationship']['type']
relation_properties = row['p']['segments'][0]['relationship']['properties']
print(relation_properties)
if relation_properties == {}:
cypher = "match (m:{0}), (n:{1}) where m.old_id={2} and n.old_id={3} create p=(m)-[:{4}]->(n) return p".format(start_label,end_label,start_id,end_id,relation_type)
else:
constrain = ''
for k, v in relation_properties.items():
constrain = constrain + "{0}:'{1}',".format(k, v)
constrain = constrain.strip(',')
cypher = "match (m:{0}), (n:{1}) where m.old_id={2} and n.old_id={3} create p=(m)-[:{4}{{{5}}}]->(n) return p".format(start_label,end_label,start_id,end_id,relation_type,constrain)
print(cypher)
graph.run(cypher)
if __name__ == '__main__':
graph = get_graph()
graph.delete_all()
trans_nodes(graph, 'node.json')
trans_relations(graph, 'relation.json')
# delete old id
graph.run('MATCH (n) REMOVE n.old_id')
You can download resources from the given link MilKB,MilKBQA.
Facts in military field tend to involve elements of time, space, quantity, status, and so on. Existing methods of representing knowledge in the form of triples fail to adequately express these facts, and also cause obstacle to knowledge storage and updating. Furthermore, question answering on these facts introduces new complexity dimension, which are complicated to be supported by existing corpus. Thus, we construct a Chinese knowledge base for military field covering entities and events centric knowledge, referred as MilKB. It consists of 965 entities and 3,017 facts. Moreover, we classify the natural questions into 26 types and construct a complex question answering dataset derived from MilKB, referred as MilKBQA. It consists of 2,829 questions, in which 600 are event-centric questions.
Python
MIT
Dear OpenI User
Thank you for your continuous support to the Openl Qizhi Community AI Collaboration Platform. In order to protect your usage rights and ensure network security, we updated the Openl Qizhi Community AI Collaboration Platform Usage Agreement in January 2024. The updated agreement specifies that users are prohibited from using intranet penetration tools. After you click "Agree and continue", you can continue to use our services. Thank you for your cooperation and understanding.
For more agreement content, please refer to the《Openl Qizhi Community AI Collaboration Platform Usage Agreement》