Microsoft Academic Graph (MAG) is a large heterogeneous graph containing entities such as authors, papers, journals, conferences and relations between them. Microsoft provides Academic Knowledge API for this contest. The Entity attributes are defined here.
MAG是一个大型异构图,包含有作者、论文、期刊、会议以及他们之间的关系等实体信息。M$为这是比赛提供学术知识API。其中实体属性的定义如下:
Participants are supposed to provide a REST service endpoint that can find all the 1-hop, 2-hop, and 3-hop graph paths connecting a given pair of entity identifiers in MAG. The given pair of entity identifiers could be [Id, Id], [Id, AA.AuId], [AA.AuId, Id], [AA.AuId, AA.AuId]. Each node of a path should be one of the following identifiers: Id, F.Fid, J.JId, C.CId, AA.AuId, AA.AfId. Possible edges (a pair of adjacent nodes) of a path are:
参与者需要提供一个REST服务,用以寻找一对实体在MAG中三跳以内的路径。给定的一对实体可以是[Id, Id], [Id, AA.AuId], [AA.AuId, Id], [AA.AuId, AA.AuId]。路径中每一跳的类型应该为[Id, F.Fid, J.JId, C.CId, AA.AuId, AA.AfId]中的某一种。路径中可选边的类型如下:
Id1 → Id2
Id1论文引用了Id2论文(Rid - reference id)Id1 → F.Fid2
Id1论文的领域为F.Fid2 (F.Fid - Field of study ID)F.Fid1 → Id2
F.Fid1领域同为id2论文的领域 (可以看出相同领域的论文之间有两跳)Id1 → C.CId2
Id1论文的会议id为C.CId2 (C.CId - Conference series ID)C.CId1 → Id2
C.CId2会议同为Id2论文的会议id (相同会议上的论文之间有两跳)Id1 → J.JId2
Id1论文的期刊id为J.JId2 (J.JId - Journal ID)J.JId1 → Id2
J.JId2期刊同为Id2论文的期刊id (相同期刊上的论文之间有两跳)AA.AuId1 → AA.AFId2
AA.AuId1作者在AA.AFId2领域(AA.AuId - Author ID, AA.AFId2 - Author affiliation ID)AA.AFId1 → AA.AuId2
AA.AFId1领域同为AA.AuId2作者的领域(相同领域的作者之间有两跳)AA.AuId1 → Id2
AA.AuId1作者写的论文中有Id2论文Id1 → AA.AuId2
Id1论文也是AA.AuId2作者所写(同一作者的论文之间有两跳)
For each test case, the REST service endpoint will receive a JSON array via HTTP with a pair of entity identifiers, where the identifiers are 64-bit integers, e.g. [123, 456]. The service endpoint needs to respond with a JSON array within 300 seconds. The response JSON array consists of a list of graph paths in the form of [path1, path2, …, pathn], where each path is an array of entity identifiers. For example, if your program finds one 1-hop paths, two 2-hop paths, and one 3-hop paths, the results may look like this: [[123,456], [123,2,456], [123,3,456], [123,4,5,456]]. For a path such as [123,4,5,456], the integers are the identifiers of the entities on the path. After receiving the response, the evaluator will wait for a random period of time before sending the next requests.
对于每个测试用例,REST服务器将会通过HTTP协议收到一个JSON数组,该数组中存有两个实体的信息,该信息使用64位整数存储,如[123,456]。REST服务器需要在300s内返回一个JSON数组,这个数组由图的路径构成,如[path1, path2, …, pathn],每个路径又是一个实体类型的数组。比如,你的程序需要寻找3跳以内的路径,那么结果可能如下: [[123,456], [123,2,456], [123,3,456], [123,4,5,456]]. For a path such as [123,4,5,456],这其中的整数即为路径中的实体。评估服务器在收到回复后,会随机挑选一段时间后再发送下一个请求。