Treffer: Query processing in large-scale networks.

Title:
Query processing in large-scale networks.
Contributors:
Qiao, Miao., Chinese University of Hong Kong Graduate School. Division of Systems Engineering and Engineering Management.
Publication Year:
2013
Collection:
The Chinese University of Hong Kong: CUHK Digital Repository / 香港中文大學數碼典藏
Document Type:
Fachzeitschrift text
File Description:
electronic resource; remote; 1 online resource (vii, 151 leaves) : ill. (some col.)
Language:
English
Chinese
Rights:
Use of this resource is governed by the terms and conditions of the Creative Commons “Attribution-NonCommercial-NoDerivatives 4.0 International” License (http://creativecommons.org/licenses/by-nc-nd/4.0/)
Accession Number:
edsbas.C7E8CF05
Database:
BASE

Weitere Informationen

由于现今在各个领域涌现的图数据规模都愈加庞大,在这些大规模图数据上进行任何一种简单的查询都成为一件有富有挑战性的工作。在本文中,我们着重在大规模图上研究三个具有广泛应用的查询:最短路查询,权重限制查询和最近k关键字查询。具体来说, 最短路查询是一个计算两点间最短距离的基本查询。而权重限制查询判断两点间是否存在一条沿路边权都满足用指定条件的可行路径。对于一个查询节点,最近k关键字查询返回k个距离最近的带有指定关键字的节点。在面对一个拥有超过一亿节点的图时,我们需要为这些查询开发有效的索引和查询优化算法。 ; 在本文中,对于最短路查询,我们提出了两个基于地标嵌入的算法,一个是有误差控制的地标嵌入算法,另一个则是本地化地标嵌入算法。前者通过对地标的筛选和组织,能对估计的最短距离给予一定的误差保证; 而后者提出的本地化机制能够在不增加预处理复杂度和在线查询复杂度的情况下大幅度提高估计的精准度。对于权重限制查询,我们先提出一个能够保证常数查询时间的内存算法。除此之外,为了提高算法对大规模数据的处理能力,我们使用编码技术设计了一个有效的外存算法。对于最近k关键字查询,我们先在一个特殊的图,即一颗树上,开发一个有效算法来在常数时间内回答最近k关键字查询, 并由此得出一个图上的近似算法;此外我们还通过一个全局存储的技术来进一步减少索引大小和缩短查询时间。我们在真实和模拟的数据上做了大量的实验,实验结果证明我们的算法在大图上对上述三个查询都具有高效性能。 ; Due to the massive size of graphs from various domains nowadays, even simple graph queries become challenging tasks. In this thesis, three queries with a wide range of applications are investigated on large graphs. One is shortest distance query, a fundamental query which computes the shortest distance between two nodes. Another query, weight constraint reachability (WCR), checks if there is a feasible path between two nodes where edge weights along the path satisfy a side constraint. And the third one, a top-k nearest keywords (k-NK) query, reports, for a query node, the k nearest nodes bearing some user-specified keywords. When confronting with a large-scale graph with over tens of millions of nodes, we need to develop efficient indexing and query optimization techniques for these queries. ; In this thesis, for a shortest distance query, we devise two landmark embedding schemes, an error bounded landmark scheme and a local landmark scheme, where the former can guarantee an error bound for estimated distance, and the latter can significantly improve the distance estimation accuracy without increasing the offline embedding or the online query complexity. For a WCR query, we propose a memorybased approach which promises a constant query time. Besides, in order to increase its scalability, we devise an I/O-efficient approach for answering a WCR query on massive graphs. For a k-NK query, we start ...