All real-world data has structures that are best described as graphs. If there is one data structure for deep learning algorithms, graph would be the foremost candidate. The graph structure can be either explicit, such in social networks, knowledge graphs, and protein-interaction networks, etc., or latent and implicit, as in the case of languages and images. Leveraging and discovering graph structures have many immediate applications and also serves as a fertile ground for tnew generation of algorithms.
This talk begins with a general survey of deep graph learning, and then we will discuss a few new research work at AWS Shanghai AI Lab in this direction. We will introduce DGL, an open-source platform designed to accelerate research in this new emerging field, with its philosophy to support graph as the core abstraction and take care to maintain both forward (i.e. supporting new research ideas) and backward (i.e. integration with existing components) compatibility. DGL enables arbitrary message handling and mutation operators, flexible propagation rules, and is framework agnostic so as to leverage high-performance tensor, autograd operations, and other feature extraction modules already available in existing frameworks. DGL carefully handles the sparse and irregular graph structure, deals with graphs big and small which may change dynamically, fuses operations, and performs auto-batching, all to take advantages of modern hardware. DGL has been tested on a variety of models, including but not limited to the popular Graph Neural Networks (GNN) and its variants, with promising speed, memory footprint and scalability.
We are actively applying DGL to a number of domains, working with customers both internal and external to AWS, as well as pushing research of deep graph of its own right. Since its launch during NeuIPS'18, DGL has evolved rapidly. The latest release supports heterograph and knowledge graph, a significant step towards real-world applications.
现实世界的数据并不是孤立存在的。因此,描述数据最好的数据结构就是图。换句话说,如果深度学习作为一个算法,那么图是其支撑的最好的数据结构。图的结构可以是显式的,比如社交网络、知识图、药、蛋白质、关系数据库,等等。图也可以是隐式隐蔽在语言和图像背后。利用好、发掘出图结构既有丰富的现实场景,也为下一代机器学习算法提供了重要的基础。
本讲座首先从对深度图学习的总体情况作一个概述,然后介绍AWS上海AI实验室在这方面的一些新研究工作。我们将介绍深度图库(Deep Graph Library, DGL,http://dgl.ai)。DGL是一个开源平台,其愿景在于加速这个新兴领域的研究。DGL的设计思路围绕图为核心概念,并注意保持向前(即灵活支持新模型的研发)和向后(即与现在已有深度学习组件)的兼容性。 DGL支持任意消息处理和变换算子,有灵活的消息传播规则,并且与底层深度学习框架解耦,利用高性能张量、自动求导(autograd)操作,并复用现有框架中已有的其他特征提取模块。 DGL优化处理稀疏和不规则的图形结构,处理可能动态变化的大大小小的图形,融并操作并执行自动批处理,这些设计都极大地利用了现代硬件的优势。 DGL已在多种模型上进行了测试,包括但不限于流行的Graph Neural Networks(GNN)及其变体,我们支持的模型运行快,可扩展,内存占用量小。
我们和AWS内部、外部的客户积极合作,正迅速地把DGL落地到推荐系统、欺诈探测、知识图谱、新药研发等场景中。同时,我们也正紧密地进行深图地理论研究工作。DGL于2018年NeuIPS会议期间发布,一直保持强劲的开发,至今已经到第四版。0.4版包括异构、异质图和知识图谱的计算,支持亿万规模的大图,并发布制药模型库,为DGL迈向实用走出关键一步。
听众收益:了解人工智能行业发展与最前沿趋势