Knowledge graph
In knowledge representation and reasoning, a knowledge graph is a knowledge base that uses a graph-structured data model or topology to represent and operate on data. Knowledge graphs are often used to store interlinked descriptions of entities – objects, events, situations or abstract concepts – while also encoding the free-form semantics or relationships underlying these entities.[1][2]
Since the development of the Semantic Web, knowledge graphs have often been associated with linked open data projects, focusing on the connections between concepts and entities.[3][4] They are also historically associated with and used by search engines such as Google, Bing, Yext and Yahoo; knowledge-engines and question-answering services such as WolframAlpha, Apple's Siri, and Amazon Alexa; and social networks such as LinkedIn and Facebook.
Recent developments in data science and machine learning, particularly in graph neural networks and representation learning and also in machine learning, have broadened the scope of knowledge graphs beyond their traditional use in search engines and recommender systems. They are increasingly used in scientific research, with notable applications in fields such as genomics, proteomics, and systems biology.[5]
History
[edit]The term was coined as early as 1972 by the Austrian linguist Edgar W. Schneider, in a discussion of how to build modular instructional systems for courses.[6] In the late 1980s, the University of Groningen and University of Twente jointly began a project called Knowledge Graphs, focusing on the design of semantic networks with edges restricted to a limited set of relations, to facilitate algebras on the graph. In subsequent decades, the distinction between semantic networks and knowledge graphs was blurred.
Some early knowledge graphs were topic-specific. In 1985, Wordnet was founded, capturing semantic relationships between words and meanings – an application of this idea to language itself. In 2005, Marc Wirk founded Geonames to capture relationships between different geographic names and locales and associated entities. In 1998 Andrew Edmonds of Science in Finance Ltd in the UK created a system called ThinkBase that offered fuzzy-logic based reasoning in a graphical context.[7] ThinkBase LLC[8]
In 2007, both DBpedia and Freebase were founded as graph-based knowledge repositories for general-purpose knowledge. DBpedia focused exclusively on data extracted from Wikipedia, while Freebase also included a range of public datasets. Neither described themselves as a 'knowledge graph' but developed and described related concepts.
In 2012, Google introduced their Knowledge Graph,[9] building on DBpedia and Freebase among other sources. They later incorporated RDFa, Microdata, JSON-LD content extracted from indexed web pages, including the CIA World Factbook, Wikidata, and Wikipedia.[9][10] Entity and relationship types associated with this knowledge graph have been further organized using terms from the schema.org[11] vocabulary. The Google Knowledge Graph became a successful complement to string-based search within Google, and its popularity online brought the term into more common use.[11]
Since then, several large multinationals have advertised their knowledge graphs use, further popularising the term. These include Facebook, LinkedIn, Airbnb, Microsoft, Amazon, Uber and eBay.[12]
In 2019, IEEE combined its annual international conferences on "Big Knowledge" and "Data Mining and Intelligent Computing" into the International Conference on Knowledge Graph.[13]
Definitions
[edit]There is no single commonly accepted definition of a knowledge graph. Most definitions view the topic through a Semantic Web lens and include these features:[14]
- Flexible relations among knowledge in topical domains: A knowledge graph (i) defines abstract classes and relations of entities in a schema, (ii) mainly describes real world entities and their interrelations, organized in a graph, (iii) allows for potentially interrelating arbitrary entities with each other, and (iv) covers various topical domains.[15]
- General structure: A network of entities, their semantic types, properties, and relationships.[16][17] To represent properties, categorical or numerical values are often used.
- Supporting reasoning over inferred ontologies: A knowledge graph acquires and integrates information into an ontology and applies a reasoner to derive new knowledge.[3]
There are, however, many knowledge graph representations for which some of these features are not relevant. For those knowledge graphs, this simpler definition may be more useful:
- A digital structure that represents knowledge as concepts and the relationships between them (facts). A knowledge graph can include an ontology that allows both humans and machines to understand and reason about its contents.[18][19]
Implementations
[edit]In addition to the above examples, the term has been used to describe open knowledge projects such as YAGO and Wikidata; federations like the Linked Open Data cloud;[20] a range of commercial search tools, including Yahoo's semantic search assistant Spark, Google's Knowledge Graph, and Microsoft's Satori; and the LinkedIn and Facebook entity graphs.[3]
The term is also used in the context of note-taking software applications that allow a user to build a personal knowledge graph.[21]
The popularization of knowledge graphs and their accompanying methods have led to the development of graph databases such as Neo4j[22] and GraphDB.[23] These graph databases allow users to easily store data as entities and their interrelationships, and facilitate operations such as data reasoning, node embedding, and ontology development on knowledge bases.
Using a knowledge graph for reasoning over data
[edit]A knowledge graph formally represents semantics by describing entities and their relationships.[24] Knowledge graphs may make use of ontologies as a schema layer. By doing this, they allow logical inference for retrieving implicit knowledge rather than only allowing queries requesting explicit knowledge.[25]
In order to allow the use of knowledge graphs in various machine learning tasks, several methods for deriving latent feature representations of entities and relations have been devised. These knowledge graph embeddings allow them to be connected to machine learning methods that require feature vectors like word embeddings. This can complement other estimates of conceptual similarity.[26][27]
Models for generating useful knowledge graph embeddings are commonly the domain of graph neural networks (GNNs).[28] GNNs are deep learning architectures that comprise edges and nodes, which correspond well to the entities and relationships of knowledge graphs. The topology and data structures afforded by GNNs provides a convenient domain for semi-supervised learning, wherein the network is trained to predict the value of a node embedding (provided a group of adjacent nodes and their edges) or edge (provided a pair of nodes). These tasks serve as fundamental abstractions for more complex tasks such as knowledge graph reasoning and alignment.[29]
Entity alignment
[edit]As new knowledge graphs are produced across a variety of fields and contexts, the same entity will inevitably be represented in multiple graphs. However, because no single standard for the construction or representation of knowledge graph exists, resolving which entities from disparate graphs correspond to the same real world subject is a non-trivial task. This task is known as knowledge graph entity alignment, and is an active area of research.[30]
Strategies for entity alignment generally seek to identify similar substructures, semantic relationships, shared attributes, or combinations of all three between two distinct knowledge graphs. Entity alignment methods use these structural similarities between generally non-isomorphic graphs to predict which nodes corresponds to the same entity.[31]
The recent successes of large language models (LLMs), in particular their effectiveness at producing syntactically meaningful embeddings, has spurred the use of LLMs in the task of entity alignment.[32]
As the amount of data stored in knowledge graphs grows, developing dependable methods for knowledge graph entity alignment becomes an increasingly crucial step in the integration and cohesion of knowledge graph data.
See also
[edit]- Concept map – Diagram showing relationships among concepts
- Formal semantics (natural language) – Study of meaning in natural languages
- Graph database – Database using graph structures for queries
- Knowledge base – Information repository with multiple applications
- Knowledge graph embedding – Dimensionality reduction of graph-based semantic data objects [machine learning task]
- Logical graph – Type of diagrammatic notation for propositional logic
- Semantic integration – Interrelating info from diverse sources
- Semantic technology – Technology to help machines understand data
- Topic map – Knowledge organization system
- Vadalog – Type of Knowledge Graph Management System
- YAGO (database) – Open-source information repository
References
[edit]- ^ "What is a Knowledge Graph?". 2018.
- ^ "What defines a knowledge graph?". 2020.
- ^ a b c Ehrlinger, Lisa; Wöß, Wolfram (2016). Towards a Definition of Knowledge Graphs (PDF). SEMANTiCS2016. Leipzig: Joint Proceedings of the Posters and Demos Track of 12th International Conference on Semantic Systems – SEMANTiCS2016 and 1st International Workshop on Semantic Change & Evolving Semantics (SuCCESS16). pp. 13–16.
- ^ Soylu, Ahmet (2020). "Enhancing Public Procurement in the European Union Through Constructing and Exploiting an Integrated Knowledge Graph". The Semantic Web – ISWC 2020. Lecture Notes in Computer Science. Vol. 12507. pp. 430–446. doi:10.1007/978-3-030-62466-8_27. ISBN 978-3-030-62465-1. S2CID 226229398.
- ^ Mohamed, Sameh K.; Nounu, Aayah; Nováček, Vít (2021). "Biological applications of knowledge graph embedding models". Briefings in Bioinformatics. 22 (2): 1679–1693. doi:10.1093/bib/bbaa012. hdl:1983/919db5c6-6e10-4277-9ff9-f86bbcedcee8. PMID 32065227 – via Oxford Academic.
- ^ Edward W. Schneider. 1973. Course Modularization Applied: The Interface System and Its Implications For Sequence Control and Data Analysis. In Association for the Development of Instructional Systems (ADIS), Chicago, Illinois, April 1972
- ^ "US Trademark no 75589756".
- ^ "ThinkBase". Retrieved 25 December 2024.
- ^ a b Singhal, Amit (May 16, 2012). "Introducing the Knowledge Graph: things, not strings". Official Google Blog. Retrieved 21 March 2017.
- ^ Schwartz, Barry (December 17, 2014). "Google's Freebase To Close After Migrating To Wikidata: Knowledge Graph Impact?". Search Engine Roundtable. Retrieved December 10, 2017.
- ^ a b McCusker, James P.; McGuiness, Deborah L. "What is a Knowledge Graph?". www.authorea.com. Retrieved 21 March 2017.
- ^ "Knowledge Graph Enterprises". 2020.
- ^ "2021 IEEE International Conference on Knowledge Graph (ICKG)*". KMedu Hub. 2017-07-09. Retrieved 2021-03-22.
- ^ Hogan, Aidan; Blomqvist, Eva; Cochez, Michael; d'Amato, Claudia; de Melo, Gerard; Gutierrez, Claudio; Labra Gayo, José Emilio; Kirrane, Sabrina; Neumaier, Sebastian; Polleres, Axel; Navigli, Roberto; Ngonga Ngomo, Axel-Cyrille; Rashid, Sabbir M.; Rula, Anisa; Schmelzeisen, Lukas; Sequeda, Juan; Staab, Steffen; Zimmermann, Antoine (2021-01-24). "Knowledge Graphs". ACM Computing Surveys. 54 (4): 1–37. arXiv:2003.02320. doi:10.1145/3447772. ISSN 0360-0300. S2CID 235716181.
- ^ Paulheim, Heiko (2017). "Knowledge Graph Refinement: A Survey of Approaches and Evaluation Methods" (PDF). Semantic Web: 489–508. Retrieved 21 March 2017.
- ^ Krötsch, Markus; Weikum, Gerhard (March 2016). "Editorial of the Special Issue on Knowledge Graphs". Journal of Web Semantics. 37–38: 53–54. doi:10.1016/j.websem.2016.04.002. Retrieved 10 February 2021.
- ^ "What is a Knowledge Graph?|Ontotext". Ontotext. Retrieved 2020-07-01.
- ^ Peng, Ciyuan; Feng, Xia; Naseriparsa, Mehdi; Osborne, Francesco (2023). "Knowledge Graphs: Opportunities and Challenges". Artificial Intelligence Review. 56 (11): 13071–13102. arXiv:2303.13948. doi:10.1007/s10462-023-10465-9. ISSN 1573-7462. PMC 10068207. PMID 37362886.
- ^ "The Knowledge Graph about Knowledge Graphs". 2020.
- ^ "The Linked Open Data Cloud". lod-cloud.net. Retrieved 2020-06-30.
- ^ Pyne, Yvette; Stewart, Stuart (March 2022). "Meta-work: how we research is as important as what we research". British Journal of General Practice. 72 (716): 130–131. doi:10.3399/bjgp22X718757. PMC 8884432. PMID 35210247.
- ^ "Neo4j Graph Database & Analytics | Graph Database Management System". Neo4j. Retrieved 8 November 2023.
- ^ "Ontotext GraphDB". Ontotext. Retrieved 8 November 2023.
- ^ "How do knowledge graphs work?". Stardog. 2022-04-05. Retrieved 2022-04-05.
- ^ "Unlocking the Power of Google Knowledge Panel: How to Obtain and Claim Yours in 2023 – RH Razu". rhrazu.com. 2023-09-01. Retrieved 2023-09-05.
- ^ Hongwei Wang (October 2018). "RippleNet: Propagating User Preferences on the Knowledge Graph for Recommender Systems". Proceedings of the 27th ACM International Conference on Information and Knowledge Management. pp. 417–426. arXiv:1803.03467. doi:10.1145/3269206.3271739. ISBN 9781450360142. S2CID 3766110.
- ^ Ristoski, Petar; Paulheim, Heiko (2016), "RDF2Vec: RDF Graph Embeddings for Data Mining" (PDF), The Semantic Web – ISWC 2016, Lecture Notes in Computer Science, vol. 9981, pp. 498–514, doi:10.1007/978-3-319-46523-4_30, ISBN 978-3-319-46522-7
- ^ Zhou, Jie; et al. (2020). "Graph neural networks: A review of methods and applications". AI Open. 1 (1): 57–81. arXiv:1812.08434. doi:10.1016/j.aiopen.2021.01.001. S2CID 56517517 – via Elsevier Science Direct.
- ^ Ye, Zi; Kumar, Yogan Jaya; Sing, Goh Ong; Song, Fengyan; Wang, Junsong (2022). "A comprehensive survey of graph neural networks for knowledge graphs". IEEE Access. 10: 75729–7574. Bibcode:2022IEEEA..1075729Y. doi:10.1109/ACCESS.2022.3191784. S2CID 250654689 – via IEEE Xplore.
- ^ Berrendorf, Max; Faerman, Evgeniy; Melnychuk, Valentyn; Tresp, Volker; Seidl, Thomas (April 14–17, 2020). Knowledge graph entity alignment with graph convolutional networks: lessons learned. Advances in Information Retrieval: 42nd European Conference on IR Research, ECIR 2020, Lisbon, Portugal. Lecture Notes in Computer Science. Vol. Proceedings, Part II. pp. 3–11. arXiv:1911.08342. doi:10.1007/978-3-030-45442-5_1. ISBN 978-3-030-45441-8. S2CID 208158314 – via Springer International Publishing.
- ^ Chaurasiya, Deepak; Surisetty, Anil; Kumar, Nitish; Singh, Alok; Dey, Vikrant; Malhotra, Aakarsh; Dhama, Gaurav; Arora, Ankur (2022). "Entity alignment for knowledge graphs: progress, challenges, and empirical studies". arXiv:2205.08777 [cs.AI].
- ^ Hogan, Aidan; Lippolis, Anna Sofia; Klironomos, Antonis; Milon-Flores, Daniela F.; Zheng, Heng; Jouglar, Alexane; Norouzi, Ebrahim (2023). "Enhancing Entity Alignment Between Wikidata and ArtGraph using LLMs" (PDF). Proceedings of the International Workshop on Semantic Web and Ontology Design for Cultural Heritage – via International Workshop on Semantic Web and Ontology Design for Cultural Heritage (SWODCH), Athens, Greece.
External links
[edit]- Will Douglas Heaven (4 September 2020). "This know-it-all AI learns by reading the entire web nonstop". MIT Technology Review. Retrieved 5 September 2020.
Diffbot is building the biggest-ever knowledge graph by applying image recognition and natural-language processing to billions of web pages.