Open Access Open Access  Restricted Access Subscription Access
Open Access Open Access Open Access  Restricted Access Restricted Access Subscription Access

A Survey of Methods, Tools and Applications of Knowledge Base Construction (KBC)


Affiliations
1 Founder and CEO, Cere Labs Pvt. Ltd., Maharashtra, India
     

   Subscribe/Renew Journal


Knowledge Bases (KBs) have recently become very valuable because of their use in many Artificial Intelligence (AI) applications. For example - KBs are a critical part of conversational agents that are rapidly being adopted by the industry. Knowledge Bases are made up of a number of facts about the real world, and the number of such facts is typically very large. The construction of the KB involves identifying the facts from unstructured data such as text, images, videos and speech. Due to the challenges of processing unstructured data, Knowledge Base Construction (KBC) used to be done manually, requiring huge efforts and long timelines. In recent years, intelligent technologies such as Machine Learning and Deep Learning are being employed for the purpose of KBC. In this paper, we provide an introduction to KBC and describe how various cutting edge technologies are being employed for its automation. We then mention some KBC systems created by academic as well as commercial groups. We survey some solutions in the industry that are already using KBC. In addition, we attempt to predict the future research possibilities in this field.

Keywords

Knowledge Base Construction, Information Extraction, Text Processing, Knowledge Graph, Machine Learning
Subscription Login to verify subscription
User
Notifications
Font Size


  • Al-Zaidy, R. A., & Giles, C. L. (2018). Extracting semantic relations for scholarly knowledge base construction. 2018 IEEE 12th International Conference on Semantic Computing (ICSC) (pp. 56-63).
  • Bergmann, T. (2019). Setting up a knowledge base to encode domain knowledge for Rasa. Medium. Retrieved from https://blog.grakn.ai/setting-up-a-knowledge-baset o-encode-domain-knowledge-for-rasa-6c936242d03d
  • Bocklisch, T., Faulkner, J., Pawlowski, N., & Nichol, A.(2017). Rasa: Open source language understanding and dialogue management. ArXiv, abs/1712.05181.
  • Breuel, T. M. (2008). The OCRopus open source OCR system. Electronic Imaging.
  • Breuel, T. M. (2017). High performance text recognition using a hybrid convolutional-LSTM Implementation. 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR) (pp. 11-16).
  • Clark, P., Balasubramanian, N., Bhakthavatsalam, S., Humphreys, K., Kinkead, J., Sabharwal, A., & Tafjord, O. (2014). Automatic construction of inference-supporting knowledge bases. 4th Workshop on Automated Knowledge Base Construction (AKBC).
  • Computational Biology Institute. (2018). Workshop: Rapid biomedical knowledge base construction from text. Retrieved from https://cbi.gwu.edu/workshoprapid-biomedical-knowledge-base-construction-text
  • Computer Science Department, Stanford University.(n.d). DeepDive applications. Retrieved from http:// deepdive.stanford.edu/showcase/apps
  • Dong, X., Gabrilovich, E., Heitz, G., Horn, W., Lao, N., Murphy, K., Strohmann, T., Sun, S., & Zhang, W.(2014). Knowledge vault: A web-scale approach to probabilistic knowledge fusion. KDD‘14.
  • Eftimov, T., Seljak, B. K., & Korosec, P. (2017). A rulebased named-entity recognition method for knowledge extraction of evidence-based dietary recommendations. PLoS ONE, 12.
  • Gangemi A. (2013) A comparison of knowledge extraction tools for the semantic web. In Cimiano P., Corcho O., Presutti V., Hollink L., Rudolph S. (eds) The Semantic Web: Semantics and Big Data (pp. 351-366).ESWC 2013. Lecture Notes in Computer Science, vol.7882. Springer, Berlin, Heidelberg.
  • IBM Research Editorial Staff. (2017). Automated knowledge base construction solution wins at ISWC 2017. Retrieved from https://www.ibm.com/blogs/ research/2017/11/knowledge-base-construction-iswc-2017/
  • Johnson, B. (2009). British search engine could rival Google. The Guardian. Retrieved from https:// www.theguardian.com/technology/2009/mar/09/ search-engine-google
  • Lehmann, J., Isele, R., Jakob, M., Jentzsch, A., Kontokostas, D., Mendes, P. N., Hellmann, S., Morsey, M., Kleef, P. V., Auer, S., & Bizer, C. (2015). DBpedia - A large-scale, multilingual knowledge base extracted from Wikipedia. Semantic Web, 6, 167-195.
  • Lenat, D. (1995). CYC: A large-scale investment in knowledge infrastructure. Communications of the ACM, 38(11), 33-38.
  • Leonard-Barton, D., & Svikova, J. (1988). Putting expert systems to work. HBR. Retrieved from https://hbr.
  • org/1988/03/putting-expert-systems-to-work
  • Li, Y., Reiss, F., & Chiticariu, L. (2011). System T: A declarative information extraction system. ACL.
  • Lin, Y., Liu, Z., Sun, M., Liu, Y., & Zhu, X. (2015).Learning entity and relation embeddings for knowledge graph completion. AAAI.
  • Mitchell, T., Kisiel, B., Krishnamurthy, J., Lao, N.,Rivard, K., Mohamed, T., Nakashole, N., Platanios, E. A., Ritter, A., Samadi, M., Settles, B., Cohen, W.,Wang, R., Wijaya, D., Gupta, A., Chen, X., Saparov,
  • A., Greaves, M., Welling, J., & Gardner, M. (2018).Never ending learning. Communications of the ACM,61(5), 103-115.
  • Oro, E., & Ruffolo, M. (2008). Towards a system for ontology-based information extraction from PDF documents. R. Meersman & Z. Tari (eds), On the moveto meaningful internet systems: OTM 2008. Lecture Notes in Computer Science, vol. 5332. Springer, Berlin, Heidelberg.
  • Oro, E., & Ruffolo, M. (2009). PDF-TREX: An approach for recognizing and extracting tables from PDF documents. 10th International Conference on Document Analysis and Recognition (pp. 906-910).
  • Ratner, A., Bach, S. H., Ehrenberg, H. R., Fries, J. A., Wu,S., & Ré, C. (2017). Snorkel: Rapid training data creation with weak supervision. Proceedings of the VLDB Endowment. International Conference on Very Large Data Bases, 11(3), 269-282.
  • Ratner, A., Ré, C., & Bailis, P. (2018). Knowledge base construction in the machine-learning era.Communications of the ACM, 61(11), 95-96.
  • Razniewski, S., Suchanek, F. M., & Nutt. W. (2016). But what do we actually know? AKBC.
  • Santos, C. N., Xiang, B., & Zhou, B. (2015). Classifying relations by ranking with convolutional neural networks. ArXiv, abs/1504.06580.
  • Shin, J., Wu, S., Wang, F., Sa, C.D., Ratner, A., Zhang, C.,& Ré, C. (2016). Incremental knowledge base construction using DeepDive. The VLDB Journal, 26, 81-105.
  • Subasic, P., Yin, H., & Lin, X. (2019). Building knowledge base through deep learning relation extraction and Wikidata. AAAI Spring Symposium: Combining Machine Learning with Knowledge Engineering.
  • Tom Simonite. (2019). Inside the alexa-friendly world of wikidata. Retrieved March, 2019, from https://www.wired.com/story/inside-the-alexa-friendly-world-of wikidata/
  • Wu, S., Hsiao, L., Cheng, X., Hancock, B., Rekatsinas, T.,Levis, P., & Ré, C. (2018). Fonduer: Knowledge base construction from richly formatted data. Proceedings of the 2018 International Conference on Management of Data.
  • Zhang, D., & Wang, D. (2015). Relation classification via recurrent neural network. ArXiv, abs/1508.01006.

Abstract Views: 256

PDF Views: 0




  • A Survey of Methods, Tools and Applications of Knowledge Base Construction (KBC)

Abstract Views: 256  |  PDF Views: 0

Authors

Devesh Rajadhyax
Founder and CEO, Cere Labs Pvt. Ltd., Maharashtra, India

Abstract


Knowledge Bases (KBs) have recently become very valuable because of their use in many Artificial Intelligence (AI) applications. For example - KBs are a critical part of conversational agents that are rapidly being adopted by the industry. Knowledge Bases are made up of a number of facts about the real world, and the number of such facts is typically very large. The construction of the KB involves identifying the facts from unstructured data such as text, images, videos and speech. Due to the challenges of processing unstructured data, Knowledge Base Construction (KBC) used to be done manually, requiring huge efforts and long timelines. In recent years, intelligent technologies such as Machine Learning and Deep Learning are being employed for the purpose of KBC. In this paper, we provide an introduction to KBC and describe how various cutting edge technologies are being employed for its automation. We then mention some KBC systems created by academic as well as commercial groups. We survey some solutions in the industry that are already using KBC. In addition, we attempt to predict the future research possibilities in this field.

Keywords


Knowledge Base Construction, Information Extraction, Text Processing, Knowledge Graph, Machine Learning

References