A knowledge base
) is a technology used to store complex structured
and unstructured information used by a computer system. The initial use of the term was in connection with expert
systems which were the first knowledge-based systems.
The original use of the term knowledge-base was to describe one of the two sub-systems of a knowledge-based system. A knowledge-based system consists of a knowledge-base that represents facts about the world and an inference engine
that can reason about those facts and use rules and other forms of logic
to deduce new facts or highlight inconsistencies.
The term "knowledge-base" was coined to distinguish this form of knowledge store from the more common and widely used term database
. At the time (the 1970s) virtually all large Management Information Systems stored their data in some type of hierarchical or relational database
. At this point in the history of Information Technology the distinction between a database and a knowledge base was clear and unambiguous. A database had these properties:
- Flat data. Data was usually represented in a tabular format with strings or numbers in each field.
- Multiple users. A conventional database needed to support more than one user or system logged into the same data at the same time.
- Transactions. An essential requirement for a database was to maintain integrity and consistency among data accessed by concurrent users. These are the so-called ACID properties: Atomicity, Consistency, Isolation, and Durability.
- Large, long-lived data. A corporate database needed to support not just thousands but hundreds of thousands or more rows of data. Such a database usually needed to persist past the specific uses of any individual program; it needed to store data for years and decades rather than for the life of a program.
The first knowledge-based systems had data needs that were the opposite of these database requirements. An expert system
requires structured data. Not just tables
with numbers and strings, but pointers to other objects that in turn have additional pointers. The ideal representation for a knowledge base is an object model (often called an ontology
in artificial intelligence
literature) with classes, subclasses, and instances.
Early expert systems also had little need for multiple users or the complexity
that comes with requiring transactional properties on data. The data for the early expert systems was used to arrive
at a specific answer, such as a medical diagnosis
, the design of a molecule, or a response to an emergency
. Once the solution
to the problem was known there was not a critical demand to store large amounts of data back to a permanent memory store. A more precise
statement would be that given the technologies available researchers compromised and did without these capabilities because they realized they were beyond what could be expected and they could develop useful solutions to non-trivial problems without them. Even from the beginning the more astute researchers realized the potential benefits of being able to store, analyze
, and reuse
knowledge. For example, see the discussion of Corporate Memory in the earliest work of the Knowledge-Based Software Assistant program by Cordell Green et al
The volume requirements were also different for a knowledge-base compared to a conventional database. The knowledge-base needed to know facts about the world. For example, to represent the statement that "All humans are mortal". A database typically could not represent this general knowledge but instead would need to store information about thousands of tables that represented information about specific humans. Representing that all humans are mortal and being able to reason about any given human that they are mortal is the work of a knowledge-base. Representing that George, Mary
, Sam, Jenna, Mike,... and hundreds of thousands of other customers are all humans with specific ages, sex
, address, etc. is the work for a database.
As expert systems moved from being prototypes to systems deployed in corporate environments the requirements for their data storage rapidly started to overlap with the standard database requirements for multiple, distributed users with support for transactions. Initially, the demand could be seen in two different but competitive markets. From the AI and Object-Oriented
communities object-oriented databases such as Versant emerged. These were systems designed from the ground up to have support for object-oriented capabilities but also to support standard database services as well. On the other hand, the large database vendors such as Oracle added capabilities to their products that provided support for knowledge-base requirements such as class-subclass relations and rules.
The next evolution
for the term knowledge-base was the Internet. With the rise of the Internet documents, hypertext, and multimedia
support were now critical for any corporate database. It was no longer enough to support large tables of data or relatively small objects that lived primarily in computer memory. Support for corporate web sites required persistence
and transactions for documents. This created a whole new discipline known as Web Content Management. The other driver for document support was the rise of knowledge management
vendors such as Lotus Notes. Knowledge Management actually predated the Internet but with the Internet there was great synergy
between the two areas. Knowledge management products adopted the term "knowledge-base" to describe their repositories but the meaning had a subtle difference. In the case of previous knowledge-based systems the knowledge was primarily for the use of an automated system, to reason about and draw conclusions about the world. With knowledge management products the knowledge was primarily meant for humans, for example to serve as a repository
of manuals, procedures, policies, best practices, reusable designs and code, etc. Of course
in both cases the distinctions between the uses and kinds of systems were ill
defined. As the technology scaled up it was rare to find a system that could really be cleanly classified as knowledge-based in the sense of an expert system that performed automated reasoning and knowledge-based in the sense of knowledge management that provided knowledge in the form of documents and media that could be leveraged by humans.