An AI tool for the
real world
Tutorial Details:
An AI tool for the real world
An AI tool for the real world
By: By Holger Knublauch
Knowledge modeling with Protégé
nowledge about the application domain is one of the most important cornerstones of successful software projects. You must gather at least a basic understanding of the concepts relevant to your customers before you can begin coding. For example, you need to know how your customer's business processes work before you can develop a warehouse management system; you need to know that users who buy cat food might also be interested in cat litter before you can implement purchase recommendations for an online shop; and you need to know that a Quillflinger is a monster that flings quills before you develop a role-playing game.
We acquire such knowledge from domain experts and capture it in some kind of domain model. In simple cases, we can scribble these models on paper. This approach works fine for small projects and when the experts help us decipher their handwriting. But it's better to have models that directly translate into a Java program. For instance, we can use Unified Modeling Language (UML) to sketch the domain models with class diagrams and use cases. UML is quite good for quickly getting to an implementation, but it is basically a language for object-oriented programming that few domain experts fully understand. And it consists of a fixed set of modeling constructs (such as classes and attributes) that are not very useful when domain experts would rather talk about specific business processes, products, and monsters.
If you want to more closely involve your experts and customers in the development process, you need more than UML. In this article, you will learn how to use Protégé, a simple yet powerful tool optimized for building domain models. Although Protégé was originally developed 15 years ago to support knowledge acquisition for rather specialized medical expert systems, it has also become very popular for many other purposes. Protégé is open source and currently has more than 7,500 registered users.
In a nutshell, you can use Protégé for the following:
Class modeling. Protégé provides a graphical user interface (GUI) that models classes (domain concepts) and their attributes and relationships.
Instance editing. From these classes, Protégé automatically generates interactive forms that enable you or domain experts to enter valid instances.
Model processing. Protégé has a library of plug-ins that help you define semantics, ask queries, and define logical behavior.
Model exchange. The resulting models (classes and instances) can be loaded and saved in various formats, including XML, UML, and RDF (Resource Description Framework). Protégé also provides a very scalable database back end.
From a programmer's perspective, one of Protégé's most attractive features is that it provides an open source API to plug in your own Java components and access the domain models from your application. As a result, you can develop systems very rapidly: just start with the underlying domain model, let Protégé generate the basic user interface, and then gradually write widgets and plug-ins to customize look-and-feel and behavior. You can even give Protégé to your customers and, with little training, let them build their own knowledge and requirement models.
Get started
I walk you through an example project to demonstrate how Protégé works and what else you can do with it. You can download all relevant files for this project from Resources and play with the tool while you read.
Let's assume our task is to develop a system that helps manage the articles and authors for an online magazine like JavaWorld . Articles are categorized by means of a topical index, consisting of keywords like "Swing" or "Design Patterns." Our system uses this index to propose related articles to the magazine's readers. The readers can provide feedback on the articles and rate their quality. The system uses this information to help editors decide whether submitted articles are worth publishing. This decision might depend on the ratings that previous articles by the author have received and whether articles with related topics have been recently published.
Install Protégé
Protégé is the result of various artificial intelligence (AI) and knowledge-modeling projects from the Medical Informatics group at Stanford University. The Protégé Website provides documentation, tutorials, and an extremely active discussion list. You can report problems and find a plug-in library, a collection of domain models, and the Protégé software.
Installers for all major platforms are available on Protégé's download page. To run Protégé (version 1.8), you need a Java 2 Platform, Standard Edition (J2SE) virtual machine (version 1.3 or above). You can choose to automatically install a suitable virtual machine from the Website. For this tutorial, don't forget to download the example project and extract it into a folder such as the examples folder from your Protégé installation.
When you start Protégé, the Welcome screen lets you choose to open an existing project or create a new one. Click on "Open other..." and select the Online Magazines.pprj project.
Protégé's main window consists of tabs that display the knowledge model's various aspects. You will see later that you can add additional tabs from a library or even develop your own tab components and plug them into Protégé.
Classes and slots
The most important tab when you start a project is the Classes tab, shown in Figure 1. In Protégé and many other knowledge-modeling tools, classes are named concepts from the domain that can have attributes and relations. Protégé classes are comparable to Java or UML classes, but without attached methods. Classes can be arranged in an inheritance hierarchy, which displays in the tree panel in the left part of the Classes tab. The properties of the tree's selected class display in the Classes tab's main area. Protégé supports multiple inheritance, and classes are abstract or concrete. Like in Java, only concrete classes have instances.
The example project (see Figure 1) has defined classes for various content types (e.g., Articles and Tips 'N Tricks), authors, readers, feedback, and a topic hierarchy used to categorize content.
Figure 1. Protégé's class editor. Click on thumbnail to view full-size image.
In Protégé, classes' attributes and their relations are called slots. A slot has a name and a value type. Protégé supports the primitive value types boolean , integer , float , and string , which are handled like they are in Java. For example, you can define the class Person and assign a slot called name to it with string as the value type. Additionally, a value type called symbol can represent enumerations of string values (e.g., the 12 different month names). Apart from primitive values, slots can also refer to the model's instances and classes. You can use slots to build relationships and associations between instances, such as between articles and their author(s). Slots store either single or multiple values.
To define a slot for your class, click on the C button above the list of template slots in the Classes tab. This action opens a dialog, shown in Figure 2. If you want an overview of all existing slots in your model, switch to the Slots tab.
Figure 2. Slots are attributes or relationships between classes. The authors slot stores the list of authors. Click on thumbnail to view full-size image.
From what we've seen so far, slots are very similar to conventional object-oriented attributes and relations. However, some important details make slot definitions richer than most object-oriented concepts. A main difference is that a slot can attach to multiple classes. In our magazine project, some but not all Contents subclasses can have subtitles, so we can define a slot subTitle and simultaneously assign it to multiple classes.
Another major difference is that you can specify constraints on slot values. Constraints restrict a slot's range of allowed values. One of these constraints restricts a slot's cardinality. You can specify the minimum and maximum number of values a slot holds. This feature is similar in UML, where you can define cardinalities like [0..1] or [0..*] . Protégé also allows you to define inverse slots and default values for slots. Furthermore, you can restrict the range of numeric slots (integer and float) by minimum and maximum values. All these constraints help you build correct domain models, because Protégé can display an instance's invalid values.
Protégé slots are global objects (i.e., they can even exist without being assigned to a class). You can either globally or individually define their properties for each slot's assigned class. For that purpose, Protégé allows you to override the slots' properties, so you can separately define value type, cardinality, and more for each class. You see the difference when you double-click on a slot in the Classes tab, where Protégé asks if you want to see the "top-level slot" or the "slot at class."
The slot restrictions mentioned so far ensure that the model's instances fulfill simple constraints. For more complex constraints, Protégé has a built-in language called Protégé Axiom Language (PAL). PAL is similar to the Object Constraint Language (OCL) in UML. In the example project, PAL tells Protégé that no online magazine reader can review the same article more than once. Although PAL may look unusual at a first glimpse, it is actually very powerful. Besides PAL, Protégé has some extensions like the JessTab (see below) that also expresses constraints and other kinds of "meaning."
Instances and forms
Now that the classes that describe our domain's concepts and their restrictions have been defined, you can use Protégé's Instances tab to define these classes' instances. Like in Java, instances are specific entities of a given class, such as a specific Article . Protégé tremendously he
Read
Tutorial at: Click here to view the tutorial
Rate Tutorial: An AI tool for the
real world
View Tutorial: An AI tool for the
real world
Related
Tutorials:
Clever Facade makes JDBC look easy
Clever Facade makes JDBC look easy |
The state of Jini technology - JavaWorld
The state of Jini technology - JavaWorld |
JavaWorld article about
JavaCC
JavaWorld article about
JavaCC |
XML for the
absolute beginner - JavaWorld - April 1999
XML for the
absolute beginner - JavaWorld - April 1999 |
Alternative deployment
methods, Part 1: Beyond applets - JavaWorld May
2000
Alternative deployment
methods, Part 1: Beyond applets - JavaWorld May
2000 |
Breathe intelligence into Java - JavaWorld April 2001
Breathe intelligence into Java - JavaWorld April 2001 |
The art of EJB deployment - JavaWorld August 2001
The art of EJB deployment - JavaWorld August 2001 |
Unleash mobile agents
using Jini
Unleash mobile agents
using Jini |
An AI tool for the
real world
An AI tool for the
real world |
Profiling the
profilers
Profiling the
profilers |
Develop state-of-the-art mobile
games
Develop state-of-the-art mobile
games |
ULC - J2EE Rich
Clients now on Eclipse
ULC - J2EE Rich Clients now on Eclipse
it is porting ULC Visual Editor to the new Eclipse visual GUI construction and editor platform. The company has been invited to participate in the Eclipse Visual Editor project. Following its decision to contribute |
Real World HTML Parser
Real World HTML Parser
The two fundamental use-cases that are handled by the parser are extraction and transformation (the syntheses use-case, where HTML pages are created from scratch, is better handled by other tools closer to the source of data). Whil |
Martin Fowler\'s usual terrific writing
The term 'Mock Objects' has become a popular one to describe special case objects that mimic real objects for testing. However the term mock was not originally meant as a more catchy name for stub, but to introduce a different approach to unit testing. In |
Put JSF to work
Build a real-world Web application with JavaServer Faces, the Spring Framework, and Hibernate
Summary
Building a real-world Web application using JavaServer Faces is not a trivial task. This article shows you how to integrate JSF, the Spring Framewor |
Commons-Math: The Jakarta Mathematics Library
Commons-Math: The Jakarta Mathematics Library
The Java programming language and the math extensions in Commons Lang provide implementations for only the most basic mathematical algorithms. Routine development tasks such as computing basic statistics or s |
Tutorial for Developing your first JSPs tags
We have seen how servlets and JSPs can be used to build a web application. These technologies go some distance toward making web development easier, but do not yet facilitate the separation of Java from HTML in a reusable way. Custom tags make this possib |
Sun Studio 10 Software Just Released
This world-class development environment is now extended to the AMD64 architecture and delivers reliable, scalable, and high-performance applications for the Solaris 10 Operating System. |
What is Persistence Framework?
What is Persistence Framework?
What is Persistence Framework?
A persistence framework moves the program data in its most natural form (in memory objects) to and from a permanent data store the database. The persistence framework manages the |
What is WAP? Detailed discussion of WAP API with examples.
What is WAP? Detailed discussion of WAP API with examples.
Learn WAP in 60 minutes
W ireless Application Protocol or WAP for short, allows the developers to develop next generation web application for cellular devices. Through WAP enabled mobile |
|
|
|