![]() | ![]() |
Home |
|
|
Full-Text Specialty Data Store User's Guide |
|
| Chapter 4 Setting Up Verity Functions |
|
| Creating Topics (Enhanced Version Only) |
The section provides a condensed overview of Verity Topics. Topics are discussed in detail in Chapter 8, "Verity Topics."
A TOPICŪ is a grouping of information related to a concept or subject area. With topic definitions in place, a user can perform searches on the topic instead of having to write queries with complex syntax.
The user creates topics which can be combinations of words and phrases, Verity operators and modifiers, and weight values. Then, any user can query the topic.
Before you create topics, determine your application requirements, and establish standards for naming conventions and for the location of the following:
Outline files - contains the topic definitions. Each topic has its own outline file.
Topic set directories - contains the compiled topic. Each topic has its own topic set directory.
Knowledge base map file - contains pointers to the topic set directories.
To implement topics, perform the following steps:
Create one or more outline input files to define your topics (see "Creating an Outline File"). Each outline file is used to populate one topic set.
Create and populate a topic set directory, using the mktopics utility (see "Creating a Topic Set Directory"). Each topic set directory is populated based on one topic outline input file.
Create a knowledge base map, specifying the locations of one or more topic set directories (see "Creating a Knowledge Base Map")
Set the knowledge_base configuration parameter to point to the location of the knowledge base map (see "Defining the Location of the Knowledge Base Map")
Execute queries against defined topics.
The following sample files illustrate the topics feature:
sample_text_topics.otl is a sample outline file
sample_text_topics.kbm is a sample knowledge base map
sample_text_topics.sql issues queries using defined topics
These files are in the $SYBASE/$SYBASE_FTS/sample/scripts directory.
A topic outline file specifies all the combinations of words and phrases, Verity operators and modifiers, and weight values that the search engine uses when you issue a query using the topic. The outline file is an ASCII text file in a structured format.
For example, the following outline file defines the topic "saint-bernard":
$control: 1 saint-bernard <accrue> *0.80 "Saint Bernard" *0.80 "St. Bernard" * "working dogs" * "large dogs" * "European breeds" $$
When you issue a query specifying the topic "saint-bernard", the Full-Text Search engine:
Returns documents that contain one or more of the following phrases: "Saint Bernard," "St. Bernard," "working dogs," "large dogs," and "European breeds"
Scores documents that contain the phrase "Saint Bernard" or "St. Bernard" higher than documents that contain the phrase "working dogs, "large dogs," or "European breeds"
This example is a very basic topic definition. An outline can introduce more complex relationships by using:
Multiple levels of subtopics
Combinations of Verity operators (this example uses accrue)
Verity modifiers
In Windows NT, you can use the graphical user interface of the Verity Intelligent Classifier product to create topic outlines. It is available from Verity. If you use Intelligent Classifier, it automatically creates a topic set directory, and you can go to "Creating a Knowledge Base Map" to continue setting up your topics.
Use the mktopics utility to create and populate a topic set directory. It is located in:
$SYBASE/$SYBASE_FTS/verity/bin
Run, or define an alias to run, mktopics from this bin directory. You can create a topic set directory or directories in any work directory.
The mktopics syntax is:
mktopics -outline outline_file.otl -topicset topic_set_directory
where:
outline_file - is the name of the outline file you create in "Creating an Outline File"
topic_set_directory -is the name of the topic set directory you are creating
For example, to execute the mktopics utility reading the saint-bernard.otl file defined above, and directing output to a work directory, use the syntax:
mktopics -outline /usr/u/sybase/topic_outlines/saint-bernard.otl -topicset /usr/u/sybase/topic_sets/saint-bernard_topic
A knowledge base map specifies the locations of one or more topic set directories. Create an ASCII knowledge base map file that defines the fully-qualified directory paths to your topic sets.
For example, the following knowledge base map file illustrates how you can list multiple knowledge bases in the map. The first entry identifies the topic set directory created with mktopics above.
$control:1
kbases:
{
kb:
/kb-path = /usr/u/sybase/topic_sets/saint-bernard_topic
kb:
/kb-path = /usr/u/sybase/topic_sets/another_topic
}Set the knowledge_base configuration parameter to point to the location of the knowledge base map. For example:
sp_text_configure KRAZYKAT, 'knowledge_base', '/usr/u/sybase/topic_sets/sample_text_topics.kbm'
The knowledge_base configuration parameter is static, and you must restart the Full-Text Search engine for the definition to take effect.
You can now execute queries using the defined topic instead of a complex query. For example, before you create the "saint-bernard" topic, you would have to use the following syntax:
...where i.index_any = "<accrue> ([80]Saint Bernard, [80]St. Bernard, working dogs, large dogs, European breeds)"
to find documents that:
Contain one or more of the following phrases: "Saint Bernard," "St. Bernard," "working dogs," "large dogs," and "European breeds"
Score documents containing the phrase "Saint Bernard" or "St. Bernard" higher than documents containing the phrase "working dogs," "large dogs," or "European breeds"
After you create the topic "saint-bernard", you can use this syntax:
...where i.index_any = "<topic>saint-bernard"
or:
...where i.index_any = "saint bernard"
If you enter a word in a query expression, the Full-Text Search engine tries to match it with a topic name. If you enter a phrase in a query expression, the Full-Text Search engine replaces spaces with hyphens (-), and then tries to match it with a topic name. For example, the Full-Text Search engine matches "saint bernard" with the topic "saint-bernard".
See the sample_text_topics.sql file for examples of using topics in queries.
If the knowledge_base configuration parameter specifies a knowledge base map file that does not exist, the Full-Text Search engine will not be able to start a session with Verity, and the server will not start. If the map file exists but contains invalid entries, Verity issues warning messages at start-up time. You can correct errors by editing the <textserver>.cfg file in the $SYBASE directory. You can correct path information and change the line beginning: "knowledge_base=".
|
|