Text

From Oracle FAQ
Jump to: navigation, search

Oracle Text uses standard SQL to index, search, and analyze text data stored within an Oracle database, flat files and on the web. Oracle Text can also perform linguistic analysis on documents, as well as search text using a variety of strategies including keyword searching, context queries, boolean operators, pattern matching, mixed thematic queries, HTML/XML section searching, and so on.

Oracle Text is available (no extra licensing required) in Enterprise, Standard and Personal Edition.

Index types[edit]

Different text index types are provided. Do some reading before jumping into Oracle TEXT to ensure you choose the right indexing type for your needs.

CONTEXT indexes[edit]

CONTEXT type indexes are not automatically maintained and requires "synchronizing" - changes to data are not always immediately seen. This type of index is used to index documents like WORD, EXCEL, HTML, and XML.

Example:

CREATE TABLE my_text_table(
 id NUMBER PRIMARY KEY,
 title VARCHAR2(30),
 doc CLOB);

INSERT INTO my_text_table VALUES(1, 'doc1', 'Hi Sara, ...');
INSERT INTO my_text_table VALUES(2, 'doc2', 'Hi Mom');
INSERT INTO my_text_table VALUES(3, 'doc3', 'A highly recommended document');
INSERT INTO my_text_table VALUES(4, 'doc4', 'Another document');
 
CREATE INDEX my_text_index ON my_text_table(doc)
  INDEXTYPE IS ctxsys.context;

SELECT SCORE(1), title
  FROM my_text_table
 WHERE CONTAINS(doc, '%hi%', 1) > 0;

CTXCAT indexes[edit]

CTXCAT indexes is transactional in nature and are thus maintained along with any insert/updates/deletes that you do no the underlying data.

CREATE TABLE my_text_table(
 id    NUMBER PRIMARY KEY,
 title VARCHAR2(30),
 doc   VARCHAR2(30));

CREATE INDEX my_text_index ON my_text_table(doc)
  INDEXTYPE IS ctxsys.ctxcat;

INSERT INTO my_text_table VALUES(1, 'doc1', 'Hi Sara, ...');
INSERT INTO my_text_table VALUES(2, 'doc2', 'Hi Mom');
INSERT INTO my_text_table VALUES(3, 'doc3', 'A highly recommended document');
INSERT INTO my_text_table VALUES(4, 'doc4', 'Another document');

SELECT *
  FROM my_text_table
 WHERE CATSEARCH(doc,'Hi', null) > 0;

CTXRULE indexes[edit]

CTXRULE indexes is used to build document classification applications in which incoming documents are routed according to content.