The Lucene Search Engine

Doug Cutting <cutting@lucene.com>
Inktomi Seminar
16 June, 2000


Lucene is . . .


Disclaimer


Context


Architecture


Inverted Index


Some Inverted Index Strategies

  1. batch-based: use file-sorting algorithms (textbook)
  2. b-tree based: update in place (http://www.lucene.com/papers/sigir90.ps)
  3. segment based: lots of small indexes (Verity)
  4. hash-file based (Ultraseek ISTK?)
 (strategies not exclusive, can be combined)

Lucene's Inverted Index Strategy


Indexing Diagram


Search Algorithms


Lucene's Disjunctive Search Algorithm


Lucene's Phrase Scoring


Future

http://www.lucene.com