Lucene and the Zend Framework

Zend FrameworkOne of the most talked about features of the Zend Framework is its port of the Apache Lucene project – a Java-based full-text search-engine framework. The Zend Framework allows PHP developers to use Lucene without requiring additional PHP extensions or Java, or even a database.

The theory is that Zend_Search_Lucene overcomes the usual limitations of relational databases with features such as:

  • Fast indexing
  • Ranked result sets
  • A powerful but simple query syntax
  • The ability to index multiple fields

Lucene is well-known for it's speed. For an example have a look at DamnFastDotLucene – this demo site tests the performance of a .Net implementation of Lucene on quite a large set of documents:

  • 9150 text files from the Gutenberg Project
  • The total size of indexed documents is 3.5 GB
  • The index size is 880 MB
  • The Hardware: Pentium 4 3Ghz 800/1MB Cache, 1 GB DDRII Kingston 533, Western Digital Raptor 80GB

The result – it takes approximately the same time to search 5 MB of text as it does to search 3.5 GB of text. I was getting speeds less than 0.125 seconds. That is fast.

That was .Net though – what about the PHP implementation in the Zend Framework?

The reality for PHP developers using the Zend Framework may be a little different from the hype. Some developers are reporting Zend_Search_Lucene as being significantly slower than the queries being run from MySQL or PostGres. Have a look at the following comments in the Zend Framework Mailing List for details.

To be fair it is only very early days for the Zend Framework and Lucene – the project is still in early Alpha. However it is already being adopted by the community for live projects.

If you want to learn more about Zend_Search_Lucene I recommend the following links:

If you have any experiences with Zend_Search_Lucene that you would like to share I would appreciate hearing about it…

Be Sociable, Share!

Comments are closed.