Working with Lucene.Net
(Page 1 of 5 )
In this third part of a continuing series on working with code libraries, you will learn about Lucene.Net. This article is excerpted from chapter four of the book
Windows Developer Power Tools, written by James Avery and Jim Holmes (O'Reilly; ISBN: 0596527543). Copyright © 2006 O'Reilly Media, Inc. All rights reserved. Used with permission from the publisher. Available from booksellers or direct from O'Reilly Media.
4.3 Searching Your Data Using Lucene.Net
Data is everywhere, whether it’s on the Internet, your local system, or networked hard drives. The challenge often isn’t in collecting and organizing your data but in finding it. Businesses collect data in a staggering array of formats, including Microsoft Outlook or Excel files, Access or SQL databases, PDFs, HTML files, plain old text files, and perhaps even custom application formats. That data often then gets scattered across a dizzying number of locations on different servers.
Chances are that your customers will need to deal with disparate data formats and with data stored in multiple locations. Furthermore, they will probably want to be able to exert some control over how searches are performed. Customers may want to be able to limit searches to certain keywords or to a particular set of data folders on a particular server, or to filter out information older than a particular date.
Google Desktop has made a splash by bringing this functionality to end users. Now you have the power to bring the same indexing and searching capabilities into your applications using Lucene.Net, a high-performance, scalable search engine library written in the C# language and utilizing the .NET Framework.
Lucene.Net at a Glance
Tool | Lucene.Net |
Version covered | 1.4.3, 1.9, 1.9.1, and 2.0 |
Home page | http://incubator.apache.org/lucene.net/ |
Power Tools page | http://www.windevpowertools.com/tools/144 |
Summary | .NET-based search engine API for indexing and searching contents |
License type | Apache License, version 2.0 |
Online resources | API documentation, mailing list at ASF |
Supported Frameworks | .NET 1.1, 2.0 |
Getting Started
Lucene.Net is an open source project currently under incubation at the Apache Software Foundation (ASF). The source code can be downloaded from the project’s home page as a .zip archive or checked out from the Subversion repository.
Lucene.Net requires a Microsoft C# compiler and version 1.1 or 2.0 of the .NET Framework. It works with either Microsoft Visual Studio 2003 or 2005. The source comes with a solution for Visual Studio 2003.
NUnit is required if you want to run the test code. It can be downloaded from its home page at http://www.nunit.org.
You’ll also need SharpZipLib (discussed later in this chapter) if you want to support compressed indexing in Lucene.Net versions 1.9 and 1.9.1. SharpZipLib can be downloaded from its home page at http://www.icsharpcode.net/OpenSource/ SharpZipLib/.
Next: Using Lucene.Net >>
More BrainDump Articles
More By O'Reilly Media
|
This article is excerpted from chapter four of the book Windows Developer Power Tools, written by James Avery and Jim Holmes (O'Reilly; ISBN: 0596527543). Check it out today at your favorite bookstore. Buy this book now.
|
|