Practical Apache Lucene 8 : uncover the search capabilities of your application /

Saved in:

Bibliographic Details
Author / Creator:	Sharma, Atri, author.
Imprint:	[Berkeley, CA] : Apress, [2020]
Description:	1 online resource
Language:	English
Subject:	Lucene (Electronic resource) Lucene (Electronic resource) Web search engines. Internet searching. Java (Computer program language) Web search engines. Internet searching. Application software. Computer programming. Java (Computer program language) Open source software. Electronic books.
Format:	E-Resource Book
URL for this record:	http://pi.lib.uchicago.edu/1001/cat/bib/12608482

Hidden Bibliographic Details
ISBN:	9781484263457 1484263456 1484263448 9781484263440
Digital file characteristics:	text file PDF
Notes:	Includes index. Description based on online resource; title from digital title page (viewed on January 11, 2021). Print version record.
Summary:	Gain a thorough knowledge of Lucene's capabilities and use it to develop your own search applications. This book explores the Java-based, high-performance text search engine library used to build search capabilities in your applications. Starting with the basics of Lucene and searching, you will learn about the types of queries used in it and also take a look at scoring models. Applying this basic knowledge, you will develop a hello world app using basic Lucene queries and explore functions like scoring and document level boosting. Along the way you will also uncover the concepts of partial searching and matching in Lucene and then learn how to integrate geographical information (geospatial data) in Lucene using spatial queries and n-dimensional indexing. This will prepare you to build a location-aware search engine with a representative data set that allows location constraints to be specified during a search. You'll also develop a text classifier using Lucene and Apache Mahout, a popular machine learning framework. After a detailed review of performance bench-marking and common issues associated with it, you'll learn some of the best practices of tuning the performance of your application. By the end of the book you'll be able to build your first Lucene patch, where you will not only write your patch, but also test it and ensure it adheres to community coding standards. You will: Master the basics of Apache Lucene. Utilize different query types in Apache Lucene. Explore scoring and document level boosting. Integrate geospatial data into your application.
Other form:	Print version: Sharma, Atri. Practical Apache Lucene 8. [Berkeley, CA] : Apress, [2020] 9781484263457
Standard no.:	10.1007/978-1-4842-6345-7

Table of Contents:

Intro
Table of Contents
About the Author
About the Technical Reviewer
Acknowledgments
Introduction
Chapter 1: Hola, Lucene!
Key Features of Lucene
Information Retrieval Basics
Linear Scan
Stop List
Stemming
Term
Term-Document Incidence Matrix
Serving Queries Using a Term-Document Incidence Matrix
Basic Terminology
Heart of Lucene's Data Representation
Lucene's Inverted Index Structure
On-Disk Representation of a Lucene Index
Terms Dictionary
Frequencies File
Positions File
Queries on Lucene
Structure of a Lucene Query
Fields
Types of Queries in Lucene
Lucene vs. Relational Databases
Chapter 2: Hello World: The Lucene Way
Indexing Data in Lucene
Document
Analyzers
StandardAnalyzer
StopAnalyzer
SimpleAnalyzer
IndexWriter
Directory
Create Documents
Create Index and Write Documents
Adding Data to the Index
Bringing It All Together
TestClass
Document Search
QueryParser
TopDocs
IndexSearcher
IndexReader
Searching
Boolean Model
What Is Relevance?
Scoring Algorithms
TF/IDF
Vector Space Model
Scoring Example
Lucene's Scoring Model
Fields
Similarity
Boosting
Collectors
Chapter 3: Core Search Fundamentals
Codecs
DocValues
Phrase Queries
Term Vectors
BooleanQuery
MultiTermQuery
QueryCache
Scorer as Part of the Search Process
Chapter 4: Spatial Indexing
Spatial Module
What Are Geohashes?
Quad Trees
K-D Trees
BKD Trees
Using Spatial Indexing
Chapter 5: Location-Aware Search Engines
Why Use a Search Engine for Geographic Searches?
Range Queries
Function Queries
Geospatial Basics
Representing Spatial Data
Tiered Design for Storage
Geohashes
Spatial Data with Text Search
Distance Calculations
Bounding Box Filter
A Point on Distance Calculation
Chapter 6: Introducing Machine Learning with Apache Mahout
Origin of Apache Mahout
Why Apache Mahout?
Introduction to Machine Learning
Learning
Collaborative Filtering
Clustering
Categorization
Converting from Lucene Components to Mahout Components
Integrating Lucene with Mahout
lucene.vector
Lucene2seq
Java Version of Lucene2seq
Putting It All Together
Chapter 7: Improving Lucene's Performance
Increase Indexing Speed
Reuse Field Instances
The Curious Case of Large Commits
Reuse Tokens in Analyzers
Tuning Flush Intervals
Increase mergeFactor
Choosing the Correct Analyzers
Use Multiple Threads with One IndexWriter
Index into Separate Indexes and Then Merge
Improve Search Performance
Use the Latest Version of Lucene
Use IndexReader with the readOnly Attribute Equal to True
Use MMapDirectory/NIOFSDirectory
Decrease mergeFactor
Ignore First Query's Performance
Avoid Reopening IndexSearcher Instances
Share IndexSearcher Instances

Practical Apache Lucene 8 : uncover the search capabilities of your application /

Similar Items