re: notes on distributed searching with lucene

来源:百度文库 编辑:神马文学网 时间:2024/03/29 02:46:46
RE: multithreading in SegmentsReader
Doug Cutting
Thu, 11 Oct 2001 11:30:00 -0700
> From: Dmitry Serebrennikov [mailto:[EMAIL PROTECTED]]> > But I was looking again at the MultiSearcher after reading> through the SegmentsReader (and friends) and I was> thinking if it wouldn‘t be better to write MultiSearcher> not in terms of searching over multiple Searchers, but as> an IndexReader that merges segments from more than one> directory. A lot of the issues that MultiSearcher has to> solve are also solved in the SegmentsReader, but slightly> differently. Also, MultiSearcher has to re-implement the> methods of Searcher (like the low level search API that> was added recently).Yes, there is some duplication between MultiSearcher and SegmentsReader.The reason for keeping these separate was to support distributed searching.Thus the Searcher API is designed to have only small bits of data passthrough it. I never actually implemented distributed searching, so thisdesign is somewhat half baked.The general idea is that query terms must be passed to the searcher first toweight the query, then, once the query is weighted, it can be sent to a setof searchers in parallel.To implement this, we would need to do something like:1. Move the abstract Searcher methods to an interface: public interface Searchable { int docFreq(Term term) throws IOException; int maxDoc() throws IOException; TopDocs search(Query query, Filter filter, int n) throws IOException; Document doc(int i) throws IOException; }2. Implement a RemoteSearcher using RMI.3. Change MultiSearcher.search() to search each sub-index in a separatethread.The low-level search API doesn‘t really fit in here too well.Note that, except for the search() method, the Searchable interface is asubset of IndexReader, so it still might make sense to somehow combine thenotions of Searcher and IndexReader. But we should keep distributedsearching in mind when this is done. If you are interested in drafting sucha re-design, I‘d love to see it.Doug

multithreading in SegmentsReader Dmitry SerebrennikovRE: multithreading in SegmentsReader Doug CuttingRe: multithreading in SegmentsReader Dmitry Serebrennikov
RE: multithreading in SegmentsReader Doug CuttingRe: multithreading in SegmentsReader Dmitry SerebrennikovRe: multithreading in SegmentsReader Brian Goetz
RE: multithreading in SegmentsReader Doug CuttingRe: multithreading in SegmentsReader Dmitry Serebrennikov
_xyz