public final class BloomFilteringPostingsFormat
extends PostingsFormat
A PostingsFormat useful for low doc-frequency fields such as primary
keys. Bloom filters are maintained in a ".blm" file which offers "fast-fail"
for reads in segments known to have no record of the key. A choice of
delegate PostingsFormat is used to record all other Postings data.
A choice of BloomFilterFactory can be passed to tailor Bloom Filter
settings on a per-field basis. The default configuration is
DefaultBloomFilterFactory which allocates a ~8mb bitset and hashes
values using MurmurHash2. This should be suitable for most purposes.
The format of the blm file is as follows:
FuzzySet.serialize(DataOutput)IndexHeaderString The name of a ServiceProvider registered PostingsFormatUint32Uint32 The number of the
field in this segmentCodecFooter| Modifier and Type | Field and Description |
|---|---|
static java.lang.String |
BLOOM_CODEC_NAME |
static int |
VERSION_CURRENT |
static int |
VERSION_START |
| Constructor and Description |
|---|
BloomFilteringPostingsFormat() |
BloomFilteringPostingsFormat(PostingsFormat delegatePostingsFormat)
Creates Bloom filters for a selection of fields created in the index.
|
BloomFilteringPostingsFormat(PostingsFormat delegatePostingsFormat,
BloomFilterFactory bloomFilterFactory)
Creates Bloom filters for a selection of fields created in the index.
|
| Modifier and Type | Method and Description |
|---|---|
FieldsConsumer |
fieldsConsumer(SegmentWriteState state) |
FieldsProducer |
fieldsProducer(SegmentReadState state) |
java.lang.String |
toString() |
public static final java.lang.String BLOOM_CODEC_NAME
public static final int VERSION_START
public static final int VERSION_CURRENT
public BloomFilteringPostingsFormat(PostingsFormat delegatePostingsFormat,
BloomFilterFactory bloomFilterFactory)
delegatePostingsFormat - The PostingsFormat that records all the non-bloom filter data i.e.
postings info.bloomFilterFactory - The BloomFilterFactory responsible for sizing BloomFilters
appropriatelypublic BloomFilteringPostingsFormat(PostingsFormat delegatePostingsFormat)
DefaultBloomFilterFactory for
configuring per-field BloomFilters.delegatePostingsFormat - The PostingsFormat that records all the non-bloom filter data i.e.
postings info.public BloomFilteringPostingsFormat()
public FieldsConsumer fieldsConsumer(SegmentWriteState state)
throws java.io.IOException
java.io.IOExceptionpublic FieldsProducer fieldsProducer(SegmentReadState state)
throws java.io.IOException
java.io.IOExceptionpublic java.lang.String toString()
Data In Motion GmbH all rights reserved