org.apache.lucene.index
public abstract class IndexReader extends Object
Concrete subclasses of IndexReader are usually constructed with a call to
one of the static open()
methods, e.g. open.
For efficiency, in this API documents are often referred to via document numbers, non-negative integers which each name a unique document in the index. These document numbers are ephemeral--they may change as documents are added to and deleted from an index. Clients should thus not rely on a given document having the same number between sessions.
An IndexReader can be opened on a directory for which an IndexWriter is opened already, but it cannot be used to delete documents from the index then.
Version: $Id: IndexReader.java 543620 2007-06-01 21:18:56Z dnaber $
Nested Class Summary | |
---|---|
static class | IndexReader.FieldOption |
Constructor Summary | |
---|---|
protected | IndexReader(Directory directory)
Constructor used if IndexReader is not owner of its directory.
|
Method Summary | |
---|---|
void | close()
Closes files associated with this index.
|
protected void | commit()
Commit changes resulting from delete, undeleteAll, or
setNorm operations
If an exception is hit, then either no changes or all
changes will have been committed to the index
(transactional semantics). |
void | deleteDocument(int docNum) Deletes the document numbered docNum . |
int | deleteDocuments(Term term) Deletes all documents that have a given term indexed.
|
Directory | directory() Returns the directory this index resides in. |
abstract int | docFreq(Term t) Returns the number of documents containing the term t . |
Document | document(int n) Returns the stored fields of the n th
Document in this index. |
abstract Document | document(int n, FieldSelector fieldSelector)
Get the Document at the n th position. |
protected abstract void | doClose() Implements close. |
protected abstract void | doCommit() Implements commit. |
protected abstract void | doDelete(int docNum) Implements deletion of the document numbered docNum .
|
protected abstract void | doSetNorm(int doc, String field, byte value) Implements setNorm in subclass. |
protected abstract void | doUndeleteAll() Implements actual undeleteAll() in subclass. |
protected void | ensureOpen() |
protected void | finalize() Release the write lock, if needed. |
static long | getCurrentVersion(String directory)
Reads version number from segments files. |
static long | getCurrentVersion(File directory)
Reads version number from segments files. |
static long | getCurrentVersion(Directory directory)
Reads version number from segments files. |
abstract Collection | getFieldNames(IndexReader.FieldOption fldOption)
Get a list of unique field names that exist in this index and have the specified
field option information. |
abstract TermFreqVector | getTermFreqVector(int docNumber, String field)
Return a term frequency vector for the specified document and field. |
abstract TermFreqVector[] | getTermFreqVectors(int docNumber)
Return an array of term frequency vectors for the specified document.
|
long | getVersion()
Version number when this IndexReader was opened. |
abstract boolean | hasDeletions() Returns true if any documents have been deleted |
boolean | hasNorms(String field) Returns true if there are norms stored for this field. |
static boolean | indexExists(String directory)
Returns true if an index exists at the specified directory.
|
static boolean | indexExists(File directory)
Returns true if an index exists at the specified directory.
|
static boolean | indexExists(Directory directory)
Returns true if an index exists at the specified directory.
|
boolean | isCurrent()
Check whether this IndexReader is still using the
current (i.e., most recently committed) version of the
index. |
abstract boolean | isDeleted(int n) Returns true if document n has been deleted |
static boolean | isLocked(Directory directory)
Returns true iff the index in the named directory is
currently locked. |
static boolean | isLocked(String directory)
Returns true iff the index in the named directory is
currently locked. |
boolean | isOptimized()
Checks is the index is optimized (if it has a single segment and no deletions) |
static long | lastModified(String directory)
Returns the time the index in the named directory was last modified.
|
static long | lastModified(File fileDirectory)
Returns the time the index in the named directory was last modified.
|
static long | lastModified(Directory directory2)
Returns the time the index in the named directory was last modified.
|
static void | main(String[] args)
Prints the filename and size of each file within a given compound file.
|
abstract int | maxDoc() Returns one greater than the largest possible document number.
|
abstract byte[] | norms(String field) Returns the byte-encoded normalization factor for the named field of
every document. |
abstract void | norms(String field, byte[] bytes, int offset) Reads the byte-encoded normalization factor for the named field of every
document. |
abstract int | numDocs() Returns the number of documents in this index. |
static IndexReader | open(String path) Returns an IndexReader reading the index in an FSDirectory in the named
path. |
static IndexReader | open(File path) Returns an IndexReader reading the index in an FSDirectory in the named
path. |
static IndexReader | open(Directory directory) Returns an IndexReader reading the index in the given Directory. |
static IndexReader | open(Directory directory, IndexDeletionPolicy deletionPolicy) Expert: returns an IndexReader reading the index in the given
Directory, with a custom IndexDeletionPolicy. |
void | setNorm(int doc, String field, byte value) Expert: Resets the normalization factor for the named field of the named
document. |
void | setNorm(int doc, String field, float value) Expert: Resets the normalization factor for the named field of the named
document.
|
TermDocs | termDocs(Term term) Returns an enumeration of all the documents which contain
term . |
abstract TermDocs | termDocs() Returns an unpositioned TermDocs enumerator. |
TermPositions | termPositions(Term term) Returns an enumeration of all the documents which contain
term . |
abstract TermPositions | termPositions() Returns an unpositioned TermPositions enumerator. |
abstract TermEnum | terms() Returns an enumeration of all the terms in the index. |
abstract TermEnum | terms(Term t) Returns an enumeration of all terms starting at a given term. |
void | undeleteAll() Undeletes all documents currently marked as deleted in this index.
|
static void | unlock(Directory directory)
Forcibly unlocks the index in the named directory.
|
Parameters: directory Directory where IndexReader files reside.
Throws: IOException if there is a low-level IO error
Throws: IOException if there is a low-level IO error
docNum
. Once a document is
deleted it will not appear in TermDocs or TermPostitions enumerations.
Attempts to read its field with the IndexReader
method will result in an error. The presence of this document may still be
reflected in the IndexReader statistic, though
this will be corrected eventually as the index is further modified.
Throws: StaleReaderException if the index has changed
since this reader was opened CorruptIndexException if the index is corrupt LockObtainFailedException if another writer
has this index open (write.lock
could not
be obtained) IOException if there is a low-level IO error
term
indexed.
This is useful if one uses a document field to hold a unique ID string for
the document. Then to delete such a document, one merely constructs a
term with the appropriate field and the unique ID string as its text and
passes it to this method.
See IndexReader for information about when this deletion will
become effective.
Returns: the number of documents deleted
Throws: StaleReaderException if the index has changed
since this reader was opened CorruptIndexException if the index is corrupt LockObtainFailedException if another writer
has this index open (write.lock
could not
be obtained) IOException if there is a low-level IO error
t
.Throws: IOException if there is a low-level IO error
n
th
Document
in this index.Throws: CorruptIndexException if the index is corrupt IOException if there is a low-level IO error
n
th position. The FieldSelector
may be used to determine what Fields to load and how they should be loaded.
NOTE: If this Reader (more specifically, the underlying FieldsReader
) is closed before the lazy Field is
loaded an exception may be thrown. If you want the value of a lazy Field to be available after closing you must
explicitly load it or fetch the Document again with a new loader.
Parameters: n Get the document at the n
th position fieldSelector The FieldSelector to use to determine what Fields should be loaded on the Document. May be null, in which case all Fields will be loaded.
Returns: The stored fields of the Document at the nth position
Throws: CorruptIndexException if the index is corrupt IOException if there is a low-level IO error
See Also: Fieldable FieldSelector SetBasedFieldSelector LoadFirstFieldSelector
docNum
.
Applications should call IndexReader or deleteDocuments.Throws: AlreadyClosedException if this IndexReader is closed
Parameters: directory where the index resides.
Returns: version number.
Throws: CorruptIndexException if the index is corrupt IOException if there is a low-level IO error
Parameters: directory where the index resides.
Returns: version number.
Throws: CorruptIndexException if the index is corrupt IOException if there is a low-level IO error
Parameters: directory where the index resides.
Returns: version number.
Throws: CorruptIndexException if the index is corrupt IOException if there is a low-level IO error
Parameters: fldOption specifies which field option should be available for the returned fields
Returns: Collection of Strings indicating the names of the fields.
See Also: FieldOption
Parameters: docNumber document for which the term frequency vector is returned field field for which the term frequency vector is returned.
Returns: term frequency vector May be null if field does not exist in the specified document or term vector was not stored.
Throws: IOException if index cannot be accessed
See Also: TermVector
Parameters: docNumber document for which term frequency vectors are returned
Returns: array of term frequency vectors. May be null if no term vectors have been stored for the specified document.
Throws: IOException if index cannot be accessed
See Also: TermVector
true
if an index exists at the specified directory.
If the directory does not exist or if there is no index in it.
false
is returned.Parameters: directory the directory to check for an index
Returns: true
if an index exists; false
otherwise
true
if an index exists at the specified directory.
If the directory does not exist or if there is no index in it.Parameters: directory the directory to check for an index
Returns: true
if an index exists; false
otherwise
true
if an index exists at the specified directory.
If the directory does not exist or if there is no index in it.Parameters: directory the directory to check for an index
Returns: true
if an index exists; false
otherwise
Throws: IOException if there is a problem with accessing the index
false
, in which case you must open a new
IndexReader in order to see the changes. See the
description of the autoCommit
flag which controls when the IndexWriter
actually commits changes to the index.
Throws: CorruptIndexException if the index is corrupt IOException if there is a low-level IO error
true
iff the index in the named directory is
currently locked.Parameters: directory the directory to check for a lock
Throws: IOException if there is a low-level IO error
true
iff the index in the named directory is
currently locked.Parameters: directory the directory to check for a lock
Throws: IOException if there is a low-level IO error
Returns: true
if the index is optimized; false
otherwise
Throws: CorruptIndexException if the index is corrupt IOException if there is a low-level IO error
Throws: CorruptIndexException if the index is corrupt IOException if there is a low-level IO error
Throws: CorruptIndexException if the index is corrupt IOException if there is a low-level IO error
Parameters: args Usage: org.apache.lucene.index.IndexReader [-extract] <cfsfile>
See Also: Field
See Also: Field
Parameters: path the path to the index directory
Throws: CorruptIndexException if the index is corrupt IOException if there is a low-level IO error
Parameters: path the path to the index directory
Throws: CorruptIndexException if the index is corrupt IOException if there is a low-level IO error
Parameters: directory the index directory
Throws: CorruptIndexException if the index is corrupt IOException if there is a low-level IO error
Parameters: directory the index directory deletionPolicy a custom deletion policy (only used if you use this reader to perform deletes or to set norms); see IndexWriter for details.
Throws: CorruptIndexException if the index is corrupt IOException if there is a low-level IO error
boost
and its length normalization
. Thus, to preserve the length normalization
values when resetting this, one should base the new value upon the old.
Throws: StaleReaderException if the index has changed
since this reader was opened CorruptIndexException if the index is corrupt LockObtainFailedException if another writer
has this index open (write.lock
could not
be obtained) IOException if there is a low-level IO error
See Also: norms Similarity
Throws: StaleReaderException if the index has changed
since this reader was opened CorruptIndexException if the index is corrupt LockObtainFailedException if another writer
has this index open (write.lock
could not
be obtained) IOException if there is a low-level IO error
See Also: norms
term
. For each document, the document number, the frequency of
the term in that document is also provided, for use in search scoring.
Thus, this method implements the mapping:
The enumeration is ordered by document number. Each document number is greater than all that precede it in the enumeration.
Throws: IOException if there is a low-level IO error
Throws: IOException if there is a low-level IO error
term
. For each document, in addition to the document number
and frequency of the term in that document, a list of all of the ordinal
positions of the term in the document is available. Thus, this method
implements the mapping:
This positional information faciliates phrase and proximity searching.
The enumeration is ordered by document number. Each document number is greater than all that precede it in the enumeration.
Throws: IOException if there is a low-level IO error
Throws: IOException if there is a low-level IO error
Throws: IOException if there is a low-level IO error
Throws: IOException if there is a low-level IO error
Throws: StaleReaderException if the index has changed
since this reader was opened LockObtainFailedException if another writer
has this index open (write.lock
could not
be obtained) CorruptIndexException if the index is corrupt IOException if there is a low-level IO error
Caution: this should only be used by failure recovery code, when it is known that no other process nor thread is in fact currently accessing this index.