org.apache.lucene.analysis

Class StopFilter

public final class StopFilter extends TokenFilter

Removes stop words from a token stream.
Constructor Summary
StopFilter(TokenStream input, String[] stopWords)
Construct a token stream filtering the given input.
StopFilter(TokenStream in, String[] stopWords, boolean ignoreCase)
Constructs a filter which removes words from the input TokenStream that are named in the array of words.
StopFilter(TokenStream input, Set stopWords, boolean ignoreCase)
Construct a token stream filtering the given input.
StopFilter(TokenStream in, Set stopWords)
Constructs a filter which removes words from the input TokenStream that are named in the Set.
Method Summary
static SetmakeStopSet(String[] stopWords)
Builds a Set from an array of stop words, appropriate for passing into the StopFilter constructor.
static SetmakeStopSet(String[] stopWords, boolean ignoreCase)
Tokennext()
Returns the next input Token whose termText() is not a stop word.

Constructor Detail

StopFilter

public StopFilter(TokenStream input, String[] stopWords)
Construct a token stream filtering the given input.

StopFilter

public StopFilter(TokenStream in, String[] stopWords, boolean ignoreCase)
Constructs a filter which removes words from the input TokenStream that are named in the array of words.

StopFilter

public StopFilter(TokenStream input, Set stopWords, boolean ignoreCase)
Construct a token stream filtering the given input.

Parameters: input stopWords The set of Stop Words, as Strings. If ignoreCase is true, all strings should be lower cased ignoreCase -Ignore case when stopping. The stopWords set must be setup to contain only lower case words

StopFilter

public StopFilter(TokenStream in, Set stopWords)
Constructs a filter which removes words from the input TokenStream that are named in the Set. It is crucial that an efficient Set implementation is used for maximum performance.

See Also: (java.lang.String[])

Method Detail

makeStopSet

public static final Set makeStopSet(String[] stopWords)
Builds a Set from an array of stop words, appropriate for passing into the StopFilter constructor. This permits this stopWords construction to be cached once when an Analyzer is constructed.

See Also: (java.lang.String[], boolean) passing false to ignoreCase

makeStopSet

public static final Set makeStopSet(String[] stopWords, boolean ignoreCase)

Parameters: stopWords ignoreCase If true, all words are lower cased first.

Returns: a Set containing the words

next

public final Token next()
Returns the next input Token whose termText() is not a stop word.
Copyright © 2000-2007 Apache Software Foundation. All Rights Reserved.