org.apache.lucene.analysis.ngram

Class NGramTokenizer

public class NGramTokenizer extends Tokenizer

Tokenizes the input into n-grams of the given size(s).

Author: Otis Gospodnetic

Field Summary
static intDEFAULT_MAX_NGRAM_SIZE
static intDEFAULT_MIN_NGRAM_SIZE
Constructor Summary
NGramTokenizer(Reader input, int minGram, int maxGram)
Creates NGramTokenizer with given min and max n-grams.
NGramTokenizer(Reader input)
Creates NGramTokenizer with default min and max n-grams.
Method Summary
Tokennext()
Returns the next token in the stream, or null at EOS.

Field Detail

DEFAULT_MAX_NGRAM_SIZE

public static final int DEFAULT_MAX_NGRAM_SIZE

DEFAULT_MIN_NGRAM_SIZE

public static final int DEFAULT_MIN_NGRAM_SIZE

Constructor Detail

NGramTokenizer

public NGramTokenizer(Reader input, int minGram, int maxGram)
Creates NGramTokenizer with given min and max n-grams.

Parameters: input Reader holding the input to be tokenized minGram the smallest n-gram to generate maxGram the largest n-gram to generate

NGramTokenizer

public NGramTokenizer(Reader input)
Creates NGramTokenizer with default min and max n-grams.

Parameters: input Reader holding the input to be tokenized

Method Detail

next

public final Token next()
Returns the next token in the stream, or null at EOS.
Copyright © 2000-2007 Apache Software Foundation. All Rights Reserved.