|
||||||||||
| PREV NEXT | FRAMES NO FRAMES | |||||||||
| Packages that use Tokenizer | |
|---|---|
| org.apache.lucene.analysis | API and code to convert text into indexable/searchable tokens. |
| org.apache.lucene.analysis.standard | Standards-based analyzers implemented with JFlex. |
| Uses of Tokenizer in org.apache.lucene.analysis |
|---|
| Subclasses of Tokenizer in org.apache.lucene.analysis | |
|---|---|
class |
CharTokenizer
An abstract base class for simple, character-oriented tokenizers. |
class |
KeywordTokenizer
Emits the entire input as a single token. |
class |
LetterTokenizer
A LetterTokenizer is a tokenizer that divides text at non-letters. |
class |
LowerCaseTokenizer
LowerCaseTokenizer performs the function of LetterTokenizer and LowerCaseFilter together. |
class |
WhitespaceTokenizer
A WhitespaceTokenizer is a tokenizer that divides text at whitespace. |
| Fields in org.apache.lucene.analysis declared as Tokenizer | |
|---|---|
protected Tokenizer |
ReusableAnalyzerBase.TokenStreamComponents.source
|
| Constructors in org.apache.lucene.analysis with parameters of type Tokenizer | |
|---|---|
ReusableAnalyzerBase.TokenStreamComponents(Tokenizer source)
Creates a new ReusableAnalyzerBase.TokenStreamComponents instance. |
|
ReusableAnalyzerBase.TokenStreamComponents(Tokenizer source,
TokenStream result)
Creates a new ReusableAnalyzerBase.TokenStreamComponents instance. |
|
| Uses of Tokenizer in org.apache.lucene.analysis.standard |
|---|
| Subclasses of Tokenizer in org.apache.lucene.analysis.standard | |
|---|---|
class |
ClassicTokenizer
A grammar-based tokenizer constructed with JFlex |
class |
StandardTokenizer
A grammar-based tokenizer constructed with JFlex. |
class |
UAX29URLEmailTokenizer
This class implements Word Break rules from the Unicode Text Segmentation algorithm, as specified in Unicode Standard Annex #29 URLs and email addresses are also tokenized according to the relevant RFCs. |
|
||||||||||
| PREV NEXT | FRAMES NO FRAMES | |||||||||