|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||
java.lang.Objectorg.apache.lucene.util.AttributeSource
org.apache.lucene.analysis.TokenStream
org.apache.lucene.analysis.Tokenizer
org.apache.lucene.analysis.cjk.CJKTokenizer
@Deprecated public final class CJKTokenizer
CJKTokenizer is designed for Chinese, Japanese, and Korean languages.
The tokens returned are every two adjacent characters with overlap match.
Example: "java C1C2C3C4" will be segmented to: "java" "C1C2" "C2C3" "C3C4".
Additionally, the following is applied to Latin text (such as English):
| Nested Class Summary |
|---|
| Nested classes/interfaces inherited from class org.apache.lucene.util.AttributeSource |
|---|
org.apache.lucene.util.AttributeSource.AttributeFactory, org.apache.lucene.util.AttributeSource.State |
| Field Summary |
|---|
| Fields inherited from class org.apache.lucene.analysis.Tokenizer |
|---|
input |
| Constructor Summary | |
|---|---|
CJKTokenizer(org.apache.lucene.util.AttributeSource.AttributeFactory factory,
Reader in)
Deprecated. |
|
CJKTokenizer(org.apache.lucene.util.AttributeSource source,
Reader in)
Deprecated. |
|
CJKTokenizer(Reader in)
Deprecated. Construct a token stream processing the given input. |
|
| Method Summary | |
|---|---|
void |
end()
Deprecated. |
boolean |
incrementToken()
Deprecated. Returns true for the next token in the stream, or false at EOS. |
void |
reset()
Deprecated. |
void |
reset(Reader reader)
Deprecated. |
| Methods inherited from class org.apache.lucene.analysis.Tokenizer |
|---|
close, correctOffset |
| Methods inherited from class org.apache.lucene.util.AttributeSource |
|---|
addAttribute, addAttributeImpl, captureState, clearAttributes, cloneAttributes, copyTo, equals, getAttribute, getAttributeClassesIterator, getAttributeFactory, getAttributeImplsIterator, hasAttribute, hasAttributes, hashCode, reflectAsString, reflectWith, restoreState, toString |
| Methods inherited from class java.lang.Object |
|---|
clone, finalize, getClass, notify, notifyAll, wait, wait, wait |
| Constructor Detail |
|---|
public CJKTokenizer(Reader in)
in - I/O reader
public CJKTokenizer(org.apache.lucene.util.AttributeSource source,
Reader in)
public CJKTokenizer(org.apache.lucene.util.AttributeSource.AttributeFactory factory,
Reader in)
| Method Detail |
|---|
public boolean incrementToken()
throws IOException
incrementToken in class org.apache.lucene.analysis.TokenStreamIOException - - throw IOException when read error public final void end()
end in class org.apache.lucene.analysis.TokenStream
public void reset()
throws IOException
reset in class org.apache.lucene.analysis.TokenStreamIOException
public void reset(Reader reader)
throws IOException
reset in class org.apache.lucene.analysis.TokenizerIOException
|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||