public class EccoPreTokenizer extends AbstractPreTokenizer implements PreTokenizer
| Modifier and Type | Field and Description |
|---|---|
protected static PatternReplacer |
doubleBackTicksReplacer
Double back-ticks.
|
protected static java.lang.String |
EccoAlwaysSeparators |
protected static PatternReplacer |
singleBackTicksReplacer
Single back-tick followed by a capital letter.
|
protected static PatternReplacer |
wordOrSpanGapReplacer
Word or span gap.
|
alwaysSeparators, alwaysSeparatorsReplacer, asterisks, commaSeparator, commaSeparatorReplacer, hyphens, logger, periods| Constructor and Description |
|---|
EccoPreTokenizer()
Create an Ecco pretokenizer.
|
| Modifier and Type | Method and Description |
|---|---|
java.lang.String |
pretokenize(java.lang.String line)
Prepare text for tokenization.
|
getLogger, setLoggercloseclone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, waitcloseprotected static final java.lang.String EccoAlwaysSeparators
protected static final PatternReplacer wordOrSpanGapReplacer
protected static final PatternReplacer doubleBackTicksReplacer
protected static final PatternReplacer singleBackTicksReplacer
public java.lang.String pretokenize(java.lang.String line)
pretokenize in interface PreTokenizerpretokenize in class AbstractPreTokenizerline - The text to prepare for tokenization,