public abstract class AbstractSpellingMapper extends IsCloseableObject implements SpellingMapper, UsesLogger
| Modifier and Type | Field and Description |
|---|---|
protected java.util.Set<java.lang.String> |
alternateSpellingsWordClasses
Word classes of alternate spellings.
|
protected Logger |
logger
Logger used for output.
|
protected TaggedStrings |
mappedSpellings
The map with alternate spellings as keys and standard spellings
as values.
|
protected Map2D<java.lang.String,java.lang.String,java.lang.String> |
spellingsByWordClass
Irregular forms.
|
protected static java.lang.String |
spellingsByWordClassFileName
Path to list of irregular word forms.
|
protected java.util.Set<java.lang.String> |
standardSpellingSet
The set of standard spellings.
|
| Constructor and Description |
|---|
AbstractSpellingMapper()
Create abstract spelling mapper.
|
| Modifier and Type | Method and Description |
|---|---|
void |
addCachedSpelling(java.lang.String alternateSpelling,
java.lang.String standardSpelling)
Cached a generated mapped spelling.
|
void |
addMappedSpelling(java.lang.String alternateSpelling,
java.lang.String standardSpelling)
Add a mapped spelling.
|
void |
addStandardSpelling(java.lang.String standardSpelling)
Add a standard spelling.
|
void |
addStandardSpellings(java.util.Collection<java.lang.String> standardSpellings)
Add standard spellings from a collection.
|
java.lang.String |
fixCapitalization(java.lang.String spelling,
java.lang.String standardSpelling)
Fix capitalization of standardized spelling.
|
Logger |
getLogger()
Get the logger.
|
TaggedStrings |
getMappedSpellings()
Return the mapped spellings.
|
int |
getNumberOfAlternateSpellings()
Returns number of alternate spellings.
|
int |
getNumberOfStandardSpellings()
Returns number of standard spellings.
|
java.util.Set<java.lang.String> |
getStandardSpellings()
Return the standard spellings.
|
void |
loadAlternativeSpellings(java.io.Reader reader,
java.lang.String delimChars)
Loads alternative spellings from a reader.
|
void |
loadAlternativeSpellings(java.net.URL url,
java.lang.String encoding,
java.lang.String delimChars)
Loads alternate spellings from a URL.
|
protected void |
loadSpellingsByWordClass()
Load alternate to standard spellings by word class.
|
void |
loadStandardSpellings(java.io.Reader reader)
Loads standard spellings from a reader.
|
void |
loadStandardSpellings(java.net.URL url,
java.lang.String encoding)
Loads standard spellings from a URL.
|
java.lang.String |
preprocessSpelling(java.lang.String spelling)
Preprocess spelling.
|
void |
setLogger(Logger logger)
Set the logger.
|
void |
setMappedSpellings(TaggedStrings mappedSpellings)
Sets map which maps alternate spellings to standard spellings.
|
void |
setStandardSpellings(java.util.Set<java.lang.String> standardSpellings)
Sets standard spellings.
|
java.lang.String[] |
standardizeSpelling(java.lang.String spelling)
Returns standard spellings given a spelling.
|
java.lang.String |
standardizeSpelling(java.lang.String spelling,
java.lang.String wordClass)
Returns a standard spelling given a standard or alternate spelling.
|
closeclone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, waitmapSpellingprotected TaggedStrings mappedSpellings
protected java.util.Set<java.lang.String> standardSpellingSet
protected Map2D<java.lang.String,java.lang.String,java.lang.String> spellingsByWordClass
Spellings disambiguated by word class are stored in a HashMap2D. The compound key consists of the word class and alternate spelling, and the value is the standardized spelling.
protected java.util.Set<java.lang.String> alternateSpellingsWordClasses
protected static java.lang.String spellingsByWordClassFileName
protected Logger logger
public AbstractSpellingMapper()
protected void loadSpellingsByWordClass()
throws java.io.IOException
java.io.IOExceptionpublic void loadAlternativeSpellings(java.net.URL url,
java.lang.String encoding,
java.lang.String delimChars)
throws java.io.IOException
url - URL containing alternate spellings to
standard spellings mappings.encoding - Text encoding (utf-8, 8859_1, etc.).delimChars - Delimiter characters separating spelling pairs.java.io.IOExceptionpublic void loadAlternativeSpellings(java.io.Reader reader,
java.lang.String delimChars)
throws java.io.IOException
reader - The reader.delimChars - Delimiter characters separating spelling pairs.java.io.IOExceptionpublic void loadStandardSpellings(java.net.URL url,
java.lang.String encoding)
throws java.io.IOException
url - URL containing standard spellingsencoding - Character set encoding for spellingsjava.io.IOExceptionpublic void loadStandardSpellings(java.io.Reader reader)
throws java.io.IOException
reader - The reader.java.io.IOExceptionpublic void addMappedSpelling(java.lang.String alternateSpelling,
java.lang.String standardSpelling)
alternateSpelling - The alternate spelling.standardSpelling - The corresponding standard spelling.public void addStandardSpelling(java.lang.String standardSpelling)
standardSpelling - A standard spelling.public void addStandardSpellings(java.util.Collection<java.lang.String> standardSpellings)
standardSpellings - A collection of standard spellings.public void addCachedSpelling(java.lang.String alternateSpelling,
java.lang.String standardSpelling)
alternateSpelling - The alternate spelling.standardSpelling - The corresponding standard spelling.public void setMappedSpellings(TaggedStrings mappedSpellings)
mappedSpellings - Map with alternate spellings as keys
and standard spellings as values.public void setStandardSpellings(java.util.Set<java.lang.String> standardSpellings)
standardSpellings - Set of standard spellings.public java.lang.String[] standardizeSpelling(java.lang.String spelling)
spelling - The spelling.If not spelling map is defined, the spelling is returned unchanged.
public java.lang.String standardizeSpelling(java.lang.String spelling,
java.lang.String wordClass)
spelling - The spelling.wordClass - The major word class.public int getNumberOfAlternateSpellings()
public int getNumberOfStandardSpellings()
public TaggedStrings getMappedSpellings()
public java.util.Set<java.lang.String> getStandardSpellings()
public java.lang.String preprocessSpelling(java.lang.String spelling)
spelling - Spelling to preprocess.By default, no preprocessing is applied; the original spelling is returned unchanged.
public java.lang.String fixCapitalization(java.lang.String spelling,
java.lang.String standardSpelling)
spelling - The original spelling.standardSpelling - The candidate standard spelling.public Logger getLogger()
getLogger in interface UsesLoggerpublic void setLogger(Logger logger)
setLogger in interface UsesLoggerlogger - The logger.