One of the original design goals of the PC-Kimmo program was to produce a reusable function library that could be used in different programs or with different user interfaces. The functions and data structures described in this manual are a result of that design goal.
The PC-Kimmo function library can be used for programs that need to handle the morphology and phonology of natural language using the two-level morphology originally invented by Kimmo Koskenniemi. (His use of the term morphology should be understood to encompass both what linguists would consider morphology proper--the decomposition of words into morphemes--and phonology--at least in the sense of morphophonemics.)
The author would appreciate feedback directed to the following address:
Stephen McConnel (972)708-7361 (office) Language Software Development (972)708-7561 (fax) SIL International 7500 W. Camp Wisdom Road Dallas, TX 75236 steve@acadcomp.sil.org U.S.A. or steve.mcconnel@sil.org
The basic goal behind choosing names in the PC-Kimmo function library is for the name to convey information about what it represents. This is achieved in two ways: striving for a descriptive name rather than a short cryptic abbreviated name, and following a different pattern of capitalization for each type of name.
Preprocessor macro names are written entirely in capital letters. If
the name requires more than one word for an adequate description, the
words are joined together with intervening underscore (_
)
characters.
Data structure names consist of one or more capitalized words. If the name requires more than one word for an adequate description, the words are joined together without underscores, depending on the capitalization pattern to make them readable as separate words.
Variable names in the PC-Kimmo function library follow a modified form of the Hungarian naming convention described by Steve McConnell in his book Code Complete on pages 202-206.
Variable names have three parts: a lowercase type prefix, a descriptive name, and a scope suffix.
The type prefix has the following basic possibilities:
b
char
, short
, or int
c
char
but sometimes a short
or
int
d
double
e
enum
or as a char
,
short
, or int
i
int
, short
, long
, or
(rarely) char
s
struct
statement
sz
pf
In addition, the basic types may be prefixed by these qualifiers:
u
a
p
The descriptive name portion of a variable name consists of one or more
capitalized words concatenated together. There are no underscores
(_
) separating these words from each other, or from the type
prefix. For the PC-Kimmo function library, the descriptive
name for global variables
begins with Kimmo.
The scope suffix has these possibilities:
_g
_m
static
)
_in
_out
_io
_s
static
)
The lack of a scope suffix indicates that a variable is declared within a function and exists on the stack for the duration of the current call.
Global function names in the PC-Kimmo function library have
two parts: a verb that is all lowercase followed by a noun phrase
containing one or more capitalized words. These pieces are
concatanated without any intervening underscores (_
). For the
PC-Kimmo library functions, the noun phrase section
includes
Kimmo.
Given the discussion above, it is easy to discern at a glance what type of item each of the following names refers to.
SAMPLE_NAME
SampleName
pSampleName
writeSampleName
SampleName
).
The PC-Kimmo functions operate on a number of different data structures. The most important of these are described in the following sections. The PC-Kimmo functions also use a number of other data structures internally, but it should not be necessary for a programmer to manipulate them directly.
#include <stdio.h> /* * type definition for KimmoData * needed for patr.h */ typedef struct kimmo_data KimmoData; #include "patr.h" /* needed for PATRData */ /* * forward declarations for internal data types */ typedef struct kimmo_alternation KimmoAlternation; typedef struct kimmo_lexicon KimmoLexicon; typedef struct kimmo_pair KimmoPair; typedef struct kimmo_rule KimmoRule; typedef struct kimmo_subset KimmoSubset; struct kimmo_data { /* * parameters for controlling the PC-Kimmo processing */ char bLimit; char iTraceLevel; char bUsePATR; char bSilent; char bShowWarnings; char bAlignment; unsigned char cGlossBegin; unsigned char cGlossEnd; unsigned char cComment; FILE * pLogFP; /* * loaded or derived from the rules file */ unsigned char ** ppszAlphabet; unsigned short uiAlphabetSize; unsigned char cNull; unsigned char cAny; unsigned char cBoundary; char bTwoLCFile; KimmoSubset * pSubsets; unsigned short uiSubsetCount; KimmoRule * pAutomata; unsigned short uiAutomataSize; KimmoPair * pFeasiblePairs; unsigned short uiFeasiblePairsCount; char * pszRulesFile; /* * loaded or derived from the lexicon file */ KimmoAlternation * pAlternations; unsigned short uiAlternationCount; KimmoLexicon * pLexiconSections; KimmoLexicon * pInitialLexicon; unsigned short uiLexiconSectionCount; unsigned char ** ppszFeatures; unsigned short uiFeatureCount; char * pszLexiconFile; /* * loaded or derived from the grammar file */ PATRData sPATR; };
The KimmoData
data structure collects the information used for
data processing within the PC-Kimmo functions. Its general purpose is
to reduce the number of parameters needed by the various functions.
bLimit
TRUE
(nonzero).
iTraceLevel
bUsePATR
bSilent
stderr
).
bShowWarnings
bAlignment
TRUE
.
cGlossBegin
cGlossEnd
cComment
KIMMO_DEFAULT_COMMENT
is a symbol for the default value.)
pLogFP
FILE
pointer for an output log file (NULL
means
none).
ppszAlphabet
NULL
terminated.
uiAlphabetSize
cNull
cAny
cBoundary
bTwoLCFile
pSubsets
uiSubsetCount
pAutomata
uiAutomataSize
pFeasiblePairs
uiFeasiblePairsCount
pszRulesFile
NULL
means none).
pAlternations
uiAlternationCount
pLexiconSections
pInitialLexicon
uiLexiconSectionCount
ppszFeatures
uiFeatureCount
pszLexiconFile
NULL
means none).
sPATR
`kimmo.h'
/* * type definition for KimmoResult * needed for patr.h */ typedef struct kimmo_result KimmoResult; #include "patr.h" /* needed for PATREdgeList */ /* * forward declaration for internal data type */ typedef struct kimmo_morpheme KimmoMorpheme; struct kimmo_result { KimmoResult * pNext; unsigned char * pszSynthesis; KimmoMorpheme * pAnalysis; PATREdgeList * pParseChart; unsigned char * pszResult; unsigned char * pszGloss; short bOkay; };
The KimmoResult
data structure contains a single result from one
of the PC-Kimmo processing functions (applyKimmoGenerator
,
applyKimmoRecognizer
, or applyKimmoSynthesizer
). It can
be used to build a linked list for ambiguous results.
pNext
pszSynthesis
applyKimmoGenerator
or applyKimmoSynthesizer
.
pAnalysis
applyKimmoRecognizer
.
pParseChart
applyKimmoRecognizer
.
pszResult
applyKimmoGenerator
,
applyKimmoRecognizer
, or applyKimmoSynthesizer
. It
differs from pszSynthesis
in that it has the "null" characters
removed.
pszGloss
applyKimmoRecognizer
.
bOkay
FALSE
by applyKimmoGenerator
,
applyKimmoRecognizer
, and applyKimmoSynthesizer
.
`kimmo.h'
This chapter gives the proper usage information about each of the global variables found in the PC-Kimmo function library. The `kimmo.h' header file contains the extern declarations for all of these variables.
#include "kimmo.h" extern int bCancelKimmoOperation_g;
bCancelKimmoOperation_g
can be set asynchronously to interrupt a
PC-Kimmo parse that seems to be stuck.
4.1.3 Example
#include <signal.h> #include "kimmo.h" #include "patr.h" ... void sigint_handler(int iSignal_in) { bCancelKimmoOperation_g = TRUE; bCancelPATROperation_g = TRUE; /* remember embedded PATR parser */ signal(SIGINT, sigint_handler); } ... signal(SIGINT, sigint_handler); ...
`kimmdata.c'
#include "kimmo.h" extern const char cKimmoPatchSep_g;
cKimmoPatchSep_g
is used to separate the revision and patch
level values when printing the PC-Kimmo version number. 'a'
indicates an alpha release, 'b'
indicates a beta release, and
'.'
indicates a production release.
4.2.3 Example See section 4.5 iKimmoVersion_g.
4.2.4 Source File `kimmdata.c'
#include "kimmo.h" extern const int iKimmoPatchlevel_g;
iKimmoPatchlevel_g
is the current patch level of the
PC-Kimmo function library and program. This is the third level version
number, reflecting bug fixes or internal improvements that should be
functionally invisible to users.
4.3.3 Example See section 4.5 iKimmoVersion_g.
4.3.4 Source File `kimmdata.c'
#include "kimmo.h" extern const int iKimmoRevision_g;
iKimmoRevision_g
is the current revision level of the
PC-Kimmo function library and program. This is the second level
version number, reflecting changes to program behavior that require
changes to the PC-Kimmo Reference Manual.
4.4.3 Example See section 4.5 iKimmoVersion_g.
4.4.4 Source File `kimmdata.c'
#include "kimmo.h" extern const int iKimmoVersion_g;
iKimmoVersion_g
is the current version number of the
PC-Kimmo function library and program. This is the top level version
number, reflecting a major rewrite of the program or major changes that
make it incompatible with earlier versions of the program.
4.5.3 Example
#include <stdio.h> #include "kimmo.h" ... fprintf(stderr, "PC-Kimmo version %d.%d%c%d (%s), Copyright %s SIL\n", iKimmoVersion_g, iKimmoRevision_g, cKimmoPatchSep_g, iKimmoPatchlevel_g, pszKimmoDate_g, pszKimmoYear_g); #ifdef __DATE__ fprintf(stderr, pszKimmoCompileFormat_g, pszKimmoCompileDate_g, pszKimmoCompileTime_g); #else if (pszKimmoTestVersion_g != NULL) fputs(pszKimmoTestVersion_g, stderr); #endif ...
`kimmdata.c'
#include "kimmo.h" #ifdef __DATE__ extern const char * pszKimmoCompileDate_g; #endif
pszKimmoCompileDate_g
points to a string containing the date on
which the PC-Kimmo library was compiled. It exists only if the C
compiler preprocessor supports the __DATE__
constant.
4.6.3 Example See section 4.5 iKimmoVersion_g.
4.6.4 Source File `kimmdata.c'
#include "kimmo.h" #ifdef __DATE__ #ifdef __TIME__ extern const char * pszKimmoCompileFormat_g; #endif #endif
pszKimmoCompileFormat_g
points to a printf
style format
string suitable for displaying pszKimmoCompileDate_g
and
pszKimmoCompileTime_g
. It exists only if the C compiler
preprocessor supports the __DATE__
and __TIME__
constants.
4.7.3 Example See section 4.5 iKimmoVersion_g.
4.7.4 Source File `kimmdata.c'
#include "kimmo.h" #ifdef __TIME__ extern const char * pszKimmoCompileTime_g; #endif
pszKimmoCompileTime_g
points to a string containing the time at
which the PC-Kimmo library was compiled. It exists only if the C
compiler preprocessor supports the __TIME__
constant.
4.8.3 Example See section 4.5 iKimmoVersion_g.
4.8.4 Source File `kimmdata.c'
#include "kimmo.h" extern const char * pszKimmoDate_g;
pszKimmoDate_g
points to a string containing the date on
which the PC-Kimmo library was last modified.
4.9.3 Example See section 4.5 iKimmoVersion_g.
4.9.4 Source File `kimmdata.c'
#include "kimmo.h" #ifndef __DATE__ extern const char * pszKimmoTestVersion_g; #endif
pszKimmoTestVersion_g
points to a string describing the test
status of PC-Kimmo (either alpha or beta). If this is a production
release version, it is set to NULL
. It is defined only if the C
compiler preprocessor does not support the __DATE__
constant.
4.10.3 Example See section 4.5 iKimmoVersion_g.
4.10.4 Source File `kimmdata.c'
#include "kimmo.h" extern const char * pszKimmoYear_g;
pszKimmoYear_g
points to a string containing the year in
which the PC-Kimmo library was last modified. This is suitable for a
copyright notice assigning the copyright to SIL International.
4.11.3 Example See section 4.5 iKimmoVersion_g.
4.11.4 Source File `kimmdata.c'
#include "kimmo.h" extern size_t uiKimmoCharArraySize_g;
uiKimmoCharArraySize_g
determines how big a buffer is allocated
for holding strings loaded from the PC-Kimmo lexicon. A larger size
reduces the number of calls to malloc
, and the amount of memory
overhead lost for each allocation, but increases the amount of memory
wasted by not being used. The default value is 8000
. Setting
uiKimmoCharArraySize_g
to 0
causes each lexicon string
to be individually allocated with malloc
.
4.12.3 Example
#include "kimmo.h" ... unsigned char szLexiconFile_g[256]; KimmoData sKimmoData_g; ... uiKimmoCharArraySize_g = 16364; uiKimmoLexItemArraySize_g = 16364; uiKimmoShortArraySize_g = 16364; if (loadKimmoLexicon(szLexiconFile_g, KIMMO_ANALYSIS, &sKimmoData_g) != 0) { reportError(ERROR_MSG, "Cannot open lexicon file %s\n", szLexiconFile_g); } ...
`lexicon.c'
#include "kimmo.h" extern size_t uiKimmoLexItemArraySize_g;
uiKimmoLexItemArraySize_g
determines how many lexical item data
structures are allocated at a time while loading the PC-Kimmo lexicon.
A larger size reduces the number of calls to malloc
, and the
amount of memory overhead lost for each allocation, but increases the
amount of memory wasted by not being used. The default value is
1000
. Setting uiKimmoLexItemArraySize_g
to 0
causes
each lexical item data structure to be individually allocated with
malloc
.
4.13.3 Example See section 4.12 uiKimmoCharArraySize_g.
4.13.4 Source File `lexicon.c'
#include "kimmo.h" extern size_t uiKimmoShortArraySize_g;
uiKimmoShortArraySize_g
determines how large an array of short
integers is allocated for dispensing to the individual lexical items
while loading the PC-Kimmo lexicon.
A larger size reduces the number of calls to malloc
, and the
amount of memory overhead lost for each allocation, but increases the
amount of memory wasted by not being used. The default value is
2000
. Setting uiKimmoShortArraySize_g
to 0
causes
each lexical item data structure's array of short integers to be
individually allocated with malloc
.
4.14.3 Example See section 4.12 uiKimmoCharArraySize_g.
4.14.4 Source File `lexicon.c'
This document gives the proper usage information about each of the functions found in the PC-Kimmo function library. The prototypes and type definitions relevent to the use of these functions are all found in the `kimmo.h' header file.
#include "kimmo.h" KimmoResult * applyKimmoGenerator(unsigned char * pszLexForm_in, KimmoData * pKimmo_in);
applyKimmoGenerator
tries to generate the surface form of a word
from the provided lexical form. The PC-Kimmo rules must be loaded
before this function is called.
The arguments to applyKimmoGenerator
are as follows:
pszLexForm_in
pKimmo_in
a pointer to a list of results, or NULL if unsuccessful
5.1.4 Example
#include <stdio.h> #include <string.h> #include "kimmo.h" ... KimmoData sKimmoData_g; ... void do_generate(pszForm_in) unsigned char * pszForm_in; { unsigned char * pszLexForm; KimmoResult * pResults; if ((sKimmoData_g.ppszAlphabet == NULL) || (pszForm_in == NULL)) return; pszLexForm = pszForm_in + strspn((char *)pszForm_in, " \t\r\n\f"); if (*pszLexForm == '\0') return; if (sKimmoData_g.pLogFP != NULL) fprintf(sKimmoData_g.pLogFP, "%s\n", pszLexForm); pResults = applyKimmoGenerator(pszLexForm, &sKimmoData_g); writeKimmoResults(pResults, stderr, &sKimmoData_g); if (sKimmoData_g.pLogFP != NULL) writeKimmoResults(pResults, sKimmoData_g.pLogFP, &sKimmoData_g); freeKimmoResult(pResults); }
`generate.c'
#include "kimmo.h" KimmoResult * applyKimmoRecognizer(unsigned char * pszSurfaceForm_in, KimmoData * pKimmo_in);
applyKimmoRecognizer
tries to analyze the provided surface form
of a word to create the lexical (underlying) form divided into
morphemes. If the word can be divided into morphemes, and a word
grammar has been loaded, applyKimmoRecognizer
also tries to
parse the list of morphemes to create a word parse chart with related
feature structures.
The PC-Kimmo rules and lexicon must be loaded before
applyKimmoRecognizer
is called. If a word parse is desired, the
word grammar must also be loaded before calling this function.
The arguments to applyKimmoRecognizer
are as follows:
pszSurfaceForm_in
pKimmo_in
a pointer to a list of results, or NULL if unsuccessful
5.2.4 Example
#include <stdio.h> #include <string.h> #include "kimmo.h" ... KimmoData sKimmoData_g; ... void do_recognize(pszForm_in) unsigned char * pszForm_in; { unsigned char * pszSurfForm; KimmoResult * pResults; if ( (sKimmoData_g.ppszAlphabet == NULL) || (sKimmoData_g.pLexiconSections == NULL) || (pszForm_in == NULL) ) return; pszSurfForm = pszForm_in + strspn((char *)pszForm_in, " \t\r\n\f"); if (*pszSurfForm == '\0') return; if (sKimmoData_g.pLogFP != NULL) fprintf(sKimmoData_g.pLogFP, "%s\n", pszSurfForm); pResults = applyKimmoRecognizer(pszSurfForm, &sKimmoData_g); writeKimmoResults(pResults, stderr, &sKimmoData_g); if (sKimmoData_g.pLogFP != NULL) writeKimmoResults(pResults, sKimmoData_g.pLogFP, &sKimmoData_g); freeKimmoResult(pResults); }
`recogniz.c'
#include "kimmo.h" KimmoResult * applyKimmoSynthesizer(unsigned char * pszMorphemes_in, KimmoData * pKimmo_in);
applyKimmoSynthesizer
tries to synthesize a word from a string
containing an ordered list of morpheme names (glosses) separated by
spaces. The PC-Kimmo rules and synthesis lexicon must be loaded before
this function is called.
The arguments to applyKimmoSynthesizer
are as follows:
pszMorphemes_in
pKimmo_in
a pointer to a list of results, or NULL if unsuccessful
5.3.4 Example
#include <stdio.h> #include <string.h> #include "kimmo.h" ... KimmoData sKimmoData_g; KimmoData sSynthesisData_g; ... static void fix_synthesis_data() { sSynthesisData_g.bLimit = sKimmoData_g.bLimit; sSynthesisData_g.iTraceLevel = sKimmoData_g.iTraceLevel; sSynthesisData_g.bUsePATR = sKimmoData_g.bUsePATR; sSynthesisData_g.bSilent = sKimmoData_g.bSilent; sSynthesisData_g.bShowWarnings = sKimmoData_g.bShowWarnings; sSynthesisData_g.bAlignment = sKimmoData_g.bAlignment; sSynthesisData_g.cGlossBegin = sKimmoData_g.cGlossBegin; sSynthesisData_g.cGlossEnd = sKimmoData_g.cGlossEnd; sSynthesisData_g.cComment = sKimmoData_g.cComment; sSynthesisData_g.pLogFP = sKimmoData_g.pLogFP; sSynthesisData_g.ppszAlphabet = sKimmoData_g.ppszAlphabet; sSynthesisData_g.uiAlphabetSize = sKimmoData_g.uiAlphabetSize; sSynthesisData_g.cNull = sKimmoData_g.cNull; sSynthesisData_g.cAny = sKimmoData_g.cAny; sSynthesisData_g.cBoundary = sKimmoData_g.cBoundary; sSynthesisData_g.bTwoLCFile = sKimmoData_g.bTwoLCFile; sSynthesisData_g.pSubsets = sKimmoData_g.pSubsets; sSynthesisData_g.uiSubsetCount = sKimmoData_g.uiSubsetCount; sSynthesisData_g.pAutomata = sKimmoData_g.pAutomata; sSynthesisData_g.uiAutomataSize = sKimmoData_g.uiAutomataSize; sSynthesisData_g.pFeasiblePairs = sKimmoData_g.pFeasiblePairs; sSynthesisData_g.uiFeasiblePairsCount = sKimmoData_g.uiFeasiblePairsCount; sSynthesisData_g.pszRulesFile = sKimmoData_g.pszRulesFile; memset(&sSynthesisData_g.sPATR, 0, sizeof(PATRData)); } void do_synthesize(pszForm_in) unsigned char * pszForm_in; { unsigned char * pszMorphForm; KimmoResult * pResults; if ( (sKimmoData_g.ppszAlphabet == NULL) || (sSynthesisData_g.pLexiconSections == NULL) || (pszForm_in == NULL) ) return; pszMorphForm = pszForm_in + strspn((char *)pszForm_in, " \t\r\n\f"); if (*pszMorphForm == '\0') return; fix_synthesis_data(); if (sKimmoData_g.pLogFP != NULL) fprintf(sKimmoData_g.pLogFP, "%s\n", pszMorphForm); pResults = applyKimmoSynthesizer(pszMorphForm, &sKimmoData_g); writeKimmoResults(pResults, stderr, &sKimmoData_g); if (sKimmoData_g.pLogFP != NULL) writeKimmoResults(pResults, sKimmoData_g.pLogFP, &sKimmoData_g); freeKimmoResult(pResults); }
`synthesi.c'
#include "kimmo.h" int checkKimmoRuleStatus(int iRule_in, KimmoData * pKimmo_in);
checkKimmoRuleStatus
checks whether or not the given rule is
active.
The arguments to checkKimmoRuleStatus
are as follows:
iRule_in
pKimmo_in
TRUE if the given rule is active, otherwise FALSE
5.4.4 Example
#include <stdio.h> #include "kimmo.h" ... void show_rule_status(KimmoData * pKimmo_in) { int i; int iCount; int iWidth; int bActive; if (pKimmo_in->uiAutomataSize == 0) { fprintf(stderr, " There are no rules.\n"); return; } for ( iCount = 0, i = 1 ; i <= pKimmo_in->uiAutomataSize ; ++i ) { if (checkKimmoRuleStatus(i, pKimmo_in)) ++iCount; } if (iCount == pKimmo_in->uiAutomataSize) { fprintf(stderr, " Rules are ALL ON.\n"); return; } if (iCount == 0) { fprintf(stderr, " Rules are ALL OFF.\n"); return; } fprintf(stderr, " Rules are"); iWidth = 13; for ( i = 1 ; i <= pKimmo_in->uiAutomataSize ; ++i ) { if (iWidth == 0) { fputs(" ", stderr); iWidth = 13; } bActive = checkKimmoRuleStatus(i, pKimmo_in); if (i < pKimmo_in->uiAutomataSize) { fprintf(stderr, "%3d %s", i, bActive ? "ON, ":"OFF,"); iWidth += 8; } else { fprintf(stderr, "%3d %s", i, bActive ? "ON ":"OFF"); iWidth += 7; } if (iWidth >= 72) { putc( '\n', stderr); iWidth = 0; } } if (iLength != 0) putc( '\n', stderr); }
`rules.c'
#include "kimmo.h" unsigned char * concatKimmoMorphFeatures( KimmoMorpheme * pMorphemes_in, char * pszSeparate_in, KimmoData * pKimmo_in);
concatKimmoMorphFeatures
concatenates the feature names from a
list of morphemes created by applyKimmoRecognizer
as part of a
KimmoResult
data structure.
The arguments to concatKimmoMorphFeatures
are as follows:
pMorphemes_in
pAnalysis
element of a KimmoResult
data structure.
pszSeparate_in
pKimmo_in
a pointer to a dynamically allocated string containing the concatenated feature names from a list of morphemes, or NULL
5.5.4 Example
#include <stdio.h> #include <string.h> #include "kimmo.h" #include "patr.h" #include "opaclib.h" ... void write_as_WordTemplate(unsigned char * pszForm_in, KimmoResult * pResults_in, KimmoData * pKimmo_in, FILE * pOutputFP_in) { KimmoResult * pResult; WordAnalysis * pAnal; WordTemplate * pWord; if ((pszForm_in == NULL) || (pOutputFP_in == NULL)) return; /* * allocate and initialize a WordTemplate structure */ pWord = (WordTemplate *)allocMemory(sizeof(WordTemplate)); pWord->pszFormat = NULL; pWord->pszOrigWord = pszForm_in; pWord->paWord = NULL; pWord->pszNonAlpha = NULL; pWord->iCapital = 0; pWord->iOutputFlags = WANT_DECOMPOSITION | WANT_FEATURES | WANT_UNDERLYING | WANT_ORIGINAL; pWord->pAnalyses = NULL; pWord->pNewWords = NULL; /* * convert the results into a list of WordAnalysis structures */ for ( pResult = pResults_in ; pResult ; pResult = pResult->pNext ) { pAnal = (WordAnalysis *)allocMemory(sizeof(WordAnalysis)); pAnal->pszAnalysis = (char *)concatKimmoMorphGlosses( pResult->pAnalysis, " ", pKimmo_in); pAnal->pszDecomposition = (char *)concatKimmoMorphLexemes( pResult->pAnalysis, "-", pKimmo_in); pAnal->pszCategory = NULL; pAnal->pszProperties = NULL; pAnal->pszFeatures = (char *)concatKimmoMorphFeatures( pResult->pAnalysis, " ", pKimmo_in); pAnal->pszUnderlyingForm = duplicateString(pResult->pszResult); pAnal->pszSurfaceForm = pszForm_in; pAnal->pNext = pWord->pAnalyses; pWord->pAnalyses = pAnal; } /* * write the WordTemplate data and free the memory it used */ writeTemplate(pOutputFP_in, NULL, pWord, NULL); pWord->pszOrigWord = NULL; for ( pAnal = pWord->pAnalyses ; pAnal ; pAnal = pAnal->pNext ) pAnal->pszSurfaceForm = NULL; freeWordTemplate(pWord); }
`pckfuncs.c'
#include "kimmo.h" unsigned char * concatKimmoMorphGlosses( KimmoMorpheme * pMorphemes_in, char * pszSeparate_in, KimmoData * pKimmo_in);
concatKimmoMorphGlosses
concatenates the glosses from a list of
morphemes created by applyKimmoRecognizer
as part of a
KimmoResult
data structure.
The arguments to concatKimmoMorphGlosses
are as follows:
pMorphemes_in
pAnalysis
element of a KimmoResult
data structure.
pszSeparate_in
pKimmo_in
a pointer to a dynamically allocated string containing the concatenated glosses from a list of morphemes, or NULL
5.6.4 Example See section 5.5 concatKimmoMorphFeatures.
5.6.5 Source File `pckfuncs.c'
#include "kimmo.h" unsigned char * concatKimmoMorphLexemes( KimmoMorpheme * pMorphemes_in, char * pszSeparate_in, KimmoData * pKimmo_in);
concatKimmoMorphLexemes
concatenates the lexical (underlying)
forms from a list of morphemes created by applyKimmoRecognizer
as part of a KimmoResult
data structure.
The arguments to concatKimmoMorphLexemes
are as follows:
pMorphemes_in
pAnalysis
element of a KimmoResult
data structure.
pszSeparate_in
pKimmo_in
a pointer to a dynamically allocated string containing the concatenated lexical forms from a list of morphemes, or NULL
5.7.4 Example See section 5.5 concatKimmoMorphFeatures.
5.7.5 Source File `pckfuncs.c'
#include "kimmo.h" void freeKimmoLexicon(KimmoData * pKimmo_io);
freeKimmoLexicon
frees the memory used to store the lexicon
portion of the KimmoData information.
freeKimmoLexicon
has only one argument:
pKimmo_io
none
5.8.4 Example
#include <string.h> #include "kimmo.h" #include "patr.h" ... KimmoData sKimmoData_g; KimmoData sSynthesisData_g; /* for synthesis lexicon */ ... static void reset_synthesis_data() { sSynthesisData_g.bLimit = FALSE; sSynthesisData_g.iTraceLevel = 0; sSynthesisData_g.bUsePATR = FALSE; sSynthesisData_g.bSilent = FALSE; sSynthesisData_g.bShowWarnings = FALSE; sSynthesisData_g.bAlignment = FALSE; sSynthesisData_g.cGlossBegin = '\0'; sSynthesisData_g.cGlossEnd = '\0'; sSynthesisData_g.cComment = '\0'; sSynthesisData_g.pLogFP = NULL; sSynthesisData_g.ppszAlphabet = NULL; sSynthesisData_g.uiAlphabetSize = 0; sSynthesisData_g.cNull = '\0'; sSynthesisData_g.cAny = '\0'; sSynthesisData_g.cBoundary = '\0'; sSynthesisData_g.bTwoLCFile = FALSE; sSynthesisData_g.pSubsets = NULL; sSynthesisData_g.uiSubsetCount = 0; sSynthesisData_g.pAutomata = NULL; sSynthesisData_g.uiAutomataSize = 0; sSynthesisData_g.pFeasiblePairs = NULL; sSynthesisData_g.uiFeasiblePairsCount = 0; sSynthesisData_g.pszRulesFile = ; memset(&sSynthesisData_g.sPATR, 0, sizeof(PATRData)); }
void do_clear() { freeKimmoRules(&sKimmoData_g); freeKimmoLexicon(&sKimmoData_g); freePATRGrammar(&sKimmoData_g.sPATR); sKimmoData_g.bUsePATR = FALSE; freePATRInternalMemory(); reset_synthesis_data(); /* prevent double freeing */ freeKimmoLexicon(&sSynthesisData_g); }
5.8.5 Source File `lexicon.c'
#include "kimmo.h" void freeKimmoResult(KimmoResult * pResults_io);
freeKimmoResult
frees the memory used by a list of
KimmoResult
data structures.
freeKimmoResult
has only one argument:
pResults_io
KimmoResult
data
structures.
none
5.9.4 Example See section 5.1 applyKimmoGenerator, See section 5.2 applyKimmoRecognizer, or See section 5.3 applyKimmoSynthesizer.
5.9.5 Source File `pckfuncs.c'
#include "kimmo.h" void freeKimmoRules(KimmoData * pKimmo_io);
freeKimmoRules
frees the memory used to store the rules
portion of the KimmoData information.
freeKimmoRules
has only one argument:
pKimmo_io
none
5.10.4 Example See section 5.8 freeKimmoLexicon.
5.10.5 Source File `rules.c'
#include "kimmo.h" int loadKimmoLexicon(unsigned char * pszLexiconFile_in, int eLexiconType_in, KimmoData * pKimmo_io);
loadKimmoLexicon
loads a PC-Kimmo lexicon, starting with the
primary lexicon file. If a lexicon has already been loaded, then the
existing lexicon is erased before this lexicon file is read.
The arguments to loadKimmoLexicon
are as follows:
pszLexiconFile_in
eLexiconType_in
KIMMO_ANALYSIS
KIMMO_SYNTHESIS
pKimmo_io
zero if successful, -1 if an error occurs
5.11.4 Example
#include "kimmo.h" #include "patr.h" ... KimmoData sKimmoData_g; ... /* * load the PC-Kimmo data files. * return the number of files successfully loaded (0-3) */ int load_kimmo_files(char * pszRules_in, char * pszLexicon_in, char * pszGrammar_in) { if (loadKimmoRules(pszRules_in, &sKimmoData_g) != 0) return 0; if (loadKimmoLexicon(pszLexicon_in, KIMMO_ANALYSIS, &sKimmoData_g) != 0) return 1; if (loadPATRGrammar(pszGrammar_in, &sKimmoData_g.sPATR) == 0) return 2; return 3; }
`file.c'
#include "kimmo.h" int loadKimmoRules(unsigned char * pszRuleFile_in, KimmoData * pKimmo_io);
loadKimmoRules
loads a PC-Kimmo rules file. If rules have
already been loaded, then the existing rules and lexicon are erased
before this rules file is read.
The arguments to loadKimmoRules
are as follows:
pszRuleFile_in
pKimmo_io
zero if okay, -1 if an error occurs
5.12.4 Example See section 5.11 loadKimmoLexicon.
5.12.5 Source File `file.c'
#include "kimmo.h" void setKimmoRuleStatus(int iRule_in, int bValue_in, KimmoData * pKimmo_io);
setKimmoRuleStatus
sets the status (active or inactive) of a
given PC-Kimmo rule. The set of feasible pairs is automatically
recomputed as a side effect of calling this function.
The arguments to setKimmoRuleStatus
are as follows:
iRule_in
iRule_in
is equal to zero (0
),
then all of the rules are turned on or off according to
bValue_in
.
bValue_in
pKimmo_io
none
5.13.4 Example
#include <ctype.h> #include <stdlib.h> #include <string.h> #include "kimmo.h" #include "cportlib.h" ... KimmoData sKimmoData_g; ... void do_set_rule(char * pszArgument_in, int bValue_in) { int i; char * pszNumber; char * pszNext; if ( (strcasecmp(pszArgument_in, "all") == 0) || (strcasecmp(pszArgument_in, "al") == 0) || (strcasecmp(pszArgument_in, "a") == 0) ) { setKimmoRuleStatus(0, bValue_in, &sKimmoData_g); return; } for (pszNumber = pszArgument_in ; *pszNumber ; pszNumber = pszNext) { i = strtol(pszNumber, &pszNext, 10); if (pszNext == pszNumber) break; if ((i > 0) && (i <= sKimmoData_g.uiAutomataSize)) setKimmoRuleStatus(i, bValue_in, &sKimmoData_g); else break; } if (*pszNumber != '\0') fprintf(stderr, "Invalid argument to SET RULE: \"%s\"\n", pszNumber); }
`file.c'
#include "kimmo.h" void writeKimmoFeasiblePairs(FILE * pOutputFP_in, KimmoData * pKimmo_in);
writeKimmoFeasiblePairs
writes a list of the current PC-Kimmo
feasible pairs to the output file.
The arguments to writeKimmoFeasiblePairs
are as follows:
pOutputFP_in
FILE
pointer.
pKimmo_in
none
5.14.4 Example
#include <stdio.h> #include <string.h> #include "kimmo.h" extern char * strlwr P((char * pszString_io)); ... KimmoData sKimmoData_g; ... void do_list(char * pszArgument_in) { strlwr(pszArgument_in); if ( (strcmp(pszArgument_in, "l") == 0) || (strcmp(pszArgument_in, "lexicon") == 0) ) writeKimmoLexiconSectionNames(stderr, &sKimmoData_g); else if ((strcmp(pszArgument_in, "p") == 0) || (strcmp(pszArgument_in, "pairs") == 0) ) writeKimmoFeasiblePairs(stderr, &sKimmoData_g); else if ((strcmp(pszArgument_in, "r") == 0) || (strcmp(pszArgument_in, "rules") == 0) ) writeKimmoRulesStatus(stderr, &sKimmoData_g); else fprintf(stderr, "Invalid argument for list command: %s\n", pszArgument_in); }
`rules.c'
#include "kimmo.h" int writeKimmoLexiconSection(unsigned char * pszLexSection_in, FILE * pOutputFP_in, KimmoData * pKimmo_in);
writeKimmoLexiconSection
writes the designated section of the
PC-Kimmo lexicon to the output file. This is useful only for debugging
purposes.
The arguments to writeKimmoLexiconSection
are as follows:
pszLexSection_in
pOutputFP_in
FILE
pointer.
pKimmo_in
TRUE
if successful, FALSE
if the lexicon section does not
exist
5.15.4 Example
#include <stdio.h> #include "kimmo.h" #include "cmd.h" #include "rpterror.h" ... KimmoData sKimmoData_g; ... void show_lexicon(char * pszLexName_in) { if ((pszLexName_in == NULL) || (pszLexName_in[0] == '\0')) { displayNumberedMessage(&sCmdMissingArgument_g, sKimmoData_g.bSilent, sKimmoData_g.bShowWarnings, sKimmoData_g.pLogFP, NULL, 0, "SHOW LEXICON" ); } else if (writeKimmoLexiconSection(pszLexName_in, stderr, &sKimmoData_g) == FALSE) { displayNumberedMessage(&sCmdBadArgument_g, sKimmoData_g.bSilent, sKimmoData_g.bShowWarnings, sKimmoData_g.pLogFP, NULL, 0, "SHOW LEXICON", pszLexName_in); } }
`lexicon.c'
#include "kimmo.h" void writeKimmoLexiconSectionNames(FILE * pOutputFP_in, KimmoData * pKimmo_in);
writeKimmoLexiconSectionNames
writes a list of the PC-Kimmo
lexicon section names to the output file.
The arguments to writeKimmoLexiconSectionNames
are as follows:
pOutputFP_in
FILE
pointer.
pKimmo_in
none
5.16.4 Example See section 5.14 writeKimmoFeasiblePairs.
5.16.5 Source File `lexicon.c'
#include "kimmo.h" void writeKimmoResults(KimmoResult * pResults_in, FILE * pOutputFP_in, KimmoData * pKimmo_in);
writeKimmoResults
writes a list of PC-Kimmo results to the
output file. If pResults_in
is NULL
, then nothing is
written to the output file.
The arguments to writeKimmoResults
are as follows:
pResults_in
applyKimmoGenerator
, applyKimmoRecognizer
, or
applyKimmoSynthesizer
.
pOutputFP_in
FILE
pointer.
pKimmo_in
none
5.17.4 Example See section 5.1 applyKimmoGenerator, See section 5.2 applyKimmoRecognizer, or See section 5.3 applyKimmoSynthesizer.
5.17.5 Source File `pckfuncs.c'
#include "kimmo.h" void writeKimmoRule(unsigned uiRuleNumber_in, FILE * pOutputFP_in, KimmoData * pKimmo_in);
writeKimmoRule
writes the designated PC-Kimmo rule to the output
file. This is useful only for debugging purposes.
The arguments to writeKimmoRule
are as follows:
uiRuleNumber_in
pOutputFP_in
FILE
pointer.
pKimmo_in
none
5.18.4 Example
#include <stdlib.h> #include "kimmo.h" #include "cmd.h" ... KimmoData sKimmoData_g; ... void do_show_rule(char * pszArgument_in) { int k; if (pszArgument_in == (char *)NULL) { displayNumberedMessage(&sCmdMissingKeyword_g, sKimmoData_g.bSilent, sKimmoData_g.bShowWarnings, sKimmoData_g.pLogFP, NULL, 0, "SHOW RULE"); return; } k = atoi(pszArgument_in); if ( (k <= 0) || (k > sKimmoData_g.uiAutomataSize) ) displayNumberedMessage(&sCmdBadArgument_g, sKimmoData_g.bSilent, sKimmoData_g.bShowWarnings, sKimmoData_g.pLogFP, NULL, 0, "SHOW RULE", pszArgument_in); else writeKimmoRule( k, stderr, &sKimmoData_g ); }
`rules.c'
#include "kimmo.h" void writeKimmoRulesStatus(FILE * pOutputFP_in, KimmoData * pKimmo_in);
writeKimmoRulesStatus
writes the status ("on"
or
"off"
and name for each of the PC-Kimmo rules currently loaded
from a rules file.
The arguments to writeKimmoRulesStatus
are as follows:
pOutputFP_in
FILE
pointer.
pKimmo_in
none
5.19.4 Example See section 5.14 writeKimmoFeasiblePairs.
5.19.5 Source File `file.c'
Jump to:
This document was generated on 20 March 2003 using texi2html 1.56k.