Since it was released in 1988, the AMPLE program has been used for morphological analysis in many different languages. It has always functioned as a batch processing program, which is useful for production work such as analyzing an entire book, but is less useful during the early stages of developing a morphological description. The AMPLE function library has therefore been developed with the goal of making it easier to cast AMPLE style morphological parsing into different frameworks. This has already borne fruit: the PC-PATR syntactic parser now has an embedded AMPLE morphological parser, and a Microsoft Windows DLL incorporating the AMPLE functions has been written.
The basic goal behind choosing names in the AMPLE function library is for the name to convey information about what it represents. This is achieved in two ways: striving for a descriptive name rather than a short cryptic abbreviated name, and following a different pattern of capitalization for each type of name.
Preprocessor macro names are written entirely in capital letters. If
the name requires more than one word for an adequate description, the
words are joined together with intervening underscore (_)
characters.
Data structure names consist of one or more capitalized words. If the name requires more than one word for an adequate description, the words are joined together without underscores, depending on the capitalization pattern to make them readable as separate words.
Variable names in the AMPLE function library follow a modified form of the Hungarian naming convention described by Steve McConnell in his book Code Complete on pages 202-206.
Variable names have three parts: a lowercase type prefix, a descriptive name, and a scope suffix.
The type prefix has the following basic possibilities:
b
char, short, or int
c
char but sometimes a short or
int
d
double
e
enum or as a char,
short, or int
i
int, short, long, or
(rarely) char
s
struct statement
sz
pf
In addition, the basic types may be prefixed by these qualifiers:
u
a
p
The descriptive name portion of a variable name consists of one or more
capitalized words concatenated together. There are no underscores
(_) separating these words from each other, or from the type
prefix. For the AMPLE function library, the descriptive
name for global variables
begins with Ample.
The scope suffix has these possibilities:
_g
_m
static)
_in
_out
_io
_s
static)
The lack of a scope suffix indicates that a variable is declared within a function and exists on the stack for the duration of the current call.
Global function names in the AMPLE function library have
two parts: a verb that is all lowercase followed by a noun phrase
containing one or more capitalized words. These pieces are
concatanated without any intervening underscores (_). For the
AMPLE library functions, the noun phrase section
includes
Ample.
Given the discussion above, it is easy to discern at a glance what type of item each of the following names refers to.
SAMPLE_NAME
SampleName
pSampleName
writeSampleName
SampleName).
The AMPLE functions generally operate on two basic data structures:
AmpleData stores the lexicon and other linguistic information
necessary for morphological parsing, and AmpleWord stores the
information for a single word that is being parsed. Each of these data
structures is a collection of other data structures. Several of these
are described in
section `The OPAC function library data structures' in OPAC Function Library Reference Manual,
and the other data structures are usually not important for using the
AMPLE function library.
#include <stdio.h>
#include "opaclib.h"
typedef struct ample_allo_env AmpleAlloEnv;
typedef struct ample_cat_class AmpleCategoryClass;
typedef struct ample_fnlist AmpleTestList;
typedef struct ample_hlalist AmpleHeadlistList;
typedef struct ample_morph_class AmpleMorphClass;
typedef struct ample_morph_constraint AmpleMorphConstraint;
typedef struct ample_morpheme AmpleMorpheme;
typedef struct ample_pairlist AmplePairList;
typedef struct ample_prop AmpleProperty;
typedef struct {
/*
* information provided directly by the user
*/
unsigned char bDebugAllomorphConds; /* -a */
unsigned char bEnableAllomorphIDs; /* -b */
unsigned char cBeginComment; /* -c */
unsigned char bRootGlosses; /* -g */
int iMaxTrieDepth; /* -d */
int iMaxMorphnameLength; /* -n */
int eTraceAnalysis; /* -t */
int iOutputFlags; /* -w -x, \\cat ... */
int iDebugLevel; /* -/ */
FILE * pLogFP;
/*
* information loaded from the selective analysis file
*/
char * pszSelectiveAnalFile;
StringList * pSelectiveAnalMorphs;
/*
* information loaded from the text input control file
*/
TextControl sTextCtl;
/*
* information loaded from the "analysis data" (control) file
*/
char * pszAnalysisDataFile;
AmpleTestList * pPrefixSuccTests; /* \\pt */
AmpleTestList * pRootSuccTests; /* \\rt */
AmpleTestList * pSuffixSuccTests; /* \\st */
AmpleTestList * pInfixSuccTests; /* \\it */
AmpleTestList * pFinalTests; /* \\ft */
int eWriteCategory; /* \\cat */
int bWriteMorphCats;
StringList * pCategories; /* \\ca */
AmpleCategoryClass * pCategoryClasses; /* \\ccl */
char cBeginRoot; /* \\rd */
char cEndRoot;
StringClass * pStringClasses; /* \\scl (all files) */
AmplePairList * pInfixAdhocPairs; /* \\iah */
AmplePairList * pPrefixAdhocPairs; /* \\pah */
AmplePairList * pRootAdhocPairs; /* \\rah */
AmplePairList * pSuffixAdhocPairs; /* \\sah */
unsigned char * pCompoundRootPairs; /* \\cr */
AmpleMorphClass * pMorphClasses; /* \\mcl */
AmpleProperty * pProperties; /* \\ap, \\mp */
StringList * pPropertySets;
int iMaxPrefixCount; /* \\maxp */
int iMaxInfixCount; /* \\maxi */
int iMaxRootCount; /* \\maxr */
int iMaxSuffixCount; /* \\maxs */
AmpleMorphConstraint * pMorphConstraints; /* \\mcc */
int iMaxNullCount; /* \\maxnull */
char * pszValidChars; /* \\strcheck */
int bDictionaryCapitals; /* \\dicdecap */
/*
* information loaded from the dictionary codes file
*/
char * pszDictionaryCodesFile;
CodeTable * pPrefixTable;
CodeTable * pInfixTable;
CodeTable * pSuffixTable;
CodeTable * pRootTable;
CodeTable * pDictTable;
/*
* information loaded from the AMPLE dictionaries
*/
StringList * pDictionaryFiles;
Trie * pDictionary;
AmpleMorpheme * pAmpleMorphemes;
AmpleAlloEnv * pAllomorphEnvs;
unsigned char iInfixLocations; /* AMPLE_PFX, AMPLE_SFX,
and/or AMPLE_ROOT */
/*
* information loaded from the dictionary orthography change file
*/
char * pszDictOrthoChangeFile;
Change * pDictOrthoChanges;
/*
* parsing variables
*/
short bMorphemeLookahead;
short bLookaheadDone;
short bMultiDependency;
} AmpleData;
AmpleData groups all of the information loaded from AMPLE's
multitudinous control files. This simplifies the parameter lists for
many of the AMPLE library functions, while minimizing the need for
global variables.
The fields of the AmpleData data structure are as follows:
bDebugAllomorphConds
TRUE
(nonzero).
bEnableAllomorphIDs
TRUE
(nonzero). This was added to support LinguaLinks.
cBeginComment
bRootGlosses
G in the dictionary code table.
iMaxTrieDepth
2 or
3 is reasonable.
iMaxMorphnameLength
64. Smaller values save memory.
eTraceAnalysis
AMPLE_TRACE_OFF
AMPLE_TRACE_ON
AMPLE_TRACE_SGML
iOutputFlags
WANT_DECOMPOSITION
WANT_CATEGORY
WANT_PROPERTIES
WANT_FEATURES
WANT_UNDERLYING
WANT_ORIGINAL
iDebugLevel
pLogFP
FILE pointer opened for logging information, or is
NULL.
pszSelectiveAnalFile
pSelectiveAnalMorphs
NULL, only those dictionary entries that
match a member of the list are used in analysis.
sTextCtl
pszAnalysisDataFile
pPrefixSuccTests
NULL.
pRootSuccTests
NULL.
pSuffixSuccTests
NULL.
pInfixSuccTests
NULL.
pFinalTests
NULL.
eWriteCategory
AMPLE_NO_CATEGORY
iOutputFlags & WANT_CATEGORY is FALSE.
AMPLE_SUFFIX_CATEGORY
iOutputFlags & WANT_CATEGORY is TRUE.
AMPLE_PREFIX_CATEGORY
iOutputFlags & WANT_CATEGORY is TRUE.
bWriteMorphCats
TRUE, and if eWriteCategory is
not set to AMPLE_NO_CATEGORY.
pCategories
pCategoryClasses
NULL.
cBeginRoot
cEndRoot
pStringClasses
NULL.
pInfixAdhocPairs
NULL.
pPrefixAdhocPairs
NULL.
pRootAdhocPairs
NULL.
pSuffixAdhocPairs
NULL.
pCompoundRootPairs
NULL.
pMorphClasses
NULL.
pProperties
NULL.
pPropertySets
iMaxPrefixCount
iMaxInfixCount
iMaxRootCount
iMaxSuffixCount
pMorphConstraints
iMaxNullCount
pszValidChars
NULL.
bDictionaryCapitals
TRUE by the analysis data file.
pszDictionaryCodesFile
pPrefixTable
CodeTable data structure for the prefix dictionary
file, or is NULL. For
more details, see
section `CodeTable' in OPAC Function Library Reference Manual.
pInfixTable
CodeTable data structure for the infix dictionary
file, or is NULL. For
more details, see
section `CodeTable' in OPAC Function Library Reference Manual.
pSuffixTable
CodeTable data structure for the suffix dictionary
file, or is NULL. For
more details, see
section `CodeTable' in OPAC Function Library Reference Manual.
pRootTable
CodeTable data structure for root dictionary
files, or is NULL. For
more details, see
section `CodeTable' in OPAC Function Library Reference Manual.
pDictTable
CodeTable data structure for unified dictionary
files, or is NULL. For
more details, see
section `CodeTable' in OPAC Function Library Reference Manual.
pDictionaryFiles
pDictionary
pAmpleMorphemes
pDictionary.)
pAllomorphEnvs
iInfixLocations
pszDictOrthoChangeFile
NULL.
pDictOrthoChanges
NULL.
bMorphemeLookahead
bLookaheadDone
bMultiDependency
`ample.h'
#include "template.h"
typedef struct ample_hlalist AmpleHeadlistList;
typedef struct {
WordTemplate * pTemplate;
AmpleHeadlistList * pHeadlists;
char * pszRemaining;
unsigned uiAmbigCount;
int bFoundRoot;
} AmpleWord;
AmpleWord groups the information for a single word processed by
AMPLE. This simplifies the parameter lists for many of the AMPLE
library functions, while minimizing the need for global variables.
The fields of the AmpleWord data structure are as follows:
pTemplate
WordTemplate data structure that stores a word and
its analyses. For more details, see
section `WordTemplate' in OPAC Function Library Reference Manual.
pHeadlists
pszRemaining
uiAmbigCount
bFoundRoot
TRUE if a root has been found, and FALSE if only prefixes
and infixes have been found in the analysis process.
`ample.h'
This chapter gives the proper usage information about each of the global variables found in the AMPLE function library. For each global variable that the library provides, this information includes which header files to include in your source to obtain the extern declaration for that variable.
Note that all of the global variables in the AMPLE function library provide information about the current version.
4.1.1 Syntax
#include "ample.h" extern const char cAmplePatchSep_g;
cAmplePatchSep_g is the character used to separate the
revision level number and the patch level number. It has one of the
following three values:
a
b
.
See section 4.4 iAmpleVersion_g.
4.1.4 Source File `version.c'
4.2.1 Syntax
#include "ample.h" extern const int iAmplePatchlevel_g;
iAmplePatchlevel_g is the current patch level of the
AMPLE function library and program. This is the third level version
number, reflecting bug fixes or internal improvements that should be
functionally invisible to users.
The patch level can go as high as needed. It is not limited to single (or double) digit numbers.
4.2.3 Example See section 4.4 iAmpleVersion_g.
4.2.4 Source File `version.c'
4.3.1 Syntax
#include "ample.h" extern const int iAmpleRevision_g;
iAmpleRevision_g is the current revision level of the
AMPLE program and function library. This is the second level version
number, reflecting changes to program behavior that require changes to
the AMPLE Reference Manual.
The revision level can go as high as needed. It is not limited to single (or double) digit numbers.
4.3.3 Example See section 4.4 iAmpleVersion_g.
4.3.4 Source File `version.c'
4.4.1 Syntax
#include "ample.h" extern const int iAmpleVersion_g;
iAmpleVersion_g is the current version number of the
AMPLE program and function library. This is the top level version
number, reflecting a major rewrite of the program or major changes that
make it incompatible with earlier versions of the program.
4.4.3 Example
#include <stdio.h>
#include "ample.h"
...
printf("AMPLE functions version %d.%d%c%d (%s), ",
iAmpleVersion_g, iAmpleRevision_g, cAmplePatchSep_g,
iAmplePatchlevel_g, pszAmpleDate_g);
printf("Copyright %s SIL, Inc.\n", pszAmpleYear_g);
#ifdef __DATE__
printf(pszAmpleCompileFormat_g,
pszAmpleCompileDate_g, pszAmpleCompileTime_g);
#else
if (pszAmpleTestVersion_g != NULL)
fputs(pszAmpleTestVersion_g, stdout);
#endif
`version.c'
4.5.1 Syntax
#include "ample.h" #ifdef __DATE__ extern const char * pszAmpleCompileDate_g; #endif
If the compiler predefines the __DATE__ constant,
pszAmpleCompileDate_g is a string containing the date that
the AMPLE function library and program was compiled.
4.5.3 Example See section 4.4 iAmpleVersion_g.
4.5.4 Source File `version.c'
4.6.1 Syntax
#include "ample.h" #ifdef __DATE__ extern const char * pszAmpleCompileFormat_g; #endif
If the compiler predefines the __DATE__ constant,
pszAmpleCompileFormat_g is a printf format string
suitable for displaying the date and time that the AMPLE function
library and program was compiled.
4.6.3 Example See section 4.4 iAmpleVersion_g.
4.6.4 Source File `version.c'
4.7.1 Syntax
#include "ample.h" #ifdef __DATE__ extern const char * pszAmpleCompileTime_g; #endif
If the compiler predefines the __DATE__ constant,
pszAmpleCompileTime_g is a string containing the time that
the AMPLE function library and program was compiled.
4.7.3 Example See section 4.4 iAmpleVersion_g.
4.7.4 Source File `version.c'
4.8.1 Syntax
#include "ample.h" extern const char * pszAmpleDate_g;
pszAmpleDate_g is a string containing the date that the
AMPLE function library and program was last modified.
4.8.3 Example See section 4.4 iAmpleVersion_g.
4.8.4 Source File `version.c'
4.9.1 Syntax
#include "ample.h" #ifndef __DATE__ extern const char * pszAmpleTestVersion_g; #endif
If the compiler does not predefine the __DATE__ constant,
pszAmpleCompileDate_g is a string describing what kind of
test version it is (alpha or beta). If it is not a test version, then
the string pointer is NULL.
4.9.3 Example See section 4.4 iAmpleVersion_g.
4.9.4 Source File `version.c'
4.10.1 Syntax
#include "ample.h" extern const char * pszAmpleYear_g;
pszAmpleYear_g is a string containing the year that the
AMPLE function library and program was last copyrighted.
4.10.3 Example See section 4.4 iAmpleVersion_g.
4.10.4 Source File `version.c'
This document gives the proper usage information about each of the functions found in the AMPLE function library. The prototypes and type definitions relevent to the use of these functions are all found in the `ample.h' header file.
5.1.1 Syntax
#include "ample.h"
void addAmpleSelectiveAnalItem(const char * pszMorphs_in,
AmpleData * pAmple_io);
addAmpleSelectiveAnalItem adds the morpheme and allomorph
information to the list of morphemes and allomorphs that are used in
selective analysis.
The arguments to addAmpleSelectiveAnalItem are as follows:
pszMorphs_in
NUL-terminated character string that encodes
morphname or allomorph information. (Currently, this is just a list of
morphnames or allomorphs.)
pAmple_io
none
5.1.4 Example
#include "ample.h"
...
AmpleData sAmpleData_g;
...
addAmpleSelectiveAnalItem("morph allomorph", &sAmpleData_g);
...
addAmpleSelectiveAnalItem("allomorph2 morph2", &sAmpleData_g);
...
`setsd.c'
5.2.1 Syntax
#include "ample.h"
void checkAmpleMorphs(int bCheckMorphs_in,
AmpleData * pAmple_in);
checkAmpleMorphs checks that all referenced morphnames are
defined in the dictionaries. This requires that
initAmpleMorphChecking be called before loading the analysis
data file or any of the dictionaries.
Morphname references are checked in:
An error message is displayed for each unrecognized morphname. Duplicate morphnames in the dictionaries are also detected.
The arguments to checkAmpleMorphs are as follows:
bCheckMorphs_in
TRUE, or prevents it
if FALSE.
pAmple_in
none
5.2.4 Example
#include "ample.h" ... initAmpleMorphChecking(TRUE); ... checkAmpleMorphs(TRUE);
`setsd.c'
5.3.1 Syntax
#include "ample.h" void eraseAmpleWord(AmpleWord * pWord_in);
eraseAmpleWord frees the memory allocated for an AMPLE word data
structure. This includes the WordTemplate data structure and other
fields used internally. The AMPLE word data structure itself is not
freed, so it can be a static or auto variable.
eraseAmpleWord has only one argument:
pWord_in
AmpleWord data structure that contains information
that is no longer needed.
none
5.3.4 Example
#include "ample.h"
...
static AmpleData sAmpleData_m;
...
AmpleWord sThisWord;
WordTemplate * pWord;
FILE * pInputFP;
FILE * pOutputFP;
char * pszOutFilename;
...
initiateAmpleTrace( &sAmpleData_m );
while ((pWord = readTemplateFromText(pInputFP,
&sAmpleData_m.sTextCtl)) != NULL)
{
sThisWord.pTemplate = pWord;
sThisWord.pHeadlists = NULL;
sThisWord.pszRemaining = NULL;
sThisWord.uiAmbigCount = 0;
sThisWord.bFoundRoot = FALSE;
if (sThisWord.pTemplate->paWord != NULL)
performAmpleAnalysis(&sThisWord, NULL, NULL, &sAmpleData_m);
writeTemplate( pOutputFP, pszOutFilename,
sThisWord_m.pTemplate, &sAmpleData_m.sTextCtl);
eraseAmpleWord( &sThisWord );
}
terminateAmpleTrace( &sAmpleData_m );
...
`anal.c'
5.4.1 Syntax
#include "ample.h"
char * findAmplePropertyName(unsigned uiPropNumber_in,
const AmpleData * pAmple_in);
findAmplePropertyName searches for the name of a property given
by number.
The arguments to findAmplePropertyName are as follows:
uiPropNumber_in
pAmple_in
a pointer to the property name, or NULL if not found
5.4.4 Example
#include <stdio.h>
#include "ample.h"
...
static AmpleData sAmpleData_m;
...
char * pszProperty;
unsigned uiProperty;
...
for ( uiProperty = 1 ; uiProperty < 256 ; ++uiProperty )
{
pszProperty = findAmplePropertyName(uiProperty, &sAmpleData_m);
if (pszProperty != NULL)
printf("Property %3u is \"%s\"\n", uiProperty, pszProperty);
}
`proper.c'
5.5.1 Syntax
#include "ample.h"
unsigned char findAmplePropertyNumber(const char * pszName_in,
const AmpleData * pAmple_in);
findAmplePropertyNumber searches for a property given by name.
The arguments to findAmplePropertyNumber are as follows:
pszName_in
pAmple_in
the integer value of the property, or zero if not found
5.5.4 Example
#include <stdio.h>
#include "ample.h"
...
static AmpleData sAmpleData_m;
...
unsigned uiProperty;
char * pszProperty;
...
uiProperty = findAmplePropertyNumber(pszProperty, &sAmpleData_m);
if (uiProperty == 0)
printf("%s is not a valid property name.\n", pszProperty);
else
printf("%s is property number %u.\n", pszProperty, uiProperty);
`proper.c'
5.6.1 Syntax
#include "ample.h" void freeAmpleDictionary(AmpleData * pAmple_io);
freeAmpleDictionary frees the memory allocated to store an AMPLE
dictionary. This is called by resetAmpleData, which is the
safest way to use it since the data from the dictionary files are
somewhat intermingled with data from other files.
freeAmpleDictionary has only one argument:
pAmple_io
none
5.6.4 Example
#include "ample.h" AmpleData sAmpleData_g; char szCodesFilename_g[100]; char szDictFilename_g[100]; ... loadAmpleDictCodeTables(szCodesFilename_g, &sAmpleData_g, TRUE); ... loadAmpleDictionary(szDictFilename_g, AMPLE_UNIFIED, &sAmpleData_g); ... freeAmpleDictionary( &sAmpleData_g );
`setsd.c'
5.7.1 Syntax
#include "ample.h" void freeAmpleSelectiveAnalInfo(AmpleData * pAmple_io);
freeAmpleSelectiveAnalInfo frees the memory allocated to store
the selective analysis information. This also marks all of the current
dictionary entries (in memory) to enable them to be used in future
analysis efforts.
freeAmpleSelectiveAnalInfo has only one argument:
pAmple_io
none
5.7.4 Example
#include "ample.h" AmpleData sAmpleData_g; char szSelectiveAnalFile_g[100]; ... loadAmpleSelectiveAnalFile(szSelectiveAnalFile_g, &sAmpleData_g); ... freeAmpleSelectiveAnalInfo( &sAmpleData_g ); ...
`select.c'
5.8.1 Syntax
#include "ample.h"
int hasAmpleProperty(const unsigned char * pProperties_in,
unsigned uiPropNumber_in);
hasAmpleProperty checks whether pProperties_in contains a
specific property value. pProperties_in is normally the
combined set of allomorph and morpheme properties from a dictionary
entry. This is the same as
(strchr(pProperties_in, uiPropNumber_in) != NULL), except
for using unsigned characters.
The arguments to hasAmpleProperty are as follows:
pProperties_in
NUL-terminated array of AMPLE (allomorph or
morpheme) property numbers.
uiPropNumber_in
TRUE if the property set contains the property value,
otherwise FALSE
5.8.4 Example
#include <stdio.h>
#include "ample.h"
#include "ampledef.h" /* example uses internal data structures */
...
static AmpleData sAmpleData_m;
...
AmpleAllomorph * pAllomorph
unsigned uiProperty;
char * pszPropName;
...
pszPropName = findAmplePropertyName(uiProperty,
sAmpleData_m.pProperties);
if (hasAmpleProperty(pAllomorph->pProperties, uiProperty))
{
printf("allomorph %s of %s has property %s.\n",
pAllomorph->pszAllomorph,
pAllomorph->pMorpheme->pszMorphName,
pszPropName);
}
else
{
printf("allomorph %s of %s does not have property %s.\n",
pAllomorph->pszAllomorph,
pAllomorph->pMorpheme->pszMorphName,
pszPropName ? pszPropName : "(invalid property)");
}
`proper.c'
5.9.1 Syntax
#include "ample.h" void initAmpleMorphChecking(int bCheckMorphs_in);
initAmpleMorphChecking initializes the internal arrays for
morphname checking. If bCheckMorphs_in is FALSE, then no
memory is allocated and no checking can be performed.
initAmpleMorphChecking has only one argument:
bCheckMorphs_in
TRUE, or prevents it
if FALSE.
none
5.9.4 Example
#include "ample.h" ... initAmpleMorphChecking(TRUE); ... checkAmpleMorphs(TRUE);
`setsd.c'
5.10.1 Syntax
#include "ample.h" void initiateAmpleTrace(const AmpleData * pAmple_in);
initiateAmpleTrace writes the AMPLE trace header to the log
file. If pAmple_in->pLogFP is NULL, then nothing happens.
If tracing output is wanted, then this function should be called before
any words are analyzed, and after all control files and dictionaries
have been loaded.
initiateAmpleTrace has only one argument:
pAmple_in
none
5.10.4 Example
#include "ample.h"
...
static AmpleData sAmpleData_m;
...
AmpleWord sThisWord;
WordTemplate * pWord;
FILE * pInputFP;
FILE * pOutputFP;
char * pszOutFilename;
...
initiateAmpleTrace( &sAmpleData_m );
while ((pWord = readTemplateFromText(pInputFP,
&sAmpleData_m.sTextCtl)) != NULL)
{
sThisWord.pTemplate = pWord;
sThisWord.pHeadlists = NULL;
sThisWord.pszRemaining = NULL;
sThisWord.uiAmbigCount = 0;
sThisWord.bFoundRoot = FALSE;
if (sThisWord.pTemplate->paWord != NULL)
performAmpleAnalysis(&sThisWord, NULL, NULL, &sAmpleData_m);
writeTemplate( pOutputFP, pszOutFilename,
sThisWord_m.pTemplate, &sAmpleData_m.sTextCtl);
eraseAmpleWord( &sThisWord );
}
terminateAmpleTrace( &sAmpleData_m );
...
`anal.c'
5.11.1 Syntax
#include "ample.h"
int isAmpleAllomorphProperty(unsigned uiPropNumber_in,
const AmpleData * pAmple_in);
isAmpleAllomorphProperty tests whether or not the property
(given by number) is an allomorph property.
The arguments to isAmpleAllomorphProperty are as follows:
uiPropNumber_in
pAmple_in
TRUE if it is an allomorph property, otherwise FALSE
5.11.4 Example
#include <stdio.h>
#include "ample.h"
#include "ampledef.h" /* example uses internal data structures */
...
static AmpleData sAmpleData_m;
...
AmpleAllomorph * pAllomorph;
unsigned char * pProp;
...
printf("Allomorph %s of %s has these properties:",
pAllomorph->pszAllomorph,
pAllomorph->pMorpheme->pszMorphName);
for ( pProp = pAllomorph->pProperties ; pProp && *pProp ; ++pProp )
{
if (isAmpleAllomorphProperty(*pProp, &sAmpleData_m))
printf(" %s", findAmplePropertyName(*pProp, &sAmpleData_m));
}
printf("\n");
`proper.c'
5.12.1 Syntax
#include "ample.h"
int isAmpleMorphemeProperty(unsigned uiPropNumber_in,
const AmpleData * pAmple_in);
isAmpleMorphemeProperty tests whether or not the property
(given by number) is a morpheme property.
The arguments to isAmpleMorphemeProperty are as follows:
uiPropNumber_in
pAmple_in
TRUE if it is a morpheme property, otherwise FALSE
5.12.4 Example
#include <stdio.h>
#include "ample.h"
#include "ampledef.h" /* example uses internal data structures */
...
static AmpleData sAmpleData_m;
...
AmpleMorpheme * pMorpheme;
unsigned char * pProp;
...
printf("Morpheme %s has these properties:", pMorpheme->pszMorphName);
for ( pProp = pMorpheme->pAllomorphs->pProperties ;
pProp && *pProp ;
++pProp )
{
if (isAmpleMorphemeProperty(*pProp, &sAmpleData_m))
printf(" %s", findAmplePropertyName(*pProp, &sAmpleData_m));
}
printf("\n");
`proper.c'
5.13.1 Syntax
#include "ample.h"
int loadAmpleControlFile(const char * pszInputFile_in,
AmpleData * pAmple_io);
loadAmpleControlFile reads the main AMPLE control file, commonly
called the "analysis data file".
The arguments to loadAmpleControlFile are as follows:
pszInputFile_in
pAmple_io
zero if the file is successfully read into memory, otherwise nonzero
5.13.4 Example
#include "ample.h"
AmpleData sAmpleData_g;
char szControlFilename_g[100];
...
if (loadAmpleControlFile(szControlFilename,
sAmpleData_g.cBeginComment) != 0)
{
/* error message? */
}
`analda.c'
5.14.1 Syntax
#include "ample.h"
int loadAmpleDictCodeTables(const char * pszCodesFile_in,
AmpleData * pAmple_io,
int bUnified_in);
loadAmpleDictCodeTables reads an AMPLE dictionary code change
tables file.
The arguments to loadAmpleDictCodeTables are as follows:
pszCodesFile_in
pAmple_io
bUnified_in
TRUE, or that the dictionary is split into separate
prefix, infix, suffix, and root dictionary files if FALSE.
zero if the file is successfully read into memory, otherwise nonzero
5.14.4 Example
#include "ample.h"
AmpleData sAmpleData_g;
char szCodesFilename_g[100];
...
if (loadAmpleDictCodeTables(szCodesFilename_g,
&sAmpleData_g, FALSE) != 0)
{
/* error message? */
}
`loadtb.c'
5.15.1 Syntax
#include "ample.h"
int loadAmpleDictionary(const char * pszDictFile_in,
int eDictType_in,
AmpleData * pAmple_io);
loadAmpleDictionary reads an AMPLE dictionary file.
The arguments to loadAmpleDictionary are as follows:
pszDictFile_in
eDictType_in
AMPLE_PFX
AMPLE_IFX
AMPLE_SFX
AMPLE_ROOT
AMPLE_UNIFIED
AMPLE_PFX | AMPLE_IFX | AMPLE_SFX | AMPLE_ROOT.)
pAmple_io
zero if the file is successfully read into memory, otherwise nonzero
5.15.4 Example
#include "ample.h"
AmpleData sAmpleData_g;
char szCodesFilename_g[100];
char szDictFilename_g[100];
...
loadAmpleDictCodeTables(szCodesFilename_g, &sAmpleData_g, FALSE);
...
if (loadAmpleDictionary(szDictFilename_g,
AMPLE_PFX, &sAmpleData_g) != 0)
{
/* error message? */
}
`setsd.c'
5.16.1 Syntax
#include "ample.h"
int loadAmpleDictOrthoChanges(const char * pszDictOrthoFile_in,
AmpleData * pAmple_io);
loadAmpleDictOrthoChanges loads an ordered list of AMPLE
dictionary orthography changes from a file. These changes are applied
to the allomorphs loaded from the file before storing them in memory.
The arguments to loadAmpleDictOrthoChanges are as follows:
pszDictOrthoFile_in
pAmple_io
zero if the file is successfully read into memory, otherwise nonzero
5.16.4 Example
#include "ample.h"
AmpleData sAmpleData_g;
char szDictOrthoFilename_g[100];
...
if (szDictOrthoFilename_g[0])
{
if (loadAmpleDictOrthoChanges(szDictOrthoFilename_g,
&sAmpleData_g) != 0)
{
/* error message? */
}
}
else
sAmpleData_g.pDictOrthoChanges = (Change *)NULL;
`loadcc.c'
5.17.1 Syntax
#include "ample.h"
int loadAmpleSelectiveAnalFile(const char * pszFilename_in,
AmpleData * pAmple_io);
loadAmpleSelectiveAnalFile loads an set of morphnames and
allomorphs from a file. If loadAmpleSelectiveAnalFile is called
before loadAmpleDictionary, then only the selected morphemes and
allomorphs are stored in memory. If loadAmpleSelectiveAnalFile
is called after loadAmpleDictionary, then the current dictionary
entries are marked for selective analysis.
The arguments to loadAmpleSelectiveAnalFile are as follows:
pszDictOrthoFile_in
pAmple_io
zero if the file is successfully read into memory, otherwise nonzero
5.17.4 Example
#include "ample.h"
AmpleData sAmpleData_g;
char szSelectiveAnalFilename_g[100];
...
if (szSelectiveAnalFilename_g[0])
{
if (loadAmpleSelectiveAnalFile(szSelectiveAnalFilename_g,
&sAmpleData_g) != 0)
{
/* error message? */
}
}
else
sAmpleData_g.pSelectiveAnalMorphs = NULL;
`select.c'
5.18.1 Syntax
#include "ample.h"
unsigned performAmpleAnalysis(AmpleWord * pThisWord_io,
AmpleWord * pPreviousWord_in,
AmpleWord * pNextWord_in,
AmpleData * pAmple_in);
performAmpleAnalysis tries to analyze the wordform(s) pointed to
by pThisWord_io->pTemplate->paWord[0..n]. (Please forgive the
mixture of C and pseudoPascal in the last sentence.) The wordforms are
usually set by readTemplateFromText or an equivalent function.
(see section `readTemplateFromText' in OPAC Function Library Reference Manual.)
There is usually only one wordform, stored as
pThisWord_io->pTemplate->paWord[0]. Since the process of
decapitalization may be ambiguous, a NULL-terminated array of character
strings is used for the wordform to parse instead of simply using a
single character string.
The resulting analyses are stored in
pThisWord_io->pTemplate->pAnalyses. The original morpheme
dictionary information for each analysis is stored in
pThisWord_io->pHeadlists. The number of analyses is stored in
pThisWord_io->uiAmbigCount as well as being returned as the
function value.
The arguments to performAmpleAnalysis are as follows:
pThisWord_io
pPreviousWord_in
pNextWord_in
pAmple_in
the number of analyses produced (zero if analysis failed)
5.18.4 Example
#include "ample.h"
...
static AmpleData sAmpleData_m;
...
AmpleWord sThisWord;
WordTemplate * pWord;
FILE * pInputFP;
FILE * pOutputFP;
char * pszOutFilename;
...
initiateAmpleTrace( &sAmpleData_m );
while ((pWord = readTemplateFromText(pInputFP,
&sAmpleData_m.sTextCtl)) != NULL)
{
sThisWord.pTemplate = pWord;
sThisWord.pHeadlists = NULL;
sThisWord.pszRemaining = NULL;
sThisWord.uiAmbigCount = 0;
sThisWord.bFoundRoot = FALSE;
if (sThisWord.pTemplate->paWord != NULL)
performAmpleAnalysis(&sThisWord, NULL, NULL, &sAmpleData_m);
writeTemplate( pOutputFP, pszOutFilename,
sThisWord_m.pTemplate, &sAmpleData_m.sTextCtl);
eraseAmpleWord( &sThisWord );
}
terminateAmpleTrace( &sAmpleData_m );
...
`anal.c'
5.19.1 Syntax
#include "ample.h"
int removeFromAmpleDictionary(char * pszMorphName_in,
unsigned eType_in,
AmpleData * pAmple_io);
removeFromAmpleDictionary frees the memory allocated to store a
morpheme in the AMPLE dictionary. This removes both the morpheme data
structure and all associated allomorphs from the dictionary. The
morpheme is identified by a combination of its morphname and its
morpheme type value.
The arguments to removeFromAmpleDictionary are as follows:
pszMorphName_in
eType_in
AMPLE_PFX
AMPLE_IFX
AMPLE_SFX
AMPLE_ROOT
pAmple_io
0 if successful, 1 if an error occurs.
5.19.4 Example
#include "ample.h" AmpleData sAmpleData_g; char szCodesFilename_g[100]; char szDictFilename_g[100]; ... loadAmpleDictCodeTables(szCodesFilename_g, &sAmpleData_g, TRUE); ... loadAmpleDictionary(szDictFilename_g, AMPLE_UNIFIED, &sAmpleData_g); ... removeFromAmpleDictionary( "NOT", AMPLE_PFX, &sAmpleData_g );
`setsd.c'
5.20.1 Syntax
#include "ample.h"
void reportAmpleDictCodeTable(int eType_in,
AmpleData * pAmple_in);
reportAmpleDictCodeTable displays the size and type of the given
AMPLE dictionary code table.
The arguments to reportAmpleDictCodeTable are as follows:
eType_in
AMPLE_PFX
AMPLE_IFX
AMPLE_SFX
AMPLE_ROOT
AMPLE_UNIFIED
AMPLE_PFX | AMPLE_IFX | AMPLE_SFX | AMPLE_ROOT.)
pAmple_in
none
5.20.4 Example
#include "ample.h" AmpleData sAmpleData_g; char szCodesFilename_g[100]; char szDictFilename_g[100]; ... loadAmpleDictCodeTables(szCodesFilename_g, &sAmpleData_g, FALSE); ... reportAmpleDictCodeTable(AMPLE_PFX, &sAmpleData_g); loadAmpleDictionary(szDictFilename_g, AMPLE_PFX, &sAmpleData_g);
`loadtb.c'
5.21.1 Syntax
#include "ample.h" void resetAmpleData(AmpleData * pAmple_io);
resetAmpleData frees the memory allocated for the AMPLE control
data and reestablish the default values. The data structure pointed to
by pAmple_io is not itself freed, but any memory pointed to
by one of its elements is freed and the pointer set to NULL.
resetAmpleData has only one argument:
pAmple_io
none
5.21.4 Example
#include "ample.h"
...
typedef char FileNameBuffer[200];
AmpleData sAmpleData_m;
FileNameBuffer szAmpleControlFile_m;
FileNameBuffer szDictCodeFile_m;
FileNameBuffer szDictOrthoChgFile_m;
FileNameBuffer szPrefixFile_m;
FileNameBuffer szInfixFile_m;
FileNameBuffer szSuffixFile_m;
FileNameBuffer aszRootFiles_m[20];
FileNameBuffer szTextCtlFile_m;
int iRootFilesCount_m;
...
int i;
...
if (loadAmpleControlFile(szAmpleControlFile_m, &sAmpleData_m) != 0)
exit(1);
if (loadAmpleDictCodeTables(szDictCodeFile_m,
&sAmpleData_m, FALSE) != 0)
exit(1);
if ( szDictOrthoChgFile_m[0] &&
(loadAmpleDictOrthoChanges(szDictOrthoChgFile_m,
&sAmpleData_m) != 0))
exit(1);
if ( sAmpleData_m.iMaxPrefixCount &&
(loadAmpleDictionary(szPrefixFile_m,
AMPLE_PFX, &sAmpleData_m) != 0))
exit(1);
if ( sAmpleData_m.iMaxInfixCount &&
(loadAmpleDictionary(szInfixFile_m,
AMPLE_IFX, &sAmpleData_m) != 0))
exit(1);
if ( sAmpleData_m.iMaxSuffixCount &&
(loadAmpleDictionary(szSuffixFile_m,
AMPLE_SFX, &sAmpleData_m) != 0))
exit(1);
for ( i = 0 ; i < iRootFilesCount_m ; ++i )
{
if (loadAmpleDictionary(aszRootFiles_m[i],
AMPLE_ROOT, &sAmpleData_m) != 0)
exit(1);
}
if (loadIntxCtlFile(szTextCtlFile_m, sAmpleData_m.cBeginComment,
&sAmpleData_m.sTextCtl) != 0)
exit(1);
...
/* process the data */
...
resetAmpleData( &sAmpleData_m );
`analda.c'
5.22.1 Syntax
#include "ample.h" void terminateAmpleTrace(const AmpleData * pAmple_in);
terminateAmpleTrace writes the AMPLE trace end marker to the log
file. If pAmple_in->pLogFP is NULL, then nothing happens.
If tracing output is wanted, then this function should be called after
all of the words have been analyzed.
terminateAmpleTrace has only one argument:
pAmple_in
none
5.22.4 Example
#include "ample.h"
...
static AmpleData sAmpleData_m;
...
AmpleWord sThisWord;
WordTemplate * pWord;
FILE * pInputFP;
FILE * pOutputFP;
char * pszOutFilename;
...
initiateAmpleTrace( &sAmpleData_m );
while ((pWord = readTemplateFromText(pInputFP,
&sAmpleData_m.sTextCtl)) != NULL)
{
sThisWord.pTemplate = pWord;
sThisWord.pHeadlists = NULL;
sThisWord.pszRemaining = NULL;
sThisWord.uiAmbigCount = 0;
sThisWord.bFoundRoot = FALSE;
if (sThisWord.pTemplate->paWord != NULL)
performAmpleAnalysis(&sThisWord, NULL, NULL, &sAmpleData_m);
writeTemplate( pOutputFP, pszOutFilename,
sThisWord_m.pTemplate, &sAmpleData_m.sTextCtl);
eraseAmpleWord( &sThisWord );
}
terminateAmpleTrace( &sAmpleData_m );
...
`anal.c'
5.23.1 Syntax
#include "ample.h"
int updateAmpleDictEntry(const char * pszEntry_in,
AmpleData * pAmple_io);
updateAmpleDictEntry adds this entry to the internal AMPLE
dictionary, first deleting any existing entry with the same morphname
and type. If the dictionary codes for a unified dictionary do not
exist, the entry is assumed to use AmpleLinks Canonical Format
standard format markers. These markers would look like this in an
AMPLE dictionary codes table file:
\unified \lx \ch "\\a" "A" | allomorph \ch "\\c" "C" | category \ch "\\e" "E" | "elsewhere" allomorph \ch "\\fd" "F" | feature descriptors \ch "\\g" "G" | gloss (used in analysis output) \ch "\\loc" "L" | infix location \ch "\\mn" "M" | morphname \ch "\\o" "O" | order class \ch "\\mp" "P" | morpheme properties \ch "\\entryType" "T" | dictionary entry type \ch "\\uf" "U" | underlying form \ch "\\mcc" "Z" | morpheme co-occurrence constraint \ch "\\no" "!" | don't load \ch "\\lx" "#" | lexicon entry number (not stored)
The arguments to updateAmpleDictEntry are as follows:
pszEntry_in
NUL-terminated
standard format record character string.
pAmple_io
0 if an error occurs, 1 if an existing morpheme is replaced, or 2 if this is a new entry
5.23.4 Example
#include "ample.h"
...
static AmpleData sAmpleData_m;
static char szEntry_m[512];
...
int iStatus;
...
strncpy(szEntry_m, "\\lx update\n", 512);
strncat(szEntry_m, "\\entryType root\n", 512);
strncat(szEntry_m, "\\mn morph\n", 512);
strncat(szEntry_m, "\\a morph\n", 512);
strncat(szEntry_m, "\\c N\n", 512);
strncat(szEntry_m, "\\uf morph\n", 512);
iStatus = updateAmpleDictEntry(szEntry_m, &sAmpleData_m);
if (iStatus == 0)
printf("Error while trying to update this entry:\n%s\n",
szEntry_m);
`setsd.c'
5.24.1 Syntax
#include "ample.h"
void writeAmpleDictionary(const char * pszFilename_in,
AmpleData * pAmple_in);
writeAmpleDictionary writes the dictionary stored in memory to
the given file using the AmpleLinks Canonical Format standard
format markers. These markers would look like this in an AMPLE
dictionary codes table file:
\unified \lx \ch "\\a" "A" | allomorph \ch "\\c" "C" | category \ch "\\e" "E" | "elsewhere" allomorph \ch "\\fd" "F" | feature descriptors \ch "\\g" "G" | gloss (used in analysis output) \ch "\\loc" "L" | infix location \ch "\\mn" "M" | morphname \ch "\\o" "O" | order class \ch "\\mp" "P" | morpheme properties \ch "\\entryType" "T" | dictionary entry type \ch "\\uf" "U" | underlying form \ch "\\mcc" "Z" | morpheme co-occurrence constraint \ch "\\no" "!" | don't load \ch "\\lx" "#" | lexicon entry number
The arguments to writeAmpleDictionary are as follows:
pszFilename_in
pAmple_in
none
5.24.4 Example
#include "ample.h"
...
static AmpleData sAmpleData_m;
...
if (sAmpleData_m.iDebugLevel != 0)
writeAmpleDictionary("test.dic", &sAmpleData_m);
`putsd.c'
5.25.1 Syntax
#include "ample.h"
void writeAmpleTests(const char * pszType_in,
AmpleData * pAmple_in);
writeAmpleTests writes a list of tests to the log file in the
order that they will be applied. User defined tests are expanded to
show the internal parse trees; built-in tests are given by name.
The arguments to writeAmpleTests are as follows:
pszType_in
"Prefix"
"Infix"
"Root"
"Suffix"
"Final"
pAmple_in
none
5.25.4 Example
#include <stdio.h>
#include "ample.h"
...
static AmpleData sAmpleData_m;
static int bVerify_m;
...
char * pszFilename;
...
if (loadAmpleControlFile(pszFilename, &sAmpleData_m) != 0)
{
...
exit(1);
}
if (bVerify_m)
{
writeAmpleTests("Prefix", &sAmpleData_m);
writeAmpleTests("Infix", &sAmpleData_m);
writeAmpleTests("Root", &sAmpleData_m);
writeAmpleTests("Suffix", &sAmpleData_m);
writeAmpleTests("Final", &sAmpleData_m);
}
...
`writests.c'
Jump to: a - c - e - f - h - i - l - p - r - t - u - w
This document was generated on 20 March 2003 using texi2html 1.56k.