NOTE: THIS IS SOMEWHAT OUT OF DATE AS OF October 24, 1998. CHANGES SINCE AT LEAST July 1998 HAVE NOT YET BEEN RECORDED. AMPLE Function Library Reference Manual functions for morphological parsing version 3.2.1 October 1998 by Stephen McConnel Copyright (C) 2000 SIL International Published by: Language Software Development SIL International 7500 W. Camp Wisdom Road Dallas, TX 75236 U.S.A. Permission is granted to make and distribute verbatim copies of this file provided the copyright notice and this permission notice are preserved in all copies. The author may be reached at the address above or via email as `steve@acadcomp.sil.org'. Introduction to the AMPLE function library ****************************************** Since it was released in 1988, the AMPLE program has been used for morphological analysis in many different languages. It has always functioned as a batch processing program, which is useful for production work such as analyzing an entire book, but is less useful during the early stages of developing a morphological description. The AMPLE function library has therefore been developed with the goal of making it easier to cast AMPLE style morphological parsing into different frameworks. This has already borne fruit: the PC-PATR syntactic parser now has an embedded AMPLE morphological parser, and a Microsoft Windows DLL incorporating the AMPLE functions has been written. Variable and function naming conventions **************************************** The basic goal behind choosing names in the AMPLE function library is for the name to convey information about what it represents. This is achieved in two ways: striving for a descriptive name rather than a short cryptic abbreviated name, and following a different pattern of capitalization for each type of name. Preprocessor macro names ======================== Preprocessor macro names are written entirely in capital letters. If the name requires more than one word for an adequate description, the words are joined together with intervening underscore (`_') characters. Data structure names ==================== Data structure names consist of one or more capitalized words. If the name requires more than one word for an adequate description, the words are joined together without underscores, depending on the capitalization pattern to make them readable as separate words. Variable names ============== Variable names in the AMPLE function library follow a modified form of the Hungarian naming convention described by Steve McConnell in his book `Code Complete' on pages 202-206. Variable names have three parts: a lowercase type prefix, a descriptive name, and a scope suffix. Type prefix ----------- The type prefix has the following basic possibilities: `b' a Boolean, usually encoded as a `char', `short', or `int' `c' a character, usually a `char' but sometimes a `short' or `int' `d' a double precision floating point number, that is, a `double' `e' an enumeration, encoded as an `enum' or as a `char', `short', or `int' `i' an integer, that is, an `int', `short', `long', or (rarely) `char' `s' a data structure defined by a `struct' statement `sz' a NUL (that is, zero) terminated character string `pf' a pointer to a function In addition, the basic types may be prefixed by these qualifiers: `u' indicates that an integer or a character is unsigned `a' indicates an array of the basic type `p' indicates a pointer to the type, possibly a pointer to an array or to a pointer Descriptive name ---------------- The descriptive name portion of a variable name consists of one or more capitalized words concatenated together. There are no underscores (`_') separating these words from each other, or from the type prefix. For the AMPLE function library, the descriptive name for global variables begins with `Ample'. Scope suffix ------------ The scope suffix has these possibilities: `_g' indicates a global variable accessible throughout the program `_m' indicates a module (semiglobal) variable accessible throughout the file (declared `static') `_in' indicates a function argument used for input `_out' indicates a function argument used for output (must be a pointer) `_io' indicates a function argument used for both input and output (must be a pointer) `_s' indicates a function variable that retains its value between calls (declared `static') The lack of a scope suffix indicates that a variable is declared within a function and exists on the stack for the duration of the current call. Function names ============== Global function names in the AMPLE function library have two parts: a verb that is all lowercase followed by a noun phrase containing one or more capitalized words. These pieces are concatanated without any intervening underscores (`_'). For the AMPLE library functions, the noun phrase section includes `Ample'. Examples ======== Given the discussion above, it is easy to discern at a glance what type of item each of the following names refers to. `SAMPLE_NAME' is a preprocessor macro. `SampleName' is a data structure. `pSampleName' is a local pointer variable. `writeSampleName' is a function (that may apply to a data structure named `SampleName'). AMPLE data structures ********************* The AMPLE functions generally operate on two basic data structures: `AmpleData' stores the lexicon and other linguistic information necessary for morphological parsing, and `AmpleWord' stores the information for a single word that is being parsed. Each of these data structures is a collection of other data structures. Several of these are described in `OPAC Function Library Reference Manual', and the other data structures are usually not important for using the AMPLE function library. AmpleData ========= Definition ---------- #include #include "opaclib.h" typedef struct ample_allo_env AmpleAlloEnv; typedef struct ample_cat_class AmpleCategoryClass; typedef struct ample_fnlist AmpleTestList; typedef struct ample_hlalist AmpleHeadlistList; typedef struct ample_morph_class AmpleMorphClass; typedef struct ample_morph_constraint AmpleMorphConstraint; typedef struct ample_morpheme AmpleMorpheme; typedef struct ample_pairlist AmplePairList; typedef struct ample_prop AmpleProperty; typedef struct { /* * information provided directly by the user */ unsigned char bDebugAllomorphConds; /* -a */ unsigned char bEnableAllomorphIDs; /* -b */ unsigned char cBeginComment; /* -c */ unsigned char bRootGlosses; /* -g */ int iMaxTrieDepth; /* -d */ int iMaxMorphnameLength; /* -n */ int eTraceAnalysis; /* -t */ int iOutputFlags; /* -w -x, \\cat ... */ int iDebugLevel; /* -/ */ FILE * pLogFP; /* * information loaded from the selective analysis file */ char * pszSelectiveAnalFile; StringList * pSelectiveAnalMorphs; /* * information loaded from the text input control file */ TextControl sTextCtl; /* * information loaded from the "analysis data" (control) file */ char * pszAnalysisDataFile; AmpleTestList * pPrefixSuccTests; /* \\pt */ AmpleTestList * pRootSuccTests; /* \\rt */ AmpleTestList * pSuffixSuccTests; /* \\st */ AmpleTestList * pInfixSuccTests; /* \\it */ AmpleTestList * pFinalTests; /* \\ft */ int eWriteCategory; /* \\cat */ int bWriteMorphCats; StringList * pCategories; /* \\ca */ AmpleCategoryClass * pCategoryClasses; /* \\ccl */ char cBeginRoot; /* \\rd */ char cEndRoot; StringClass * pStringClasses; /* \\scl (all files) */ AmplePairList * pInfixAdhocPairs; /* \\iah */ AmplePairList * pPrefixAdhocPairs; /* \\pah */ AmplePairList * pRootAdhocPairs; /* \\rah */ AmplePairList * pSuffixAdhocPairs; /* \\sah */ unsigned char * pCompoundRootPairs; /* \\cr */ AmpleMorphClass * pMorphClasses; /* \\mcl */ AmpleProperty * pProperties; /* \\ap, \\mp */ StringList * pPropertySets; int iMaxPrefixCount; /* \\maxp */ int iMaxInfixCount; /* \\maxi */ int iMaxRootCount; /* \\maxr */ int iMaxSuffixCount; /* \\maxs */ AmpleMorphConstraint * pMorphConstraints; /* \\mcc */ int iMaxNullCount; /* \\maxnull */ char * pszValidChars; /* \\strcheck */ int bDictionaryCapitals; /* \\dicdecap */ /* * information loaded from the dictionary codes file */ char * pszDictionaryCodesFile; CodeTable * pPrefixTable; CodeTable * pInfixTable; CodeTable * pSuffixTable; CodeTable * pRootTable; CodeTable * pDictTable; /* * information loaded from the AMPLE dictionaries */ StringList * pDictionaryFiles; Trie * pDictionary; AmpleMorpheme * pAmpleMorphemes; AmpleAlloEnv * pAllomorphEnvs; unsigned char iInfixLocations; /* AMPLE_PFX, AMPLE_SFX, and/or AMPLE_ROOT */ /* * information loaded from the dictionary orthography change file */ char * pszDictOrthoChangeFile; Change * pDictOrthoChanges; /* * parsing variables */ short bMorphemeLookahead; short bLookaheadDone; short bMultiDependency; } AmpleData; Description ----------- `AmpleData' groups all of the information loaded from AMPLE's multitudinous control files. This simplifies the parameter lists for many of the AMPLE library functions, while minimizing the need for global variables. The fields of the `AmpleData' data structure are as follows: `bDebugAllomorphConds' causes debugging output for allomorph constraints if `TRUE' (nonzero). `bEnableAllomorphIDs' allows the allomorph identifiers to be stored in memory if `TRUE' (nonzero). This was added to support LinguaLinks. `cBeginComment' is the character that begins comments in the input control files (including the dictionaries). `bRootGlosses' causes root glosses to be output in the analysis file, and enables the internal code `G' in the dictionary code table. `iMaxTrieDepth' is the maximum depth of the dictionary trie. A value of `2' or `3' is reasonable. `iMaxMorphnameLength' is the maximum allowable length for morphnames. This must be no greater than `64'. Smaller values save memory. `eTraceAnalysis' specifies the type of analysis trace (debugging) output desired. It should be one of these three values: `AMPLE_TRACE_OFF' means that no analysis trace output is wanted. `AMPLE_TRACE_ON' means that the traditional style of indented trace output is written to the log file. `AMPLE_TRACE_SGML' means that SGML output that follows the ampletrc.dtd document type definition is written to the log file. This was added to support LinguaLinks. `iOutputFlags' is a bit vector that encodes several independent Boolean values: `WANT_DECOMPOSITION' `WANT_CATEGORY' `WANT_PROPERTIES' `WANT_FEATURES' `WANT_UNDERLYING' `WANT_ORIGINAL' For more details, see section `WordTemplate' in `OPAC Function Library Reference Manual'. `iDebugLevel' is the program debugging level. A larger number implies a larger amount of debugging output. `pLogFP' is an output `FILE' pointer opened for logging information, or is `NULL'. `pszSelectiveAnalFile' points to the name of the file containing selective analysis information. `pSelectiveAnalMorphs' points to a list of morphnames or allomorphs used for selective analysis. If it is not `NULL', only those dictionary entries that match a member of the list are used in analysis. `sTextCtl' stores the information loaded from, and the name of, the text input control file. For more details, see section `TextControl' in `OPAC Function Library Reference Manual'. `pszAnalysisDataFile' points to the name of the primary AMPLE control file (the "analysis data file"). `pPrefixSuccTests' points to the ordered list of "prefix successor tests" loaded from the analysis data file, or is `NULL'. `pRootSuccTests' points to the ordered list of "root successor tests" loaded from the analysis data file, or is `NULL'. `pSuffixSuccTests' points to the ordered list of "suffix successor tests" loaded from the analysis data file, or is `NULL'. `pInfixSuccTests' points to the ordered list of "infix successor tests" loaded from the analysis data file, or is `NULL'. `pFinalTests' points to the ordered list of "final tests" loaded from the analysis data file, or is `NULL'. `eWriteCategory' determines what kind of category information is written to the output analysis file. `AMPLE_NO_CATEGORY' means that no category information is written. This implies that `iOutputFlags & WANT_CATEGORY' is `FALSE'. `AMPLE_SUFFIX_CATEGORY' means that the last suffix probably carries the word category. This implies that `iOutputFlags & WANT_CATEGORY' is `TRUE'. `AMPLE_PREFIX_CATEGORY' means that the first prefix probably carries the word category. This implies that `iOutputFlags & WANT_CATEGORY' is `TRUE'. `bWriteMorphCats' causes all of the morpheme category information to be written to the output analysis file if `TRUE', and if `eWriteCategory' is not set to `AMPLE_NO_CATEGORY'. `pCategories' points to the ordered list of category names defined in the analysis data file. `pCategoryClasses' points to the list of category classes defined in the analysis data file, or is `NULL'. `cBeginRoot' is the character used to mark the beginning of the root morpheme field in the analysis string. `cEndRoot' is the character used to mark the end of the root morpheme field in the analysis string. `pStringClasses' points to the list of string classes defined in the analysis data file, the text input control file, and the dictionary orthography changes file, or is `NULL'. `pInfixAdhocPairs' points to the list of "infix ad hoc pairs" defined in the analysis data file, or is `NULL'. `pPrefixAdhocPairs' points to the list of "prefix ad hoc pairs" defined in the analysis data file, or is `NULL'. `pRootAdhocPairs' points to the list of "root ad hoc pairs" defined in the analysis data file, or is `NULL'. `pSuffixAdhocPairs' points to the list of "suffix ad hoc pairs" defined in the analysis data file, or is `NULL'. `pCompoundRootPairs' points to the list of "compound root category pairs" defined in the analysis data file, or is `NULL'. `pMorphClasses' points to the list of "morpheme classes" defined in the analysis data file, or is `NULL'. `pProperties' points to the list of properties (either allomorph or morpheme) defined in the analysis data file, or is `NULL'. `pPropertySets' points to a list of sets of properties used in the loaded dictionaries. This is used to conserve memory, by storing each distinct set of properties only once. `iMaxPrefixCount' is the maximum number of prefixes allowed in a word, as defined in the analysis data file. If zero, then no prefixes are allowed. `iMaxInfixCount' is the maximum number of infixes allowed in a word, as defined in the analysis data file. If zero, then no infixes are allowed. `iMaxRootCount' is the maximum number of roots allowed in a word, as defined in the analysis data file. If one, then compound roots are not allowed. `iMaxSuffixCount' is the maximum number of suffixes allowed in a word, as defined in the analysis data file. If zero, then no suffixes are allowed. `pMorphConstraints' points to the list of morpheme co-occurrence constraints defined in the analysis data file. `iMaxNullCount' is the maximum number of null allomorphs allowed in a word, as defined in the analysis data file. `pszValidChars' points to the set of valid alphabetic characters allowed in string environment constraints, as defined in the analysis data file, or is `NULL'. `bDictionaryCapitals' enables decapitalization of allomorphs in the dictionary files if set `TRUE' by the analysis data file. `pszDictionaryCodesFile' points to the name of the dictionary codes file. `pPrefixTable' points to the `CodeTable' data structure for the prefix dictionary file, or is `NULL'. For more details, see section `CodeTable' in `OPAC Function Library Reference Manual'. `pInfixTable' points to the `CodeTable' data structure for the infix dictionary file, or is `NULL'. For more details, see section `CodeTable' in `OPAC Function Library Reference Manual'. `pSuffixTable' points to the `CodeTable' data structure for the suffix dictionary file, or is `NULL'. For more details, see section `CodeTable' in `OPAC Function Library Reference Manual'. `pRootTable' points to the `CodeTable' data structure for root dictionary files, or is `NULL'. For more details, see section `CodeTable' in `OPAC Function Library Reference Manual'. `pDictTable' points to the `CodeTable' data structure for unified dictionary files, or is `NULL'. For more details, see section `CodeTable' in `OPAC Function Library Reference Manual'. `pDictionaryFiles' points to a list of dictionary filenames. `pDictionary' points to the lexicon information loaded from the dictionary files, indexed by allomorph. `pAmpleMorphemes' points to the complete list of morphemes loaded from the dictionary files. This is needed to allow morphemes to be removed from the dictionary, or to erase the entire dictionary in memory. (This logically, but not physically, duplicates the information pointed to by `pDictionary'.) `pAllomorphEnvs' points to the set of allomorph environment constraints used by all of the allomorphs in the dictionary. This is an optimization to save memory, since most allomorph environment constraints are used by allomorphs in different morphemes. `iInfixLocations' `pszDictOrthoChangeFile' points to the name of the dictionary orthography change file, or is `NULL'. `pDictOrthoChanges' points to the ordered list of orthography changes to apply to the allomorphs loaded from the dictionary files, or is `NULL'. `bMorphemeLookahead' `bLookaheadDone' `bMultiDependency' These Boolean variables are used internally while parsing. They are all involved with the need to look at morphemes in adjacent words (a buggy hack that should not be used and should not have been implemented in my opinion). Source File ----------- `ample.h' AmpleWord ========= Definition ---------- #include "template.h" typedef struct ample_hlalist AmpleHeadlistList; typedef struct { WordTemplate * pTemplate; AmpleHeadlistList * pHeadlists; char * pszRemaining; unsigned uiAmbigCount; int bFoundRoot; } AmpleWord; Description ----------- `AmpleWord' groups the information for a single word processed by AMPLE. This simplifies the parameter lists for many of the AMPLE library functions, while minimizing the need for global variables. The fields of the `AmpleWord' data structure are as follows: `pTemplate' points to a `WordTemplate' data structure that stores a word and its analyses. For more details, see section `WordTemplate' in `OPAC Function Library Reference Manual'. `pHeadlists' is used internally by the AMPLE processing functions. It points to a list of lists of morphemes, each list of morphemes representing one analysis of the word. `pszRemaining' is used internally by the AMPLE processing functions. It points to the remainder of the word that has yet to be analyzed. `uiAmbigCount' is the number of analyses for this word. (This is used internally by the AMPLE processing functions.) `bFoundRoot' is used internally by the AMPLE processing functions. It is `TRUE' if a root has been found, and `FALSE' if only prefixes and infixes have been found in the analysis process. Source File ----------- `ample.h' AMPLE global variables ********************** This chapter gives the proper usage information about each of the global variables found in the AMPLE function library. For each global variable that the library provides, this information includes which header files to include in your source to obtain the extern declaration for that variable. Note that all of the global variables in the AMPLE function library provide information about the current version. cAmplePatchSep_g ================ Syntax ------ #include "ample.h" extern const char cAmplePatchSep_g; Description ----------- `cAmplePatchSep_g' is the character used to separate the revision level number and the patch level number. It has one of the following three values: `a' for alpha test versions `b' for beta test versions `.' for release versions Example ------- See the example for `iAmpleVersion_g' below. Source File ----------- `version.c' iAmplePatchlevel_g ================== Syntax ------ #include "ample.h" extern const int iAmplePatchlevel_g; Description ----------- `iAmplePatchlevel_g' is the current "patch level" of the AMPLE function library and program. This is the third level version number, reflecting bug fixes or internal improvements that should be functionally invisible to users. The patch level can go as high as needed. It is not limited to single (or double) digit numbers. Example ------- See the example for `iAmpleVersion_g' below. Source File ----------- `version.c' iAmpleRevision_g ================ Syntax ------ #include "ample.h" extern const int iAmpleRevision_g; Description ----------- `iAmpleRevision_g' is the current "revision level" of the AMPLE program and function library. This is the second level version number, reflecting changes to program behavior that require changes to the `AMPLE Reference Manual'. The revision level can go as high as needed. It is not limited to single (or double) digit numbers. Example ------- See the example for `iAmpleVersion_g' below. Source File ----------- `version.c' iAmpleVersion_g =============== Syntax ------ #include "ample.h" extern const int iAmpleVersion_g; Description ----------- `iAmpleVersion_g' is the current "version" number of the AMPLE program and function library. This is the top level version number, reflecting a major rewrite of the program or major changes that make it incompatible with earlier versions of the program. Example ------- #include #include "ample.h" ... printf("AMPLE functions version %d.%d%c%d (%s), ", iAmpleVersion_g, iAmpleRevision_g, cAmplePatchSep_g, iAmplePatchlevel_g, pszAmpleDate_g); printf("Copyright %s SIL, Inc.\n", pszAmpleYear_g); #ifdef __DATE__ printf(pszAmpleCompileFormat_g, pszAmpleCompileDate_g, pszAmpleCompileTime_g); #else if (pszAmpleTestVersion_g != NULL) fputs(pszAmpleTestVersion_g, stdout); #endif Source File ----------- `version.c' pszAmpleCompileDate_g ===================== Syntax ------ #include "ample.h" #ifdef __DATE__ extern const char * pszAmpleCompileDate_g; #endif Description ----------- If the compiler predefines the `__DATE__' constant, `pszAmpleCompileDate_g' is a string containing the date that the AMPLE function library and program was compiled. Example ------- See the example for `iAmpleVersion_g' above. Source File ----------- `version.c' pszAmpleCompileFormat_g ======================= Syntax ------ #include "ample.h" #ifdef __DATE__ extern const char * pszAmpleCompileFormat_g; #endif Description ----------- If the compiler predefines the `__DATE__' constant, `pszAmpleCompileFormat_g' is a `printf' format string suitable for displaying the date and time that the AMPLE function library and program was compiled. Example ------- See the example for `iAmpleVersion_g' above. Source File ----------- `version.c' pszAmpleCompileTime_g ===================== Syntax ------ #include "ample.h" #ifdef __DATE__ extern const char * pszAmpleCompileTime_g; #endif Description ----------- If the compiler predefines the `__DATE__' constant, `pszAmpleCompileTime_g' is a string containing the time that the AMPLE function library and program was compiled. Example ------- See the example for `iAmpleVersion_g' above. Source File ----------- `version.c' pszAmpleDate_g ============== Syntax ------ #include "ample.h" extern const char * pszAmpleDate_g; Description ----------- `pszAmpleDate_g' is a string containing the date that the AMPLE function library and program was last modified. Example ------- See the example for `iAmpleVersion_g' above. Source File ----------- `version.c' pszAmpleTestVersion_g ===================== Syntax ------ #include "ample.h" #ifndef __DATE__ extern const char * pszAmpleTestVersion_g; #endif Description ----------- If the compiler does not predefine the `__DATE__' constant, `pszAmpleCompileDate_g' is a string describing what kind of test version it is (alpha or beta). If it is not a test version, then the string pointer is `NULL'. Example ------- See the example for `iAmpleVersion_g' above. Source File ----------- `version.c' pszAmpleYear_g ============== Syntax ------ #include "ample.h" extern const char * pszAmpleYear_g; Description ----------- `pszAmpleYear_g' is a string containing the year that the AMPLE function library and program was last copyrighted. Example ------- See the example for `iAmpleVersion_g' above. Source File ----------- `version.c' AMPLE functions *************** This document gives the proper usage information about each of the functions found in the AMPLE function library. The prototypes and type definitions relevent to the use of these functions are all found in the `ample.h' header file. addAmpleSelectiveAnalItem ========================= Syntax ------ #include "ample.h" void addAmpleSelectiveAnalItem(const char * pszMorphs_in, AmpleData * pAmple_io); Description ----------- `addAmpleSelectiveAnalItem' adds the morpheme and allomorph information to the list of morphemes and allomorphs that are used in selective analysis. The arguments to `addAmpleSelectiveAnalItem' are as follows: `pszMorphs_in' points to a `NUL'-terminated character string that encodes morphname or allomorph information. (Currently, this is just a list of morphnames or allomorphs.) `pAmple_io' points to the data structure that contains the current AMPLE language information. Return Value ------------ none Example ------- #include "ample.h" ... AmpleData sAmpleData_g; ... addAmpleSelectiveAnalItem("morph allomorph", &sAmpleData_g); ... addAmpleSelectiveAnalItem("allomorph2 morph2", &sAmpleData_g); ... Source File ----------- `setsd.c' checkAmpleMorphs ================ Syntax ------ #include "ample.h" void checkAmpleMorphs(int bCheckMorphs_in, AmpleData * pAmple_in); Description ----------- `checkAmpleMorphs' checks that all referenced morphnames are defined in the dictionaries. This requires that `initAmpleMorphChecking' be called before loading the analysis data file or any of the dictionaries. Morphname references are checked in: 1. allomorph environment constraints, 2. morpheme co-occurrence constraints, 3. user-defined tests, 4. morpheme classes, and 5. adhoc-pairs. An error message is displayed for each unrecognized morphname. Duplicate morphnames in the dictionaries are also detected. The arguments to `checkAmpleMorphs' are as follows: `bCheckMorphs_in' causes the morphname check to take place if `TRUE', or prevents it if `FALSE'. `pAmple_in' points to the data structure that contains the current AMPLE language information. Return Value ------------ none Example ------- #include "ample.h" ... initAmpleMorphChecking(TRUE); ... checkAmpleMorphs(TRUE); Source File ----------- `setsd.c' eraseAmpleWord ============== Syntax ------ #include "ample.h" void eraseAmpleWord(AmpleWord * pWord_in); Description ----------- `eraseAmpleWord' frees the memory allocated for an AMPLE word data structure. This includes the WordTemplate data structure and other fields used internally. The AMPLE word data structure itself is not freed, so it can be a static or auto variable. `eraseAmpleWord' has only one argument: `pWord_in' points to an `AmpleWord' data structure that contains information that is no longer needed. Return Value ------------ none Example ------- #include "ample.h" ... static AmpleData sAmpleData_m; ... AmpleWord sThisWord; WordTemplate * pWord; FILE * pInputFP; FILE * pOutputFP; char * pszOutFilename; ... initiateAmpleTrace( &sAmpleData_m ); while ((pWord = readTemplateFromText(pInputFP, &sAmpleData_m.sTextCtl)) != NULL) { sThisWord.pTemplate = pWord; sThisWord.pHeadlists = NULL; sThisWord.pszRemaining = NULL; sThisWord.uiAmbigCount = 0; sThisWord.bFoundRoot = FALSE; if (sThisWord.pTemplate->paWord != NULL) performAmpleAnalysis(&sThisWord, NULL, NULL, &sAmpleData_m); writeTemplate( pOutputFP, pszOutFilename, sThisWord_m.pTemplate, &sAmpleData_m.sTextCtl); eraseAmpleWord( &sThisWord ); } terminateAmpleTrace( &sAmpleData_m ); ... Source File ----------- `anal.c' findAmplePropertyName ===================== Syntax ------ #include "ample.h" char * findAmplePropertyName(unsigned uiPropNumber_in, const AmpleData * pAmple_in); Description ----------- `findAmplePropertyName' searches for the name of a property given by number. The arguments to `findAmplePropertyName' are as follows: `uiPropNumber_in' is a number assigned to an AMPLE (allomorph or morpheme) property. `pAmple_in' points to the data structure that contains the current AMPLE language information. Return Value ------------ a pointer to the property name, or `NULL' if not found Example ------- #include #include "ample.h" ... static AmpleData sAmpleData_m; ... char * pszProperty; unsigned uiProperty; ... for ( uiProperty = 1 ; uiProperty < 256 ; ++uiProperty ) { pszProperty = findAmplePropertyName(uiProperty, &sAmpleData_m); if (pszProperty != NULL) printf("Property %3u is \"%s\"\n", uiProperty, pszProperty); } Source File ----------- `proper.c' findAmplePropertyNumber ======================= Syntax ------ #include "ample.h" unsigned char findAmplePropertyNumber(const char * pszName_in, const AmpleData * pAmple_in); Description ----------- `findAmplePropertyNumber' searches for a property given by name. The arguments to `findAmplePropertyNumber' are as follows: `pszName_in' points to an AMPLE (allomorph or morpheme) property name. `pAmple_in' points to the data structure that contains the current AMPLE language information. Return Value ------------ the integer value of the property, or zero if not found Example ------- #include #include "ample.h" ... static AmpleData sAmpleData_m; ... unsigned uiProperty; char * pszProperty; ... uiProperty = findAmplePropertyNumber(pszProperty, &sAmpleData_m); if (uiProperty == 0) printf("%s is not a valid property name.\n", pszProperty); else printf("%s is property number %u.\n", pszProperty, uiProperty); Source File ----------- `proper.c' freeAmpleDictionary =================== Syntax ------ #include "ample.h" void freeAmpleDictionary(AmpleData * pAmple_io); Description ----------- `freeAmpleDictionary' frees the memory allocated to store an AMPLE dictionary. This is called by `resetAmpleData', which is the safest way to use it since the data from the dictionary files are somewhat intermingled with data from other files. `freeAmpleDictionary' has only one argument: `pAmple_io' points to the data structure that contains the current AMPLE language information. Return Value ------------ none Example ------- #include "ample.h" AmpleData sAmpleData_g; char szCodesFilename_g[100]; char szDictFilename_g[100]; ... loadAmpleDictCodeTables(szCodesFilename_g, &sAmpleData_g, TRUE); ... loadAmpleDictionary(szDictFilename_g, AMPLE_UNIFIED, &sAmpleData_g); ... freeAmpleDictionary( &sAmpleData_g ); Source File ----------- `setsd.c' freeAmpleSelectiveAnalInfo ========================== Syntax ------ #include "ample.h" void freeAmpleSelectiveAnalInfo(AmpleData * pAmple_io); Description ----------- `freeAmpleSelectiveAnalInfo' frees the memory allocated to store the selective analysis information. This also marks all of the current dictionary entries (in memory) to enable them to be used in future analysis efforts. `freeAmpleSelectiveAnalInfo' has only one argument: `pAmple_io' points to the data structure that contains the current AMPLE language data, including the selective analysis information. Return Value ------------ none Example ------- #include "ample.h" AmpleData sAmpleData_g; char szSelectiveAnalFile_g[100]; ... loadAmpleSelectiveAnalFile(szSelectiveAnalFile_g, &sAmpleData_g); ... freeAmpleSelectiveAnalInfo( &sAmpleData_g ); ... Source File ----------- `select.c' hasAmpleProperty ================ Syntax ------ #include "ample.h" int hasAmpleProperty(const unsigned char * pProperties_in, unsigned uiPropNumber_in); Description ----------- `hasAmpleProperty' checks whether `pProperties_in' contains a specific property value. `pProperties_in' is normally the combined set of allomorph and morpheme properties from a dictionary entry. This is the same as `(strchr(pProperties_in, uiPropNumber_in) != NULL)', except for using unsigned characters. The arguments to `hasAmpleProperty' are as follows: `pProperties_in' points to a `NUL'-terminated array of AMPLE (allomorph or morpheme) property numbers. `uiPropNumber_in' is a number assigned to an AMPLE (allomorph or morpheme) property. Return Value ------------ `TRUE' if the property set contains the property value, otherwise `FALSE' Example ------- #include #include "ample.h" #include "ampledef.h" /* example uses internal data structures */ ... static AmpleData sAmpleData_m; ... AmpleAllomorph * pAllomorph unsigned uiProperty; char * pszPropName; ... pszPropName = findAmplePropertyName(uiProperty, sAmpleData_m.pProperties); if (hasAmpleProperty(pAllomorph->pProperties, uiProperty)) { printf("allomorph %s of %s has property %s.\n", pAllomorph->pszAllomorph, pAllomorph->pMorpheme->pszMorphName, pszPropName); } else { printf("allomorph %s of %s does not have property %s.\n", pAllomorph->pszAllomorph, pAllomorph->pMorpheme->pszMorphName, pszPropName ? pszPropName : "(invalid property)"); } Source File ----------- `proper.c' initAmpleMorphChecking ====================== Syntax ------ #include "ample.h" void initAmpleMorphChecking(int bCheckMorphs_in); Description ----------- `initAmpleMorphChecking' initializes the internal arrays for morphname checking. If `bCheckMorphs_in' is `FALSE', then no memory is allocated and no checking can be performed. `initAmpleMorphChecking' has only one argument: `bCheckMorphs_in' allows the morphname check to take place if `TRUE', or prevents it if `FALSE'. Return Value ------------ none Example ------- #include "ample.h" ... initAmpleMorphChecking(TRUE); ... checkAmpleMorphs(TRUE); Source File ----------- `setsd.c' initiateAmpleTrace ================== Syntax ------ #include "ample.h" void initiateAmpleTrace(const AmpleData * pAmple_in); Description ----------- `initiateAmpleTrace' writes the AMPLE trace header to the log file. If `pAmple_in->pLogFP' is `NULL', then nothing happens. If tracing output is wanted, then this function should be called before any words are analyzed, and after all control files and dictionaries have been loaded. `initiateAmpleTrace' has only one argument: `pAmple_in' points to the data structure that contains the current AMPLE language information. Return Value ------------ none Example ------- #include "ample.h" ... static AmpleData sAmpleData_m; ... AmpleWord sThisWord; WordTemplate * pWord; FILE * pInputFP; FILE * pOutputFP; char * pszOutFilename; ... initiateAmpleTrace( &sAmpleData_m ); while ((pWord = readTemplateFromText(pInputFP, &sAmpleData_m.sTextCtl)) != NULL) { sThisWord.pTemplate = pWord; sThisWord.pHeadlists = NULL; sThisWord.pszRemaining = NULL; sThisWord.uiAmbigCount = 0; sThisWord.bFoundRoot = FALSE; if (sThisWord.pTemplate->paWord != NULL) performAmpleAnalysis(&sThisWord, NULL, NULL, &sAmpleData_m); writeTemplate( pOutputFP, pszOutFilename, sThisWord_m.pTemplate, &sAmpleData_m.sTextCtl); eraseAmpleWord( &sThisWord ); } terminateAmpleTrace( &sAmpleData_m ); ... Source File ----------- `anal.c' isAmpleAllomorphProperty ======================== Syntax ------ #include "ample.h" int isAmpleAllomorphProperty(unsigned uiPropNumber_in, const AmpleData * pAmple_in); Description ----------- `isAmpleAllomorphProperty' tests whether or not the property (given by number) is an allomorph property. The arguments to `isAmpleAllomorphProperty' are as follows: `uiPropNumber_in' is a number assigned to an AMPLE (allomorph or morpheme) property. `pAmple_in' points to the data structure that contains the current AMPLE language information. Return Value ------------ `TRUE' if it is an allomorph property, otherwise `FALSE' Example ------- #include #include "ample.h" #include "ampledef.h" /* example uses internal data structures */ ... static AmpleData sAmpleData_m; ... AmpleAllomorph * pAllomorph; unsigned char * pProp; ... printf("Allomorph %s of %s has these properties:", pAllomorph->pszAllomorph, pAllomorph->pMorpheme->pszMorphName); for ( pProp = pAllomorph->pProperties ; pProp && *pProp ; ++pProp ) { if (isAmpleAllomorphProperty(*pProp, &sAmpleData_m)) printf(" %s", findAmplePropertyName(*pProp, &sAmpleData_m)); } printf("\n"); Source File ----------- `proper.c' isAmpleMorphemeProperty ======================= Syntax ------ #include "ample.h" int isAmpleMorphemeProperty(unsigned uiPropNumber_in, const AmpleData * pAmple_in); Description ----------- `isAmpleMorphemeProperty' tests whether or not the property (given by number) is a morpheme property. The arguments to `isAmpleMorphemeProperty' are as follows: `uiPropNumber_in' is a number assigned to an AMPLE (allomorph or morpheme) property. `pAmple_in' points to the data structure that contains the current AMPLE language information. Return Value ------------ `TRUE' if it is a morpheme property, otherwise `FALSE' Example ------- #include #include "ample.h" #include "ampledef.h" /* example uses internal data structures */ ... static AmpleData sAmpleData_m; ... AmpleMorpheme * pMorpheme; unsigned char * pProp; ... printf("Morpheme %s has these properties:", pMorpheme->pszMorphName); for ( pProp = pMorpheme->pAllomorphs->pProperties ; pProp && *pProp ; ++pProp ) { if (isAmpleMorphemeProperty(*pProp, &sAmpleData_m)) printf(" %s", findAmplePropertyName(*pProp, &sAmpleData_m)); } printf("\n"); Source File ----------- `proper.c' loadAmpleControlFile ==================== Syntax ------ #include "ample.h" int loadAmpleControlFile(const char * pszInputFile_in, AmpleData * pAmple_io); Description ----------- `loadAmpleControlFile' reads the main AMPLE control file, commonly called the "analysis data file". The arguments to `loadAmpleControlFile' are as follows: `pszInputFile_in' points to the name of an analysis data file. `pAmple_io' points to the data structure that is filled in with the language information from the analysis data file. Return Value ------------ zero if the file is successfully read into memory, otherwise nonzero Example ------- #include "ample.h" AmpleData sAmpleData_g; char szControlFilename_g[100]; ... if (loadAmpleControlFile(szControlFilename, sAmpleData_g.cBeginComment) != 0) { /* error message? */ } Source File ----------- `analda.c' loadAmpleDictCodeTables ======================= Syntax ------ #include "ample.h" int loadAmpleDictCodeTables(const char * pszCodesFile_in, AmpleData * pAmple_io, int bUnified_in); Description ----------- `loadAmpleDictCodeTables' reads an AMPLE dictionary code change tables file. The arguments to `loadAmpleDictCodeTables' are as follows: `pszCodesFile_in' points to the name of a AMPLE dictionary code change tables file. `pAmple_io' points to the data structure that is filled in with the dictionary code change tables from the file. `bUnified_in' signals that the dictionary is in "unified" (combined affix and root) form if `TRUE', or that the dictionary is split into separate prefix, infix, suffix, and root dictionary files if `FALSE'. Return Value ------------ zero if the file is successfully read into memory, otherwise nonzero Example ------- #include "ample.h" AmpleData sAmpleData_g; char szCodesFilename_g[100]; ... if (loadAmpleDictCodeTables(szCodesFilename_g, &sAmpleData_g, FALSE) != 0) { /* error message? */ } Source File ----------- `loadtb.c' loadAmpleDictionary =================== Syntax ------ #include "ample.h" int loadAmpleDictionary(const char * pszDictFile_in, int eDictType_in, AmpleData * pAmple_io); Description ----------- `loadAmpleDictionary' reads an AMPLE dictionary file. The arguments to `loadAmpleDictionary' are as follows: `pszDictFile_in' points to the name of an AMPLE dictionary file. `eDictType_in' is one of the following values: `AMPLE_PFX' signals a prefix dictionary. `AMPLE_IFX' signals an infix dictionary. `AMPLE_SFX' signals a suffix dictionary. `AMPLE_ROOT' signals a root dictionary. `AMPLE_UNIFIED' signals a unified (combined affix and root) dictionary. (This is the same as `AMPLE_PFX | AMPLE_IFX | AMPLE_SFX | AMPLE_ROOT'.) `pAmple_io' points to the data structure that is filled in with the lexicon information from the dictionary file. Return Value ------------ zero if the file is successfully read into memory, otherwise nonzero Example ------- #include "ample.h" AmpleData sAmpleData_g; char szCodesFilename_g[100]; char szDictFilename_g[100]; ... loadAmpleDictCodeTables(szCodesFilename_g, &sAmpleData_g, FALSE); ... if (loadAmpleDictionary(szDictFilename_g, AMPLE_PFX, &sAmpleData_g) != 0) { /* error message? */ } Source File ----------- `setsd.c' loadAmpleDictOrthoChanges ========================= Syntax ------ #include "ample.h" int loadAmpleDictOrthoChanges(const char * pszDictOrthoFile_in, AmpleData * pAmple_io); Description ----------- `loadAmpleDictOrthoChanges' loads an ordered list of AMPLE dictionary orthography changes from a file. These changes are applied to the allomorphs loaded from the file before storing them in memory. The arguments to `loadAmpleDictOrthoChanges' are as follows: `pszDictOrthoFile_in' points to the name of the AMPLE dictionary orthography changes file. `pAmple_io' points to the data structure that is filled in with the dictionary orthography changes loaded from the file. Return Value ------------ zero if the file is successfully read into memory, otherwise nonzero Example ------- #include "ample.h" AmpleData sAmpleData_g; char szDictOrthoFilename_g[100]; ... if (szDictOrthoFilename_g[0]) { if (loadAmpleDictOrthoChanges(szDictOrthoFilename_g, &sAmpleData_g) != 0) { /* error message? */ } } else sAmpleData_g.pDictOrthoChanges = (Change *)NULL; Source File ----------- `loadcc.c' loadAmpleSelectiveAnalFile ========================== Syntax ------ #include "ample.h" int loadAmpleSelectiveAnalFile(const char * pszFilename_in, AmpleData * pAmple_io); Description ----------- `loadAmpleSelectiveAnalFile' loads an set of morphnames and allomorphs from a file. If `loadAmpleSelectiveAnalFile' is called before `loadAmpleDictionary', then only the selected morphemes and allomorphs are stored in memory. If `loadAmpleSelectiveAnalFile' is called after `loadAmpleDictionary', then the current dictionary entries are marked for selective analysis. The arguments to `loadAmpleSelectiveAnalFile' are as follows: `pszDictOrthoFile_in' points to the name of the AMPLE selective analysis file. `pAmple_io' points to the data structure that contains the current AMPLE language information, including the selective analysis information. Return Value ------------ zero if the file is successfully read into memory, otherwise nonzero Example ------- #include "ample.h" AmpleData sAmpleData_g; char szSelectiveAnalFilename_g[100]; ... if (szSelectiveAnalFilename_g[0]) { if (loadAmpleSelectiveAnalFile(szSelectiveAnalFilename_g, &sAmpleData_g) != 0) { /* error message? */ } } else sAmpleData_g.pSelectiveAnalMorphs = NULL; Source File ----------- `select.c' performAmpleAnalysis ==================== Syntax ------ #include "ample.h" unsigned performAmpleAnalysis(AmpleWord * pThisWord_io, AmpleWord * pPreviousWord_in, AmpleWord * pNextWord_in, AmpleData * pAmple_in); Description ----------- `performAmpleAnalysis' tries to analyze the wordform(s) pointed to by `pThisWord_io->pTemplate->paWord[0..n]'. (Please forgive the mixture of C and pseudoPascal in the last sentence.) The wordforms are usually set by `readTemplateFromText' or an equivalent function. (See section `readTemplateFromText' in `OPAC Function Library Reference Manual'.) There is usually only one wordform, stored as `pThisWord_io->pTemplate->paWord[0]'. Since the process of decapitalization may be ambiguous, a NULL-terminated array of character strings is used for the wordform to parse instead of simply using a single character string. The resulting analyses are stored in `pThisWord_io->pTemplate->pAnalyses'. The original morpheme dictionary information for each analysis is stored in `pThisWord_io->pHeadlists'. The number of analyses is stored in `pThisWord_io->uiAmbigCount' as well as being returned as the function value. The arguments to `performAmpleAnalysis' are as follows: `pThisWord_io' points to a data structure that contains the current wordform(s), and that will store the analyses of the current word. `pPreviousWord_in' points to a data structure that contains the wordform(s) and analyses of the preceding word. `pNextWord_in' points to a data structure that contains the wordform(s) and analyses of the following word. (Knowing the analyses is inherently impossible, but ... .) `pAmple_in' points to the data structure that contains the current AMPLE language information. Return Value ------------ the number of analyses produced (zero if analysis failed) Example ------- #include "ample.h" ... static AmpleData sAmpleData_m; ... AmpleWord sThisWord; WordTemplate * pWord; FILE * pInputFP; FILE * pOutputFP; char * pszOutFilename; ... initiateAmpleTrace( &sAmpleData_m ); while ((pWord = readTemplateFromText(pInputFP, &sAmpleData_m.sTextCtl)) != NULL) { sThisWord.pTemplate = pWord; sThisWord.pHeadlists = NULL; sThisWord.pszRemaining = NULL; sThisWord.uiAmbigCount = 0; sThisWord.bFoundRoot = FALSE; if (sThisWord.pTemplate->paWord != NULL) performAmpleAnalysis(&sThisWord, NULL, NULL, &sAmpleData_m); writeTemplate( pOutputFP, pszOutFilename, sThisWord_m.pTemplate, &sAmpleData_m.sTextCtl); eraseAmpleWord( &sThisWord ); } terminateAmpleTrace( &sAmpleData_m ); ... Source File ----------- `anal.c' removeFromAmpleDictionary ========================= Syntax ------ #include "ample.h" int removeFromAmpleDictionary(char * pszMorphName_in, unsigned eType_in, AmpleData * pAmple_io); Description ----------- `removeFromAmpleDictionary' frees the memory allocated to store a morpheme in the AMPLE dictionary. This removes both the morpheme data structure and all associated allomorphs from the dictionary. The morpheme is identified by a combination of its morphname and its morpheme type value. The arguments to `removeFromAmpleDictionary' are as follows: `pszMorphName_in' points to a morphname string. `eType_in' is one of the following values: `AMPLE_PFX' signals a prefix. `AMPLE_IFX' signals an infix. `AMPLE_SFX' signals a suffix. `AMPLE_ROOT' signals a root. `pAmple_io' points to the data structure that contains the current AMPLE language information, including the dictionary. Return Value ------------ 0 if successful, 1 if an error occurs. Example ------- #include "ample.h" AmpleData sAmpleData_g; char szCodesFilename_g[100]; char szDictFilename_g[100]; ... loadAmpleDictCodeTables(szCodesFilename_g, &sAmpleData_g, TRUE); ... loadAmpleDictionary(szDictFilename_g, AMPLE_UNIFIED, &sAmpleData_g); ... removeFromAmpleDictionary( "NOT", AMPLE_PFX, &sAmpleData_g ); Source File ----------- `setsd.c' reportAmpleDictCodeTable ======================== Syntax ------ #include "ample.h" void reportAmpleDictCodeTable(int eType_in, AmpleData * pAmple_in); Description ----------- `reportAmpleDictCodeTable' displays the size and type of the given AMPLE dictionary code table. The arguments to `reportAmpleDictCodeTable' are as follows: `eType_in' is one of the following values: `AMPLE_PFX' signals a prefix dictionary code table. `AMPLE_IFX' signals an infix dictionary code table. `AMPLE_SFX' signals a suffix dictionary code table. `AMPLE_ROOT' signals a root dictionary code table. `AMPLE_UNIFIED' signals a unified dictionary code table. (This is the same as `AMPLE_PFX | AMPLE_IFX | AMPLE_SFX | AMPLE_ROOT'.) `pAmple_in' points to the data structure that contains the current AMPLE language information, including the dictionary code tables. Return Value ------------ none Example ------- #include "ample.h" AmpleData sAmpleData_g; char szCodesFilename_g[100]; char szDictFilename_g[100]; ... loadAmpleDictCodeTables(szCodesFilename_g, &sAmpleData_g, FALSE); ... reportAmpleDictCodeTable(AMPLE_PFX, &sAmpleData_g); loadAmpleDictionary(szDictFilename_g, AMPLE_PFX, &sAmpleData_g); Source File ----------- `loadtb.c' resetAmpleData ============== Syntax ------ #include "ample.h" void resetAmpleData(AmpleData * pAmple_io); Description ----------- `resetAmpleData' frees the memory allocated for the AMPLE control data and reestablish the default values. The data structure pointed to by `pAmple_io' is not itself freed, but any memory pointed to by one of its elements is freed and the pointer set to NULL. `resetAmpleData' has only one argument: `pAmple_io' points to the data structure that contains the current AMPLE language information. Return Value ------------ none Example ------- #include "ample.h" ... typedef char FileNameBuffer[200]; AmpleData sAmpleData_m; FileNameBuffer szAmpleControlFile_m; FileNameBuffer szDictCodeFile_m; FileNameBuffer szDictOrthoChgFile_m; FileNameBuffer szPrefixFile_m; FileNameBuffer szInfixFile_m; FileNameBuffer szSuffixFile_m; FileNameBuffer aszRootFiles_m[20]; FileNameBuffer szTextCtlFile_m; int iRootFilesCount_m; ... int i; ... if (loadAmpleControlFile(szAmpleControlFile_m, &sAmpleData_m) != 0) exit(1); if (loadAmpleDictCodeTables(szDictCodeFile_m, &sAmpleData_m, FALSE) != 0) exit(1); if ( szDictOrthoChgFile_m[0] && (loadAmpleDictOrthoChanges(szDictOrthoChgFile_m, &sAmpleData_m) != 0)) exit(1); if ( sAmpleData_m.iMaxPrefixCount && (loadAmpleDictionary(szPrefixFile_m, AMPLE_PFX, &sAmpleData_m) != 0)) exit(1); if ( sAmpleData_m.iMaxInfixCount && (loadAmpleDictionary(szInfixFile_m, AMPLE_IFX, &sAmpleData_m) != 0)) exit(1); if ( sAmpleData_m.iMaxSuffixCount && (loadAmpleDictionary(szSuffixFile_m, AMPLE_SFX, &sAmpleData_m) != 0)) exit(1); for ( i = 0 ; i < iRootFilesCount_m ; ++i ) { if (loadAmpleDictionary(aszRootFiles_m[i], AMPLE_ROOT, &sAmpleData_m) != 0) exit(1); } if (loadIntxCtlFile(szTextCtlFile_m, sAmpleData_m.cBeginComment, &sAmpleData_m.sTextCtl) != 0) exit(1); ... /* process the data */ ... resetAmpleData( &sAmpleData_m ); Source File ----------- `analda.c' terminateAmpleTrace =================== Syntax ------ #include "ample.h" void terminateAmpleTrace(const AmpleData * pAmple_in); Description ----------- `terminateAmpleTrace' writes the AMPLE trace end marker to the log file. If `pAmple_in->pLogFP' is `NULL', then nothing happens. If tracing output is wanted, then this function should be called after all of the words have been analyzed. `terminateAmpleTrace' has only one argument: `pAmple_in' points to the data structure that contains the current AMPLE language information. Return Value ------------ none Example ------- #include "ample.h" ... static AmpleData sAmpleData_m; ... AmpleWord sThisWord; WordTemplate * pWord; FILE * pInputFP; FILE * pOutputFP; char * pszOutFilename; ... initiateAmpleTrace( &sAmpleData_m ); while ((pWord = readTemplateFromText(pInputFP, &sAmpleData_m.sTextCtl)) != NULL) { sThisWord.pTemplate = pWord; sThisWord.pHeadlists = NULL; sThisWord.pszRemaining = NULL; sThisWord.uiAmbigCount = 0; sThisWord.bFoundRoot = FALSE; if (sThisWord.pTemplate->paWord != NULL) performAmpleAnalysis(&sThisWord, NULL, NULL, &sAmpleData_m); writeTemplate( pOutputFP, pszOutFilename, sThisWord_m.pTemplate, &sAmpleData_m.sTextCtl); eraseAmpleWord( &sThisWord ); } terminateAmpleTrace( &sAmpleData_m ); ... Source File ----------- `anal.c' updateAmpleDictEntry ==================== Syntax ------ #include "ample.h" int updateAmpleDictEntry(const char * pszEntry_in, AmpleData * pAmple_io); Description ----------- `updateAmpleDictEntry' adds this entry to the internal AMPLE dictionary, first deleting any existing entry with the same morphname and type. If the dictionary codes for a unified dictionary do not exist, the entry is assumed to use "AmpleLinks Canonical Format" standard format markers. These markers would look like this in an AMPLE dictionary codes table file: \unified \lx \ch "\\a" "A" | allomorph \ch "\\c" "C" | category \ch "\\e" "E" | "elsewhere" allomorph \ch "\\fd" "F" | feature descriptors \ch "\\g" "G" | gloss (used in analysis output) \ch "\\loc" "L" | infix location \ch "\\mn" "M" | morphname \ch "\\o" "O" | order class \ch "\\mp" "P" | morpheme properties \ch "\\entryType" "T" | dictionary entry type \ch "\\uf" "U" | underlying form \ch "\\mcc" "Z" | morpheme co-occurrence constraint \ch "\\no" "!" | don't load \ch "\\lx" "#" | lexicon entry number (not stored) The arguments to `updateAmpleDictEntry' are as follows: `pszEntry_in' points to a dictionary entry encoded as a `NUL'-terminated standard format record character string. `pAmple_io' points to the data structure that contains the current AMPLE language information, including the dictionary. Return Value ------------ 0 if an error occurs, 1 if an existing morpheme is replaced, or 2 if this is a new entry Example ------- #include "ample.h" ... static AmpleData sAmpleData_m; static char szEntry_m[512]; ... int iStatus; ... strncpy(szEntry_m, "\\lx update\n", 512); strncat(szEntry_m, "\\entryType root\n", 512); strncat(szEntry_m, "\\mn morph\n", 512); strncat(szEntry_m, "\\a morph\n", 512); strncat(szEntry_m, "\\c N\n", 512); strncat(szEntry_m, "\\uf morph\n", 512); iStatus = updateAmpleDictEntry(szEntry_m, &sAmpleData_m); if (iStatus == 0) printf("Error while trying to update this entry:\n%s\n", szEntry_m); Source File ----------- `setsd.c' writeAmpleDictionary ==================== Syntax ------ #include "ample.h" void writeAmpleDictionary(const char * pszFilename_in, AmpleData * pAmple_in); Description ----------- `writeAmpleDictionary' writes the dictionary stored in memory to the given file using the "AmpleLinks Canonical Format" standard format markers. These markers would look like this in an AMPLE dictionary codes table file: \unified \lx \ch "\\a" "A" | allomorph \ch "\\c" "C" | category \ch "\\e" "E" | "elsewhere" allomorph \ch "\\fd" "F" | feature descriptors \ch "\\g" "G" | gloss (used in analysis output) \ch "\\loc" "L" | infix location \ch "\\mn" "M" | morphname \ch "\\o" "O" | order class \ch "\\mp" "P" | morpheme properties \ch "\\entryType" "T" | dictionary entry type \ch "\\uf" "U" | underlying form \ch "\\mcc" "Z" | morpheme co-occurrence constraint \ch "\\no" "!" | don't load \ch "\\lx" "#" | lexicon entry number The arguments to `writeAmpleDictionary' are as follows: `pszFilename_in' points to an output dictionary filename. `pAmple_in' points to the data structure that contains the current AMPLE language information, including the dictionary. Return Value ------------ none Example ------- #include "ample.h" ... static AmpleData sAmpleData_m; ... if (sAmpleData_m.iDebugLevel != 0) writeAmpleDictionary("test.dic", &sAmpleData_m); Source File ----------- `putsd.c' writeAmpleTests =============== Syntax ------ #include "ample.h" void writeAmpleTests(const char * pszType_in, AmpleData * pAmple_in); Description ----------- `writeAmpleTests' writes a list of tests to the log file in the order that they will be applied. User defined tests are expanded to show the internal parse trees; built-in tests are given by name. The arguments to `writeAmpleTests' are as follows: `pszType_in' points to a string describing the type of tests to write to the log file. It must be one of the following: * `"Prefix"' * `"Infix"' * `"Root"' * `"Suffix"' * `"Final"' `pAmple_in' points to the data structure that contains the current AMPLE language information, including the tests. Return Value ------------ none Example ------- #include #include "ample.h" ... static AmpleData sAmpleData_m; static int bVerify_m; ... char * pszFilename; ... if (loadAmpleControlFile(pszFilename, &sAmpleData_m) != 0) { ... exit(1); } if (bVerify_m) { writeAmpleTests("Prefix", &sAmpleData_m); writeAmpleTests("Infix", &sAmpleData_m); writeAmpleTests("Root", &sAmpleData_m); writeAmpleTests("Suffix", &sAmpleData_m); writeAmpleTests("Final", &sAmpleData_m); } ... Source File ----------- `writests.c' Bibliography ************ 1. McConnel, Stephen. 2000. `AMPLE Reference Manual'. SIL International. 2. McConnel, Stephen. 2000. `OPAC Function Library Reference Manual'. SIL International. 3. Weber, David J., H. Andrew Black, and Stephen R. McConnel. 1988. `AMPLE: a tool for exploring morphology'. Occasional Publications in Academic Computing No. 12. Dallas, TX: Summer Institute of Linguistics. 4. Weber, David J., H. Andrew Black, Stephen R. McConnel, and Alan Buseman. 1990. `STAMP: a tool for dialect adaptation'. Occasional Publications in Academic Computing No. 15. Dallas, TX: Summer Institute of Linguistics. Index ***** addAmpleSelectiveAnalItem: See ``addAmpleSelectiveAnalItem''. cAmplePatchSep_g: See ``cAmplePatchSep_g''. checkAmpleMorphs: See ``checkAmpleMorphs''. eraseAmpleWord: See ``eraseAmpleWord''. findAmplePropertyName: See ``findAmplePropertyName''. findAmplePropertyNumber: See ``findAmplePropertyNumber''. freeAmpleDictionary: See ``freeAmpleDictionary''. freeAmpleSelectiveAnalInfo: See ``freeAmpleSelectiveAnalInfo''. hasAmpleProperty: See ``hasAmpleProperty''. iAmplePatchlevel_g: See ``iAmplePatchlevel_g''. iAmpleRevision_g: See ``iAmpleRevision_g''. iAmpleVersion_g: See ``iAmpleVersion_g''. initAmpleMorphChecking: See ``initAmpleMorphChecking''. initiateAmpleTrace: See ``initiateAmpleTrace''. isAmpleAllomorphProperty: See ``isAmpleAllomorphProperty''. isAmpleMorphemeProperty: See ``isAmpleMorphemeProperty''. loadAmpleControlFile: See ``loadAmpleControlFile''. loadAmpleDictCodeTables: See ``loadAmpleDictCodeTables''. loadAmpleDictionary: See ``loadAmpleDictionary''. loadAmpleDictOrthoChanges: See ``loadAmpleDictOrthoChanges''. loadAmpleSelectiveAnalFile: See ``loadAmpleSelectiveAnalFile''. performAmpleAnalysis: See ``performAmpleAnalysis''. pszAmpleCompileDate_g: See ``pszAmpleCompileDate_g''. pszAmpleCompileFormat_g: See ``pszAmpleCompileFormat_g''. pszAmpleCompileTime_g: See ``pszAmpleCompileTime_g''. pszAmpleDate_g: See ``pszAmpleDate_g''. pszAmpleTestVersion_g: See ``pszAmpleTestVersion_g''. pszAmpleYear_g: See ``pszAmpleYear_g''. removeFromAmpleDictionary: See ``removeFromAmpleDictionary''. reportAmpleDictCodeTable: See ``reportAmpleDictCodeTable''. resetAmpleData: See ``resetAmpleData''. terminateAmpleTrace: See ``terminateAmpleTrace''. updateAmpleDictEntry: See ``updateAmpleDictEntry''. writeAmpleDictionary: See ``writeAmpleDictionary''. writeAmpleTests: See ``writeAmpleTests''. Table of Contents ***************** Introduction to the AMPLE function library Variable and function naming conventions Preprocessor macro names Data structure names Variable names Function names Examples AMPLE data structures AmpleData AmpleWord AMPLE global variables cAmplePatchSep_g iAmplePatchlevel_g iAmpleRevision_g iAmpleVersion_g pszAmpleCompileDate_g pszAmpleCompileFormat_g pszAmpleCompileTime_g pszAmpleDate_g pszAmpleTestVersion_g pszAmpleYear_g AMPLE functions addAmpleSelectiveAnalItem checkAmpleMorphs eraseAmpleWord findAmplePropertyName findAmplePropertyNumber freeAmpleDictionary freeAmpleSelectiveAnalInfo hasAmpleProperty initAmpleMorphChecking initiateAmpleTrace isAmpleAllomorphProperty isAmpleMorphemeProperty loadAmpleControlFile loadAmpleDictCodeTables loadAmpleDictionary loadAmpleDictOrthoChanges loadAmpleSelectiveAnalFile performAmpleAnalysis removeFromAmpleDictionary reportAmpleDictCodeTable resetAmpleData terminateAmpleTrace updateAmpleDictEntry writeAmpleDictionary writeAmpleTests Bibliography Index