OPAC Function Library Reference Manual

functions for linguistic data processing

July 1998

by Stephen McConnel

1. Introduction to the OPAC function library

This document describes a library of data structures and functions developed over the years for programs in the Occasional Publications in Academic Computing series published by SIL International. (For SIL International, "academic" refers to linguistics, literacy, anthropology, translation, and related fields.) It is hoped that this documentation will make future maintenance of these programs easier.

2. Variable and function naming conventions

The basic goal behind choosing names in the OPAC function library is for the name to convey information about what it represents. This is achieved in two ways: striving for a descriptive name rather than a short cryptic abbreviated name, and following a different pattern of capitalization for each type of name.

2.1 Preprocessor macro names

Preprocessor macro names are written entirely in capital letters. If the name requires more than one word for an adequate description, the words are joined together with intervening underscore (_) characters.

2.2 Data structure names

Data structure names consist of one or more capitalized words. If the name requires more than one word for an adequate description, the words are joined together without underscores, depending on the capitalization pattern to make them readable as separate words.

2.3 Variable names

Variable names in the OPAC function library follow a modified form of the Hungarian naming convention described by Steve McConnell in his book Code Complete on pages 202-206.

Variable names have three parts: a lowercase type prefix, a descriptive name, and a scope suffix.

2.3.1 Type prefix

The type prefix has the following basic possibilities:

b: a Boolean, usually encoded as a char, short, or int
c: a character, usually a char but sometimes a short or int
d: a double precision floating point number, that is, a double
e: an enumeration, encoded as an enum or as a char, short, or int
i: an integer, that is, an int, short, long, or (rarely) char
s: a data structure defined by a struct statement
sz: a NUL (that is, zero) terminated character string
pf: a pointer to a function

In addition, the basic types may be prefixed by these qualifiers:

u: indicates that an integer or a character is unsigned
a: indicates an array of the basic type
p: indicates a pointer to the type, possibly a pointer to an array or to a pointer

2.3.2 Descriptive name

The descriptive name portion of a variable name consists of one or more capitalized words concatenated together. There are no underscores (_) separating these words from each other, or from the type prefix. For the OPAC function library, the descriptive name for global variables may begin with the name of the most relevant data strucure, if any.

2.3.3 Scope suffix

The scope suffix has these possibilities:

_g: indicates a global variable accessible throughout the program
_m: indicates a module (semiglobal) variable accessible throughout the file (declared static)
_in: indicates a function argument used for input
_out: indicates a function argument used for output (must be a pointer)
_io: indicates a function argument used for both input and output (must be a pointer)
_s: indicates a function variable that retains its value between calls (declared static)

The lack of a scope suffix indicates that a variable is declared within a function and exists on the stack for the duration of the current call.

2.4 Function names

Global function names in the OPAC function library have two parts: a verb that is all lowercase followed by a noun phrase containing one or more capitalized words. These pieces are concatanated without any intervening underscores (_). For the OPAC library functions, the noun phrase section includes the name of the most relevant data strucure, if any.

2.5 Examples

Given the discussion above, it is easy to discern at a glance what type of item each of the following names refers to.

SAMPLE_NAME: is a preprocessor macro.
SampleName: is a data structure.
pSampleName: is a local pointer variable.
writeSampleName: is a function (that may apply to a data structure named SampleName).

3. The OPAC function library data structures

This chapter describes the data structures defined for the OPAC function library. These include both general purpose data collection structures and specialized linguistic processing data structures. For each data structure that the library provides, this information includes which header files to include in your source to obtain its definition.

3.1 CaselessLetter

3.1.1 Definition

#include "textctl.h"    /* or template.h or opaclib.h */

typedef struct caseless_letter {
    unsigned char *          pszLetter;
    struct caseless_letter * pNext;
    } CaselessLetter;

3.1.2 Description

The CaselessLetter data structure is normally used only inside a TextControl data structure. It stores a multibyte character string that represents a single caseless letter.

The fields of the CaselessLetter data structure are as follows:

pszLetter: points to a caseless multigraph character string. This string is one or more characters (bytes) long, and is terminated by a NUL byte.
pNext: is a pointer to facilitate keeping a list of caseless letters.

3.1.3 Source File

`textctl.h'

3.2 Change

3.2.1 Definition

#include "change.h"     /* or textctl.h or template.h or opaclib.h */

typedef struct change_list {
    char *               pszMatch;
    char *               pszReplace;
    ChangeEnvironment *  pEnvironment;
    char *               pszDescription;
    struct change_list * pNext;
    } Change;

3.2.2 Description

A Change data structure stores a single "consistent change" to apply to character strings. Such consistent changes are usually used as ordered lists of changes rather than being applied in isolation here and there.

The fields of the Change data structure are as follows:

pszMatch: points to the substring to match in the original string.
pszReplace: points to the string with which to replace matched substrings in the output.
pEnvironment: points to the list of alternative environments (if any) for this change. See section 3.3 ChangeEnvironment.
pszDescription: points to an optional comment string that describes this change.
pNext: is a pointer to facilitate keeping an ordered list of changes.

3.2.3 Source File

`change.h'

3.3 ChangeEnvironment

3.3.1 Definition

#include "change.h"     /* or textctl.h or template.h or opaclib.h */

typedef struct chg_envir {
    short               bNot;
    ChgEnvItem *        pLeftEnv;
    ChgEnvItem *        pRightEnv;
    struct chg_envir *  pNext;
    } ChangeEnvironment;

3.3.2 Description

The ChangeEnvironment data structure is normally used only inside a Change data structure.

The fields of the ChangeEnvironment data structure are as follows:

bNot: indicates the negation of this environment.
pLeftEnv: points to the environment to the left of the matched substring.
pRightEnv: points to the environment to the right of the matched substring.
pNext: points to the next alternative constraint.

3.3.3 Source File

`change.h'

3.4 ChgEnvItem

3.4.1 Definition

#include "change.h"     /* or textctl.h or template.h or opaclib.h */

typedef struct chg_env_item {
    char                                iFlags;
    union { char *        pszString;
            StringClass *  pClass;   } u;
    struct chg_env_item *               pNext;
    } ChgEnvItem;

3.4.2 Description

The ChgEnvItem data structure is normally used only inside a ChangeEnvironment data structure, which is normally used only inside a Change data structure.

The fields of the ChgEnvItem data structure are as follows:

iFlags & E_NOT: signals that this item is not wanted.
iFlags & E_CLASS: signals that this item refers to a class of strings instead of a literal string.
iFlags & E_ELLIPSIS: signals that this item may possibly not be contiguous.
iFlags & E_OPTIONAL: signals that this item is optional.
u.pszString: points to a literal string if iFlags & E_CLASS is 0.
u.pClass: points to a StringClass data structure if iFlags & E_CLASS is not 0. See section 3.8 StringClass.
pNext: points to the next item in the environment, if any.

3.4.3 Source File

`change.h'

3.5 CodeTable

3.5.1 Definition

#include "record.h"     /* or opaclib.h */

typedef struct {
    char *      pCodeTable;
    unsigned    uiCodeCount;
    char *      pszFirstCode;
    } CodeTable;

3.5.2 Description

The CodeTable data structure is used to map between the field codes used in a standard format file and single characters used in case labels inside switch statements in C code.

The fields of the CodeTable data structure are as follows:

pCodeTable: points to a primitive change string such as "match1\0A\0match2\0B\0". Note that the replacement strings are assumed to be single characters.
uiCodeCount: is the number of entries (match strings with replacement characters) in pCodeTable.
pszFirstCode: points to the record marker string, that is, the field code that marks the beginning of a record in the input file. This would usually be one of the match strings embedded in pCodeTable.

3.5.3 Source File

`record.h'

3.6 LowerLetter

3.6.1 Definition

#include "textctl.h"    /* or template.h or opaclib.h */

typedef struct lower_letter {
    unsigned char *       pszLower;
    StringList *          pUpperList;
    struct lower_letter * pNext;
    } LowerLetter;

3.6.2 Description

The LowerLetter data structure is normally used only inside a TextControl data structure. It stores a multibyte character string that represents a single lowercase letter. It also stores a list of the corresponding uppercase multigraph character strings.

The fields of the NumberedMessage data structure are as follows:

pszLower: points to a lowercase multigraph character string. This string is one or more characters (bytes) long, and is terminated by a NUL byte.
pUpperList: points to a list of uppercase multigraph character strings. This list has at least one element, but may have any number of elements if the orthography is ambiguous about converting from lowercase to uppercase forms. (This is quite unlikely, but allowed by this software.)
pNext: is a pointer to facilitate keeping a list of lowercase letters.

3.6.3 Source File

`textctl.h'

3.7 NumberedMessage

3.7.1 Definition

#include "rpterror.h"   /* or opaclib.h */

typedef struct {
    int         eType;
    unsigned    uiNumber;
    char *      pszMessage;
    } NumberedMessage;

3.7.2 Description

The NumberedMessage data structure stores the information for a single numbered error or warning message. This is the style of error reporting used by the PC-Kimmo and PC-PATR programs.

The fields of the NumberedMessage data structure are as follows:

eType

is the type of message, one of these symbolic constants:

ERROR_MSG: is a severe error that aborts the procedure.
WARNING_MSG: is a minor error that user should be made aware of.
DEBUG_MSG: is a debugging message intended for the programmer.

uiNumber

is the (unique) message number.

pszMessage

is a printf style format string for the message.

3.7.3 Source File

`rpterror.h'

3.8 StringClass

3.8.1 Definition

#include "strclass.h"   /* or change.h or textctl.h or template.h
                           or opaclib.h */

typedef struct string_class {
    char *                pszName;
    StringList *          pMembers;
    struct string_class * pNext;
    } StringClass;

3.8.2 Description

The StringClass data structure stores a labeled set of strings. The intention is that any one of the set of strings may be used in a matching operation.

The fields of the StringClass data structure are as follows:

pszName: points to the name of the string class.
pMembers: points to the list of members of the string class. See section 3.9 StringList.
pNext: is a pointer to facilitate keeping a list of string classes.

3.8.3 Source File

`strclass.h'

3.9 StringList

3.9.1 Definition

#include "strlist.h"    /* or strclass.h or change.h or textctl.h
                           or template.h or opaclib.h */

typedef struct strlist
    {
    char *              pszString;
    struct strlist *    pNext;
    } StringList;

3.9.2 Description

The StringList data structure is used to store a collection of character strings. This collection may be a set (no duplicate strings), an ordered list, or an unordered list, depending on how the programmer adds strings to the list.

The fields of the StringList data structure are as follows:

pszString: points to a stored string.
pNext: points to the next string in the list of strings.

This is one of the most commonly used data structures in the OPAC function library.

3.9.3 Source File `strlist.h'

3.10 TextControl

3.10.1 Definition

#include "textctl.h"    /* or template.h or opaclib.h */

typedef struct text_control {
    char *              pszTextControlFile;
    LowerLetter *       pLowercaseLetters;
    UpperLetter *       pUppercaseLetters;
    CaselessLetter *    pCaselessLetters;
    Change *            pOrthoChanges;
    Change *            pOutputChanges;
    StringList *        pIncludeFields;
    StringList *        pExcludeFields;
    unsigned char       cFormatMark;
    unsigned char       cAmbig;
    unsigned char       cDecomp;
    unsigned char       cBarMark;
    unsigned char *     pszBarCodes;
    char                bIndividualCapitalize;
    char                bCapitalize;
    unsigned            uiMaxAmbigDecap;
    } TextControl;

3.10.2 Description

The TextControl data structure is used to control reading a text file into a (sequence of) WordTemplate data structure(s), or writing a (sequence of) WordTemplate data structure(s) to a text file.

The fields of the TextControl data structure are as follows:

pszTextControlFile: points to the name of the file that the data is loaded from.
pLowercaseLetters: points to a list of lowercase word formation character multigraphs, each of which has a list of one or more corresponding uppercase multigraphs. This list is sorted by decreasing length of the lowercase multigraph string. See section 3.6 LowerLetter.
pUppercaseLetters: points to a list of lowercase word formation character multigraphs, each of which has a list of one or more corresponding uppercase multigraphs. This list is sorted by decreasing length of the uppercase multigraph string. See section 3.12 UpperLetter.
pCaselessLetters: points to a list of word formation character multigraphs that do not have distinct lowercase and uppercase forms. This list is sorted by decreasing length of the multigraph string. See section 3.1 CaselessLetter.
pOrthoChanges: points to an ordered list of input orthography changes. See section 3.2 Change.
pOutputChanges: points to an ordered list of output (orthography) changes. See section 3.2 Change.
pIncludeFields: points to a list of format markers (fields) to include. See section 3.9 StringList.
pExcludeFields: points to a list of format markers (fields) to exclude. See section 3.9 StringList.
cFormatMark: is the initial character of format markers.
cAmbig: is the character for marking ambiguities and failures.
cDecomp: is the character for marking decomposition of words into morphemes.
cBarMark: is the initial character of secondary format markers.
pszBarCodes: points to a string of characters for secondary format markers.
bIndividualCapitalize: flags whether or not to capitalize individual letters within words.
bCapitalize: flags whether or not to decapitalize (recapitalize) words.
uiMaxAmbigDecap: is the maximum number of ambiguous decapitalizations allowed.

3.10.3 Source File

`textctl.h'

3.11 Trie

3.11.1 Definition

#include "trie.h"       /* or opaclib.h */

typedef struct s__trienode
    {
    unsigned char        cLetter;
    struct s__trienode * pChildren;
    struct s__trienode * pSiblings;
    void *               pTrieInfo;
    } Trie;

3.11.2 Description

A trie is a data structure designed for relatively fast insertion and relatively fast retrieval of information referenced by a "key" string. See Knuth 1973, pages 481-505, for an extended treatment of tries.

The fields of the Trie data structure are as follows:

cLetter: is the letter (key character) at this node.
pChildren: points to the children Trie nodes, those that have cLetter in their key at this point.
pSiblings: points to the sibling Trie nodes, those that have an alternative to cLetter in their key at this point.
pTrieInfo: points to the stored information, which may be a linked list, an array, or anything the programmer desires.

3.11.3 Source File

`trie.h'

3.12 UpperLetter

3.12.1 Definition

#include "textctl.h"    /* or template.h or opaclib.h */

typedef struct upper_letter {
    unsigned char *       pszUpper;
    StringList *          pLowerList;
    struct upper_letter * pNext;
    } UpperLetter;

3.12.2 Description

The UpperLetter data structure is normally used only inside a TextControl data structure. It stores a multibyte character string that represents a single uppercase letter. It also stores a list of the corresponding lowercase multigraph character strings.

The fields of the NumberedMessage data structure are as follows:

pszUpper: points to a uppercase multigraph character string. This string is one or more characters (bytes) long, and is terminated by a NUL byte.
pLowerList: points to a list of lowercase multigraph character strings. This list has at least one element, but may have any number of elements if the orthography is ambiguous about converting from uppercase to lowercase forms.
pNext: is a pointer to facilitate keeping a list of uppercase letters.

Application programmers should not need to use this data structure directly, as its only use is for a list embedded in the TextControl data structure.

3.12.3 Source File `textctl.h'

3.13 WordAnalysis

3.13.1 Definition

#include "template.h"   /* or opaclib.h */

typedef struct word_analysis {
    char *                 pszAnalysis;
    char *                 pszDecomposition;
    char *                 pszCategory;
    char *                 pszProperties;
    char *                 pszFeatures;
    char *                 pszUnderlyingForm;
    char *                 pszSurfaceForm;
    struct word_analysis * pNext;
    } WordAnalysis;

3.13.2 Description

The WordAnalysis data structure is normally used a part of a WordTemplate data structure to record the result of morphological analysis.

The fields of the WordAnalysis data structure are as follows:

pszAnalysis: points to an analysis (morphname) string.
pszDecomposition: points to the surface form of the word, hyphenated to show morpheme breaks. The "hyphen" character is typically the one given by the cDecomp field of a TextControl data structure.
pszCategory: points to the probable word category, possibly followed by morpheme categories. Categories within a morpheme are separated by spaces, and morphemes are separated by equal signs (=).
pszProperties: points to the morpheme properties, if any. Properties within a morpheme are separated by spaces, and morphemes are separated by equal signs (=).
pszFeatures: points to the morpheme features, if any. Features within a morpheme are separated by spaces, and morphemes are separated by equal signs (=).
pszUnderlyingForm: points to the underlying morpheme forms, separated by the character given by the cDecomp field of a TextControl data structure.
pszSurfaceForm: points to the wordform after decapitalization and orthography changes.
pNext: is a pointer to facilitate keeping a list of alternative analyses.

3.13.3 Source File

`template.h'

3.14 WordTemplate

3.14.1 Definition

typedef struct {
    char *          pszFormat;
    char *          pszOrigWord;
    char **         paWord;
    char *          pszNonAlpha;
    short           iCapital;
    short           iOutputFlags;
    WordAnalysis *  pAnalyses;
    StringList *    pNewWords;
    } WordTemplate;

3.14.2 Description

The WordTemplate data structure is used to hold a single word for processing, with the original capitalization and punctuation preserved for restoration on output.

The fields of the WordTemplate data structure are as follows:

pszFormat

points to a string that contains any "formatting" (non-word) information prior to the word.

pszOrigWord

points to a string containing the original input word.

paWord

points to a NULL-terminated array of alternative surface forms after decapitalization and orthography changes.

pszNonAlpha

points to a string containing any "formatting" (non-word) information following the word.

iCapital

is a capitalization flag with one of the following values:

NOCAP: indicates that there are not uppercase letters in the word.
INITCAP: indicates that only the first letter of the word (that can be upppercase) is uppercase.
ALLCAP: indicates that there are no lowercase letters in the word, and two or more uppercase letters.
4-65535: indicates that the word is "mixed case", not describable by one of the standard three values. The number can be interpreted as a bit vector, where 4 is the first letter being capitalized, 8 is the second letter being capitalized, and so on. This scheme handles only the first 14 characters of the word.

iOutputFlags & WANT_DECOMPOSITION

causes the decomposition fields (pAnalyses->pszDecomposition) to be written to an output file if set (nonzero).

iOutputFlags & WANT_CATEGORY

causes the category fields (pAnalyses->pszCategory) to be written to an output file if set.

iOutputFlags & WANT_PROPERTIES

causes the property fields (pAnalyses->pszProperties) to be written to an output file if set.

iOutputFlags & WANT_FEATURES

causes the feature descriptor fields (pAnalyses->pszFeatures) to be written to an output file if set.

iOutputFlags & WANT_UNDERLYING

causes the underlying form fields (pAnalyses->pszUnderlyingForm) to be written to an output file if set.

iOutputFlags & WANT_ORIGINAL

causes the original word (pszOrigWord) to be written to an output file if set.

pAnalyses

points to a list of morphological parses produced by analysis functions, and possibly modified by transfer functions. See section 3.13 WordAnalysis.

pNewWords

points to a list of wordforms created by synthesis functions. See section 3.9 StringList.

3.14.3 Source File

`template.h'

4. The OPAC function library global variables

This chapter gives the proper usage information about each of the global variables found in the OPAC function library. For each global variable that the library provides, this information includes which header files to include in your source to obtain the extern declaration for that variable.

4.1 pfOutOfMemory_g

4.1.1 Syntax

#include "allocmem.h"   /* or opaclib.h */

extern void (* pfOutOfMemory_g)(size_t uiSize_in);

4.1.2 Description

pfOutOfMemory_g points to a function used by allocMemory and related functions whenever malloc or realloc return a NULL. This function has one argument, the size of the allocation request that failed. It is assumed that this function does not return normally, so that programs that use allocMemory do not need to check for a successful memory allocation. This can be satisfied either by aborting the program or by judicious use of setjmp and longjmp.

The default value for pfOutOfMemory_g is NULL. This causes a function to be used which simply displays an error message (using szOutOfMemoryMarker_g) and aborts the program.

4.1.3 Example

#include <stdio.h>
#include <setjmp.h>
#include "allocmem.h"
...
static jmp_buf jmpNoMemory_m;

static void out_of_memory(uiRequest_in)
size_t uiRequest_in;
{
fprintf(stderr,
        "Out of memory requesting %lu bytes---trying to recover",
        (unsigned long)uiRequest_in);
longjmp( jmpNoMemory_m, 1 );
}

char * processData()
{
char *  p;

if (setjmp( jmpNoMemory_m ))
    {
    /* free any memory left hanging in mid air */
    ...
    return NULL;
    }
pfOutOfMemory_g = out_of_memory;
p = processSafely();
pfOutOfMemory_g = NULL;         /* restore default behavior */
return p;
}

4.1.4 Source File

`allocmem.c'

4.2 pRecordBuffer_g

4.2.1 Syntax

#include "record.h"     /* or opaclib.h */

extern char * pRecordBuffer_g;

4.2.2 Description

pRecordBuffer_g points to the dynamically allocated buffer used by readStdFormatRecord for its return value. Allocating this buffer is handled automatically (but perhaps not optimally) if the programmer does not allocate it explicitly.

4.2.3 Example

#include "record.h"
#include "allocmem.h"
#define BIG_RECSIZE     16000
#define SMALL_RECSIZE     500
...
/*
 *  allocate space for records
 */
pRecordBuffer_g      = (char *)allocMemory( BIG_RECSIZE );
uiRecordBufferSize_g = BIG_RECSIZE;
...
/*
 *  reduce amount of memory allocated for records
 */
freeMemory( pRecordBuffer_g );
pRecordBuffer_g = (char *)allocMemory( SMALL_RECSIZE );
uiRecordBufferSize_g = SMALL_RECSIZE;
...
/*
 *  release memory allocated for records
 */
cleanupAfterStdFormatRecord();

4.2.4 Source File

`record.c'

4.3 szOutOfMemoryMarker_g

4.3.1 Syntax

#include "allocmem.h"   /* or opaclib.h */

extern char szOutOfMemoryMarker_g[/*101*/];

4.3.2 Description

szOutOfMemoryMarker_g is a character array used by allocMemory and friends whenever malloc or realloc return a NULL and pfOutOfMemory_g is NULL. The contents of the character array are used as part of the error message notifying the user that a request for more memory has failed.

The default value for szOutOfMemoryMarker_g is to be empty (all NUL bytes). This means that no context sensitive information is provided in the error message displayed just before the program aborts.

4.3.3 Example

#include "allocmem.h"
...
int *   piArray;
...
strncpy(szOutOfMemoryMarker_g, "creating huge array", 100);
piArray = allocMemory( 100000 * sizeof(int) );

4.3.4 Source File

`allocmem.c'

4.4 szRecordKey_g

4.4.1 Syntax

#include "record.h"     /* or opaclib.h */
/*#define MAX_RECKEY_SIZE 64*/

extern char szRecordKey_g[MAX_RECKEY_SIZE];

4.4.2 Description

readStdFormatRecord stores the first MAX_RECKEY_SIZE-1 characters following the record marker in szRecordKey_g. This may or may not be useful information.

4.4.3 Example

#include <stdio.h>
#include "record.h"
#include "rpterror.h"
...
void load_dictionary(
    char *      pszInputFile_in,
    CodeTable * pCodeTable_in,
    int         cComment_in)
{
FILE *          pInputFP;
char *          pRecord;
char *          pszField;
char *          pszNextField;
unsigned        uiRecordCount = 0;

pInputFP = fopen(pszInputFile_in, "r");
if (pInputFP == NULL)
    {
    reportError(WARNING_MSG, "Cannot open dictionary file %s\n",
                pszInputFile_in);
    return;
    }
while ((pRecord = readStdFormatRecord(pInputFP,
                                      pCodeTable_in,
                                      cComment_in,
                                      &uiRecordCount)) != NULL)
    {
    pszField = pRecord;
    while (*pszField)
        {
        pszNextField = pszField + strlen(pszField) + 1;
        switch (*pszField)
            {
            case 'A':
                ...
                break;
            case 'B':
                ...
                break;
            ...
            default:
                reportError(WARNING_MSG,
            "Warning: unrecognized field in record %u (%s)\n%s\n",
                    uiRecordCount, szRecordKey_in, pszField);
                break;
            }
        ...
        pszField = pszNextField;
        }
    ...
    }
cleanupAfterStdFormatRecord();
fclose(pInputFP);
...
}

4.4.4 Source File

`record.c'

4.5 uiRecordBufferSize_g

4.5.1 Syntax

#include "record.h"     /* or opaclib.h */

extern unsigned uiRecordBufferSize_g;

4.5.2 Description

uiRecordBufferSize_g stores the number of bytes allocated for pRecordBuffer_g.

4.5.3 Example See section 4.2 pRecordBuffer_g.

4.5.4 Source File `record.c'

4.6 uiTrieArrayBlockSize_g

4.6.1 Syntax

#include "trie.h"       /* or opaclib.h */

extern size_t uiTrieArrayBlockSize_g;

4.6.2 Description

Trie nodes are allocated uiTrieArrayBlockSize_g nodes at a time for efficiency. The default value for uiTrieArrayBlockSize_g is 2000, which minimizes the number of calls to allocateMemory, but potentially wastes several thousand bytes of memory.

4.6.3 Example

#include "strlist.h"
#include "trie.h"
...
Trie *          pLexicon = NULL;
StringList *    pNewString;
...
VOIDP addStringToList(VOIDP pNew_in, VOIDP pList_in)
{
StringList *    pList = pList_in;
StringList *    pNew  = pNew_in;

pNew->pNext = pList;
return pNew;
}
...
uiTrieArrayBlockSize_g = 63;    /* less time efficient, but
                                   more space efficient */
...
pNewString = mergeIntoStringList(NULL, "Test value");
pLexicon = addDataToTrie(pLexicon, pNewString->pszString, pNewString,
                         addStringToList, 3);

4.6.4 Source File

`trie.c'

5. The OPAC functions

This chapter gives the proper usage information about each of the functions found in the OPAC function library. For each function that the library provides, this information includes which header files to include in your source to obtain prototypes and type definitions relevent to the use of that function.

5.1 addDataToTrie

5.1.1 Syntax

#include "trie.h"       /* or opaclib.h */

Trie * addDataToTrie(Trie *       pTrieHead_io,
                     const char * pszKey_in,
                     void *       pInfo_in,
                     void * (*    pfLinkInfo_in)(void * pNew_in,
                                                 void * pList_io),
                     int          iMaxTrieDepth_in);

5.1.2 Description

addDataToTrie adds information to a trie, using the given insertion key.

The arguments to addDataToTrie are as follows:

pTrieHead_io

points to the head of a trie. This may be NULL the first time addDataToTrie is called. Each subsequent call should use the value returned by the preceding call.

pszKey_in

points to the insertion key (a character string).

pInfo_in

points to a generic data structure. The exact definition depends on the application using the Trie for data storage and retrieval.

pfLinkInfo_in

points to a function for adding information to the pTrieInfo field of the leaf Trie data structure found or created for this key. The function has two arguments:

pNew_in: points to a single data item to store.
pList_io: points to a collection of items stored at a Trie node (Trieinfo), or is NULL.

The function returns the updated pointer to the data collection for storing as the value of pTrieInfo.

iMaxTrieDepth_in

is the maximum depth to which the trie is built. If this is less than the maximum length of key strings, then the data structure stored in the trie must include the key as one of its elements for future reference.

5.1.3 Return Value

a pointer to the head of the modified trie

5.1.4 Example

#include <stdio.h>
#include <string.h>
#include "trie.h"
#include "rpterror.h"
#include "allocmem.h"
...
typedef struct lex_item {
    struct lex_item *   pLink;          /* link to next item */
    struct lex_item *   pNext;          /* link to next homograph */
    unsigned char *     pszForm;        /* lexical form (word) */
    unsigned char *     pszGloss;       /* lexical gloss */
    unsigned short      uiCategory;     /* lexical category */
    } LexItem;
...
Trie *          pLexicon_g;
unsigned long   uiLexiconCount_g;
static char     szWhitespace_m[7] = " \t\r\n\f\v";
...
static void * add_lex_item(void * pNew_in, void * pList_in)
{
LexItem *       pLex;
/*
 *  be a little paranoid
 */
if (pNew_in == NULL)
    return pList_in;
/*
 *  link the list of items that start out the same
 */
((LexItem *)pNew_in)->pLink = (LexItem *)pList_in;
/*
 *  link the list of homographs
 */
for ( pLex = (LexItem *)pList_in ; pLex ; pLex = pLex->pLink )
    {
    if (strcmp(((LexItem *)pNew_in)->pszForm, pLex->pszForm) == 0)
        {
        ((LexItem *)pNew_in)->pNext = pLex;
        break;
        }
    }
return pNew_in;
}

void load_lexicon(char * pszLexiconFile_in)
{
FILE *          pLexiconFP;
char            szBuffer[512];
char *          pszForm;
char *          pszGloss;
char *          pszCategory;
LexItem *       pLexItem;

if (pszLexiconFile_in == NULL)
    {
    reportError(WARNING_MSG, "Missing input lexicon filename\n");
    return;
    }
pLexiconFP = fopen(pszLexiconFile_in, "r");
if (pLexiconFP == NULL)
    {
    reportError(WARNING_MSG, "Cannot open lexicon file %s for input\n",
                pszLexiconFile_in);
    return;
    }
while (fgets(szBuffer, 512, pLexiconFP) != NULL)
    {
    pszForm     = strtok(szBuffer, szWhitespace_m);
    pszGloss    = strtok(NULL,     szWhitespace_m);
    pszCategory = strtok(NULL,     szWhitespace_m);
    if (    (pszForm     == NULL) ||
            (pszGloss    == NULL) ||
            (pszCategory == NULL) )
        continue;

    pLexItem = (LexItem *)allocateMemory((unsigned)sizeof(LexItem));
    pLexItem->pLink      = NULL;
    pLexItem->pNext      = NULL;
    pLexItem->pszForm    = duplicateString(pszForm);
    pLexItem->pszGloss   = duplicateString(pszGloss);
    pLexItem->uiCategory = index_lexical_category(pszCategory);

    pLexicon_g = addDataToTrie(pLexicon_g, pszForm, pLexItem,
                               add_lex_item, 3);
    ++uiLexiconCount_g;
    }
fclose(pLexiconFP);
}

5.1.5 Source File

`trie.c'

5.2 addLowerUpperWFChars

5.2.1 Syntax

#include "textctl.h"    /* or template.h or opaclib.h */

void addLowerUpperWFChars(char *        pszLUPairs_in,
                          TextControl * pTextCtl_io);

5.2.2 Description

addLowerUpperWFChars scans the input string for character pairs. The first member of each pair is added to the set of (multibyte) lowercase alphabetic characters, and the second member is added to the set of (multibyte) uppercase alphabetic characters. Note that there may be a many-to-many mapping between lowercase and uppercase characters.

The arguments to addLowerUpperWFChars are as follows:

pszLUPairs_in: points to a string containing lowercase/UPPERCASE character pairs. Whitespace characters in the string are ignored.
pTextCtl_io: points to a data structure that contains orthographic information, including the mappings between lowercase and uppercase letters.

5.2.3 Return Value

none

5.2.4 Example

#include "textctl.h"
...
TextControl sTextInputCtl_m;
...
void set_alphabetic(pszField_in)
char *  pszField_in;
{
int     code;
char *  psz;

psz  = pszField_in;
code = *psz++;
switch (code)
    {
    case 'A':   /* alphabetic (word formation) characters */
        addWordFormationChars(psz, &sTextInputCtl_m);
        break;
    case 'L':   /* lower-upper word formation characters */
        addLowerUpperWFChars(psz, &sTextInputCtl_m);
        break;
    case 'a':   /* multibyte alphabetic (word formation) characters */
        addWordFormationCharStrings(psz, &sTextInputCtl_m);
        break;
    case 'l':   /* multibyte lower-upper word formation characters */
        addLowerUpperWFCharStrings(psz, &sTextInputCtl_m);
        break;
    default:
        break;
    }
}

void reset_alphabetic()
{
resetWordFormationChars(&sTextInputCtl_m);
}

5.2.5 Source File

`myctype.c'

5.3 addLowerUpperWFCharStrings

5.3.1 Syntax

#include "textctl.h"    /* or template.h or opaclib.h */

void addLowerUpperWFCharStrings(char *        pszLUPairs_in,
                                TextControl * pTextCtl_io);

5.3.2 Description

addLowerUpperWFCharStrings scans the input string for pairs of multibyte characters. The first member of each pair is added to the set of multibyte lowercase alphabetic characters, and the second member is added to the set of multibyte uppercase alphabetic characters. Note that there may be a many-to-many mapping between lowercase and uppercase multibyte characters.

The arguments to addLowerUpperWFChars are as follows:

pszLUPairs_in: points to a string containing multibyte lowercase/UPPERCASE character pairs. Whitespace is used to separate the multibyte characters from each other, and the pairs from each other.
pTextCtl_io: points to a data structure that contains orthographic information, including the mappings between lowercase and uppercase letters.

5.3.3 Return Value

none

5.3.4 Example See section 5.2 addLowerUpperWFChars.

5.3.5 Source File `myctype.c'

5.4 addStringClass

5.4.1 Syntax

#include "strclass.h"   /* or change.h or textctl.h or template.h
                           or opaclib.h */

StringClass * addStringClass(char *        pszField_in,
                             StringClass * pClasses_io);

5.4.2 Description

addStringClass adds a string class to the list of string classes. String classes are used in string environments such as those in the consistent change notation supported by the OPAC function library.

The arguments to addStringClass are as follows:

pszField_in: points to a string containing a string class definition: the class name followed by the set of members.
pClasses_io: points to the list of string classes. This may be NULL the first time addStringClass is called. Each subsequent call should use the value returned by the preceding call.

5.4.3 Return Value

a pointer to the head of the updated list of string classes

5.4.4 Example

#include "change.h"     /* includes strclass.h */
...
static Change *         pChanges_m = NULL;
static StringClass *    pClasses_m = NULL;
...
void store_change_info(pszField_in)
char *  pszField_in;
{
Change *        pChg;
char *          psz;
int             code;

if (pszField_in == NULL)
    return;
psz  = pszField_in;
code = *psz++;          /* grab the table code */
switch (code)
    {
    case 'C':           /* change */
        pChg = parseChangeString( psz, pClasses_m );
        if (pChg != (Change *)NULL)
            {
            pChg->pNext = pChanges_m;
            pChanges_m = pChg;
            }
        break;
    case 'S':           /* string class */
        pClasses_m = addStringClass( psz, pClasses_m );
        break;
    default:
        break;
    }
}

5.4.5 Source File

`strcla.c'

5.5 addToStringList

5.5.1 Syntax

#include "strlist.h"

StringList * addToStringList(StringList * pList_in,
                             const char * pszString_in);

5.5.2 Description

addToStringList adds a string to the beginning of a list of strings. It does not check whether the string is already in the list.

The arguments to addToStringList are as follows:

pList_in: points to a list of strings. It may be NULL to signal an empty list.
pszString_in: points to a NUL-terminated character string.

5.5.3 Return Value

a pointer to the revised list

5.5.4 Example

#include "strlist.h"
...
StringList * pStrings = NULL;
...
                /* pStrings-->NULL */
pStrings = addToStringList(pStrings, "this");
                /* pStrings-->"this"-->NULL */
pStrings = addToStringList(pStrings, "test");
                /* pStrings-->"test"-->"this"-->NULL */
pStrings = addToStringList(pStrings, "is");
                /* pStrings-->"is"-->"test"-->"this"-->NULL */
pStrings = addToStringList(pStrings, "a");
                /* pStrings-->"a"-->"is"-->"test"-->"this"-->NULL */
pStrings = addToStringList(pStrings, "test");
                /* pStrings-->"test"-->"a"-->"is"-->"test"-->"this"-->NULL */

5.5.5 Source File

`add_sl.c'

5.6 addWordFormationChars

5.6.1 Syntax

#include "textctl.h"    /* or template.h or opaclib.h */

void addWordFormationChars(char *        pszLetters_in,
                           TextControl * pTextCtl_io);

5.6.2 Description

addWordFormationChars scans the input string for non-whitespace characters. Each such character is added to the set of alphabetic characters that do not have a lowercase/UPPERCASE distinction. (An English example would be the apostrophe character.)

The arguments to addWordFormationChars are as follows:

pszLetters_in: points to a string containing (caseless) alphabetic characters.
pTextCtl_io: points to a data structure that contains orthographic information.

5.6.3 Return Value

none

5.6.4 Example See section 5.2 addLowerUpperWFChars.

5.6.5 Source File `myctype.c'

5.7 addWordFormationCharStrings

5.7.1 Syntax

#include "textctl.h"    /* or template.h or opaclib.h */

void addWordFormationCharStrings(char *        pszLetters_in,
                                 TextControl * pTextCtl_io);

5.7.2 Description

addWordFormationCharStrings scans the input string for multibyte characters. Each such multibyte character sequence is added to the set of multibyte caseless alphabetic characters.

The arguments to addWordFormationCharStrings are as follows:

pszLetters_in: points to a string containing multibyte (caseless) alphabetic characters. Whitespace separates the multibyte characters.
pTextCtl_io: points to a data structure that contains orthographic information.

5.7.3 Return Value

none

5.7.4 Example See section 5.2 addLowerUpperWFChars.

5.7.5 Source File `myctype.c'

5.8 allocMemory

5.8.1 Syntax

#include "allocmem.h"   /* or opaclib.h */

void * allocMemory(size_t uiSize_in);

5.8.2 Description

allocMemory provides a "safe" interface to malloc. If the requested memory cannot be allocated, the function pointed to by pfOutOfMemory_g is called. If pfOutOfMemory_g is NULL, then the default behavior is to display an error message incorporating the string stored in szOutOfMemoryMarker_g and abort the program.

It is assumed that allocMemory always returns a good value. This implies that any function pointed to by pfOutOfMemory_g either aborts the program or uses longjmp to escape to a safe place in the program.

allocMemory has a single argument:

uiSize_in: is the number of bytes to allocate.

5.8.3 Return Value

a pointer to the beginning of the memory area allocated

5.8.4 Example

#include "allocmem.h"
...
char * p;
...
p = allocMemory(75);

5.8.5 Source File

`allocmem.c'

5.9 applyChanges

5.9.1 Syntax

#include "change.h"     /* or textctl.h or template.h or opaclib.h */

char * applyChanges(const char *   pszString_in,
                    const Change * pChangeList_in);

5.9.2 Description

applyChanges applies a list of consistent changes to a string. The function steps through the list of changes, applying each change as often as necessary before trying the next change in the list. The input string is not changed; rather, a copy is created, modified, and returned.

The arguments to applyChanges are as follows:

pszString_in: points to a string to be changed.
pChangeList_in: points to a list of changes to apply to the string.

5.9.3 Return Value

a pointer to a dynamically allocated and (possibly) changed string

5.9.4 Example

#include "change.h"
...
Change * pChanges_m;
...
char * pszChanged;
...
pszChanged = applyChanges("this is a test", pChanges_m);
...
freeMemory( pszChanged );

5.9.5 Source File

`change.c'

5.10 buildAdjustedFilename

5.10.1 Syntax

#include "opaclib.h"

char * buildAdjustedFilename(const char * pszFilename_in,
                             const char * pszBasePathname_in,
                             const char * pszExtension_in);

5.10.2 Description

buildAdjustedFilename builds a filename from the pieces given. If the base pathname contains directory information, and the input filename is not an absolute pathname, the leading directory information is added to the output filename. If the extension is given, and the input filename does not have an extension, the extension is added to the output filename if the file cannot be opened for input without it.

The arguments to buildAdjustedFilename are as follows:

pszFilename_in: points to a filename string.
pszBasePathname_in: points to a base file pathname string, or is NULL.
pszExtension_in: points to a filename extension string, or is NULL.

5.10.3 Return Value

a pointer to a dynamically allocated filename string

5.10.4 Example

#include "opaclib.h"
...
int readControlFile(char * pszControlFile_in)
{
char * pszIncludeFile;
char   szBuffer[512];
FILE * pControlFP;
char * p;

pControlFP = fopen(pszControlFile_in, "r");
if (pControlFP == NULL)
    return 0;
while (fgets(szBuffer, 512, pControlFP) != NULL)
    {
    p = szBuffer + strlen(szBuffer) - 1;
    if (*p == '\n')
        *p = '\0';
    if (strncmp(szBuffer, "\\include", 8) == 0)
        {
        pszIncludeFile = szBuffer + 8;
        pszIncludeFile += strspn(pszIncludeFile, " \t\r\n\f");
        if (*pszIncludeFile == '\0')
            continue;
        pszIncludeFile = buildAdjustedFilename(pszIncludeFile,
                                               pszControlFile_in,
                                               ".ctl");
        readControlFile(pszIncludeFile);
        freeMemory(pszIncludeFile);
        }
    ...
    }
fclose(pControlFP);
return 1;
}

5.10.5 Source File

`adjfname.c'

5.11 buildChangeString

5.11.1 Syntax

#include "change.h"     /* or textctl.h or template.h or opaclib.h */

char * buildChangeString(const Change * pChange_in);

5.11.2 Description

buildChangeString builds a textual representation of the given consistent change data structure.

buildChangeString has one argument:

pChange_in: points to a single consistent change data structure. (The pNext field of the Change data structure is ignored.)

5.11.3 Return Value

a pointer to a dynamically allocated string representing the change, or NULL if an error occurs

5.11.4 Example

#include "change.h"
...
void displayChangeList(Change * pChanges_in)
{
Change *        pChange;
char *          pszChange;

for ( pChange = pChanges_in ; pChange ; pChange = pChange->pNext )
    {
    pszChange = buildChangeString( pChange );
    fprintf(stderr, "%s\n", pszChange);
    freeMemory( pszChange );
    }
}

5.11.5 Source File

`change.c'

5.12 checkFileError

5.12.1 Syntax

#include <stdio.h>
#include "opaclib.h"

void checkFileError(FILE *       pOutputFP_in,
                    const char * pszProcessName_in,
                    const char * pszFilename_in);

5.12.2 Description

checkFileError checks for an error in the output file pOutputFP_in whose name is given by pszFilename_in. If an error occurred, the output file is deleted and the program exits with an error message.

The arguments to checkFileError are as follows:

pOutputFP_in: is an output FILE pointer.
pszProcessName_in: points to a string indicating where the error occurred.
pszFilename_in: points to the name of the output file.

5.12.3 Return Value

none

5.12.4 Example

#include <stdio.h>
#include "cportlib.h"
...
FILE * fp;
char   filename[100];
...
checkFileError(fp, "Program Name", filename);
fclose(fp);

5.12.5 Source File

`fulldisk.c'

5.13 cleanupAfterStdFormatRecord

5.13.1 Syntax

#include "record.h"     /* or opaclib.h */

void cleanupAfterStdFormatRecord(void);

5.13.2 Description

cleanupAfterStdFormatRecord frees any memory allocated for readStdFormatRecord.

cleanupAfterStdFormatRecord does not have any arguments.

5.13.3 Return Value none

5.13.4 Example

#include <stdio.h>
#include "record.h"
static CodeTable sLexTable_m = { "\\w\0W\0\\c\0C\\f\0F\\g\0G\0",
                                  4, "\\w" };
...
int load_lexicon(pszLexiconFile_in, cComment_in)
char *  pszLexiconFile_in;
int     cComment_in;
{
FILE *          fp;
unsigned        uiRecordCount = 0;
char *          pRecord;
/*
 *  open the lexicon file
 */
if (pszLexiconFile_in == NULL)
    return( 0 );
fp = fopen(pszLexiconFile_in, "r");
if (fp == (FILE *)NULL)
    return( 0 );
/*
 *  load all the records from the lexicon file
 */
uiRecordCount = 0;
while ((pRecord = readStdFormatRecord(fp,
                                      &sLexTable_m,
                                      cComment_in,
                                      &uiRecordCount)) != NULL)
    {
    ...
    }
/*
 *  close the lexicon file and erase the temporary data structures
 */
fclose(fp);
cleanupAfterStdFormatRecord();
return( 1 );
}

5.13.5 Source File

`record.c'

5.14 convLowerToUpper

5.14.1 Syntax

#include "textctl.h"    /* or template.h or opaclib.h */

const unsigned char * convLowerToUpper(const unsigned char * pszString_in,
                                       const TextControl *   pTextCtl_in);

5.14.2 Description

convLowerToUpper checks whether the input string begins with a multibyte lowercase character. If so, it returns the (first) corresponding multibyte uppercase character.

This function depends on previous calls to addLowerUpperWFChars or addLowerUpperWFCharStrings to establish the mappings between lowercase and uppercase multibyte characters. (addLowerUpperWFChars and addLowerUpperWFCharStrings are implicitly called by loadIntxCtlFile and loadOutxCtlFile.)

The arguments to convLowerToUpper are as follows:

pszString_in: points to a NUL-terminated character string.
pTextCtl_in: points to a data structure that contains orthographic information, including the mappings between lowercase and uppercase letters.

5.14.3 Return Value

a pointer to a NUL-terminated string containing the (primary) corresponding multibyte uppercase character, or NULL if the input string does not begin with a multibyte lowercase character. This may point to a static buffer that may be overwritten by the next call to convLowerToUpper.

5.14.4 Example

#include "textctl.h"
...
static TextControl      sTextCtl_m;
static StringClass *    pStringClasses_m;
static char             szOutxFilename_m[100];
...
loadOutxCtlFile(szOutxFilename_m, ';', &sTextCtl_m, &pStringClasses_m);
...
unsigned char * upcaseString(unsigned char * pszString_in)
{
size_t          iCharSize;
size_t          iUCSize;
size_t          iUpperLength;
unsigned char * p;
unsigned char * pUC;
unsigned char * pszUpper;
unsigned char * q;

if (pszString_in == NULL)
    return NULL;
for ( p = pszString_in ; *p ; p += iCharSize )
    {
    if ((iCharSize = matchAlphaChar(p, &sTextCtl_m)) == 0)
        iCharSize = 1;
    pUC = convLowerToUpper(p, &sTextCtl_m);
    if (pUC != NULL)
        iUpperLength += strlen((char *)pUC);
    else
        iUpperLength += iCharSize;
    }
pszUpper = allocMemory(iUpperLength + 1);
for ( p = pszString_in, q = pszUpper ; *p ; p += iCharSize )
    {
    if ((iCharSize = matchAlphaChar(p, &sTextCtl_m)) == 0)
        iCharSize = 1;
    pUC = convLowerToUpper(p, &sTextCtl_m);
    if (pUC != NULL)
        {
        iUCSize = strlen((char *)pUC);
        memcpy(q, pUC, iUCSize);
        q += iUCSize;
        }
    else
        {
        memcpy(q, p, iCharSize);
        q += iCharSize;
        }
    }
pszUpper[iUpperLength] = NUL;
return pszUpper;
}

5.14.5 Source File

`myctype.c'

5.15 convLowerToUpperSet

5.15.1 Syntax

#include "textctl.h"    /* or template.h or opaclib.h */

const StringList * convLowerToUpperSet(const unsigned char * pszString_in,
                                       const TextControl *   pTextCtl_in);

5.15.2 Description

convLowerToUpperSet checks whether the input string begins with a multibyte lowercase character. If so, it returns the complete set of corresponding multibyte uppercase characters.

The arguments to convLowerToUpperSet are as follows:

pszString_in: points to a NUL-terminated character string.
pTextCtl_in: points to a data structure that contains orthographic information, including the mappings between lowercase and uppercase letters.

5.15.3 Return Value

a pointer to a list of NUL-terminated strings containing the corresponding multibyte uppercase characters, or NULL if the input string does not begin with a multibyte lowercase character. This may point to a static buffer that may be overwritten by the next call to convLowerToUpperSet.

5.15.4 Example #include "textctl.h" #include "rpterror.h" ... StringList * upcaseWord(pszWord_in, pTextCtl_in) char * pszWord_in; const TextControl * pTextCtl_in; { size_t uiCharCount; size_t uiLowerCount; size_t uiNumberAlternatives; size_t uiSpan; size_t uiWordLength; size_t k; int iLength; unsigned char * p; StringList * pUpcaseList = NULL; const StringList * pUpperSet; const StringList * ps; /* * count the number of multibyte characters in the string * count the lowercase letters * calculate the number of (ambiguous) upcase conversions * calculate the maximum length of the upcased word */ uiCharCount = 0; uiLowerCount = 0; uiNumberAlternatives = 1; uiWordLength = 1; /* count the terminating NUL byte */ for ( p = (unsigned char *)pszWord_in ; *p != NUL ; p += iLength ) { iLength = matchAlphaChar(p, pTextCtl_in); if (iLength == 0) iLength = 1; ++uiCharCount; if (matchLowercaseChar(p, pTextCtl_in) != 0) { ++uiLowerCount; pUpperSet = convLowerToUpperSet(p, pTextCtl_in); uiNumberAlternatives *= getStringListSize( pUpperSet ); uiSpan = 0; for ( ps = pUpperSet ; ps ; ps = ps->pNext ) { k = strlen( ps->pszString ); if (k > uiSpan) uiSpan = k; } } else uiSpan = iLength; uiWordLength += uiSpan; } if (uiLowerCount == 0) { /* * the word is already all uppercase */ return addToStringList(NULL, pszWord_in); } else { /* * convert word to all uppercase (possibly ambiguosly) */ char * pszCapWord; char * pszUpper; size_t uiNum; int iUpperLength; size_t i; size_t j;

if (uiNumberAlternatives < 1) { reportError(ERROR_MSG, "error getting uppercase equivalents for \"%s\"\n", pszWord_in); return NULL; } if (uiNumberAlternatives > 500) { reportError(WARNING_MSG, "%lu uppercase equivalents is too many: storing only 500\n", uiNumberAlternatives); uiNumberAlternatives = 500; } pszCapWord = allocMemory(uiWordLength); for ( i = 0 ; i < uiNumberAlternatives ; ++i ) { strcpy(pszCapWord, pszWord_in); uiSpan = 1; j = 0; for ( p = (unsigned char *)pszCapWord ; *p ; p += iLength ) { iLength = matchLowercaseChar(p, pTextCtl_in); if (iLength != 0) { pUpperSet = convLowerToUpperSet(p, pTextCtl_in); uiNum = getStringListSize(pUpperSet); pszUpper = pUpperSet->pszString; if (uiNum > 1) { k = (i / uiSpan) % uiNum; uiSpan *= uiNum; for ( ps = pUpperSet ; ps ; ps = ps->pNext ) { if (k == 0) { pszUpper = ps->pszString; break; } --k; } } /* * replace the lowercase multibyte character with an * equivalent uppercase multibyte character */ iUpperLength = strlen(pszUpper); if (iUpperLength != iLength) memmove(p + iUpperLength, p + iLength, strlen((char *)p + iLength) + 1); memcpy(p, pszUpper, iUpperLength); iLength = iUpperLength; } else { iLength = matchAlphaChar(p, pTextCtl_in); if (iLength == 0) iLength = 1; } ++j; } pUpcaseList = addToStringList(pUpcaseList, pszCapWord); } freeMemory( pszCapWord ); } return pUpcaseList; }

5.15.5 Source File `myctype.c'

5.16 convUpperToLower

5.16.1 Syntax

#include "textctl.h"    /* or template.h or opaclib.h */

const unsigned char * convUpperToLower(const unsigned char * pszString_in,
                                       const TextControl *   pTextCtl_in);

5.16.2 Description

convUpperToLower checks whether the input string begins with a multibyte uppercase character. If so, it returns the (first) corresponding multibyte lowercase character.

The arguments to convUpperToLower are as follows:

pszString_in: points to a NUL-terminated character string.
pTextCtl_in: points to a data structure that contains orthographic information, including the mappings between lowercase and uppercase letters.

5.16.3 Return Value

a pointer to a NUL-terminated string containing the (primary) corresponding multibyte lowercase character, or NULL if the input string does not begin with a multibyte uppercase character. This may point to a static buffer that may be overwritten by the next call to convUpperToLower.

5.16.4 Example

#include "textctl.h"
...
static TextControl      sTextCtl_m;
static StringClass *    pStringClasses_m;
static char             szIntxFilename_m[100];
...
loadIntxCtlFile(szIntxFilename_m, ';', &sTextCtl_m, &pStringClasses_m);
...
unsigned char * downcaseString(unsigned char * pszString_in)
{
size_t          iCharSize;
size_t          iLCSize;
size_t          iLowerLength;
unsigned char * p;
unsigned char * pLC;
unsigned char * pszLower;
unsigned char * q;

if (pszString_in == NULL)
    return NULL;
for ( p = pszString_in ; *p ; p += iCharSize )
    {
    if ((iCharSize = matchAlphaChar(p, &sTextCtl_m)) == 0)
        iCharSize = 1;
    pLC = convUpperToLower(p, &sTextCtl_m);
    if (pLC != NULL)
        iLowerLength += strlen((char *)pLC);
    else
        iLowerLength += iCharSize;
    }
pszLower = allocMemory(iLowerLength + 1);
for ( p = pszString_in, q = pszLower ; *p ; p += iCharSize )
    {
    if ((iCharSize = matchAlphaChar(p, &sTextCtl_m)) == 0)
        iCharSize = 1;
    pLC = convUpperToLower(p, &sTextCtl_m);
    if (pLC != NULL)
        {
        iLCSize = strlen((char *)pLC);
        memcpy(q, pLC, iLCSize);
        q += iLCSize;
        }
    else
        {
        memcpy(q, p, iCharSize);
        q += iCharSize;
        }
    }
pszLower[iLowerLength] = NUL;
return pszLower;
}

5.16.5 Source File

`myctype.c'

5.17 convUpperToLowerSet

5.17.1 Syntax

#include "textctl.h"    /* or template.h or opaclib.h */

const StringList * convUpperToLowerSet(const unsigned char * pszString_in,
                                       const TextControl *   pTextCtl_in);

5.17.2 Description

convUpperToLowerSet checks whether the input string begins with a multibyte uppercase character. If so, it returns the complete set of corresponding multibyte lowercase characters.

The arguments to convUpperToLowerSet are as follows:

pszString_in: points to a NUL-terminated character string.
pTextCtl_in: points to a data structure that contains orthographic information, including the mappings between lowercase and uppercase letters.

5.17.3 Return Value

a pointer to a list of NUL-terminated strings containing the corresponding multibyte lowercase characters, or NULL if the input string does not begin with a multibyte uppercase character. This may point to a static buffer that may be overwritten by the next call to convUpperToLowerSet.

5.17.4 Example

#include "textctl.h"
#include "rpterror.h"
...
StringList * downcaseWord(pszWord_in, pTextCtl_in)
char *              pszWord_in;
const TextControl * pTextCtl_in;
{
size_t              uiCharCount;
size_t              uiUpperCount;
size_t              uiNumberAlternatives;
size_t              uiSpan;
size_t              uiWordLength;
size_t              k;
int                 iLength;
unsigned char *     p;
StringList *        pDowncaseList = NULL;
const StringList *  pLowerSet;
const StringList *  ps;
/*
 *  count the number of multibyte characters in the string
 *  count the uppercase letters
 *  calculate the number of (ambiguous) downcase conversions
 *  calculate the maximum length of the downcased word
 */
uiCharCount = 0;
uiUpperCount = 0;
uiNumberAlternatives = 1;
uiWordLength = 1;           /* count the terminating NUL byte */
for ( p = (unsigned char *)pszWord_in ; *p != NUL ; p += iLength )
    {
    iLength = matchAlphaChar(p, pTextCtl_in);
    if (iLength == 0)
        iLength = 1;
    ++uiCharCount;
    if (matchUppercaseChar(p, pTextCtl_in) != 0)
        {
        ++uiUpperCount;
        pLowerSet = convUpperToLowerSet(p, pTextCtl_in);
        uiNumberAlternatives *= getStringListSize( pLowerSet );
        uiSpan = 0;
        for ( ps = pLowerSet ; ps ; ps = ps->pNext )
            {
            k = strlen( ps->pszString );
            if (k > uiSpan)
                uiSpan = k;
            }
        }
    else
        uiSpan = iLength;
    uiWordLength += uiSpan;
    }
if (uiUpperCount == 0)
    {
    /*
     *  the word is already all lowercase
     */
    return addToStringList(NULL, pszWord_in);
    }
else
    {
    /*
     *  convert word to all lowercase (possibly ambiguosly)
     */
    char *      pszDecapWord;
    char *      pszLower;
    size_t      uiNum;
    int         iLowerLength;
    size_t      i;
    size_t      j;

    if (uiNumberAlternatives < 1)
        {
        reportError(ERROR_MSG,
                    "error getting lowercase equivalents for \"%s\"\n",
                    pszWord_in);
        return NULL;
        }
    if (uiNumberAlternatives > 500)
        {
        reportError(WARNING_MSG,
                   "%lu lowercase equivalents is too many: storing only 500\n",
                    uiNumberAlternatives);
        uiNumberAlternatives = 500;
        }
    pszDecapWord = allocMemory(uiWordLength);
    for ( i = 0 ; i < uiNumberAlternatives ; ++i )
        {
        strcpy(pszDecapWord, pszWord_in);
        uiSpan = 1;
        j = 0;
        for ( p = (unsigned char *)pszDecapWord ; *p ; p += iLength )
            {
            iLength = matchUppercaseChar(p, pTextCtl_in);
            if (iLength != 0)
                {
                pLowerSet = convUpperToLowerSet(p, pTextCtl_in);
                uiNum = getStringListSize(pLowerSet);
                pszLower = pLowerSet->pszString;
                if (uiNum > 1)
                    {
                    k = (i / uiSpan) % uiNum;
                    uiSpan *= uiNum;
                    for ( ps = pLowerSet ; ps ; ps = ps->pNext )
                        {
                        if (k == 0)
                            {
                            pszLower = ps->pszString;
                            break;
                            }
                        --k;
                        }
                    }
                /*
                 *  replace the uppercase multibyte character with an
                 *  equivalent lowercase multibyte character
                 */
                iLowerLength = strlen(pszLower);
                if (iLowerLength != iLength)
                    memmove(p + iLowerLength,
                            p + iLength,
                            strlen((char *)p + iLength) + 1);
                memcpy(p, pszLower, iLowerLength);
                iLength = iLowerLength;
                }
            else
                {
                iLength = matchAlphaChar(p, pTextCtl_in);
                if (iLength == 0)
                    iLength = 1;
                }
            ++j;
            }
        pDowncaseList = addToStringList(pDowncaseList, pszDecapWord);
        }
    freeMemory( pszDecapWord );
    }
return pDowncaseList;
}

5.17.5 Source File

`myctype.c'

5.18 decapitalizeWord

5.18.1 Syntax

#include "template.h"   /* or opaclib.h */

int decapitalizeWord(WordTemplate *      pWord_io,
                     const TextControl * pTextCtl_in);

5.18.2 Description

int (pWord_io, pTextCtl_in) WordTemplate * pWord_io; /* pointer to WordTemplate structure TextControl * pTextCtl_in;

decapitalizeWord converts the input word to all lowercase (possibly ambiguously) and returns a capitalization flag:

0 (NOCAP): The input word had no uppercase letters.
1 (INITCAP): The input word had a single capital letter at the beginning.
2 (ALLCAP): The input word had all uppercase letters.
>4: The input word had a mixture of uppercase and lowercase letters.

After the conversion to all lowercase, any orthography changes stored in pTextCtl_in are applied.

The arguments to decapitalizeWord are as follows:

pWord_io: points to a data structure that contains the original word and receives the decapitalized word.
pTextCtl_in: points to a data structure that contains orthographic information.

5.18.3 Return Value

the capitalization flag for the word

5.18.4 Example

#include "template.h"   /* includes textctl.h */
...
WordTemplate * buildTemplate(
    char *        pszWord_in,
    TextControl * pTextCtl_in)
{
WordTemplate *  pTemplate;

if (pszWord_in == NULL)
    return NULL;
pTemplate = (WordTemplate *)allocMemory(sizeof(WordTemplate));
pTemplate->pszOrigWord = duplicateString( pszWord_in );
pTemplate->iCapital = decapitalizeWord( pTemplate, pTextCtl_in);
return pTemplate;
}

5.18.5 Source File

`textin.c'

5.19 displayNumberedMessage

5.19.1 Syntax

#include "rpterror.h"   /* or opaclib.h */

void displayNumberedMessage(const NumberedMessage * pMessage_in,
                            int                     bSilent_in,
                            int                     bShowWarnings_in,
                            FILE *                  pLogFP_in,
                            const char *            pszFilename_in,
                            unsigned                uiLineNumber_in,
                            ...);

5.19.2 Description

displayNumberedMessage writes a numbered error or warning message to the standard error output (screen), optionally writing it to a log file as well. For GUI programs, the programmer must write a different version of displayNumberedMessage to satisfy the link requirements of other functions in the OPAC library. This would typically display a message box or write to a message window.

The arguments to displayNumberedMessage are as follows:

pMessage_in: points to a NumberedMessage data structure that contains the message type, the message number, and the format string for the message.
bSilent_in: specifies that no screen output occurs if TRUE (nonzero).
bShowWarnings_in: specifies that warning messages (not just error messages) are displayed if TRUE (nonzero).
pLogFP_in: is a FILE pointer to an open log file, or is NULL.
pszFilename_in: points to the name of the input file in which the error occurred, or is NULL.
uiLineNumber_in: is the line number in the input file on which the error occurred, or is zero (0).
...: represents any number of additional arguments needed by the printf style format string given by pMessage_in.

5.19.3 Return Value

none

5.19.4 Example

#include <stdio.h>
#include "opaclib.h"    /* includes rpterror.h */
...
int     bSilent_g       = 0;
int     bShowWarnings_g = 1;
FILE *  pLogFP_g        = NULL;
...
static NumberedMessage sCannotOpen_m      = { ERROR_MSG,   100,
    "Cannot open %s file %s" };
static NumberedMessage sIgnoreRedundant_m = { WARNING_MSG, 101,
    "Ignoring all but first \\%s line" };
static char *   aszCodes_m[] = {
    "\\lexicon",
    "\\grammar",
    ...
    NULL
    };
...
FILE *          pControlFP;
char *          pszControlFile;
unsigned        uiLineNumber;
char *          pszLexFile;
char **         ppszField;
char *          p;
unsigned        i;
...
pControlFP = fopen(pszControlFile, "r");
if (pControlFP == (FILE *)NULL)
    {
    displayNumberedMessage(&sCannotOpen_m,
                           bSilent_g, bShowWarnings_g, pLogFP_g,
                           NULL, 0,
                           "log", pszControlFile);
    exit(1);
    }
uiLineNumber = 1;
while ((ppszField = readStdFormatField(pControlFP,
                                       aszCodes_m, NUL)) != NULL)
    {
    switch (**ppszField)
        {
        case 1:                 /* "\\lexicon" */
            if (pszLexFile != (char *)NULL)
                displayNumberedMessage(&sIgnoreRedundant_m,
                                       bSilent_g, bShowWarnings_g,
                                       pLogFP_g,
                                       pszControlFile, uiLineNumber,
                                       "lexicon");
            else
                {
                p = strtok(ppszField[0]+1, " \t\r\n\f\v");
                pszLexFile = buildAdjustedFilename(p,
                                                   pszControlFile,
                                                   ".lex");
                }
            break;
        ...
        }
    ...
    for ( i = 0 ; ppszField[i] ; ++i )
        ++uiLineNumber;
    }
...

5.19.5 Source File

`textin.c'

5.20 duplicateString

5.20.1 Syntax

#include "allocmem.h"   /* or opaclib.h */

char * duplicateString(const char * pszString_in);

5.20.2 Description

duplicateString creates a copy of an existing NUL-terminated character string. It calls allocateMemory to get the memory to store the copy of the string. If pszString_in is NULL, then duplicateString returns NULL.

This is the same as the standard function strdup, except that it calls allocateMemory instead of malloc.

duplicateString has one argument:

pszString_in: points to a NUL-terminated character string.

5.20.3 Return Value

a pointer to the newly allocated and copied duplicate string

5.20.4 Example

#include "template.h"   /* includes textctl.h */
...
WordTemplate * buildTemplate(
    char *        pszWord_in,
    TextControl * pTextCtl_in)
{
WordTemplate *  pTemplate;

if (pszWord_in == NULL)
    return NULL;
pTemplate = (WordTemplate *)allocMemory(sizeof(WordTemplate));
pTemplate->pszOrigWord = duplicateString( pszWord_in );
pTemplate->iCapital = decapitalizeWord( pTemplate, pTextCtl_in);
return pTemplate;
}

5.20.5 Source File

`allocmem.c'

5.21 duplicateStringList

5.21.1 Syntax

#include "strlist.h"    /* or strclass.h or change.h or textctl.h
                           or template.h or opaclib.h */

StringList * duplicateStringList(const StringList * pList_in);

5.21.2 Description

duplicateStringList copies a list of strings to create another, identical list of strings. If pList_in is NULL, then duplicateStringList returns NULL.

duplicateStringList has one argument:

pList_io: points to a list of strings.

5.21.3 Return Value

a pointer to the new list of dynamically allocated strings

5.21.4 Example

#include "strlist.h"
...
StringList * pList1;
StringList * pList2;
...
pList2 = duplicateStringList(pList1);
...
freeStringList( pList2 );
pList2 = NULL;

5.21.5 Source File

`copy_sl.c'

5.22 equivalentStringLists

5.22.1 Syntax

#include "strlist.h"    /* or strclass.h or change.h or textctl.h
                           or template.h or opaclib.h */

int equivalentStringLists(const StringList * pFirst_in, 
                          const StringList * pSecond_in);

5.22.2 Description

equivalentStringLists tests whether or not two string lists contain the same strings. The strings do not have to be in the same order in the two lists. Duplicate strings in either list are immaterial.

The arguments to equivalentStringLists are as follows:

pFirst_in: points to a list of strings.
pSecond_in: points to another list of strings.

5.22.3 Return Value

nonzero (TRUE) if the lists are equal, otherwise zero (FALSE)

5.22.4 Example

#include "strlist.h"
...
StringList * pList1;
StringList * pList2;
...
if (equivalentStringLists(pList1, pList2))
    {
    ...
    }

5.22.5 Source File

`equiv_sl.c'

5.23 eraseCharsInString

5.23.1 Syntax

#include "opaclib.h"

char * eraseCharsInString(char *       pszString_io,
                          const char * pszEraseChars_in);

5.23.2 Description

eraseCharsInString erases any characters from pszEraseChars_in that are found in pszString_io, possibly shortening pszString_io as a side-effect.

The arguments to eraseCharsInString are as follows:

pszString_io: points to the input (and output) string.
pszEraseChars_in: points to the characters to erase from the input string.

5.23.3 Return Value

a pointer to the possibly modified string

5.23.4 Example

#include "opaclib.h"    /* includes allocmem.h */
...
static char szMarkers_m[] = "-=#";
...
static int get_score(pszMarkedWord_in)
const char *    pszMarkedWord_in;
{
char *  pszWord;
int     iScore = 0;

if (pszMarkedWord_in != NULL)
    {
    pszWord = eraseCharsInString(duplicateString(pszMarkedWord_in),
                                 szMarkers_m);
    ...
    freeMemory(pszWord);
    }
return iScore;
}

5.23.5 Source File

`erasecha.c'

5.24 eraseTrie

5.24.1 Syntax

#include "trie.h"       /* or opaclib.h */

void eraseTrie(Trie *  pTrieHead_io,
               void (* pfEraseInfo_in)(void * pList_io));

5.24.2 Description

eraseTrie walks through a trie, freeing all the memory allocated for the trie and for the information it stores.

The arguments to eraseTrie are as follows:

pTrieHead_io

points to the head of a trie.

pfEraseInfo_in

points to a function for erasing the stored information. The function has one argument:

pList_io: points to a data collection to erase, presumably by freeing memory.

The function does not return a value.

5.24.3 Return Value

none

5.24.4 Example

#include "trie.h"
#include "allocmem.h"
...
typedef struct lex_item {
    struct lex_item *   pLink;          /* link to next element */
    struct lex_item *   pNext;          /* link to next homograph */
    unsigned char *     pszForm;        /* lexical form (word) */
    unsigned char *     pszGloss;       /* lexical gloss */
    unsigned short      uiCategory;     /* lexical category */
    } LexItem;
...
Trie *          pLexicon_g;
unsigned long   uiLexiconCount_g;
...
static void erase_lex_item(void * pList)
{
LexItem *       pItem;
LexItem *       pNextItem;

for ( pItem = (LexItem *)pList ; pItem ; pItem = pNextItem )
    {
    pNextItem = pItem->pLink;
    if (pItem->pszForm != NULL)
        freeMemory(pItem->pszForm);
    if (pItem->pszGloss != NULL)
        freeMemory(pItem->pszGloss);
    freeMemory(pItem);
    }
}

void free_lexicon()
{
if (pLexicon_g != NULL)
    {
    eraseTrie(pLexicon_g, erase_lex_item);
    pLexicon_g = NULL;
    }
uiLexiconCount_g = 0L;
}

5.24.5 Source File

`trie.c'

5.25 exitSafely

5.25.1 Syntax

#include "opaclib.h"

int exitSafely(int iCode_in);

5.25.2 Description

exitSafely replaces exit. When compiled for Microsoft Windows, the program should define exitSafely to not call exit because Windows doesn't like that very much!

exitSafely has one argument:

iCode_in: is the program status code to return from the program.

5.25.3 Return Value

none, but it must be defined as returning int to keep everyone happy

5.25.4 Example

#include <stdlib.h>
#include "opaclib.h"
...
char *  pszCopy;
...
pszCopy = strdup("This is a test!");
if (pszCopy == NULL)
    {
    ...
    exitSafely(2);
    }

5.25.5 Source File

`safeexit.c'

5.26 fcloseWithErrorCheck

5.26.1 Syntax

#include "opaclib.h"

void fcloseWithErrorCheck(FILE *       pOutputFP_in,
                          const char * pszFilename_in);

5.26.2 Description

fcloseWithErrorCheck checks for the output file for write errors, and closes it. If an error is detected, it is reported using reportError.

The arguments to fcloseWithErrorCheck are as follows:

pOutputFP_in: is an output FILE pointer.
pszFilename_in: points to the name of the output file.

5.26.3 Return Value

none

5.26.4 Example

#include <stdio.h>
#include "opaclib.h"
...
FILE * pOutput;
char * pszFilename;
...
pOutput = fopen(pszFilename, "w");
if (pOutput != NULL)
    {
    ...
    fcloseWithErrorCheck(pOutput, pszFilename);
    pOutput = NULL;
    }

5.26.5 Source File

`errcheck.c'

5.27 findDataInTrie

5.27.1 Syntax

#include "trie.h"       /* or opaclib.h */

void * findDataInTrie(const Trie * pTrieHead_in,
                      const char * pszKey_in);

5.27.2 Description

findDataInTrie searches the trie for information stored using the key for access. The pointer returned is not guaranteed to point to only desired information unless the length of the key is less than the maximum depth of the trie. You may need to scan over the list (or array) to get exactly what you want.

The arguments to findDataInTrie are as follows:

pTrieHead_in: points to the head of a trie.
pszKey_in: points to the key string.

5.27.3 Return Value

a pointer to the generic information found in the trie, or NULL if the search fails

5.27.4 Example

#include "trie.h"
...
typedef struct lex_item {
    struct lex_item *   pLink;          /* link to next element */
    struct lex_item *   pNext;          /* link to next homograph */
    unsigned char *     pszForm;        /* lexical form (word) */
    unsigned char *     pszGloss;       /* lexical gloss */
    unsigned short      uiCategory;     /* lexical category */
    } LexItem;
...
Trie *          pLexicon_g;
...
LexItem * find_entries(unsigned char * pszWord_in)
{
LexItem *       pLex;

for (   pLex = findDataInTrie(pLexicon_g, pszWord_in) ;
        pLex ;
        pLex = pLex->pLink )
    {
    if (strcmp(pLex->pszForm, pszWord_in) == 0)
        {
        /*
         *  since add_lex_item() links the homographs together,
         *  this points to a list containing only the homographs
         */
        return pLex;
        }
    }
return NULL;
}

5.27.5 Source File

`trie.c'

5.28 findStringClass

5.28.1 Syntax

#include "strclass.h"   /* or change.h or textctl.h or template.h
                           or opaclib.h */

StringClass * findStringClass(const char *        pszName_in,
                              const StringClass * pClasses_in);

5.28.2 Description

findStringClass searches a list of string classes for a specific string class by name.

The arguments to findStringClass are as follows:

pszName_in: points to the name of the desired string class.
pClasses_in: points to a collection of string classes to search.

5.28.3 Return Value

a pointer to the string class found, or NULL if not found

5.28.4 Example

#include "strclass.h"
#include "rpterror.h"
...
static StringClass *    pClasses_m = NULL;
...
StringClass *   pClass;
char *          pszClassName;
...
pClass = findStringClass( pszClassName, pClasses_m);
if (pClass == NULL)
    reportError(WARNING_MSG, "Undefined class %s\n", pszName);
...

5.28.5 Source File

`strcla.c'

5.29 fitAllocStringExactly

5.29.1 Syntax

#include "allocmem.h"   /* or opaclib.h */

char * fitAllocStringExactly(char * pszString_in);

5.29.2 Description

fitAllocStringExactly shrinks the allocated buffer to exactly fit the string. The program is aborted with an error message if it somehow runs out of memory. (See section 5.8 allocMemory, for details about this error message.)

fitAllocStringExactly has one argument:

pszString_in: points to a string in a possibly overlarge allocated buffer.

5.29.3 Return Value

a pointer to the (possibly) reallocated block

5.29.4 Example

#include <stdio.h>
#include "allocmem.h"
...
char * read_line(FILE * pInputFP_in)
{
char *  pszBuffer;
size_t  uiBufferSize = 500;
size_t  uiLineLength;

if ((pInputFP_in == NULL) || feof(pInputFP_in))
    return NULL;
pszBuffer = allocMemory(uiBufferSize);
if (fgets(pszBuffer, uiBufferSize, pInputFP_in) == NULL)
    {
    freeMemory(pszBuffer);
    return NULL;
    }
while (strchr(pszBuffer, '\n') == NULL)
    {
    uiBufferSize += 500;
    pszBuffer = reallocMemory(pszBuffer, uiBufferSize);
    uiLineLength = strlen(pszBuffer);
    if (fgets(pszBuffer + uiLineLength,
              uiBufferSize - uiLineLength, pInputFP_in) == NULL)
        break;
    }
return fitAllocStringExactly( pszBuffer );
}

5.29.5 Source File

`allocmem.c'

5.30 fixSynthesizedWord

5.30.1 Syntax

#include "opaclib.h"

void fixSynthesizedWord(WordTemplate *      pTemplate_io,
                        const TextControl * pTextCtl_in);

5.30.2 Description

fixSynthesizedWord applies the output orthography changes and recapitalization to the list of synthesized wordforms. The list is updated to reflect these changes, and to minimize any ensuing ambiguity.

The arguments to fixSynthesizedWord are as follows:

pTemplate_io: points to a data structure that contains the (possibly ambiguous) word synthesis list and capitalization information.
pTextCtl_in: points to a data structure that contains orthographic information.

5.30.3 Return Value

none

5.30.4 Example

#include "template.h"
...
TextControl     sTextControl_g;
...
FILE *          pInputFP;
FILE *          pOutputFP;
WordTemplate *  pWord;
...
for (;;)
    {
    pWord = readTemplateFromAnalysis(pInputFP, &sTextControl_g);
    if (pWord == NULL)
        break;
    pWord->pNewWords = synthesize_word(pWord->pAnalyses,
                                       &sTextControl_g);
    fixSynthesizedWord(pWord, &TextControl_g);
    writeTextFromTemplate( pOutputFP, pWord, &sTextControl_g);
    freeWordTemplate( pWord );
    }

5.30.5 Source File

`textout.c'

5.31 fopenAlways

5.31.1 Syntax

#include "opaclib.h"

FILE * fopenAlways(char *       pszFilename_io,
                   const char * pszMode_in);

5.31.2 Description

fopenAlways opens a file, prompting the user if necessary and retrying until successful. If it is not NULL, pszFilename_io is updated to contain the name of the file actually opened. fopenAlways uses fopen to open the file, and repeatedly prompts the user for a filename if fopen fails.

The buffer pointed to by pszFilename_io must be (at least) FILENAME_MAX bytes long. If FILENAME_MAX is not defined by `stdio.h', then it is assumed to be 128.

pszFilename_io: points to a buffer for holding the name of the file, or is NULL.
pszMode_in: points to an fopen mode string (usually "r" or "w").

5.31.3 Return Value

a valid FILE pointer

5.31.4 Example

#include <stdio.h>
#include "opaclib.h"
...
FILE * pInputFP;
char   szFilename[FILENAME_MAX];
...
pInputFP = fopenAlways(szFilename, "r");
...
fclose(pInputFP);
pInputFP = NULL;

5.31.5 Source File

`ufopen.c'

5.32 freeChangeList

5.32.1 Syntax

#include "change.h"     /* or textctl.h or template.h or opaclib.h */

void freeChangeList(Change * pList_io);

5.32.2 Description

freeChangeList frees the memory allocated for a list of consistent change structures.

freeChangeList has one argument:

pList_io: points to a list of consistent change structures.

5.32.3 Return Value

none

5.32.4 Example

#include "change.h"
...
Change * pChangeList_g;
...
void add_change(char * pszChange_in)
{
Change * pTail;
if (pChangeList_g == NULL)
    pChangeList_g = parseChangeString( pszChange_in );
else
    {
    for (pTail = pChangeList_g ; pTail->pNext ; pTail = pTail->pNext)
        ;
    pTail->pNext = parseChangeString( pszChange_in );
    }
}
...
freeChangeList( pChangeList_g );
pChangeList_g = NULL;

5.32.5 Source File

`change.c'

5.33 freeCodeTable

5.33.1 Syntax

#include "record.h"

void freeCodeTable(CodeTable * pCodeTable_io);

5.33.2 Description

freeCodeTable frees the memory allocated for a CodeTable data structure.

freeCodeTable has only one argument:

pCodeTable_io: points to a CodeTable data structure that contains information that is no longer needed.

5.33.3 Return Value

none

5.33.4 Example

#include "record.h"
#include "ample.h"

AmpleData sAmpleData_g;
char szCodesFilename_g[100];
char szDictFilename_g[100];
...
loadAmpleDictCodeTables(szCodesFilename_g, &sAmpleData_g, FALSE);
...
loadAmpleDictionary(szDictFilename_g, PFX, &sAmpleData_g);
freeCodeTable( sAmpleData_g.pPrefixTable );
sAmpleData_g.pPrefixTable = NULL;

5.33.5 Source File

`free_ct.c'

5.34 freeMemory

5.34.1 Syntax

#include "allocmem.h"   /* or opaclib.h */

void freeMemory(void * pBlock_io);

5.34.2 Description

freeMemory provides a "safe" interface to free. It ignores NULL as an argument. (But passing NULL is still poor practice!) This is the only protection added to free: passing random memory addresses to freeMemory, or passing the same address twice, will result in memory corruption and program crashes!

freeMemory has one argument:

pBlock_io: points to a dynamically allocated block of memory to deallocate.

5.34.3 Return Value

none

5.34.4 Example

#include <stdio.h>
#include "allocmem.h"
...
char * read_line(FILE * pInputFP_in)
{
char *  pszBuffer;
size_t  uiBufferSize = 500;
size_t  uiLineLength;

if ((pInputFP_in == NULL) || feof(pInputFP_in))
    return NULL;
pszBuffer = allocMemory(uiBufferSize);
if (fgets(pszBuffer, uiBufferSize, pInputFP_in) == NULL)
    {
    freeMemory(pszBuffer);
    return NULL;
    }
return pszBuffer;
}

5.34.5 Source File

`allocmem.c'

5.35 freeStringClasses

5.35.1 Syntax

#include "strclass.h"   /* or change.h or textctl.h or template.h
                           or opaclib.h */

void freeStringClasses(StringClass * pClasses_io);

5.35.2 Description

freeStringClasses frees the memory allocated for the list of string classes.

freeStringClasses has one argument:

pClasses_io: points to a list of string classes.

5.35.3 Return Value

none

5.35.4 Example

#include "change.h"     /* includes strclass.h */
...
static Change *         pChanges_m;
static StringClass *    pClasses_m;
...
void free_change_info()
{
freeChangeList( pChanges_m );
freeStringClasses( pClasses_m );
pChanges_m = NULL;
pClasses_m = NULL;
}

5.35.5 Source File

`strcla.c'

5.36 freeStringList

5.36.1 Syntax

#include "strlist.h"    /* or strclass.h or change.h or textctl.h
                           or template.h or opaclib.h */

void freeStringList(StringList * pList_io);

5.36.2 Description

freeStringList deletes a list of strings, freeing all the memory used by the list of strings.

freeStringList has one argument:

pList_io: points to a list of strings.

5.36.3 Return Value

none

5.36.4 Example

#include "strlist.h"
...
StringList * pNames_g;
...
freeStringList(pNames_g);
pNames_g = NULL;
...

5.36.5 Source File

`free_sl.c'

5.37 freeWordAnalysisList

5.37.1 Syntax

#include "template.h"

void freeWordAnalysisList(WordAnalysis * pAnalyses_io);

5.37.2 Description

freeWordAnalysisList frees the memory allocated for a list of WordAnalysis data structures.

freeWordAnalysisList has one argument:

pAnalyses_io: points to a list of WordAnalysis data structures.

5.37.3 Return Value

none

5.37.4 Example

#include "template.h"
...
WordTemplate *	pWord;
...
if (pWord->pAnalyses != NULL)
    freeWordAnalysisList(pWord->pAnalyses);
...

5.37.5 Source File

`wordanal.c'

5.38 freeWordTemplate

5.38.1 Syntax

#include "template.h"   /* or opaclib.h */

void freeWordTemplate(WordTemplate * pWord_io);

5.38.2 Description

freeWordTemplate frees everything in a WordTemplate data structure, including the structure itself.

freeWordTemplate has one argument:

pWord_io: points to a WordTemplate data structure to free.

5.38.3 Return Value

none

5.38.4 Example

#include "template.h"
...
TextControl sTextCtl_g;
...
WordAnalysis * merge_analyses(
    WordAnalysis *  pList_in,
    WordAnalysis *  pAnal_in)
{
...
}
...
void process(
    FILE * pInputFP_in,
    FILE * pOutputFP_in)
{
WordTemplate *  pWord;
WordAnalysis *  pAnal;
unsigned        uiAmbiguityCount;
unsigned long   uiWordCount;

for ( uiWordCount = 0L ;; )
    {
    pWord = readTemplateFromText(pInputFP_in, &sTextCtl_g);
    if (pWord == NULL)
        break;
    uiAmbiguityCount = 0;
    if (pWord->paWord != NULL)
        {
        for ( i = 0 ; pWord->paWord[i] ; ++i )
            {
            pAnal = analyze(pWord->paWord[i]);
            pWord->pAnalyses = merge_analyses(pWord->pAnalyses,
                                              pAnal);
            }
        for (pAnal = pWord->pAnalyses ; pAnal ; pAnal = pAnal->pNext)
            ++uiAmbiguityCount;
        }
    uiWordCount = showAmbiguousProgress(uiAmbiguityCount,
                                        uiWordCount);
    writeTemplate(pOutputFP_in, NULL, pWord, &sTextCtl_g);
    freeWordTemplate(pWord);
    }
}

5.38.5 Source File

`free_wt.c'

5.39 getAndClearAllocMemorySum

5.39.1 Syntax

#include "allocmem.h"   /* or opaclib.h */

unsigned long getAndClearAllocMemorySum(void);

5.39.2 Description

getAndClearAllocMemorySum returns the amount of memory used by allocMemory calls since the last call to getAndClearAllocMemorySum. It does not account for calls to freeMemory, which greatly reduces its accuracy.

getAndClearAllocMemorySum does not have any arguments.

5.39.3 Return Value the number of bytes of memory requested by allocMemory calls since the last call to getAndClearAllocMemorySum

5.39.4 Example

#include <stdio.h>
#include "allocmem.h"
...
getAndClearAllocMemorySum();    /* reset the counter */
...
p = allocMemory(500);
...
p = duplicateString("this is a test");
...
printf("%lu bytes allocated recently\n", getAndClearAllocMemorySum());

5.39.5 Source File

`allocmem.c'

5.40 getChangeQuote

5.40.1 Syntax

#include "change.h"     /* or textctl.h or template.h or opaclib.h */

int getChangeQuote(const char * pszMatch_in,
                   const char * pszReplace_in);

5.40.2 Description

getChangeQuote finds a suitable "quote" character that is not used in either input string.

The arguments to getChangeQuote are as follows:

pszMatch_in: points to the string to change from.
pszReplace_in: points to the string to change to.

5.40.3 Return Value

a character suitable for quoting the match and replace strings

5.40.4 Example

#include <string.h>
#include "change.h"
#include "allocmem.h"

char * composeChangeString(pszMatch_in, pszReplace_in, pszEnvir_in)
const char *    pszMatch_in;
const char *    pszReplace_in;
const char *    pszEnvir_in;
{
char *  pszChange;
size_t  uiLength;
char    cQuote;

if ((pszMatch_in == NULL) && (pszReplace_in == NULL))
    return NULL;
if (pszMatch_in == NULL)
    pszMatch_in = "";
if (pszReplace_in == NULL)
    pszReplace_in = "";

uiEnvirLength   = strlen( pszEnvir_in );
uiLength = strlen( pszMatch_in ) + strlen( pszReplace_in ) + 6;
if ((pszEnvir_in != NULL) && (*pszEnvir_in != '\0'))
    uiLength += strlen( pszEnvir_in ) + 1;
pszChange = allocMemory(uiLength);

cQuote = getChangeQuote(pszMatch_in, pszReplace_in);

sprintf(pszChange, "%c%s%c %c%s%c",
        cQuote, pszMatch_in, cQuote, cQuote, pszReplace_in, cQuote);
if ((pszEnvir_in != NULL) && (*pszEnvir_in != '\0'))
    strcat(strcat(pszChange, " "), pszEnvir_in);

return pszChange;
}

5.40.5 Source File

`change.c'

5.41 getStringListSize

5.41.1 Syntax

#include "strlist.h"    /* or strclass.h or change.h or textctl.h
                           or template.h or opaclib.h */

unsigned getStringListSize(const StringList * pList_in);

5.41.2 Description

getStringListSize counts the number of strings stored in the list. It does not check for duplicate strings or for NULL string pointers, just for the total number of data structures linked together.

getStringListSize has one argument:

pList_in: points to a list of strings.

5.41.3 Return Value

the number of strings in the list

5.41.4 Example

#include <stdio.h>
#include "strlist.h"
...
void writeAmbigWords(pList_in, cAmbig_in, pOutputFP_in)
const StringList * pList_in;
int                cAmbig_in;
FILE *             pOutputFP_in;
{
char    szAmbig[2];

if (pList_in == NULL)
    fprintf(pOutputFP_in, "%c0%c%c", cAmbig_in, cAmbig_in, cAmbig_in);
else if (pList_in->pNext)
    {
    fprintf(pOutputFP_in, "%c%u%c",
            cAmbig_in, getStringListSize(pList_in), cAmbig_in );
    szAmbig[0] = cAmbig_in;
    szAmbig[1] = '\0';
    writeStringList( pList_in, szAmbig, pOutputFP_in );
    fprintf(pOutputFP_in, "%c", cAmbig_in);
    }
else
    fputs(pList_in->pszString, pOutputFP_in);
}

5.41.5 Source File

`size_sl.c'

5.42 identicalStringLists

5.42.1 Syntax

#include "strlist.h"    /* or strclass.h or change.h or textctl.h
                           or template.h or opaclib.h */

int identicalStringLists(const StringList * pFirstList_in,
                         const StringList * pSecondList_in);

5.42.2 Description

identicalStringLists checks whether or not two lists of strings are identical, that is, whether they have the same strings in the same order.

The arguments to identicalStringLists are as follows:

pFirstList_in: points to a list of strings.
pSecondList_in: points to another list of strings.

5.42.3 Return Value

nonzero (TRUE) if the lists are identical, otherwise zero (FALSE)

5.42.4 Example

#include "strlist.h"
...
StringList * pList1;
StringList * pList2;
...
if (identicalStringLists(pList1, pList2))
    {
    ...
    }

5.42.5 Source File

`equal_sl.c'

5.43 isMemberOfStringList

5.43.1 Syntax

#include "strlist.h"    /* or strclass.h or change.h or textctl.h
                           or template.h or opaclib.h */

int isMemberOfStringList(const StringList * pList_in,
                         const char *       pszString_in);

5.43.2 Description

isMemberOfStringList checks whether a string is stored in a list of strings.

The arguments to isMemberOfStringList are as follows:

pList_in: points to a list of strings.
pszString_in: points to the string to be checked.

5.43.3 Return Value

nonzero (TRUE) if the string is found in the list, otherwise zero (FALSE)

5.43.4 Example

#include "strlist.h"
...
static StringList *  pFiles_m = NULL;
...
void processFileOnce(const char * pszFile_in)
{
if ((pszFile_in != NULL) && !isMemberOfStringList(pFiles_m, pszFile_in))
    {
    pFiles_m = mergeIntoStringList(pFiles_m, pszFile_in);
    ...
    }
}

5.43.5 Source File

`membr_sl.c'

5.44 isolateWord

5.44.1 Syntax

#include "opaclib.h"

char * isolateWord(char * pszLine_io);

5.44.2 Description

isolateWord isolates the "word" pointed to by its argument by replacing the first whitespace character following the word with a NUL character. It then steps the pointer to the beginning of the next "word" in the input string.

isolateWord skips over any leading whitespace in the input string before trying to isolate a "word".

isolateWord has one argument:

pszLine_io: points to a NUL-terminated character string.

5.44.3 Return Value

a pointer to the first character of the next following word, which may be the NUL character at the end of the input string

5.44.4 Example

#include <string.h>
#include "opaclib.h"    /* includes strlist.h */
...
StringList * pTraceMorphs_m = NULL;
...
void addTraceMorphs(char * pszLine_in)
{
char *  pszMorph;
char *  pszEnd;

if (pszLine_in == NULL)
    return;
for (   pszMorph = pszLine_in + strspn(pszLine_in, " \r\n\t\f\v");
        *pszMorph_in ;
        pszMorph = pszEnd )
    {
    pszEnd = isolateWord( pszMorph );   /* isolate the morpheme */
    if (strcmp(pszMorph, "0") == 0)     /* If 0, put in NUL */
        *pszMorph = NUL;
    pTraceMorphs_m = mergeIntoStringList(pTraceMorphs_m, pszMorph);
    }
}

5.44.5 Source File

`isolatew.c'

5.45 isStringClassMember

5.45.1 Syntax

#include "strclass.h"   /* or change.h or textctl.h or template.h
                           or opaclib.h */

int isStringClassMember(const char *        pszString_in,
                        const StringClass * pClass_in);

5.45.2 Description

isStringClassMember searches a string class for a specific string.

The arguments to isStringClassMember are as follows:

pszString_in: points to the string to look for.
pClass_in: points to a string class.

5.45.3 Return Value

nonzero (TRUE) if the string is found in the class, otherwise zero (FALSE)

5.45.4 Example

#include "strclass.h"
...
static StringClass *    pClasses_m;
...
int isClassMember(const char * pszString_in,
                  const char * pszClassName_in)
{
StringClass *   pClass;

pClass = findStringClass(pszClassName_in, pClasses_m);
if (pClass == NULL)
    return 0;
return isStringClassMember(pszString_in, pClass);
}

5.45.5 Source File

`strcla.c'

5.46 loadIntxCtlFile

5.46.1 Syntax

#include "textctl.h"    /* or template.h or opaclib.h */

int loadIntxCtlFile(const char *   pszFilename_in, 
                    int            cComment_in,
                    TextControl *  pTextCtl_out,
                    StringClass ** ppStringClasses_io);

5.46.2 Description

loadIntxCtlFile loads a text input control file into memory. This is a standard format file containing one data record with the following fields (not necessarily in this order):

\ambig: defines the character used to mark ambiguities in the output after processing. (This does not really belong in a "text input" control file, but exists for historical reasons and is kept for compatibility.) The \ambig field is optional, and may occur only once.
\barchar: defines the character used to start a short formatting command that consists of this character and the immediately following character. Its name comes from the use of the vertical bar character (|) in the S.I.L. Manuscripter program in the early 1980's. The \barchar field is optional, and may occur only once. An empty field disables this feature.
\barcodes: defines the characters allowed to follow the \barchar character to form formatting commands. Whitespace (spaces, tabs, or newlines) in this field is optional. The \barcodes field is optional, and may occur any number of times. Its effect is cumulative.
\ch: defines an input othography change to apply to words after they have been decapitalized, but before any other processing takes place. A change consists of two or three parts, in this order: a match string, a replace string, and an optional environment. The match string and replace string must be quoted by some character that does not appear in either string. (ASCII single quotes and double quotes are most often used for this purpose.) The syntax of the environment is too complicated to discuss here: see Weber 1988 (pages 68-74, 82-83, and 86-90) for details. The \ch field is optional, and may occur any number of times. An ordered list of consistent changes is built by the function. Each change is applied to each input word as many times as necessary before the next change is applied.
\dsc: defines the character used to segment words in the output after processing. This is typically for dividing words into morphemes. (This does not really belong in a "text input" control file, but exists for historical reasons and is kept for compatibility.) The \dsc field is optional, and may occur only once.
\excl: specifies one or more "fields" to exclude from processing in the input file. Fields in the input file are marked by formatting commands such as those defined by the \format field in the text input control file. The \excl field lists one or more field codes (formatting commands) complete with the leading \format character. Field codes are separated by whitespace (spaces, tabs, or newlines). The \excl field is optional, and may occur any number of times. Its effect is cumulative. If any \excl fields occur, then no \incl fields are allowed, and all fields in the input file that are not explicitly listed in a \excl field will be processed.
\format: defines the character used to start a formatting command in the input text. The formatting command is assumed to consist of this characters and all following contiguous nonwhitespace characters. The \format field is optional, and may occur only once.
\incl: specifies one or more "fields" to include in processing in the input file. Fields in the input file are marked by formatting commands such as those defined by the \format field in the text input control file. The \incl field lists one or more field codes (formatting commands) complete with the leading \format character. Field codes are separated by whitespace (spaces, tabs, or newlines). The \incl field is optional, and may occur any number of times. Its effect is cumulative. If any \incl fields occur, then no \excl fields are allowed, and only those fields in the input file that are explicitly listed in a \incl field will be processed.
\luwfc: defines one or more "word formation characters" that have distinct lowercase and uppercase forms. The lowercase form is given first and must be followed by its uppercase form. The functions that use this information allow several lowercase characters to map onto a single uppercase character, and one lowercase character to map onto several uppercase characters. Whitespace (spaces, tabs, or newlines) in this field is optional. The \luwfc field is optional, and may occur any number of times. Its effect is cumulative. For lowercase and uppercase forms that are represented by two or more adjacent characters (bytes), use the \luwfcs field described below.
\luwfcs: defines one or more "word formation character multigraphs" that have distinct lowercase and uppercase forms. The lowercase form is given first and must be followed by its uppercase form. The functions that use this information allow several lowercase character multigraphs to map onto a single uppercase character multigraph, and one lowercase character multigraph to map onto several uppercase character multigraphs. Whitespace (spaces, tabs, or newlines) in this field is significant: each multigraph is separated from its neighbors by one or more whitespace characters. The \luwfcs field is optional, and may occur any number of times. Its effect is cumulative. Note that \luwfcs fields may be used to replace \luwfc fields, or the two types of fields may be mixed together in the control file. The implementation underlying the \luwfcs field does not require that the lowercase and uppercase forms occupy the same number of characters (bytes).
\maxdecap: defines the maximum number of alternative decapitalizations to produce when multiple lowercase characters map onto a single uppercase character. This probably matters only for handling words that are entirely capitalized, as the number of alternatives can grow very rapidly with the length of the word. The \maxdecap field is optional, and may occur only once.
\nocap: dictates that the orthography does not use capitalization at all. If this field is present, then the \luwfc and \luwfcs fields should not be used. The \nocap field is optional, and may occur only once.
\noincap: dictates that capitalization applies to only the first character of a word, or to all characters of a word, but not to individual characters. That is, it tells to program not to attempt to deal with names like `McConnel'. The \noincap field is optional, and may occur only once.
\scl: defines a string class, presumably for use by one or more orthography input changes. The first item in the field is the name of the class. All other items are members of the class. Items are separated by whitespace (spaces, tabs, or newlines). The \scl field is optional, and any number of string classes may be defined. A string class definition must occur before any \ch field that uses that string class.
\wfc: defines one or more "word formation characters" that do not have distinct lowercase and uppercase forms. Whitespace (spaces, tabs, or newlines) in this field is optional. The \wfc field is optional, and may occur any number of times. Its effect is cumulative. For caseless forms that are represented by two or more adjacent characters (bytes), use the \wfcs field described below.
\wfcs: defines one or more multibyte "word formation characters" that do not have distinct lowercase and uppercase forms. Whitespace (spaces, tabs, or newlines) in this field is required to separate the different multibyte characters. The \wfcs field is optional, and may occur any number of times. Its effect is cumulative. Note that \wfcs fields may be used to replace \wfc fields, or the two types of fields may be mixed together in the control file.

For more details about this file, see section `Text Input Control File' in AMPLE Reference Manual.

The arguments to loadIntxCtlFile are as follows:

pszFilename_in: points to the name of the text input control file.
cComment_in: is the character used to initiate comments on lines in the file.
pTextCtl_out: points to a data structure for storing information read from the file.
ppStringClasses_io: is the address of a pointer to a set of string classes possibly used by \ch fields or added to by \scl fields.

5.46.3 Return Value

zero if successful, nonzero if an error occurs

5.46.4 Example

#include <stdio.h>
#include "textctl.h"    /* includes strclass.h */
#include "rpterror.h"
...
char               szIntxFilename_g[200];
TextControl        sTextControl_g;
StringClass *      pStringClasses_g = NULL;
static TextControl sDefaultTextControl_m = {
    NULL,       /* filename */
    NULL,       /* ordered array of lowercase letters */
    NULL,       /* ordered array of matching uppercase letters */
    NULL,       /* array of caseless letters */
    NULL,       /* list of input orthography changes */
    NULL,       /* list of output (orthography) changes */
    NULL,       /* list of format markers (fields) to include */
    NULL,       /* list of format markers (fields) to exclude */
    '\\',       /* initial character of format markers (field codes) */
    '%',        /* character for marking ambiguities and failures */
    '-',        /* character for marking decomposition */
    '|',        /* initial character of secondary format markers */
    NULL,       /* (Manuscripter) bar codes */
    TRUE,       /* flag whether to capitalize individual letters */
    TRUE,       /* flag whether to decapitalize/recapitalize */
    100         /* maximum number of decapitalization alternatives */
    };
...
memcpy(&sTextControl_g, &sDefaultTextControl_m, sizeof(TextControl));
fprintf(stderr, "Text Control File (xxINTX.CTL) [none]: ");
fgets( szIntxFilename_g, 200, stdin );
if (szIntxFilename_g[0])
    {
    if (loadIntxCtlFile(szIntxFilename_g, ';',
                        sTextControl_g, pStringClasses_g) != 0)
        {
        reportError(ERROR_MSG, "Error reading text control file %s\n",
                    szIntxFilename_g);
        }
    }
if (    (sTextControl_g.cBarMark == NUL) &&
        (sTextControl_g.pszBarCodes != NULL) )
    {
    freeMemory(sTextControl_g.pszBarCodes);
    sTextControl_g.pszBarCodes = NULL;
    }
if (    (sTextControl_g.cBarMark != NUL) &&
        (sTextControl_g.pszBarCodes == NULL) )
    {
    sTextControl_g.pszBarCodes = (unsigned char *)duplicateString(
                                                    "bdefhijmrsuvyz");
    }

5.46.5 Source File

`loadintx.c'

5.47 loadOutxCtlFile

5.47.1 Syntax

#include "textctl.h"    /* or template.h or opaclib.h */

int loadOutxCtlFile(const char *   pszFilename_in,
                    int            cComment_in,
                    TextControl *  pTextCtl_out,
                    StringClass ** ppStringClasses_io);

5.47.2 Description

loadOutxCtlFile loads a text output control file into memory. This is a standard format file containing one data record with the following fields (not necessarily in this order):

\ambig: defines the character used to mark ambiguities in the output after processing. The \ambig field is optional, and may occur only once.
\ch: defines an output othography change to apply to words after they have processed, but before they are recapitalized. A change consists of two or three parts, in this order: a match string, a replace string, and an optional environment. The match string and replace string must be quoted by some character that does not appear in either string. (ASCII single quotes and double quotes are most often used for this purpose.) The syntax of the environment is too complicated to discuss here: see Weber 1988 (pages 68-74, 82-83, and 86-90) for details. The \ch field is optional, and may occur any number of times. An ordered list of consistent changes is built by the function. Each change is applied to each output word as many times as necessary before the next change is applied.
\dsc: defines the character used to segment words in the output after processing. This is typically for dividing words into morphemes. (This does not really belong in a "text output" control file, but exists for historical reasons and is kept for compatibility.) The \dsc field is optional, and may occur only once.
\format: defines the character used to start a formatting command in the input text. The formatting command is assumed to consist of this characters and all following contiguous nonwhitespace characters. The \format field is optional, and may occur only once.
\luwfc: defines one or more "word formation characters" that have distinct lowercase and uppercase forms. The lowercase form is given first and must be followed by its uppercase form. The functions that use this information allow several lowercase characters to map onto a single uppercase character, and one lowercase character to map onto several uppercase characters. Whitespace (spaces, tabs, or newlines) in this field is optional. The \luwfc field is optional, and may occur any number of times. Its effect is cumulative. For lowercase and uppercase forms that are represented by two or more adjacent characters (bytes), use the \luwfcs field described below.
\luwfcs: defines one or more "word formation character multigraphs" that have distinct lowercase and uppercase forms. The lowercase form is given first and must be followed by its uppercase form. The functions that use this information allow several lowercase character multigraphs to map onto a single uppercase character multigraph, and one lowercase character multigraph to map onto several uppercase character multigraphs. Whitespace (spaces, tabs, or newlines) in this field is significant: each multigraph is separated from its neighbors by one or more whitespace characters. The \luwfcs field is optional, and may occur any number of times. Its effect is cumulative. Note that \luwfcs fields may be used to replace \luwfc fields, or the two types of fields may be mixed together in the control file. The implementation underlying the \luwfcs field does not require that the lowercase and uppercase forms occupy the same number of characters (bytes).
\scl: defines a string class, presumably for use by one or more orthography output changes. The first item in the field is the name of the class. All other items are members of the class. Items are separated by whitespace (spaces, tabs, or newlines). The \scl field is optional, and any number of string classes may be defined. A string class definition must occur before any \ch field that uses that string class.
\wfc: defines one or more "word formation characters" that do not have distinct lowercase and uppercase forms. Whitespace (spaces, tabs, or newlines) in this field is optional. The \wfc field is optional, and may occur any number of times. Its effect is cumulative. For caseless forms that are represented by two or more adjacent characters (bytes), use the \wfcs field described below.
\wfcs: defines one or more multibyte "word formation characters" that do not have distinct lowercase and uppercase forms. Whitespace (spaces, tabs, or newlines) in this field is required to separate the different multibyte characters. The \wfcs field is optional, and may occur any number of times. Its effect is cumulative. Note that \wfcs fields may be used to replace \wfc fields, or the two types of fields may be mixed together in the control file.

Note that these are only a subset of the fields allowed in a text input control file. For more details about this file, see section `The text output control file' in KTEXT Reference Manual.

The arguments to loadOutxCtlFile are as follows:

pszFilename_in: points to the name of the text output control file.
cComment_in: is the character used to initiate comments on lines in the file.
pTextCtl_out: points to a data structure for storing information read from the file.
ppStringClasses_io: is the address of a pointer to a set of string classes possibly used by \ch fields or added to by \scl fields.

5.47.3 Return Value

zero if successful, nonzero if an error occurs

5.47.4 Example

#include <stdio.h>
#include "textctl.h"    /* includes strclass.h */
#include "rpterror.h"
...
char               szOutxFilename_g[200];
TextControl        sOutputControl_g;
StringClass *      pStringClasses_g = NULL;
...
memset(&sOutputControl_g, 0, sizeof(TextControl));
fprintf(stderr, "Text Output Control File (xxOUTX.CTL) [none]: ");
fgets(szOutxFilename_g, 200, stdin);
if (szOutxFilename_g[0])
    {
    if (loadOutxCtlFile(szOutxFilename_g, ';',
                        sOutputControl_g, pStringClasses_g) != 0)
        {
        reportError(ERROR_MSG,
                    "Error reading text output control file %s\n",
                    szOutxFilename_g);
        }
    }

5.47.5 Source File

`loadoutx.c'

5.48 matchAlphaChar

5.48.1 Syntax

#include "textctl.h"    /* or template.h or opaclib.h */

int matchAlphaChar(const unsigned char * pszString_in,
                   const TextControl *   pTextCtl_in);

5.48.2 Description

matchAlphaChar checks whether the input string begins with a multibyte alphabetic (word formation) character. If so, it returns the number of bytes in the matched multibyte alphabetic character.

This function depends on previous calls to addWordFormationChars, addWordFormationCharStrings, addLowerUpperWFChars, and addLowerUpperWFCharStrings to establish the multibyte alphabetic characters. (These functions are implicitly called by loadIntxCtlFile and loadOutxCtlFile.)

The arguments to matchAlphaChar are as follows:

pszString_in: points to a string to match against.
pTextCtl_in: points to a data structure that contains orthographic information.

5.48.3 Return Value

the number of bytes occupied by the multibyte alphabetic character at the beginning of the input string, or zero if the the string does not begin with a multibyte alphabetic character

5.48.4 Example See section 5.14 convLowerToUpper.

5.48.5 Source File `myctype.c'

5.49 matchBeginning

5.49.1 Syntax

#include "opaclib.h"

int matchBeginning(const char * pszString_in,
                   const char * pszBegin_in);

5.49.2 Description

matchBeginning compares two strings, using the end of the second string as the cutoff point for the comparison. It is functionally equivalent to

(strncmp(pszString_in, pszBegin_in, strlen(pszBegin_in)) == 0)

The arguments to matchBeginning are as follows:

pszString_in: points to a string to examine.
pszBegin_in: points to a string to compare to the beginning of the other string.

5.49.3 Return Value

nonzero (TRUE) if the two strings are equal up to the end of the second string, otherwise zero (FALSE)

5.49.4 Example

#include "opaclib.h"
...
char string[100], match[50];
...
if (matchBeginning(string, match))
    {
    ...
    }

5.49.5 Source File

`matchbeg.c'

5.50 matchBeginWithStringClass

5.50.1 Syntax

#include "strclass.h"   /* or change.h or textctl.h or template.h
                           or opaclib.h */

size_t matchBeginWithStringClass(const char *        pszString_in,
                                 const StringClass * pClass_in);

5.50.2 Description

matchBeginWithStringClass searches a string class to find a class member that matches the beginning of a string. It stops at the first successful match.

The arguments to matchBeginWithStringClass are as follows:

pszString_in: points to a string to match against.
pClass_in: points to a string class to search for a match.

5.50.3 Return Value

the length of the first successful match if found (effectively TRUE), otherwise zero (FALSE)

5.50.4 Example

#include "strclass.h"
...
static StringClass *    pClasses_m;
...
int matchesClassMemberAtBeginning(const char * pszString_in,
                                  const char * pszClassName_in)
{
StringClass *   pClass;

pClass = findStringClass(pszClassName_in, pClasses_m);
if (pClass == NULL)
    return 0;
return matchBeginWithStringClass(pszString_in, pClass);
}

5.50.5 Source File

`strcla.c'

5.51 matchCaselessChar

5.51.1 Syntax

#include "textctl.h"

int matchCaselessChar(const unsigned char * pszString_in,
                      const TextControl *   pTextCtl_in);

5.51.2 Description

matchCaselessChar checks whether the input string begins with a multibyte caseless character. If so, it returns the number of bytes in the matched multibyte caseless character.

This function depends on previous calls to addWordFormationChars or addWordFormationCharStrings to establish the multibyte caseless characters. (addWordFormationChars and addWordFormationCharStrings are implicitly called by loadIntxCtlFile and loadOutxCtlFile.)

The arguments to matchCaselessChar are as follows:

pszString_in: points to a string to match against.
pTextCtl_in: points to a data structure that contains orthographic information.

5.51.3 Return Value

the number of bytes occupied by the multibyte caseless character at the beginning of the input string, or zero if the the string does not begin with a multibyte caseless character

5.51.4 Example See section 5.54 matchLowercaseChar.

5.51.5 Source File `myctype.c'

5.52 matchEnd

5.52.1 Syntax

#include "opaclib.h"

int matchEnd(const char * pszString_in,
             const char * pszTail_in);

5.52.2 Description

matchEnd compares the second string against the end of the first string. It is functionally equivalent to

((strlen(pszString_in) < strlen(pszTail_in)) ? 0 :
    (strcmp(pszString_in + strlen(pszString_in) - strlen(pszTail_in),
            pszTail_in) == 0))

The arguments to matchEnd are as follows:

pszString_in: points to a string to examine.
pszTail_in: points to a string to compare to the end of the other string.

5.52.3 Return Value

nonzero (TRUE) if the second string matches the end of the first string, otherwise zero (FALSE)

5.52.4 Example

#include "opaclib.h"
...
char string[100], match[50];
...
if (matchEnd(string, match))
    {
    ...
    }

5.52.5 Source File

`matchend.c'

5.53 matchEndWithStringClass

5.53.1 Syntax

#include "strclass.h"   /* or change.h or textctl.h or template.h
                           or opaclib.h */

size_t matchEndWithStringClass(const char *        pszString_in,
                               const StringClass * pClass_in);

5.53.2 Description

matchEndWithStringClass searches a string class to find a class member that matches the end of a string. It stops at the first successful match.

The arguments to matchEndWithStringClass are as follows:

pszString_in: points to a string to match against.
pClass_in: points to a string class to search for a match.

5.53.3 Return Value

the length of the first successful match if found (effectively TRUE), otherwise zero (FALSE)

5.53.4 Example

#include "strclass.h"
...
static StringClass *    pClasses_m;
...
int matchesClassMemberAtEnd(const char * pszString_in,
                            const char * pszClassName_in)
{
StringClass *   pClass;

pClass = findStringClass(pszClassName_in, pClasses_m);
if (pClass == NULL)
    return 0;
return matchEndWithStringClass(pszString_in, pClass);
}

5.53.5 Source File

`strcla.c'

5.54 matchLowercaseChar

5.54.1 Syntax

#include "textctl.h"    /* or template.h or opaclib.h */

int matchLowercaseChar(const unsigned char * pszString_in,
                       const TextControl *   pTextCtl_in);

5.54.2 Description

matchLowercaseChar checks whether the input string begins with a multibyte lowercase character. If so, it returns the number of bytes in the matched multibyte lowercase character.

This function depends on previous calls to addLowerUpperWFChars or addLowerUpperWFCharStrings to establish the multibyte lowercase characters. (addLowerUpperWFChars and addLowerUpperWFCharStrings are implicitly called by loadIntxCtlFile and loadOutxCtlFile.)

The arguments to matchLowercaseChar are as follows:

pszString_in: points to a string to match against.
pTextCtl_in: points to a data structure that contains orthographic information.

5.54.3 Return Value

the number of bytes occupied by the multibyte lowercase character at the beginning of the input string, or zero if the the string does not begin with a multibyte lowercase character

5.54.4 Example

#include "textctl.h"

#define CASELESS -1
#define NOCAP     0
#define INITCAP   1
#define ALLCAP    2
#define MIXCAP    3

int getWordCase(const unsigned char * pszWord_in,
                const TextControl *   pTextCtl_in)
{
unsigned        uiUpperCount    = 0;
unsigned        uiLowerCount    = 0;
int             bFirstCap       = 0;
int             iLength;
unsigned char * p;

for ( p = pszWord_in ; p && *p ; p += iLength )
    {
    iLength = matchLowercaseChar(p, pTextCtl_in);
    if (iLength != 0)
        ++uiLowerCount;
    else
        {
        iLength = matchUppercaseChar(p, pTextCtl_in);
        if (iLength != 0)
            {
            ++uiUpperCount;
            if (uiLowerCount == 0)
                bFirstCap = 1;
            }
        else
            {
            iLength = matchCaselessChar(p, pTextCtl_in);
            if (iLength == 0)
                iLength = 1;
            }
        }
    }
if ((uiUpperCount == 0) && (uiLowerCount == 0))
    return CASELESS;
else if (uiUpperCount == 0)
    return NOCAP;
else if (bFirstCap && (uiUpperCount == 1))
    return INITCAP;
else if (uiLowerCount == 0)
    return ALLCAP;
else
    return MIXCAP;
}

5.54.5 Source File

`myctype.c'

5.55 matchUppercaseChar

5.55.1 Syntax

#include "textctl.h"    /* or template.h or opaclib.h */

int matchUppercaseChar(const unsigned char * pszString_in,
                       const TextControl *   pTextCtl_in);

5.55.2 Description

matchUppercaseChar checks whether the input string begins with a multibyte uppercase character. If so, it returns the number of bytes in the matched multibyte uppercase character.

This function depends on previous calls to addLowerUpperWFChars or addLowerUpperWFCharStrings to establish the multibyte uppercase characters. (addLowerUpperWFChars and addLowerUpperWFCharStrings are implicitly called by loadIntxCtlFile and loadOutxCtlFile.)

The arguments to matchUppercaseChar are as follows:

pszString_in: points to a string to match against.
pTextCtl_in: points to a data structure that contains orthographic information.

5.55.3 Return Value

the number of bytes occupied by the multibyte lowercase character at the beginning of the input string, or zero if the the string does not begin with a multibyte lowercase character

5.55.4 Example See section 5.54 matchLowercaseChar.

5.55.5 Source File `myctype.c'

5.56 mergeIntoStringList

5.56.1 Syntax

#include "strlist.h"    /* or strclass.h or change.h or textctl.h
                           or template.h or opaclib.h */

StringList * mergeIntoStringList(StringList * pList_io,
                                 const char * pszString_in);

5.56.2 Description

mergeIntoStringList adds a string to the beginning of a list of strings if it is not already present in the list.

The arguments to mergeIntoStringList are as follows:

pList_io: points to a list of strings.
pszString_in: points to the string to be added. A copy created with duplicateString is stored in the list, not the original string itself.

5.56.3 Return Value

a pointer to the possibly modified list of strings

5.56.4 Example

#include "strlist.h"
...
StringList * pStrings = NULL;
...
pStrings = mergeIntoStringList(pStrings, "this");
                /* pStrings-->"this"-->NULL */
pStrings = mergeIntoStringList(pStrings, "test");
                /* pStrings-->"test"-->"this"-->NULL */
pStrings = mergeIntoStringList(pStrings, "is");
                /* pStrings-->"is"-->"test"-->"this"-->NULL */
pStrings = mergeIntoStringList(pStrings, "a");
                /* pStrings-->"a"-->"is"-->"test"-->"this"-->NULL */
pStrings = mergeIntoStringList(pStrings, "test");
                /* pStrings-->"a"-->"is"-->"test"-->"this"-->NULL */

5.56.5 Source File

`add_sl.c'

5.57 mergeIntoStringListAtEnd

5.57.1 Syntax

#include "strlist.h"    /* or strclass.h or change.h or textctl.h
                           or template.h or opaclib.h */

StringList * mergeIntoStringListAtEnd(StringList * pList_io,
                                      const char * pszString_in);

5.57.2 Description

mergeIntoStringListAtEnd adds a string to the end of a list of strings if it is not already present in the list.

The arguments to mergeIntoStringListAtEnd are as follows:

pList_io: points to a list of strings.
pszString_in: points to the string to be added. A copy created with duplicateString is stored in the list, not the original string itself.

5.57.3 Return Value

a pointer to the possibly modified list of strings

5.57.4 Example

#include "strlist.h"
...
StringList * pStrings = NULL;
...
pStrings = mergeIntoStringListAtEnd(pStrings, "this");
                /* pStrings-->"this"-->NULL */
pStrings = mergeIntoStringListAtEnd(pStrings, "test");
                /* pStrings-->"this"-->"test"-->NULL */
pStrings = mergeIntoStringListAtEnd(pStrings, "is");
                /* pStrings-->"this"-->"test"-->"is"-->NULL */
pStrings = mergeIntoStringListAtEnd(pStrings, "a");
                /* pStrings-->"this"-->"test"-->"is"-->"a"-->NULL */
pStrings = mergeIntoStringListAtEnd(pStrings, "test");
                /* pStrings-->"this"-->"test"-->"is"-->"a"-->NULL */

5.57.5 Source File

`appnd_sl.c'

5.58 mergeTwoStringLists

5.58.1 Syntax

#include "strlist.h"    /* or strclass.h or change.h or textctl.h
                           or template.h or opaclib.h */

StringList * mergeTwoStringLists(StringList * pFirstList_io,
                                 StringList * pSecondList_io);

5.58.2 Description

mergeTwoStringLists merges two lists of strings together to form a single list. Any strings in the second list that exist in the first list are freed. Neither of the original lists survives this operation.

The arguments to mergeTwoStringLists are as follows:

pFirstList_io: points to a list of strings.
pSecondList_io: points to another list of strings.

5.58.3 Return Value

a pointer to the merged list

5.58.4 Example

#include "strlist.h"
...
StringList * pStrings = NULL;
StringList * pStrings1 = NULL;
StringList * pStrings2 = NULL;
...
pStrings1 = mergeIntoStringListAtEnd(pStrings1, "this");
pStrings1 = mergeIntoStringListAtEnd(pStrings1, "test");
pStrings1 = mergeIntoStringListAtEnd(pStrings1, "is");
pStrings1 = mergeIntoStringListAtEnd(pStrings1, "a");
pStrings1 = mergeIntoStringListAtEnd(pStrings1, "test");
pStrings2 = mergeIntoStringList(pStrings2, "that");
pStrings2 = mergeIntoStringList(pStrings2, "test");
pStrings2 = mergeIntoStringList(pStrings2, "is");
pStrings2 = mergeIntoStringList(pStrings2, "good");
/* pStrings1-->"this"-->"test"-->"is"-->"a"-->NULL */
/* pStrings2-->"good"-->"is"-->"test"-->"that"-->NULL */
pStrings = mergeTwoStringLists(pStrings1, pStrings2);
/* pStrings-->"good"-->"that"-->"this"-->"test"-->"is"-->"a"-->NULL */
/* pStrings1-->-----------------^ */
/* pStrings2-->??? */

5.58.5 Source File

`cat_sl.c'

5.59 parseChangeString

5.59.1 Syntax

#include "change.h"     /* or textctl.h or template.h or opaclib.h */

Change * parseChangeString(const char *        pszString_in,
                           const StringClass * pClassList_in);

5.59.2 Description

parseChangeString parses a string to build a Change structure.

The arguments to parseChangeString are as follows:

pszString_in: points to a change definition string.
pClasses_in: points to a collection of string classes that may be referenced in the environment portion of the change definition.

5.59.3 Return Value

a pointer to a newly allocated Change structure, or NULL if an error occurred while parsing the change definition

5.59.4 Example

#include "change.h"     /* includes strclass.h */
...
Change * addChange(const char *        pszChange_in,
                   Change *            pChanges_io,
                   const StringClass * pClasses_in)
{
Change *        pChange;
Change *        pTail;

pChange = parseChangeString(pszChange_in, pClasses_in);
if (pChange != NULL)
    {
    if (pChanges_io == NULL)
        return pChange;
    /*
     *  keep the list of changes in the original order
     */
    for (pTail = pChanges_io ; pTail->pNext ; pTail = pTail->pNext)
        ;
    pTail->pNext = pChange;
    }
return pChanges_io;
}

5.59.5 Source File

`change.c'

5.60 promptUser

5.60.1 Syntax

#include "opaclib.h"

void promptUser(const char * pszPrompt_in,
                char *       pszBuffer_out,
                unsigned     uiBufferSize_in);

5.60.2 Description

promptUser prompts the user, then reads a line of input from the keyboard (normally the standard input). If an EOF occurs, promptUser tries to reopen the keyboard.

The arguments to promptUser are as follows:

pszPrompt_in: points to a prompt message string.
pszBuffer_out: points to an input buffer.
uiBufferSize_in: is the size of the input buffer (not counting space for the terminating NUL).

5.60.3 Return Value

none

5.60.4 Example

#include <stdio.h>
#include "opaclib.h"
...
char    szFilename_g[BUFSIZ+1];
FILE *  pInputFP_g;
char    szBuffer_g[17];
long    iRepeatCount_g;
...
promptUser("Data file: ", szFilename_g, BUFSIZ);
pInputFP_g = fopen(szFilename_g, "r");
...
promptUser("Number of iterations to perform: ", szBuffer_g, 16);
iRepeatCount_g = strtol(szBuffer_g, NULL, 10);

5.60.5 Source File

`promptus.c'

5.61 readLineFromFile

5.61.1 Syntax

#include "opaclib.h"

char * readLineFromFile(FILE *     pInputFP_in,
                        unsigned * puiLineNumber_io,
                        int        cComment_in);

5.61.2 Description

readLineFromFile reads an arbitrarily long line of input text, erasing the trailing newline character. The string returned is overwritten or freed at the next call to readLineFromFile.

The arguments to readLineFromFile are as follows:

pInputFP_in: is a input FILE pointer.
puiLineNumber_io: points to a line number counter, or is NULL.
cComment_in: is the character that marks the beginning of a comment.

5.61.3 Return Value

the address of the buffer containing the NUL-terminated line, or NULL if already at the end of the file

5.61.4 Example

#include <stdio.h>
#include <string.h>
#include "opaclib.h"

void processFile(const char * pszFilename_in)
{
FILE *          pInputFP;
unsigned        uiLineNumber;
char *          pszLine;

if (pszFilename_in == NULL)
    return;
pInputFP = fopen(pszFilename_in, "r");
if (pInputFP == NULL)
    return;
uiLineNumber = 1;
while ((pszLine = readLineFromFile(pInputFP,
                                   &uiLineNumber, ';')) != NULL)
    {
    ...
    }
printf("%u lines read from %s\n", uiLineNumber, pszFilename_in);
}

5.61.5 Source File

`readline.c'

5.62 readSentenceOfTemplates

5.62.1 Syntax

#include "template.h"   /* or opaclib.h */

WordTemplate ** readSentenceOfTemplates(FILE *        pInputFP_in,
                                        const char *  pszAnaFile_in,
                                        const char *  pszFinalPunct_in,
                                        TextControl * pTextCtl_in,
                                        FILE *        pLogFP_in)

5.62.2 Description

readSentenceOfTemplates reads an arbitrarily long sentence (sequence of words) from an input analysis file, building an array of WordTemplate data structures. The sentence is terminated by a sentence-final punctuation character from pszFinalPunct_in.

The arguments to readSentenceOfTemplates are as follows:

pInputFP_in: is an input FILE pointer.
pszAnaFile_in: points to the name of the input analysis file.
pszFinalPunct_in: points to a NUL-terminated string of punctuation characters that mark the end of a sentence.
pTextCtl_in: points to a data structure that contains the decomposition and ambiguity marker characters.
pLogFP_in: is an output FILE pointer, used to log error messages, or NULL.

5.62.3 Return Value

a pointer to a dynamically allocated NULL-terminated array of pointers to dynamically allocated WordTemplate structures

5.62.4 Example

#include <stdio.h>
#include "template.h"
#include "allocmem.h"
#include "rpterror.h"
...
TextControl             sTextControl_g;
static const char       szSentenceFinalPunc_m[] = ".!?";
static const char       szCannotOpen_m[] =
        "Warning: cannot open analysis input file %s\n";
...
void processSentences(char * pszAnaFile_in, FILE * pLogFP_in)
{
FILE *          pInputFP;
WordTemplate ** pSentence;
unsigned        uiSentenceCount;
unsigned        i;
...
pInputFP = fopen(pszAnaFile_in, "r");
if (pInputFP == NULL)
    {
    reportError(ERROR_MSG, szCannotOpen_m, pszAnaFile_in);
    if (pLogFP_in != NULL)
        fprintf(pLogFP_in, szCannotOpen_m, pszAnaFile_in);
    return 0;
    }
for ( uiSentenceCount = 0 ;; ++uiSentenceCount )
    {
    pSentence = readSentenceOfTemplates(pInputFP,
                                        pszAnaFile_in,
                                        szSentenceFinalPunc_m,
                                        &sTextControl_g,
                                        pLogFP_in);
    if (pSentence == NULL)
        break;
    ...
    for ( i = 0 ; pSentence[i] ; ++i )
        freeWordTemplate( pSentence[i] );
    freeMemory( pSentence );
    }
return uiSentenceCount;
}

5.62.5 Source File

`senttemp.c'

5.63 readStdFormatField

5.63.1 Syntax

#include "opaclib.h"

char ** readStdFormatField(FILE *        pInputFP_in,
                           const char ** ppszFieldCodes_in,
                           int           cComment_in);

5.63.2 Description

readStdFormatField reads an arbitrarily large text field that starts with a backslash marker at the beginning of a line. Each line of the input field is stored separately in a NULL-terminated array of strings. If the field code at the beginning matches one of those in the input array of field codes, it is replaced by a single byte containing the 1-based index of the matching field code. Otherwise, the field code is left intact except that the backslash character is replaced by the character code 255 ('\377').

This function is an alternative to readStdFormatRecord, which potentially reads several fields at a time.

The arguments to readStdFormatField are as follows:

pInputFP_in: is an input FILE pointer.
ppszFieldCodes_in: points to a NULL-terminated array of field code strings.
cComment_in: is the character used to initiate comments in a line.

5.63.3 Return Value

a pointer to a dynamically allocated NULL-terminated array of pointers to dynamically allocated lines of text

5.63.4 Example

#include <stdio.h>
#include "opaclib.h"
...
static char     szWhitespace_m[7] = " \t\r\n\f\v";
...
int read_control_file(char * pszControlFile_in)
{
int             i;
char *          pszRuleFile    = NULL;
char *          pszLexiconFile = NULL;
char *          pszGrammarFile = NULL;
StringList *    pTraceList     = NULL;
char *          pszMorph;
FILE *          pControlFP;
char **         ppszField;
char *          pszLine;
static char *   aszCodes_s[] = {
    "\\rules", "\\lexicon", "\\grammar", "\\trace", ..., NULL
    };

if (pszControlFile_in == NULL)
    return FALSE;
pControlFP = fopen(pszControlFile_in, "r");
if (pControlFP == (FILE *)NULL)
    {
    reportError(WARNING_MSG, "Cannot open control file %s\n",
                pszControlFile_in);
    return FALSE;
    }
for (;;)
    {
    ppszField = readStdFormatField(pControlFP, aszCodes_s, NUL));
    if (ppszField == NULL)
        break;
    switch (**ppszField)
        {
        case 1:                 /* "\\rules" */
            if (pszRuleFile != NULL)
                reportError(WARNING_MSG,
                            "Rule file already specified: %s\n",
                            pszRuleFile);
            else
                {
                for ( i = 0 ; ppszField[i] ; ++i )
                    {
                    pszLine = ppszField[i];
                    if (i == 0)
                        ++pszLine;
                    pszRuleFile = strtok(pszLine, szWhitespace_m);
                    if (pszRuleFile != NULL)
                        break;
                    }
                }
            break;

        case 2:                 /* "\\lexicon" */
            if (pszLexiconFile != NULL)
                reportError(WARNING_MSG,
                            "Lexicon file already specified: %s\n",
                            pszLexiconFile);
            else
                {
                for ( i = 0 ; ppszField[i] ; ++i )
                    {
                    pszLine = ppszField[i];
                    if (i == 0)
                        ++pszLine;
                    pszLexiconFile = strtok(pszLine, szWhitespace_m);
                    if (pszLexiconFile != NULL)
                        break;
                    }
                }
            break;

        case 3:                 /* "\\grammar" */
            if (pszGrammarFile != NULL)
                reportError(WARNING_MSG,
                            "Grammar file already specified: %s\n",
                            pszGrammarFile);
            else
                {
                for ( i = 0 ; ppszField[i] ; ++i )
                    {
                    pszLine = ppszField[i];
                    if (i == 0)
                        ++pszLine;
                    pszGrammarFile = strtok(pszLine, szWhitespace_m);
                    if (pszGrammarFile != NULL)
                        break;
                    }
                }
            break;

        case 4:                 /* "\\trace" */
            for ( i = 0 ; ppszField[i] ; ++i )
                {
                pszLine = ppszField[i];
                if (i == 0)
                    ++pszLine;
                for (   pszMorph = strtok(pszLine, szWhitespace_m) ;
                        pszMorph ;
                        pszMorph = strtok(NULL, szWhitespace_m)
                    {
                    pTraceList = mergeIntoStringList(pTraceList,
                                                     pszMorph);
                    }
                }
            break;
...
        default:
            reportError(WARNING_MSG, "Unknown field: \\%s\n",
                        ppszField[0] + 1);
            break;
        }
    for ( i = 0 ; ppszField[i] ; ++i )
        freeMemory(ppszField[i]);
    freeMemory(ppszField);
    }
fclose(pControlFP);
...    
return TRUE;
}

5.63.5 Source File

`readfiel.c'

5.64 readStdFormatRecord

5.64.1 Syntax

#include "record.h"     /* or opaclib.h */

char * readStdFormatRecord(FILE *            pInputFP_in,
                           const CodeTable * pCodeTable_in,
                           int               cComment_in,
                           unsigned *        puiRecordCount_io);

5.64.2 Description

readStdFormatRecord reads the next record from a standard format file. The record is stored in memory as a series of NUL-terminated strings stored consecutively in a single buffer, with the record terminated by two consecutive NUL bytes. The first character of each string is either a character representing the field code (if found in the code table), or a backslash indicating that the field code was not recognized.

This function is an alternative to readStdFormatField, which always reads only one field at a time.

The arguments to readStdFormatRecord are as follows:

pInputFP_in: is an input FILE pointer.
pCodeTable_in: points to the field code table used to decode the standard format file field code markers.
cComment_in: is a character that marks comments in the input file.
puiRecordCount_io: points to a counter for keeping track of the number of records read, or is NULL.

5.64.3 Return Value

a pointer to the buffer containing the record, or NULL for EOF.

5.64.4 Example

#include <stdio.h>
#include <string.h>
#include "record.h"
...
void loadStdFmtFile(pszFilename_in)
char *          pszFilename_in;
{
FILE *          pInputFP;
char *          pRecord;
char *          pszField;
char *          pszNextField;
unsigned        uiRecordCount;
static CodeTable sCodeTable_s = { "\
\\a\0A\0\
\\d\0D\0\
\\w\0W\0\
\\f\0F\0\
\\c\0C\0\
\\n\0N\0"
    6, "\\a"
    };

if (pszFilename_in == NULL)
    return;
pInputFP = fopen(pszFilename_in, "r");
if (pInputFP == NULL)
    return;
while ((pRecord = readStdFormatRecord(pInputFP,
                                      &sCodeTable_s,
                                      ';',
                                      &uiRecordCount)) != NULL)
    {
    pszField = pRecord;
    while ((c = *pszField++) != '\0')
        {
        pszNextField = pszField + strlen(pszField) + 1;
        switch (c)
            {
            case 'A':
                ...
                break;
            case 'C':
                ...
                break;
            case 'D':
                ...
                break;
            case 'F':
                ...
                break;
            case 'N':
                ...
                break;
            case 'W':
                ...
                break;
            default:
                ...
                break;
            }
        pszField = pszNextField;
        }    
    ...
    }
cleanupAfterStdFormatRecord();
fclose(pInputFP);
return;
}

5.64.5 Source File

`record.c'

5.65 readTemplateFromAnalysis

5.65.1 Syntax

#include "template.h"   /* or opaclib.h */

WordTemplate * readTemplateFromAnalysis(
                           FILE *              pInputFP_in,
                           const TextControl * pTextCtl_in);

5.65.2 Description

readTemplateFromAnalysis fills in a WordTemplate data structure from an AMPLE style analysis file.

The arguments to readTemplateFromAnalysis are as follows:

pInputFP_in: is an input FILE pointer.
pTextCtl_in: points to a data structure that contains orthographic information.

5.65.3 Return Value

a pointer to a dynamically allocated WordTemplate data structure, or NULL if either EOF or an error occurs

5.65.4 Example

#include "template.h"
#include "rpterror.h"
...
void synthesizeFile(
    char *              pszInputFile_in,
    char *              pszOutputFile_in,
    TextControl *       pTextCtl_in)
{
FILE *                  pInputFP;
FILE *                  pOutputFP;
WordTemplate *          pWord;
WordAnalysis *          pAnal;
...
/*
 *  open the files
 */
if ((pszInputFile_in == NULL) || (pszOutputFile_in == NULL))
    return;
pInputFP  = fopen(pszInputFile_in, "r");
if (pInputFP == NULL)
    {
    reportError(WARNING_MSG, "Cannot open input file %s\n",
                pszInputFile_in);
    return;
    }
pOutputFP = fopen(pszOutputFile_g, "w");
if (pOutputFP == NULL)
    {
    reportError(WARNING_MSG, "Cannot open output file %s\n",
                pszOutputFile_in);
    fclose(pInputFP);
    return;
    }
/*
 *  process the data
 */
for (;;)
    {
    pWord = readTemplateFromAnalysis(pInputFP, &pTextCtl_in);
    if (pWord == NULL)
        break;
    ...
    for ( pAnal = pWord->pAnalyses ; pAnal ; pAnal = pAnal->pNext )
        {
        ...
        }
    ...
    writeTextFromTemplate( pOutputFP, pWord, pTextCtl_in);
    freeWordTemplate( pWord );
    }
...
fclose(pInputFP);
fclose(pOutputFP);
}

5.65.5 Source File

`dtbin.c'

5.66 readTemplateFromText

5.66.1 Syntax

#include "template.h"   /* or opaclib.h */

WordTemplate * readTemplateFromText(FILE *              pInputFP_in,
                                    const TextControl * pTextCtl_in);

5.66.2 Description

readTemplateFromText reads a word from a text file into a WordTemplate structure.

The arguments to readTemplateFromText are as follows:

pInputFP_in: is an input FILE pointer.
pTextCtl_in: points to a data structure that contains orthographic information.

5.66.3 Return Value

a pointer to a dynamically allocated WordTemplate data structure, or NULL if either EOF or an error occurs

5.66.4 Example See section 5.38 freeWordTemplate.

5.66.5 Source File `textin.c'

5.67 readTemplateFromTextString

5.67.1 Syntax

#include "template.h"   /* or opaclib.h */

WordTemplate * readTemplateFromTextString(unsigned char **    ppszString_io,
                                          const TextControl * pTextCtl_in);

5.67.2 Description

readTemplateFromText reads a word from a text string into a WordTemplate structure.

The arguments to readTemplateFromText are as follows:

ppszString_io: points to a pointer which points to the string to be "read". The pointer to the string will be updated by this routine.
pTextCtl_in: points to a data structure that contains orthographic information.

5.67.3 Return Value

a pointer to a dynamically allocated WordTemplate data structure, or NULL if either the string consists merely of NUL or an error occurs

5.67.4 Example

#include "template.h"
...
TextControl sTextCtl_g;
...
WordAnalysis * merge_analyses(
    WordAnalysis *  pList_in,
    WordAnalysis *  pAnal_in)
{
...
}
...
void process(
    unsigned char *pszInputText_in,
    FILE * pOutputFP_in)
{
char *		pszInputText;
char *		pszWord;
WordTemplate *  pWord;
WordAnalysis *  pAnal;
unsigned        uiAmbiguityCount;
unsigned long   uiWordCount;

pszInputText = duplicateString(pszInputText_in);
pszWord = pszInputText;
for ( uiWordCount = 0L ;; )
    {
    pWord = readTemplateFromTextString(&pszWord, &sTextCtl_g);
    if (pWord == NULL)
        break;
    uiAmbiguityCount = 0;
    if (pWord->paWord != NULL)
        {
        for ( i = 0 ; pWord->paWord[i] ; ++i )
            {
            pAnal = analyze(pWord->paWord[i]);
            pWord->pAnalyses = merge_analyses(pWord->pAnalyses,
                                              pAnal);
            }
        for (pAnal = pWord->pAnalyses ; pAnal ; pAnal = pAnal->pNext)
            ++uiAmbiguityCount;
        }
    writeTemplate(pOutputFP_in, NULL, pWord, &sTextCtl_g);
    freeWordTemplate(pWord);
    }
freeMemory(pszInputText);
}

5.67.5 Source File

`textin.c'

5.68 reallocMemory

5.68.1 Syntax

#include "allocmem.h"   /* or opaclib.h */

void * reallocMemory(void * pBuffer_in,
                     size_t uiSize_in);

5.68.2 Description

reallocMemory adjusts an allocated buffer to a new size. It provides a "safe" interface to either realloc or malloc, depending on whether or not pBuffer_in is NULL. Running out of memory is handled the same as for allocMemory; see section 5.8 allocMemory.

The arguments to reallocMemory are as follows:

pBuffer_in: points to a dynamically allocated buffer previously returned by allocMemory, reallocMemory, or duplicateString. It also may be NULL to allocate a new block of memory.
uiSize_in: is the new size, either smaller or larger than the previous allocation size.

5.68.3 Return Value

a pointer to a possibly reallocated block

5.68.4 Example See section 5.29 fitAllocStringExactly.

5.68.5 Source File `allocmem.c'

5.69 recapitalizeWord

5.69.1 Syntax

#include "template.h"   /* or opaclib.h */

void recapitalizeWord(char *              pszWord_io,
                      int                 iRecap_in,
                      const TextControl * pTextCtl_in);

5.69.2 Description

recapitalizeWord tries to reimpose capitalization as it was in the original input text.

The arguments to recapitalizeWord are as follows:

pszWord_io

points to the word to recapitalize.

iRecap_in

is the capitalization flag:

0 (NOCAP): None of the characters are capitalized.
1 (INITCAP): Only the initial character is capitalized.
2 (ALLCAP): All of the characters are capitalized.
4-65535: These values are bitmaps of individually capitalized characters, with 4 encoding the capitalization of the first character, 8 encoding the second character, and so on.

pTextCtl_in

points to a data structure that contains orthographic information.

5.69.3 Return Value

none

5.69.4 Example

#include "template.h"

void fix_new_words(pTemplate_io, pTextCtl_in)
WordTemplate *          pTemplate_io;
const TextControl *     pTextCtl_in;
{
StringList *    pWord;
char *          p;

if ((pTemplate_io == NULL) || (pTemplate_io->pNewWords == NULL))
    return;
if (pTextCtl_in == NULL)
    return;
/*
 *  apply orthography changes to the word and recapitalize it
 */
for ( pWord = pTemplate_io->pNewWords ; pWord ; pWord = pWord->pNext )
    {
    /*
     *  apply output orthography changes and recapitalize
     */
    p = applyChanges(pWord->pszString, pTextCtl_in->pOutputChanges );
    recapitalizeWord( p, pTemplate_io->iCapital, pTextCtl_in);
    /*
     *  store the modified wordform
     */
    freeMemory(pWord->pszString);
    pWord->pszString = p;
    }
}

5.69.5 Source File

`textout.c'

5.70 removeDataFromTrie

5.70.1 Syntax

#include "trie.h"       /* or opaclib.h */

int removeDataFromTrie(Trie *    pTrieHead_in,
                       char *    pszKey_in,
                       void *    pInfo_in,
                       void * (* pfRemoveInfo_in)(void * pOld_in,
                                                  void * pList_io));

5.70.2 Description

removeDataFromTrie removes a stored piece of information from a trie.

The arguments to removeDataFromTrie are as follows:

pTrieHead_in

points to the head of a trie.

pszKey_in

points to the key string.

pInfo_in

points to the actual data element to remove.

pfRemoveInfo_in

points to a function for removing the data element from the stored information. The function has two arguments:

pOld_in: points to the item to remove from the collection (pInfo_in).
pList_io: points to a collection of items stored at a Trie node (Trieinfo).

The function returns the updated pointer to the data collection for storing as the value of pTrieInfo.

5.70.3 Return Value

zero if successful, nonzero if an error occurs

5.70.4 Example

#include <string.h>
#include "trie.h"
#include "rpterror.h"
#include "allocmem.h"
...
typedef struct lex_item {
    struct lex_item *   pLink;          /* link to next item */
    struct lex_item *   pNext;          /* link to next homograph */
    unsigned char *     pszForm;        /* lexical form (word) */
    unsigned char *     pszGloss;       /* lexical gloss */
    unsigned short      uiCategory;     /* lexical category */
    } LexItem;
...
Trie *          pLexicon_g;
unsigned long   uiLexiconCount_g;
static char     szWhitespace_m[7] = " \t\r\n\f\v";
...
static void * remove_lex_item(void * pDefunct_in, void * pList_in)
{
LexItem *       pLex;
LexItem *       pList;
/*
 *  be a little paranoid
 */
if (pDefunct_in == NULL)
    return pList_in;
/*
 *  handle removing the head of the list
 */
if (pDefunct_in == pList_in)
    return pDefunct_in->pLink;
/*
 *  unlink from the list of homographs
 */
/*
 *  unlink from both the general list and the list of homographs
 */
for ( pLex = (LexItem *)pList_in ; pLex ; pLex = pLex->pLink )
    {
    if (pLex->pNext == pDefunct_in)
        pLex->pNext = pDefunct_in->pNext;
    if (pLex->pLink == pDefunct_in)
        {
        pLex->pLink = pDefunct_in->pLink;
        break;          /* no need to check further */
        }
    }
return pList_in;
}

void remove_from_lexicon(char * pszForm_in,
                         char * pszGloss_in,
                         char * pszCategory_in)
{
LexItem *       pLex;
unsigned short  uiCategory;

if (    (pszForm_in     == NULL) ||
        (pszGloss_in    == NULL) ||
        (pszCategory_in == NULL) )
    return;

uiCategory = index_lexical_category(pszCategory_in);
for (   pLex = findDataInTrie(pLexicon_g, pszWord_in) ;
        pLex ;
        pLex = pLex->pLink )
    {
    if (    (strcmp(pLex->pszForm,  pszWord_in)  == 0) &&
            (strcmp(pLex->pszGloss, pszGloss_in) == 0) &&
            (pLex->uiCategory    == uiCategory)        )
        {
        removeDataFromTrie(pLexicon_g, pszForm_in, pLex, 
                           remove_lex_item);
        break;
        }
    }
}

5.70.5 Source File

`trie.c'

5.71 removeFromStringList

5.71.1 Syntax

#include "strlist.h"    /* or strclass.h or change.h or textctl.h
                           or template.h or opaclib.h */

StringList * removeFromStringList(StringList * pList_io,
                                  const char * pszString_in);

5.71.2 Description

removeFromStringList removes the first occcurrence of a string from a list of strings.

The arguments to removeFromStringList are as follows:

pList_io: points to a list of strings.
pszString_in: points to the string to be removed.

5.71.3 Return Value

a pointer to the (possibly shorter) list, or NULL if the only item in the list was removed

5.71.4 Example

#include "strlist.h"
...
static StringList *     pNameList_m;
...
char *  pszName;
...
pNameList_m = removeFromStringList(pNameList_m, pszName);
...

5.71.5 Source File

`rmstr_sl.c'

5.72 reportError

5.72.1 Syntax

#include "rpterror.h"   /* or opaclib.h */

void reportError(int          eMessageType_in,
                 const char * pszFormat_in,
                 ...);

5.72.2 Description

reportError reports an error message to the user. For MS-DOS and Unix, reportError writes to the standard error output. The message is also written to the standard output if it has been redirected. For GUI programs, the programmer must write a different version of reportError to satisfy the link requirements of other functions in the OPAC library. This would typically display a message box.

The arguments to reportError are as follows:

eMessageType_in

is the type of error message being reported, one of the following:

ERROR_MSG: is a message about an erroneous situation.
WARNING_MSG: is a message about a situation that is not quite an error, but not normal either.
DEBUG_MSG: is a message that only the programmer is expected to understand.

pszFormat_in

points to a printf style format string for the (error) message.

...

represents zero or more arguments for the format string (pszFormat_in).

5.72.3 Return Value

none

5.72.4 Example See section 5.1 addDataToTrie.

5.72.5 Source File `rpterror.c'

5.73 reportMessage

5.73.1 Syntax

#include "rpterror.h"   /* or opaclib.h */

void reportMessage(int          bNotSilent_in,
                   const char * pszFormat_in,
                   ...);

5.73.2 Description

reportMessage displays a message with zero or more arguments. For MS-DOS and Unix, reportMessage writes to the standard error output. The message is also written to the standard output if it has been redirected. For GUI programs, the programmer must write a different version of reportMessage to satisfy the link requirements of other functions in the OPAC library. This would typically write to a message window.

The arguments to reportMessage are as follows:

bNotSilent_in: allows writing the message to the standard error output if TRUE (nonzero). If FALSE (zero), the message is written only to the standard output (stdout), and then only if it has been redirected. This allows programs to have a "quiet" mode of operation without requiring a global variable.
pszFormat_in: points to a printf style format string for the message.
...: represents zero or more arguments for the format string (pszFormat_in).

5.73.3 Return Value

none

5.73.4 Example

#include "rpterror.h"
...
static int      iDebugLevel_m;
...
static int read_token(pszBuffer_in, uiBufferSize_in)
char *          pszBuffer_in;
unsigned        uiBufferSize_in;
{
int     iTokenType;
...
if (iDebugLevel_m >= 8)
    {
    reportMessage("DEBUG read_token(\"%s\",%u) => ",
                  pszBuffer_in, uiBufferSize_in);
    switch (iTokenType)
        {
        case BECOMES:
            reportMessage("BECOMES_TOKEN");
            break;
        case KEYWORD:
            reportMessage("KEYWORD_TOKEN");
            break;
        case SYMBOL:
            reportMessage("SYMBOL_TOKEN");
            break;
        default:
            reportMessage("'%c'\t", iTokenType);
            break;
        }
    reportMessage("\n");
    }
return( iTokenType );
}

5.73.5 Source File

`rptmessg.c'

5.74 reportProgress

5.74.1 Syntax

#include "opaclib.h"

void reportProgress(unsigned long uiCount_in);

5.74.2 Description

reportProgress displays a progress report based on a progress counter.

The standard version of reportProgress actually does nothing. For GUI programs, the programmer may write a version of reportProgress to display some sort of progress message using the progress counter.

reportProgress has one argument:

uiCount_in: is a progress count of some sort.

5.74.3 Return Value

none

5.74.4 Example

#include "opaclib.h"
...
static unsigned long    uiTokenCount_m;
...
static int read_token(pszBuffer_in, uiBufferSize_in)
char *          pszBuffer_in;
unsigned        uiBufferSize_in;
{
int     iTokenType;
...
++uiTokenCount_m;
reportProgress( uiTokenCount_m );
return( iTokenType );
}

5.74.5 Source File

`rptprgrs.c'

5.75 resetTextControl

5.75.1 Syntax

#include "textctl.h"    /* or template.h or opaclib.h */

void resetTextControl(TextControl * pTextCtl_io);

5.75.2 Description

resetTextControl frees any memory allocated by either loadIntxCtlFile or
loadOutxCtlFile. It does not free the TextControl data structure itself.

resetTextControl has one argument:

pTextCtl_io: points to a data structure that contains orthographic information.

5.75.3 Return Value

none

5.75.4 Example

#include <stdio.h>
#include "textctl.h"    /* include strclass.h */
#include "rpterror.h"
...
char               szIntxFilename_g[200];
TextControl        sTextControl_g;
StringClass *      pStringClasses_g = NULL;
static TextControl sDefaultTextControl_m = {
    NULL,       /* filename */
    NULL,       /* ordered array of lowercase letters */
    NULL,       /* ordered array of matching uppercase letters */
    NULL,       /* array of caseless letters */
    NULL,       /* list of input orthography changes */
    NULL,       /* list of output (orthography) changes */
    NULL,       /* list of format markers (fields) to include */
    NULL,       /* list of format markers (fields) to exclude */
    '\\',       /* initial character of format markers (field codes) */
    '%',        /* character for marking ambiguities and failures */
    '-',        /* character for marking decomposition */
    '|',        /* initial character of secondary format markers */
    NULL,       /* (Manuscripter) bar codes */
    TRUE,       /* flag whether to capitalize individual letters */
    TRUE,       /* flag whether to decapitalize/recapitalize */
    100         /* maximum number of decapitalization alternatives */
    };
...
memcpy(&sTextControl_g, &sDefaultTextControl_m, sizeof(TextControl));
fprintf(stderr, "Text Control File (xxINTX.CTL) [none]: ");
fgets( szIntxFilename_g, 200, stdin );
if (szIntxFilename_g[0])
    {
    if (loadIntxCtlFile(szIntxFilename_g, ';',
                        sTextControl_g, pStringClasses_g) != 0)
        {
        reportError(ERROR_MSG, "Error reading text control file %s\n",
                    szIntxFilename_g);
        }
    }
if (    (sTextControl_g.cBarMark == NUL) &&
        (sTextControl_g.pszBarCodes != NULL) )
    {
    freeMemory(sTextControl_g.pszBarCodes);
    sTextControl_g.pszBarCodes = NULL;
    }
if (    (sTextControl_g.cBarMark != NUL) &&
        (sTextControl_g.pszBarCodes == NULL) )
    {
    sTextControl_g.pszBarCodes = (unsigned char *)duplicateString(
                                                    "bdefhijmrsuvyz");
    }
...
resetTextControl(&sTextControl_g);

5.75.5 Source File

`resetxtc.c'

5.76 resetWordFormationChars

5.76.1 Syntax

#include "textctl.h"    /* or template.h or opaclib.h */

void resetWordFormationChars(TextControl * pTextCtl_io);

5.76.2 Description

resetWordFormationChars erases the stored information about word formation characters stored by previous calls to either addWordFormationChars or addLowerUpperWFChars. This frees any allocated memory and sets the relevant pointers to NULL.

resetWordFormationChars has one argument:

pTextCtl_io: points to a data structure that contains orthographic information.

5.76.3 Return Value

none

5.76.4 Example See section 5.2 addLowerUpperWFChars.

5.76.5 Source File `myctype.c'

5.77 setAllocMemoryTracing

5.77.1 Syntax

#include "allocmem.h"   /* or opaclib.h */

void setAllocMemoryTracing(const char * pszFilename_in);

5.77.2 Description

setAllocMemoryTracing turns debugging on (if a filename is given) or off (if pszFilename_in is NULL). If debugging is on, every call to allocMemory, reallocMemory, and freeMemory is logged to the given file for postmortem analysis. Calls to duplicateString are logged as calls to allocMemory, which duplicateString calls internally.

setAllocMemoryTracing has one argument:

pszFilename_in: points to the name of the debugging output file, or is NULL.

5.77.3 Return Value

none

5.77.4 Example

#include <stdlib.h>
#include "allocmem.h"
...
extern int      getopt(int argc, char * const argv[],
                       const char *opts);
extern char *   optarg;
...
int main(int argc, char ** argv)
{
void *          pTrapAddress = NULL;
unsigned        iTrapCount   = 0;
int             k;
char *          p;
...
while ((k = getopt(argc, argv, "ai:o:x:z:Z:")) != EOF)
    {
    switch (k)
        {
...
        case 'z':       /* memory allocation trace filename */
            setAllocMemoryTracing(optarg);
            break;

        case 'Z':       /* memory allocation trap address,count */
            pTrapAddress = (void *)strtoul(optarg, &p, 10);
            if (*p == ',')
                iTrapCount = (unsigned)strtoul(p+1, NULL, 10);
            if (iTrapCount == 0)
                iTrapCount = 1;
            setAllocMemoryTrap(pTrapAddress, iTrapCount);
            break;
...
        }
    }
...
}

5.77.5 Source File

`allocmem.c'

5.78 setAllocMemoryTrap

5.78.1 Syntax

#include "allocmem.h"   /* or opaclib.h */

void setAllocMemoryTrap(const void * pAddress_in,
                        int          iCount_in);

5.78.2 Description

setAllocMemoryTrap sets a trap for the iCount_in'th reference to the address pAddress_in by either allocMemory or freeMemory. This can be useful for tracking down memory allocation bugs.

The arguments to setAllocMemoryTrap are as follows:

pAddress_in: is the memory address to trap on.
iCount_in: is the occurrence to trap on.

5.78.3 Return Value

none

5.78.4 Example See section 5.77 setAllocMemoryTracing.

5.78.5 Source File `allocmem.c'

5.79 showAmbiguousProgress

5.79.1 Syntax

#include "opaclib.h"

unsigned long showAmbiguousProgress(unsigned      uiAmbiguityCount_in,
                                    unsigned long uiItemCount_in);

5.79.2 Description

showAmbiguousProgress displays the progress of the program in a rudimentary fashion. If uiAmbiguityCount_in is 0, then a star (`*') is written to the screen, and if uiAmbiguityCount_in is 1, then a dot (`.') is written to the screen. Otherwise, if uiAmbiguityCount_in is less than 10, the count digit is written, and if it is greater than or equal to 10, a greater than sign (`>') is written. These progress characters are grouped in bunches of 10, with 5 bunches on a line and space between each bunch. Every other line ends with the total count of items thus far (uiItemCount_in).

The arguments to showAmbiguousProgress are as follows:

uiAmbiguityCount_in: is the number of alternative results to report for the current item.
uiItemCount_in: is the number of items that have been processed thus far.

5.79.3 Return Value

the updated value for uiItemCount_in

5.79.4 Example See section 5.38 freeWordTemplate.

5.79.5 Source File `ambprog.c'

5.80 squeezeStringList

5.80.1 Syntax

#include "strlist.h"    /* or strclass.h or change.h or textctl.h
                           or template.h or opaclib.h */

StringList * squeezeStringList(StringList * pList_io);

5.80.2 Description

squeezeStringList removes any redundant strings from a list of strings.

squeezeStringList has one argument:

pList_io: points to a list of strings.

5.80.3 Return Value

a pointer to the (possibly smaller) list of strings

5.80.4 Example

#include "template.h"   /* includes strlist.h */
...
static WordTemplate *   pTemplate_m = NULL;
...
/*
 *  eliminate identical results
 */
pTemplate_m->pNewWords = squeezeStringList( pTemplate_m->pNewWords );

5.80.5 Source File

`sqz_sl.c'

5.81 tokenizeString

5.81.1 Syntax

#include "opaclib.h"

unsigned char * tokenizeString(unsigned char *       pszString_in,
                               const unsigned char * pszSeparate_in)

5.81.2 Description

tokenizeString splits the string (pszString_in into a sequence of zero or more text tokens separated by spans of one or more characters from pszSeparate_in. Only the initial call provides a value for pszString_in; successive calls must use a NULL pointer for the first argument. The first separater character following the token in pszString_in is replaced by a NUL character. Subsequent calls to tokenizeString work through pszString_in sequentially. Note that pszSeparate_in may change from one call to the next.

tokenizeString is like strtok except that it operates on strings of unsigned char rather than strings of char.

The arguments to tokenizeString are as follows:

pszString_in: points to a NUL-terminated character string, or NULL.
pszSeparate_in: points to a NUL-terminated set of separator characters, or NULL. If it is NULL, then the rest of the string is returned as the token.

5.81.3 Return Value

a pointer to the next token extracted from the input string, or NULL if no more tokens exist

5.81.4 Example

#include "opaclib.h"
...
char    szWhitespace_m[7] = " \n\r\t\f\v";
char    szInputBuffer_m[1024];
char *  pszToken;
...
for (   pszToken = tokenizeString(szInputBuffer_m, szWhitespace_m) ;
        pszToken != NULL ;
        pszToken = tokenizeString(NULL, szWhitespace_m) )
    {
    ...
    }
...

5.81.5 Source File

`tokenize.c'

5.82 trimTrailingWhitespace

5.82.1 Syntax

#include "opaclib.h"

char * trimTrailingWhitespace(char * pszString_io);

5.82.2 Description

trimTrailingWhitespace removes any trailing white space characters from the input string.

trimTrailingWhitespace has one argument:

pszString_io: points to a character string.

5.82.3 Return Value

a pointer to the beginning of the input string

5.82.4 Example

#include "opaclib.h"
...
static char     szWhitespace_m[7] = " \t\r\n\f\v";
...
FILE *          pRulesFP;
unsigned        uiLineNumber;
char *          pszToken;
...
for ( uiLineNumber = 1 ;;)
    {
    pszToken = readLineFromFile(pRulesFP, &uiLineNumber, ';');
    if (pszToken == NULL)
        break;
    /*
     *  skip leading spaces and remove trailing spaces
     */
    pszToken += strspn(pszToken, szWhitespace_m);
    if (*pszToken == NUL)
        continue;
    trimTrailingWhitespace(pszToken);
    ...
    }

5.82.5 Source File

`trimspac.c'

5.83 unlinkStringList

5.83.1 Syntax

#include "strlist.h"    /* or strclass.h or change.h or textctl.h
                           or template.h or opaclib.h */

void unlinkStringList(StringList ** ppList_io);

5.83.2 Description

unlinkStringList frees the StringList data structures in a list of strings, while leaving intact the strings they point to.

The arguments to unlinkStringList are as follows:

ppList_io: is the address of a pointer to the head of a list of strings to unlink.

5.83.3 Return Value

none

5.83.4 Example

#include "strlist.h"
...
StringList *    pList;
...
unlinkStringList(pList);
pList = NULL;

5.83.5 Source File

`unlst_sl.c'

5.84 updateStringList

5.84.1 Syntax

#include "strlist.h"    /* or strclass.h or change.h or textctl.h
                           or template.h or opaclib.h */

char * updateStringList(StringList ** ppList_io,
                        const char *  pszString_in);

5.84.2 Description

updateStringList adds the string to the list if it is not already in the list. This function is similar to mergeIntoStringList, except that it has a different argument and returns a different value.

The arguments to updateStringList are as follows:

ppList_io: is the address of a pointer to the list of strings to be updated.
pszString_in: points to the string to be added to the list of strings.

5.84.3 Return Value

a pointer to the copy of pszString_in stored in the list of strings

5.84.4 Example

#include "strlist.h"
...
static StringList *     pCategories_m;
static char             szBuffer_m[100];
...
char *  pszCategory;
...
pszCategory = updateStringList( &pCategories_m, szBuffer_m );
...

5.84.5 Source File

`updat_sl.c'

5.85 walkTrie

5.85.1 Syntax

#include "trie.h"       /* or opaclib.h */

void walkTrie(Trie *  pTrieHead_in,
              void (* pfWalk_in)(void * pList_in));

5.85.2 Description

walkTrie walks through a trie, processing the information stored at each node.

The arguments to walkTrie are as follows:

pTrieHead_in

points to the head of a trie.

pfWalk_in

points to a function for processing the stored information at each node of the trie. The function has one argument:

pList_in: points to a collection of items stored at a Trie node (Trieinfo).

The function does not return a value.

5.85.3 Return Value

none

5.85.4 Example

#include <stdio.h>
#include "trie.h"
#include "rpterror.h"
...
typedef struct lex_item {
    struct lex_item *   pLink;          /* link to next item */
    struct lex_item *   pNext;          /* link to next homograph */
    unsigned char *     pszForm;        /* lexical form (word) */
    unsigned char *     pszGloss;       /* lexical gloss */
    unsigned short      uiCategory;     /* lexical category */
    } LexItem;
...
Trie *          pLexicon_g;
FILE *          pLexiconFP_m;
...
static void write_lex_items(void * pList_in)
{
LexItem *       pLex;

if (pLexiconFP_m == NULL)
    return;
for ( pLex = (LexItem *)pList_in ; pLex ; pLex = pLex->pLink )
    {
    fprintf(pLexiconFP_m, "%-20s %-20s %s\n",
            pLex->pszForm, pLex->pszGloss,
            get_lexical_category_name(pLex->uiCategory));
    }
}

void write_lexicon()
{
if (pszLexiconFile_in == NULL)
    {
    reportError(WARNING_MSG, "Missing output lexicon filename\n");
    return;
    }
pLexiconFP_m = fopen(pszLexiconFile_in, "w");
if (pLexiconFP_m == NULL)
    {
    reportError(WARNING_MSG,
                "Cannot open lexicon file %s for output\n",
                pszLexiconFile_in);
    return;
    }
walkTrie(pLexicon_g, write_lex_items);
fclose(pLexiconFP_m);
}

5.85.5 Source File

`trie.c'

5.86 writeAllocMemoryDebugMsg

5.86.1 Syntax

#include "allocmem.h"   /* or opaclib.h */

void writeAllocMemoryDebugMsg(const char * pszFormat_in,
                              ...);

5.86.2 Description

writeAllocMemoryDebugMsg writes a message to the memory allocation tracing file if it is open, and does nothing if that file is not open. The memory allocation tracing file is opened and closed by setAllocMemoryTracing. writeAllocMemoryDebugMsg is similar to printf except that it writes to a specific (optional) file rather than to the standard output.

The arguments to writeAllocMemoryDebugMsg are as follows:

pszFormat_in: points to a printf style format string for the message.
...: represents zero or more arguments for the format string (pszFormat_in).

5.86.3 Return Value

none

5.86.4 Example

#include "allocmem.h"
#include "strlist.h"
...
StringList * pStrings;
...
writeAllocMemoryDebugMsg("deleting %u strings\n",
                         getStringListSize(pStrings));
freeStringList(pStrings);
pStrings = NULL;

5.86.5 Source File

`allocmem.c'

5.87 writeChange

5.87.1 Syntax

#include "change.h"

void writeChange(const Change * pChange_in,
                 FILE *         pOutputFP_in);

5.87.2 Description

writeChange writes the given Change data structure to the output file as a human readable string consisting of a pair of quoted strings followed by the environment constraint (if any).

The arguments to writeChange are as follows:

pChange_in: points to a single consistent change data structure. (The pNext field of the Change data structure is ignored.)
pOutputFP_in: is an output FILE pointer.

5.87.3 Return Value

none

5.87.4 Example

#include <stdio.h>
#include "change.h"
...
void writeChangeList(FILE * pOutputFP_in, Change * pChanges_in)
{
Change *        cp;

if (pOutputFP_in == NULL)
    return;
for ( cp = pChanges_in ; cp ; cp = cp->pNext )
    writeChange(cp, pOutputFP_in);
}

5.87.5 Source File

`change.c'

5.88 writeCodeTable

5.88.1 Syntax

#include "record.h"

void writeCodeTable(FILE *            pOutputFP_in,
                    const CodeTable * pTable_in);

5.88.2 Description

writeCodeTable writes the contents of a CodeTable data structure to a file. The output is useful only for debugging.

The arguments to writeCodeTable are as follows:

pOutputFP_in: is an output FILE pointer.
pTable_in: points to a CodeTable data structure.

5.88.3 Return Value

none

5.88.4 Example

#include "record.h"
#include "ample.h"

AmpleData sAmpleData_g;
char szCodesFilename_g[100];
...
loadAmpleDictCodeTables(szCodesFilename_g, &sAmpleData_g, FALSE);
writeCodeTable( sAmpleData_g.pLogFP,
                sAmpleData_g.pPrefixTable );

5.88.5 Source File

`loadtb.c'

5.89 writeStringClasses

5.89.1 Syntax

#include "strclass.h"   /* or change.h or textctl.h or template.h
                           or opaclib.h */

void writeStringClasses(FILE *              pOutputFP_in,
                        const StringClass * pClasses_in);

5.89.2 Description

writeStringClasses writes the contents of all the string classes in the list to a file.

The arguments to writeStringClasses are as follows:

pOutputFP_in: is an output FILE pointer.
pClasses_in: points to a list of string classes to write to a file.

5.89.3 Return Value

none

5.89.4 Example

#include <stdio.h>
#include "strclass.h"
...
static StringClass *    pClasses_m;
...
writeStringClasses(stdout, pClasses_m);
...
}

5.89.5 Source File

`strcla.c'

5.90 writeStringList

5.90.1 Syntax

#include "strlist.h"    /* or strclass.h or change.h or textctl.h
                           or template.h or opaclib.h */

void writeStringList(const StringList * pList_in,
                     const char *       pszSep_in,
                     FILE *             pOutputFP_in);

5.90.2 Description

writeStringList writes a list of strings to an output file, separating the individual strings in the list by the indicated string.

The arguments to writeStringList are as follows:

pList_in: points to a list of strings.
pszSep_in: points to the string used to separate the members of the list.
pOutputFP_in: is an output FILE pointer.

5.90.3 Return Value

none

5.90.4 Example

#include <stdio.h>
#include "strlist.h"
...
static StringList *     pCategories_m;
...
void showCategories()
{
printf("Categories:  ");
writeStringList(pCategories_m, "  ", stdout);
printf("\n");
}

5.90.5 Source File

`write_sl.c'

5.91 writeTemplate

5.91.1 Syntax

#include "template.h"   /* or opaclib.h */

void writeTemplate(FILE *               pOutputFP_in,
                   const char *         pszFilename_in,
                   const WordTemplate * pTemplate_in,
                   const TextControl *  pTextCtl_in);

5.91.2 Description

writeTemplate writes the results of a morphological analysis as a database. Each word is a record with these fields:

\a: analysis (ambiguities and failures marked)
\d: morpheme decomposition (ambiguities and failures marked)
\cat: final category of word (ambiguities and failures marked)
\p: properties (ambiguities and failures marked)
\fd: feature descriptors (ambiguities and failures marked)
\u: underlying form (ambiguities and failures marked)
\w: original word
\f: preceding format marks
\c: capitalization
\n: trailing nonalphabetics

Ambiguities are marked as %n%Anal1%Anal2%...%analn%. Failures are marked as %0%OriginalWord% or %0%%. (The separation character can be set to something other than %.)

The arguments to writeTemplate are as follows:

pOutputFP_in: is an output FILE pointer.
pszFilename_in: points to the name of the output file.
pTemplate_in: points to a data structure that contains the word analysis information.
pTextCtl_in: points to a data structure that contains orthographic information, and also the ambiguity marker character.

5.91.3 Return Value

none

5.91.4 Example See section 5.38 freeWordTemplate.

5.91.5 Source File `dtbout.c'

5.92 writeTextFromTemplate

5.92.1 Syntax

#include "template.h"   /* or opaclib.h */
void writeTextFromTemplate(FILE *               pOutputFP_in,
                           const WordTemplate * pTemplate_in,
                           const TextControl *  pTextCtl_in);

5.92.2 Description

writeTextFromTemplate writes the results of a morphological synthesis to an output file, restoring all the formatting information associated with the word in the original input to analysis.

Ambiguities are marked as %n%Word1%Word2%...%Wordn%. Failures are marked as %0%OriginalWord%. (The separation character can be set to something other than %.)

The arguments to writeTextFromTemplate are as follows:

pOutputFP_in: is an output FILE pointer.
pTemplate_in: points to a data structure containing the word analysis and synthesis information.
pTextCtl_in: points to a data structure that contains orthographic information, and also the ambiguity marker character.

5.92.3 Return Value

none

5.92.4 Example See section 5.65 readTemplateFromAnalysis.

5.92.5 Source File `textout.c'

5.93 writeTrieData

5.93.1 Syntax

#include "trie.h"       /* or opaclib.h */

void writeTrieData(Trie * pTrieHead_in,
                  void (* pfWriteInfo_in)(void * pList_in,
                                          int    iIndent_in,
                                          FILE * pOutputFP_in),
                  FILE *  pOutputFP_in);

5.93.2 Description

writeTrieData walks through a trie, writing the information stored at each node to a file. This is intended primarily for debugging, as the trie structure is explicitly written to the output file in indented form, together with the information stored in the trie.

The arguments to writeTrieData are as follows:

pTrieHead_in

points to the head of a trie.

pfShowInfo_in

points to a function for writing the stored information to a file. The function has three arguments:

pList_in: points to a collection of items stored at a Trie node (Trieinfo).
iIndent_in: is the number of spaces to indent the display of each data item in the collection.
pOutputFP_in: is the output FILE pointer.

The function does not return a value.

pOutputFP_in

is an output FILE pointer.

5.93.3 Return Value

none

5.93.4 Example

#include <stdio.h>
#include "trie.h"
#include "rpterror.h"
...
typedef struct lex_item {
    struct lex_item *   pLink;          /* link to next item */
    struct lex_item *   pNext;          /* link to next homograph */
    unsigned char *     pszForm;        /* lexical form (word) */
    unsigned char *     pszGloss;       /* lexical gloss */
    unsigned short      uiCategory;     /* lexical category */
    } LexItem;
...
Trie *          pLexicon_g;
...
static void debug_lex_items(void * pList_in,
                            int    iIndent_in,
                            FILE * pOutputFP_in)
{
LexItem *       pLex;
int             i;

if (pOutputFP_in == NULL)
    return;
for ( pLex = (LexItem *)pList_in ; pLex ; pLex = pLex->pLink )
    {
    for ( i = 0 ; i < iIndent_in ; ++i )
        fputc(' ', pOutputFP_in);
    fprintf(pOutputFP_in, "%-20s %-20s %u [%lu -> %lu]\n",
            pLex->pszForm, pLex->pszGloss, pLex->uiCategory,
            (unsigned long)pLex, (unsigned long)pLex->pNext);
    }
}

void debug_lexicon()
{
printf("BEGIN LEXICON TRIE DATA\n");
writeTrieData(pLexicon_g, debug_lex_items, stdout);
printf("END LEXICON TRIE DATA\n");
}

5.93.5 Source File

`trie.c'

5.94 writeWordAnalysisList

5.94.1 Syntax

#include "template.h"

void writeWordAnalysisList(const WordAnalysis * pAnalyses_in,
                           FILE *               pOutputFP_in);

5.94.2 Description

writeWordAnalysisList writes a list of WordAnalysis data structures to an output file for debugging purposes.

The arguments to writeWordAnalysisList are as follows:

pAnalyses_in: points to a list of WordAnalysis data structures.
pOutputFP_in: is an output FILE pointer.

5.94.3 Return Value

none

5.94.4 Example

#include <stdio.h>
#include "template.h"
...
void dumpWordTemplate(pTemplate_in, pOutputFP_in)
WordTemplate *  pTemplate_in;
FILE *          pOutputFP_in;
{
if (pOutputFP_in == NULL)
    return;
if (pTemplate_in == NULL))
    {
    fprintf(pOutputFP_in, "WordTemplate ptr is NULL\n");
    return;
    }
putc('\n', pOutputFP_in);
fprintf(pOutputFP_in, "  orig_word = \"%s\"\n",
       pTemplate_in->pszOrigWord ? pTemplate_in->pszOrigWord : "{NULL}" );
fprintf(pOutputFP_in, "  word      = \"%s\"\n",
       pTemplate_in->paWord && pTemplate_in->paWord[0] ?
                                        pTemplate_in->paWord[0] : "{NULL}" );
fprintf(pOutputFP_in, "  format    = \"%s\"\n",
       pTemplate_in->pszFormat ? pTemplate_in->pszFormat : "{NULL}" );
fprintf(pOutputFP_in, "  non_alpha = \"%s\"\n",
       pTemplate_in->pszNonAlpha ? pTemplate_in->pszNonAlpha : "{NULL}" );
fprintf(pOutputFP_in, "  capital   = %d\n", pTemplate_in->iCapital );

writeWordAnalysisList(pTemplate_in->pAnalyses, pOutputFP_in);

fprintf(pOutputFP_in, "  new_words = ");
if (pTemplate_in->pNewWords)
    {
    fprintf(pOutputFP_in, "\"");
    writeStringList( pTemplate_in->pNewWords, "\" \"", pOutputFP_in);
    fprintf(pOutputFP_in, "\"\n");
    }
else
    fprintf(pOutputFP_in, "{NULL}\n");
}

5.94.5 Source File

`wordanal.c'

5.95 writeWordFormationChars

5.95.1 Syntax

#include "textctl.h"    /* or template.h or opaclib.h */

void writeWordFormationChars(FILE *              pOutputFP_in,
                             const TextControl * pTextCtl_in);

5.95.2 Description

writeWordFormationChars writes the set of word formation characters to an output file. This function depends on previous calls to addWordFormationChars and addLowerUpperWFChars.

The arguments to writeWordFormationChars are as follows:

pOutputFP_in: is an output FILE pointer.
pTextCtl_in: points to a data structure that contains orthographic information.

5.95.3 Return Value

none

5.95.4 Example

#include <stdio.h>
#include "textctl.h"
...
static TextControl      sTextCtl_m;
...          
printf("The word formation characters are:\n");
writeWordFormationChars(stdout, &sTextCtl_m);
...

5.95.5 Source File

`myctype.c'

Bibliography

Antworth, Evan L.. 1990. PC-KIMMO: a two-level processor for morphological analysis. Occasional Publications in Academic Computing No. 16. Dallas, TX: Summer Institute of Linguistics.
Kew, Jonathan and Stephen R. McConnel. 1991. Formatting interlinear text. Occasional Publications in Academic Computing No. 17. Dallas, TX: Summer Institute of Linguistics.
Knuth, Donald E.. 1973. Sorting and Searching. Volume 3 of The Art of Computer Programming. Reading, MA: Addison-Wesley.
Weber, David J., H. Andrew Black, and Stephen R. McConnel. 1988. AMPLE: a tool for exploring morphology. Occasional Publications in Academic Computing No. 12. Dallas, TX: Summer Institute of Linguistics.
Weber, David J., H. Andrew Black, Stephen R. McConnel, and Alan Buseman. 1990. STAMP: a tool for dialect adaptation. Occasional Publications in Academic Computing No. 15. Dallas, TX: Summer Institute of Linguistics.
Weber, David J., Stephen R. McConnel, Diana D. Weber and Beth J. Bryson. 1994. PRIMER: a tool for developing early reading materials. Occasional Publications in Academic Computing No. 18. Dallas, TX: Summer Institute of Linguistics.

This document was generated on 20 March 2003 using texi2html 1.56k.

OPAC Function Library Reference Manual

functions for linguistic data processing

July 1998

Table of Contents

2.3.1 Type prefix

2.3.2 Descriptive name

2.3.3 Scope suffix

3.1.1 Definition

3.1.2 Description

3.1.3 Source File

3.2.1 Definition

3.2.2 Description

3.2.3 Source File

3.3.1 Definition

3.3.2 Description

3.3.3 Source File

3.4.1 Definition

3.4.2 Description

3.4.3 Source File

3.5.1 Definition

3.5.2 Description

3.5.3 Source File

3.6.1 Definition

3.6.2 Description

3.6.3 Source File

3.7.1 Definition

3.7.2 Description

3.7.3 Source File

3.8.1 Definition

3.8.2 Description

3.8.3 Source File

3.9.1 Definition

3.9.2 Description

3.10.1 Definition

3.10.2 Description

3.10.3 Source File

3.11.1 Definition

3.11.2 Description

3.11.3 Source File

3.12.1 Definition

3.12.2 Description

3.13.1 Definition

3.13.2 Description

3.13.3 Source File

3.14.1 Definition

3.14.2 Description

3.14.3 Source File

4.1.1 Syntax

4.1.2 Description

4.1.4 Source File

4.2.1 Syntax

4.2.2 Description

4.2.4 Source File

4.3.1 Syntax

4.3.2 Description

4.3.4 Source File

4.4.1 Syntax

4.4.2 Description

4.4.4 Source File

4.5.1 Syntax

4.5.2 Description

4.6.1 Syntax

4.6.2 Description

4.6.4 Source File

5.1.1 Syntax

5.1.2 Description

5.1.3 Return Value

5.1.5 Source File

5.2.1 Syntax

5.2.2 Description

5.2.3 Return Value

5.2.5 Source File

5.3.1 Syntax

5.3.2 Description

5.3.3 Return Value

5.4.1 Syntax

5.4.2 Description

5.4.3 Return Value

5.4.5 Source File

5.5.1 Syntax