1. Where more than one solution is provided, I've marked the correct one with an asterix. When no correct solution is provided I've added comments/corrections next to exclamation marks. Examples:


INPUT STRING: جيمي

LOOK-UP WORD: jymy

    SOLUTION: jiym/NOUN+ayo/NSUFF_MASC_DU_ACCGEN_POSS

    SOLUTION: jiym/NOUN+ayo/NSUFF_MASC_DU_ACCGEN+ya/POSS_PRON_1S

    SOLUTION: jiym/NOUN+iy/POSS_PRON_1S

!! incorrect: need to add proper name "Jimmy" to lexicon...



INPUT STRING: التوقيع

LOOK-UP WORD: AltwqyE

*   SOLUTION: Al/DEF_ART+tawoqiyE/NOUN !! verbal noun, to be exact

    SOLUTION: Al/DEF_ART+tawoqiyE/NOUN



INPUT STRING: اتفاقية

LOOK-UP WORD: AtfAqyp

    SOLUTION: {it~ifAqiy~/NOUN_ADJ+ap/NSUFF_FEM_SG

*   SOLUTION: {it~ifAqiy~/NOUN_ADJ+ap/NSUFF_FEM_SG !! noun, not adj., in this context

2. Anything that was not an Arabic word was ignored and echoed back. Example:


". : Non-Alphabetic Data

3. I've implemented a "2nd look-up" for input words that appear to be spelled in non-standard ways:

a. hamza-on-waw spelled with two characters: waw + hamza-on-the-line
b. word-final ya' spelled as alif maqsura

Here are examples of this "2nd look-up" feature in use:


INPUT STRING: مسوءولة

LOOK-UP WORD: msw'wlp

!! NOT FOUND: msw'wlp

 NEW LOOK-UP: ms&wlp

    SOLUTION: maso&uwl/NOUN+ap/NSUFF_FEM_SG



INPUT STRING: التى

LOOK-UP WORD: AltY

!! NOT FOUND: AltY

 NEW LOOK-UP: Alty

*   SOLUTION: Al~atiy/FUNC_WORD

4. In cases where there are multiple identical solutions I used the morphological parser to disambiguate. Example:


INPUT STRING: السلام

LOOK-UP WORD: AlslAm

*   SOLUTION: Al/DEF_ART+salAm/NOUN  "peace"

    SOLUTION: Al/DEF_ART+salAm/NOUN  "greeting;salute"

5. The lexicon of stems is almost entirely lacking in explicit POS tags. (What is currently provided was generated on the fly). I did enter a few explicit experimental tags for some high frequency function words:


INPUT STRING: في

LOOK-UP WORD: fy

*   SOLUTION: fiy/PREP

    SOLUTION: fiy/PREP+ya/OBJ_PRON_1S



INPUT STRING: ما

LOOK-UP WORD: mA

*   SOLUTION: mA/REL_PRON

    SOLUTION: mA/NEG_PART

6. Perfect Verbs that have no POS information concerning person, gender, and number, (because they have a null suffix) are assumed to be 3MS (3rd.pers.masc.sg.). Example:


INPUT STRING: قال

LOOK-UP WORD: qAl

    SOLUTION: qAl/VERB_PERFECT