head icon
slanted W3C logo
knowledge in different languages

Toward a Rule-Based System
for English-Amharic Translation

Michael Gasser
School of Informatics and Computing, Indiana University

SALTMIL/AfLaT 2012
Istanbul, 22 May, 2012
http://www.cs.indiana.edu/~gasser/AfLaT12

Overview

Context and long-term goals

Rule-based vs. statistical machine translation

Dependency grammar (1)

Dependency grammar (2)

Dependency grammar (3)

Extensible Dependency Grammar (XDG)

XDG: interface dimensions (1)

XDG: interface dimensions (2)

ensem_if1 ensem_if2 ensem_if3 ensem_if4 ensem_if5

XDG: grammatical features

am_feat

XDG: lexicon

XDG: parsing

am_parse1 am_parse2 am_parse3 am_parse4

L3

L3: cross-lingual links

L3: translation

enam_t1a enam_t1b enam_t1c enam_t1d

Shallow and deep translation (1)

L3: shallow and deep translation (2)

ensemam1c enam1

L3: shallow and deep translation (3)

L3: node mismatch (1)

L3: node mismatch (2)

L3: node mismatch (3)

L3: node mismatch (4)

empty1a empty1b empty1c empty1d

L3: node mismatch (5)

empty2a empty2b empty2d

Structural divergences in MT (1)

Structural divergences in MT (2)

  1. (እሷ) ደከማት
    ɨsswa dǝkkǝmat
    she it-tired-her
    'She is tired.'
  2. ልጆቹደከማቸው
    lɨjoččudǝkkǝmaččǝw
    the-childrenit-tired-them
    'The children are tired.'

Structural divergences in MT (3)

L3: structure mapping (1)

AMHARIC
- lemma: dekeme
  empty: [@TOP]
  ID:
    out: {top: !}
    agree: [[top, obj, png]]
  cross:
    sem:
      lex: TIRED
      IDSem:
        linkend: {arg1: [top]}

L3: structure mapping (2)

ENGLISH
- lemma: be_padj
  ID:
    out: {padj: !}
  cross:
    sem:
      lex: zero
- lemma: tired
  ID:
    in: {padj: !}
  cross:
    sem:
      lex: TIRED
      IDSem:
        linkend: {arg1: [sbj]}

L3: structure mapping (3)

ensemam2a ensemam2b ensemam2c ensemam2d

Project status

Ongoing, future work

Conclusions

Thank you!
አመሰግናችኋለሁ!

References