GSoC/GCI Archive
Google Code-in 2014 Apertium

Write a script to explain an Apertium machine translation in terms of its parts

completed by: Sushain Cherivirala

mentors: Mikel Forcada

Write a script (preferably in python3 or bash/equivalent) that takes one text segment S, applies a given Apertium system to it and to all its possible whole-word subsegments s (perhaps up to a certain maximum length) and outputs a list (s,t,i,j,k,l) of correspondences so that the result of applying Apertium to s is t, t is a whole-word subsegment of T, the Apertium translation of S, i and j are the starting position and end position of s in S and k and l are hte starting position and the end postion of t in T. The script should read S, T, two language codes and optionally a maximum length and generate the correspondences (s,t,i,j,k,l) one per line
For further information and guidance on this task, you are encouraged to come to our IRC channel.