1Íkala, Revista de Lenguaje y Cultura
Medellín, C oloMbia, V ol. 29 issue 1 (January-april, 2024), pp. 1-21, issn 0123-3432
www.udea.edu.co/ikala
Abstract
This paper presents the results of the annotation in a learner translation corpus
consisting of German source texts and student translations to Basque. The analysis
was carried out with the purpose of identifying trainee translators’ strengths and
weaknesses when translating multiword expressions, such as compounds, colloca-
tions, and idioms. The data comprised eight German source texts and sixty-eight
Basque translations from undergraduate students enrolled at the University of the
Basque Country. From the total number of annotations (1214), which include
not only errors but also cases of interference from the source language and posi-
tive outcomes, around 27 % are related to multiword expressions. The results of
the translation analysis show that there are variables — such as the use of ma-
chine translation systems, the level of specialisation of the source text, the type
of multiword expression to be translated or the absence of a literal counterpart
in the target language — that may affect the translation of such units and lead to
erroneous solutions and/or interference in the outputs produced by the trainee
translators. From a pedagogical point of view, these findings will have a direct
impact on the translation classes and will be very valuable for designing corpus-
based in-class activities.
Keywords: trainee translators, German-to-Basque translation, translation analy-
sis, multiword expressions, phraseological units, learner translation corpus
Resumen
Este artículo presenta los resultados de las anotaciones en un corpus de aprendices
de traducción compuesto por textos originales en alemán y traducciones de estu-
diantes al vasco. El objetivo del análisis fue identificar las fortalezas y debilidades
de los traductores en formación al traducir unidades fraseológicas, como palabras
compuestas, colocaciones y locuciones. El conjunto de datos estuvo formado por
German-to-Basque Translation
Analysis of Multiword Expressions
in a Learner Translation Corpus
Análisis traductológico (alemán-vasco) de unidades fraseológicas
en un corpus de aprendices de traducción
Une analyse de la traduction des unités phraséologiques entre l’allemand
et le basque dans un corpus des traductions des étudiants
Análise da tradução de unidades fraseológicas do alemão para o basco
num corpus de traduções de alunos
Received: 2023-07-26 / Accepted: 2023-11-03 / Published: 2024-01-31
https://doi.org/10.17533/udea.ikala.354417
Editor: Luanda Sito, Universidad de Antioquia, Medellín, Colombia.
Copyright, Universidad de Antioquia, 2024. This is an open access article, distributed in compliance with the terms of the
Creative Commons license by-nc-sa 4.0 International.
Zuriñe Sanz-Villar
Associate Professor, University
of the Basque Country/ Euskal
Herriko Unibertsitatea —upv/ehu,
Autonomous Community of the
Basque Country, Spain.
zurine.sanz@ehu.eus
https://orcid.
org/0000-0002-2281-2574
2Íkala Zuriñe SanZ-Villar
Medellín, C oloMbia, V ol. 29 issue 1 (January-april, 2024), pp. 1-21, issn 0123-3432
www.udea.edu.co/ikala
ocho textos originales en alemán y 68 traducciones al vasco realizadas por estu-
diantes de grado de la Universidad del País Vasco (upv/ehu). Del número total
de anotaciones (1 214), entre las que se encuentran no solo errores, sino también
casos de interferencia de la lengua de origen y resultados positivos, cerca de un
27 % tiene que ver con unidades fraseológicas. Los resultados del análisis traduc-
tológico muestran que hay variables —como el uso de sistemas de traducción
automática, el grado de especialidad del texto origen, el tipo de unidad fraseoló-
gica o la ausencia de un equivalente literal en el idioma de llegada— que pueden
influir en la traducción de dichas unidades y llevar a soluciones erróneas o a casos
de interferencia en las traducciones realizadas por los aprendices de traducción.
Desde un punto de vista pedagógico, estos hallazgos tendrán un impacto directo
en las clases de traducción y serán muy valiosos para el diseño de actividades ba-
sadas en corpus.
Palabras clave: aprendices de traducción, traducción del alemán al vasco, análisis
de traducciones, unidades fraseológicas, corpus de aprendices de traducción
Résumé
Cet article présente les résultats des annotations sur un corpus de traductions
d’étudiants composé de textes sources allemands et de traductions d’étudiants
en basque. L’analyse visait à identifier les forces et les faiblesses des traducteurs
en formation lorsqu’ils traduisent des unités phraséologiques, tels que des com-
posés, des collocations et des idiomes. L’ensemble de données comprenait huit
textes sources en allemand et 68 traductions en basque réalisées par des étudiants
de l’Université du Pays basque (upv/ehu). Sur le nombre total d’annotations
(1 214), qui comprennent non seulement des erreurs mais aussi des cas d’interfé-
rence avec la langue source et des résultats positifs, environ 27 % concernent des
unités phraséologiques. Les résultats de l’analyse de la traduction font apparaître
des variables —telles que l’utilisation de systèmes de traduction automatique, le
degré de spécialisation du texte source, le type d'unité phraséologique à traduire
ou l’absence d’équivalent littéral dans la langue cible— qui peuvent influencer la
traduction de ces unités et conduire à des solutions erronées ou à des interférences
avec les produits des traducteurs en formation. D’un point de vue pédagogique,
ces résultats auront un impact direct sur les cours de traduction et seront très utiles
pour la conception d’activités en classe basées sur des corpus.
Mots clef : traducteurs stagiaires, traduction de l’allemand vers le basque, analyse
des traductions, unités phraséologiques, corpus des traductions des étudiants
Resumo
Este artigo apresenta os resultados de apontamentos em um corpus de traduções
de alunos que consiste em textos originais em alemão e traduções de alunos para o
basco. A análise buscou identificar os pontos fortes e fracos dos tradutores estagiá-
rios ao traduzir unidades fraseológicas, como compostos, locuções e colocações.
O conjunto de dados consistia em oito textos de origem em alemão e 68 traduções
para o basco feitas por alunos de graduação da Universidade do País Basco (upv/
ehu). Do número total de anotações (1 214), que incluem não apenas erros, mas
também casos de interferência do idioma de origem e resultados positivos, cerca
de 27 % referem-se a unidades fraseológicas. Os resultados da análise da tradução
mostram variáveis - como o uso de sistemas de tradução automática, o grau de
especialização do texto de origem, o tipo de unidades fraseológicas a ser tradu-
zidas ou a ausência de um equivalente literal no idioma de destino - que podem
3Íkala G erman - to -B asque translation analysis of multiword expressions in a learner translation C orpus
Medellín, ColoMbia, Vol. 29 issue 1 (January -april, 2024), pp. 1-21, issn 0123-3432
www.udea.edu.co/ikala
influenciar a tradução dessas unidades e levar a soluções errôneas ou interferir nos
produtos dos tradutores estagiários. Do ponto de vista pedagógico, essas desco-
bertas terão um impacto direto nos cursos de tradução e serão muito valiosas para
a elaboração de atividades em sala de aula baseadas em corpus.
Palavras chave: tradutores estagiários, tradução do alemão para o basco, análise
de traduções, unidades fraseológicas, corpus de traduções de estudantes
4Íkala Zuriñe SanZ-Villar
Medellín, C oloMbia, V ol. 29 issue 1 (January-april, 2024), pp. 1-21, issn 0123-3432
www.udea.edu.co/ikala
Introduction
The translation of phraseological units1 by trainee
translators has been studied by different authors
but is still an underexplored area of research
(Sanz-Villar, 2022, p. 268; Serrano Lucas, 2010,
pp. 197–198). In a study by Serrano Lucas
(2010), a methodological proposal based on the
task-based approach is made for teaching phrase-
ology in the context of translation didactics.
Leiva Rojo (2013) focuses on the importance
of considering phraseological units when assess-
ing and reviewing the quality of texts translated
by translation students. In the context of gen-
eral translation classes, Valero Cuadra (2015) and
Albaladejo Martínez (2015) examine the trans-
lation of collocations by trainee translators. It is
worth mentioning that both papers were pub-
lished in a volume dedicated to phraseology,
didactics and translations (Mogorrón Huerta &
Navarro Domínguez, 2015). In the same year,
Marcelo Wirnitzer and Amigo Extremeña (2015)
presented a pilot study analysing the process of
translating phraseological units —pus— (more
specifically, phrasal verbs, conversational routines,
collocations and idioms) by trainee translators, as
well as the product itself.
Castagnoli (2023) analyses variation in transla-
tions in a corpus of an English source text (st)
and 35 translations to Italian produced by transla-
tion trainees; additional professional translators’
outputs are included in the study. To this end,
the translation paradigms of pre-selected items
— idiomatic and non-idiomatic multiword units
(mwus), among others — were examined. One
aim was “to observe the forms variation can take
and how it may be related to the linguistic items
involved as well as to individual translator expe-
rience” (2023, p. 120). In this respect, the author
concludes that variation is greater in the transla-
tion of idiomatic units, especially when a literal
1 In this paper, the terms “phraseological unit” (pu) and
“multiword expression/unit” (mwe and mwu, respecti-
vely) are used interchangeably.
rendering of the st item does not exist in the tar-
get text (tt; 2023, p. 120).
The first attempt to analyse the translation of
multiword expressions (mwes) by trainee transla-
tors in the language combination German-Basque
was made by Sanz-Villar (2022). This study
shows the results of a translation analysis con-
ducted in a small corpus containing 24 different
texts (6 838 tokens) and created with a tool called
taligner (tralima-itzulik, 2019). The texts
were collected over four academic years, from
2014 to 2018. Multiword verbs were pre-selected
and manually extracted from the German sts, and
their counterparts in students’ translations were
analysed. The results of that study showed that
the insertion of words into the structure of the
verbal patterns leads students to misunderstand
the meaning of the st and that both source lan-
guage (sl) interference and interference from a
third language is observed in students’ transla-
tions (Sanz-Villar, 2022, pp. 283–284); not only
because of the presence of this third language,
Spanish, during the translation process, but also
due to the use of resources for translators, such as
machine translation (mt) systems.
Interference has also been observed in other stud-
ies analysing translations to Basque. The corpus
created by Sanz-Villar (2018) contained German
sts and professional translations to Basque, and
it was concluded that “[w]hen translating from a
prestigious language A to a minority language B,
if language B coexists unequally with a dominant
language C, then, according to different variables,
different types of interference from language C
into language B can occur to different degrees”
(2018, p. 90). Aierbe Mendizabal (2008) analysed
the translation of specialised phraseological units
in administrative texts and noted that Basque is
very dependent on Spanish administrative lan-
guage and that the influence of Spanish on Basque
tts is far-reaching.
The present article analyses translations from
German (A) to a minority language, Basque (B),
5Íkala G erman - to -B asque translation analysis of multiword expressions in a learner translation C orpus
Medellín, ColoMbia, Vol. 29 issue 1 (January -april, 2024), pp. 1-21, issn 0123-3432
www.udea.edu.co/ikala
which is in a diglossic situation with a dominant
language, Spanish (C). Therefore, interference
may be expected not only from the sl, but also
from Spanish, further intensified in the students’
translations due to the increasing use of mt sys-
tems. The lack of direct resources — understood
as “sets of previously gathered linguistic data
which are made available in some electronic for-
mat so that they can be used or looked up by
translators” (Alcina, 2008, p. 98) — in German-
to-Basque translations leads trainee translators in
this language combination to depend on indirect
resources (in the language combination German-
Spanish or German-English).
This paper sets out to analyse the translation
of mwes translated from German to Basque by
trainee translators based on a learner translation
corpus (ltc), which was compiled within the
framework of the must (Multilingual Student
Translation) project (Granger & Lefer, 2020). In
2019, several members of the tralima-itzulik
research group from the University of the Basque
Country (upv/ehu) joined this project, and have
been contributing data for the following language
combinations: Basque to Spanish, English to
Basque/Spanish and German to Basque. The goal
will be to present the general results of the anno-
tations made in students’ translations in terms of
errors, interference, and positive outcomes, and
then to focus on the translation analysis of mwes.
First, these units are noteworthy due to the num-
ber of annotations related to mwes. Through the
annotation process, course trainers can identify
students’ difficulties and design tasks to help them
overcome their shortcomings (Espunya, 2013,
p. 130). Secondly, these units are good candidates
for analysing interference (Sanz-Villar, 2018).
Section 2 presents the must initiative, the
Translation-oriented Annotation System (abbre-
viated as tas) created within this project and the
notion of mwes. Section 3 describes the compi-
lation and annotation process of the ltc and the
characteristics of the corpus regarding metadata.
Section 4 outlines the results of the study, and it
will be concluded with a discussion on pedagogi-
cal implications based on the outcomes and with
some final remarks (Section 5).
Theoretical Framework
This section presents the must initiative and
mentions the tralima-itzulik research
group’s contribution to this project. Following
that, the annotation system (tas 1.0) will be
described as a significant component of must,
with a focus on mwes.
Learner Translation Corpus
ltcs containing original texts and transla-
tions made either by foreign language learners
or trainee translators can be regarded as the
synergetic product of two previously separate
fields — learner corpus research (lcr) and corpus-
based translation studies (cbts; Granger & Lefer,
2020, p. 1184). These two research strands both
emerged in the late 1980s and early 1990s, but the
first ltc was only created in the early 2000s. The
idea of compiling corpora with students’ transla-
tions is not new (inter alia, Castagnoli et al., 2011;
Sánchez Nieto, 2012; Espunya, 2013; Wurm,
2013), but as argued by Granger and Lefer (2020,
p. 1184), most of the corpora created have been
local initiatives with the exception of mellange
(Multilingual eLearning in Language Engineering)
and must (Multilingual Student Translation).
In line with the mellange corpus, the aim of
the must project has been “to collect a large
Multilingual Student Translation corpus, to design
a rich set of standardised metadata, to create a stan-
dardised annotation system for translated language
and to make all the data available to the contribut-
ing partners via a web-based interface for research
and teaching” (Granger & Lefer, 2020, p. 1186).
Castagnoli et al. (2011, p. 237) identified the
ten most frequent errors in the mellange ltc,
which includes 232 annotated translations to
Catalan, German, English, Spanish, French, and
Italian. As will be explained in the results sec-
tion, it is noteworthy that the two most frequent
6Íkala Zuriñe SanZ-Villar
Medellín, C oloMbia, V ol. 29 issue 1 (January-april, 2024), pp. 1-21, issn 0123-3432
www.udea.edu.co/ikala
errors in the mellange ltc concern incorrect
lexis and terminology (first place) and distortion
(second place). Also, Wurm (2013) concludes that
lexical errors correspond to the most common cate-
gory in her ltc, which is consistent with the results
obtained within two other similar projects (2013,
pp. 399-400).
As mentioned above, in 2019 members of the
tralima-itzulik research group joined the must
initiative. Since this paper presents the results of the
annotation of the German-to-Basque subcorpus, it is
necessary to describe the annotation system.
tas 1.0
The must project team uses the term “Translated-
oriented Annotation System” rather than error
annotation system because the taxonomy is not
limited to errors but includes additional metatags
— as explained in this section — and may in the
future incorporate the option of tagging trans-
lation procedures (Granger & Lefer, 2018). In a
large-scale project such as this, it was important
to develop a flexible, language-independent sys-
tem. Depending on the language pair and scope,
it can be decided how detailed the annotation
should be. The taxonomy should also be valid for
both research and teaching purposes.
tas 1.0 is based on two traditions: error taxon-
omies for st-tt, such as the celtrac taxonomy
(based on the abovementioned mellange trans-
lation error typology) and annotation systems
“developed for the analysis of learners’ free writ-
ing” (Granger & Lefer, 2020, p. 1194). It contains
60 tags and is divided into four main blocks, as
can be observed in Figure 1: st-tt transfer, lan-
guage, metatags, and translation procedures.
However, the latter is not used in this corpus since
it was under preparation at the time of conducting
the research. A second taxonomy, Translation-
oriented Annotation System (tas2.0), has existed
since autumn 2021. However, the results pre-
sented here are based on the tas1.0 taxonomy,
the one available when annotating the mentioned
subcorpus.
st-tt transfer refers to “discrepancies between the
source text (st) and the target text (tt) and/or
between the tt and the translation brief ” (Granger
& Lefer, 2018). Tagging language errors means
identifying segments that are erroneous from the per-
spective of the target language (tl), independently
of the st. With the Metatag label it is possible to tag
positive outcomes and/or suspected sl intrusion.
Subcategories are included for each of the main
categories; that is, each category is multi-layer
(Granger & Lefer, 2018), as shown in Figure 1.
There are five subcategories for each of the two
parts (st-tt transfer and Language) of the tas
that serve to annotate errors: Content transfer,
Lexis, Discourse, Register/Culture, and Translation
brief in the former and Grammar, Lexis, Cohesion,
Mechanics, and Style and situational context in
the latter. Two subcategories are included in the
Metatags block to indicate whether an output
represents a good translation solution (positive)
or whether the student’s translation solution has
been negatively influenced by the sl (suspected
sl intrusion). As mentioned in the tas man-
ual (Granger & Lefer, 2021, p. 28), the suspected
sl intrusion metatag differs from the previously
mentioned error tags in that it does not describe
the nature of errors, but the possible source of an
infelicitous translation solution. The annotation
may be even more detailed, and the depth of the
tas is at the discretion of the annotator.
mwes in the tas
There is still a lack of consensus on the use of the
metalanguage in the field of phraseology, as men-
tioned, for example, in the metrafas project,
whose aim is to establish a common terminology
in the phraseological field and thus contribute to
a more homogeneous use of the metalanguage.2
As mentioned in Sanz-Villar (2022), “[t]he ter-
minology used by researchers when naming and
2 Refer to point 4 of the project website for more details:
https://www.cirp.gal/proxectos/proxecto-fraseoloxia-
galega.html
7Íkala G erman - to -B asque translation analysis of multiword expressions in a learner translation C orpus
Medellín, ColoMbia, Vol. 29 issue 1 (January -april, 2024), pp. 1-21, issn 0123-3432
www.udea.edu.co/ikala
defining their object of study varies depending on
the approach or the way multi-word units (mwu)
are analysed”.
Corpas Pastor’s well-known classification pro-
poses three main phraseological spheres, namely
collocations, idioms and phraseological utter-
ances (Corpas Pastor, 1996). The first two are
included in the tas as subcategories of “multi-
word non-term”, as can be observed in Figure 1.
Collocations are defined as “usage-determined or
preferred syntagmatic relations between two lex-
emes in a specific syntactic pattern. Both lexemes
make an isolable semantic contribution to the
word combination but they do not have the same
status. Semantically autonomous, the ‘base’ of a
collocation is selected first by a language user for
its independent meaning. The second element,
i.e. the ‘collocate’ or ‘collocator’, is selected by and
semantically dependent on the base” (Granger &
Paquot, 2008, p. 43). As for idioms, semantic non-
compositionality is their most salient feature, but
“[l]ack of flexibility and marked syntax are further
indications of their idiomatic status” (Granger &
Paquot, 2008, p. 43).
The tas considers compounds as mwes, although
their inclusion within the field of phraseology
is not uncontroversial and “pose[s] problems
because of their [compounds’] uncertain status as
single or multi-word units” (Granger & Paquot,
2008, p. 32). These are understood as “multiword
units that constitute one semantic unit although
they may be written in one, two or more words, as
well as other fully fixed sequences such as complex
prepositions, complex adverbs, complex conjunc-
tions and complex verbs, including phrasal verbs”
(Granger & Lefer, 2021, p. 21).
Figure 1 Categories of tas 1.0
8Íkala Zuriñe SanZ-Villar
Medellín, C oloMbia, V ol. 29 issue 1 (January-april, 2024), pp. 1-21, issn 0123-3432
www.udea.edu.co/ikala
During the error tagging of the present study, it
was found that the translation of some solid com-
pounds — such as Familienstand (‘marital status’)
— caused translation errors (as in familia-egoera,
‘family situation’, where the meaning of the orig-
inal compound is not properly rendered in the
translation). The inclusion or exclusion of these
units in the framework of a translation analysis
of mwes, while beyond the scope of this paper,
remains an interesting area of research.
Method
As part of the must project, a web-based interface
called Hypal4must, based on the Hypal soft-
ware and developed by Obrusnik (2014), allowed
Figure 3 Bitext and Student Metadata in Hypal4must
Figure 2 Uploaded and Accepted Task in the German-Basque Subcorpus
the creation and annotation of parallel corpora.
Once corpora were compiled, the uploaded texts
were queried within the tool’s interface. This sec-
tion first describes the process of compiling the
corpus. It then explains how the mwes were que-
ried in the tool and exported for further analysis.
Finally, the metadata concerning both translation
tasks and participants will be described.
Compilation of Corpus
The first step consisted of creating the task (i.e.
the sts were uploaded and accepted by the must
coordinators). As can be seen in Figure 2, in the
German-to-Basque subcorpus, eight tasks were
created.
9Íkala G erman - to -B asque translation analysis of multiword expressions in a learner translation C orpus
Medellín, ColoMbia, Vol. 29 issue 1 (January -april, 2024), pp. 1-21, issn 0123-3432
www.udea.edu.co/ikala
Figure 4 Manual Alignment in Hypal4must
Figure 5 Translation Annotation in Hypal4must
Either the teacher or the student can later upload
the translations for each task to Hypal4must.
Together with the translated text, student metadata
was included; the st and tt were aligned at para-
graph and sentence level and tts were enriched
with annotations based on the tas. Figure 3 shows
an uploaded translation together with student
metadata below that need to be filled in.
Figure 4 shows the options (add, cursor split,
merge, etc.) for making manual adjustments to the
automatic alignment.
The annotation categories of the tas can be seen
in Figure 5. As can be observed, in addition to
annotating segments on the basis of the tas, com-
ments and corrections can also be added.
Corpus Query
Once the translations had been aligned and anno-
tated, the corpus was compiled and general results
regarding the annotation were obtained by way of
the in-built bilingual concordancer. This enabled the
corpus data to be extracted and subsequently down-
loaded in Excel format (with error annotations and
metadata).
During the process of annotating the tts, not
only the mentioned “multiword” subcategory but
10Íkala Zuriñe SanZ-Villar
Medellín, C oloMbia, V ol. 29 issue 1 (January-april, 2024), pp. 1-21, issn 0123-3432
www.udea.edu.co/ikala
also other were used to annotate mwe-related pas-
sages of the tts:
• st-tt transfer -> distortion, omission: when
the meaning of the st mwe was not properly
rendered in the tt or when it was omitted.
• st-tt transfer -> cultural mismatch: when a
culturally bound mwe in the st was not pro-
perly translated in the tt.
• Language -> Lexis -> Multiword term and
non-term: when the use of the terminologi-
cal or non-terminological mwe in the tl was
incorrect.
• Metatags -> Positive: when the use of mwes in
the tts was regarded as especially good.
• Metatags ->Suspected sl intrusion: there was
interference from a mwe of the st (or a third
language).
Thus, the corresponding annotations were que-
ried in Hypal4must using the abbreviated forms
of each tag, as can be seen in Figure 6. The figure
shows the search for language errors tagged as mul-
tiword terms and non-terms (la-lt-mw.*). The
wildcard ‘.*’ matches any number of characters.
The same process was followed with the other lev-
els liable to contain mwe annotations. However,
not all distortion errors, for instance, contained
mwes. Therefore, relevant results for mwe anal-
ysis were selected manually. Quantitative and
qualitative information regarding the translation
analysis of mwes is provided below.
Metadata: Tasks and Participants
As Table 1 shows, eight tasks were created in all
(i.e. eight source texts were uploaded and enriched
with the corresponding metadata) and sixty-eight
translations (22,184 tokens) were annotated using
1214 tags. Most of the sts (6) were from the gen-
eral language category (journalistic texts), while
two were specialised texts. However, st5 was a
promotional text that did not present a high level
of specialisation, and st4, which contained aca-
demic prose, although specialised, was on a topic
with which students were familiar. There were
three translation tasks that were undertaken by
students under exam conditions and later marked
by the teacher. The rest were either unmarked
home assignments or in-class activities. The short-
est texts were examinations and the longest text
contained 493 words.
The number of translations uploaded for each
text depended on the number of students enrolled
in each academic year, the students’ willingness to
take part in the project and whether or not they
translated the texts uploaded to Hypal4must. At
the beginning of each academic year, the must
project was presented to students by the teacher.
Students who agreed to contribute their transla-
tions and metadata to the project signed the must
informed consent form. This has been approved
by the Ethics Committee of the Institute for
Figure 6 Querying by Error Annotation in Hypal4must
11Íkala G erman - to -B asque translation analysis of multiword expressions in a learner translation C orpus
Medellín, ColoMbia, Vol. 29 issue 1 (January -april, 2024), pp. 1-21, issn 0123-3432
www.udea.edu.co/ikala
Language and Communication of UCLouvain
(approval number: ce-ilc/2022/09).
During the translation assignments, students were
allowed to use any tool or resource they deemed nec-
essary. The aforementioned metadata were stored
with others (student-related metadata, for instance)
in the tool.
Translations were compiled from 26 undergrad-
uate students of Translation and Interpreting,
distributed over three academic years: 2018–19,
2019–20 and 2020–21. In general, students in
this German-to-Basque translation course were
in their sixth semester, meaning that they still had
three semesters to go (including the sixth semes-
ter) before graduation. They had attended three
Table 1 Tasks Created in Hypal4must
Title
Language type,
Supergenre,
Genre
Type of task Marked
Number
of
words
Academic
year
Number of
translations
ST1
Der Mensch ist eben auch
nur eine Ratte im Labor [The
human being is just a rat in
the laboratory]
General language,
journalistic, review
In-class
examination Yes 278 2018-19 5
ST2 Auf Wiedersehen, Kinder
[Goodbye, children]
General language,
journalistic, review
In-class
examination Yes 281 2018-19 5
ST3 Lass uns kurz reden [Let’s
talk briefly]
General language,
journalistic, news/
reportage
Home
assignment No 415 2019-20 10
ST4
DaF-Didaktik für baskische
MuttersprachlerInnen
[DaF (German as a foreign
language) didactics for
Basque native speakers]
Specialised
language,
academic prose,
abstract
In-class
activity No 330 2019-20 11
ST5 Grazer Sehenswürdigkeiten
[Sights of Graz]
Specialised
language,
promotional text,
tourist guide/
brochure
In-class
activity No 386 2019-20 13
ST6
Digitales Studium: Die beste
Zeit ihres Lebens. Eigentlich
[Digital studies: The best
time of their life. Actually]
General language,
journalistic, news/
reportage
Home
assignment No 445 2020-21 8
ST7
Corona-Kinderbuch: China
ist sauer über Hamburger
Verlag [A children’s book on
the coronavirus: China is
angry with Hamburg-based
publisher]
General language,
journalistic, news/
reportage
In-class
examination Yes 327 2020-21 8
ST8
Corona-Pandemie ohne
Partner: Gemeinsam einsam
sein [The Covid-19 pandemic
without a partner: together,
but alone]
General language,
journalistic, news/
reportage
Home
assignment No 493 2020-21 8
12Íkala Zuriñe SanZ-Villar
Medellín, C oloMbia, V ol. 29 issue 1 (January-april, 2024), pp. 1-21, issn 0123-3432
www.udea.edu.co/ikala
translation-practice courses, which involved their
two first languages (Basque and Spanish), and in
the sixth semester they were translating from two
foreign languages: German to Basque and either
English or French to Basque. This was the case
for students who did not spend a semester or aca-
demic year abroad. Of the 26 students mentioned,
only 23% had completed a translation internship.
According to students’ metadata, in 80.8% of the
cases, their self-rated proficiency in the sl was
intermediate. The remainder (5 students) rated
their level of German as advanced. Half (13) of the
total number had visited German-speaking coun-
tries, with the duration of stays ranging from 1 to
11 months. As for the tl, Basque, most (24 stu-
dents) said they had a native command of the
language. For the remaining two students, Spanish
was their only first language, self-rating their com-
mand of the language as advanced. The majority
of the students (69.2%) said they had two first
languages (Basque and Spanish) with the remain-
der stating that they had one first language (either
Spanish, French or Basque).
In summary, the students’ translation experience was
still limited when attending the German-Basque
translation course, and for the majority, their com-
mand of the sl, German, was intermediate. However,
the fact that half of them had spent time in German-
speaking countries and/or had studied it in school
before entering university shows that the groups do
not usually have a homogeneous command of the sl.
According to the metadata obtained from students,
most stated that they had two first languages, Basque
and Spanish, and all but two said they had a native-
level command of the tl.
Results
In this section the annotation results of the German-
to-Basque ltc are presented. The ‘General results’
subsection will feature results of all categories of
the tas. The second subsection (“The translation
of mwes”) will focus on the tagged units contain-
ing mwes.
General results
Table 2 shows the general results for first-level
annotations. Language errors accounted for 50%
of all annotations. The number of transfer errors,
436, was also considerable. In addition to the
errors, 169 passages were marked as either posi-
tive or “too literal” — that is, traces of the sl (or
another language) were observed.
Table 2 Quantitative Results of First-Level Annotation
Language 609 (50%)
ST-TT transfer 436 (36%)
Metatags 169 (14%)
Total number of annotations 1214 (100%)
The distribution of tags in different subcategories
can be seen in Figure 7. In the Language category,
many errors of grammar (156) and mechanics
(179) were found (the latter concerned punctua-
tion marks and spelling, for instance). Within this
category, however, the most prominent subcate-
gory was Lexis (with 239 errors). As mentioned
above, ltc were analysed in both Castagnoli et al.
(2011) and Wurm (2013). Different tagsets were
used during the annotation and therefore the
results are not comparable, but lexical errors are
among the most frequent in both studies.
The large number of content transfer errors
corresponded to student metadata and their
command of the sl. According to the question-
naires that the students filled in at the beginning
of each semester, what they usually fear most in
this language combination is not properly under-
standing the meaning of the st and not being
able to render its meaning in the tl.
Noteworthy too were the positive outcomes (78)
and suspected sl intrusions (91), both with a sim-
ilar level of representation in the corpus.
The Translation of mwes
As explained in the section on methodology, the
tagged units containing mwes were selected and
13Íkala G erman - to -B asque translation analysis of multiword expressions in a learner translation C orpus
Medellín, ColoMbia, Vol. 29 issue 1 (January -april, 2024), pp. 1-21, issn 0123-3432
www.udea.edu.co/ikala
analysed manually. Table 3 shows the categories
of the tas containing mwe-related annotations,
ordered by frequency.
Examples extracted from the corpus of each of the
categories from Table 3 are presented in Table 4.
In the first example, the Basque counterpart of
the German mwe dünnhäutig reagieren repre-
sents an unusual word combination: gaitzikor
erantzun (‘to respond suspiciously’). There are
no results for this word combination in the ref-
erence corpus of contemporary Basque (Sarasola
et al., n.d.), and the word gaitzikor (‘suspicious’)
has a very low frequency (25). In the second
example, part of the meaning of the German col-
location kerzengerade sitzen (‘to sit bolt upright’)
was omitted in the Basque tt. The German mwe
lassen Sie sich überraschen (literally ‘let yourself
be surprised’) was translated, in one case, with a
somatic mwe in Basque: aho bete hortz utzi (lit-
erally, ‘to leave someone with the mouth full of
Figure 7 Quantitative Results of Second-Level Annotations
Table 3 mwe Annotations
Multiword terms and non-terms 128
Content transfer (distortion and omission) 119
Positive 40
Suspected sl intrusion 26
Cultural mismatch 11
Total 324
teeth’, meaning ‘to leave someone speechless’).
This was tagged as positive, because the mean-
ing was conveyed accurately, and another mwe
was used in the tt, which fits well in the promo-
tional text from which the example was extracted
(ST5). The following example shows the same
sentence translated to Basque literally. Utzi zaitez
harritzen (‘let yourself be surprised’) is not only a
literal translation of the German mwe but also of
the Spanish expression déjate sorprender. Finally,
Semesterferien (‘semester break’) was adapted
to the target culture as summer and Christmas
break. However, it is noteworthy that the winter
break for German students typically falls around
February and March and not during Christmas.
Table 5 shows the distribution of annotations
by task. It is necessary to take into account the
unequal number of participants submitting a
translation for each task (see Table 1). However,
st4 was notable for the high number of error anno-
tations (66 in total) and sl intrusion cases (16)
compared to the other tasks of the same academic
year, as well as the low number of positive fea-
tures (1). Out of the tree tasks of the academic
year 2020-21, the only marked task made under
exam conditions, st7, contained the least number
of errors (21, as opposed to 30 in st6 and 27 in
st8). A summary of the main features of each cat-
egory will be given in the next section.
14Íkala Zuriñe SanZ-Villar
Medellín, C oloMbia, V ol. 29 issue 1 (January-april, 2024), pp. 1-21, issn 0123-3432
www.udea.edu.co/ikala
Multiword Terms and Non-Terms
Lexical mwe errors were marked as such because
they represented inappropriate expressions in
Basque. The number of errors tagged as non-
term multiword (68) was slightly higher than term
multiword errors (60), probably due to the non-
specialised character of most sts.
With regard to multiword terms, around 50% of
the errors (32) were found in student translations
Table 5 Task-Based Distribution of Annotations
Multiword
Non-Terms
Multiword
Terms
Content
Transfer
Cultural
Mismatch Positive
Suspected
sl
Intrusion
2018-19 ST1 3 0 4 4 6 0
ST2 4 0 12 2 5 0
2019-20 ST3 8 11 21 0 0 3
ST4 7 32 27 0 1 16
ST5 23 0 20 2 11 5
2020-21 ST6 4 6 18 2 5 0
ST7 4 9 8 0 6 2
ST8 15 2 9 1 6 0
68 60 119 11 40 26
Table 4 Examples of Each Relevant Category of the tas
Multiword
non-term
Wenn es um die Ursachen der Corona-Pandemie geht,
reagiert China sehr dünnhäutig. [When it comes
to the origins of the Covid-19 outbreak, China reacts
very thin-skinned.]
Koronabirusaren pandemiaren jatorriari dagokionez,
Txinak gaitzikor erantzuten du. [When it
comes to the origins of the Covid-19 outbreak, China
responds suspiciously.]
Content transfer
Elaf, Jeans und schwarzes Top, die Laptop-Kamera
abgeklebt, sitzt kerzengerade auf ihrem Stuhl am
Schreibtisch. [Elaf, jeans and a black top, the laptop
camera taped off, sits bolt upright in her chair at the
desk.]
Elafek galtza bakeroak eta top beltza ditu jantzita.
Idazmahiko [sic] aulkian dago zain, ordenagailuko
kamera itzalita. [Elaf wears jeans and a black top. She
waits in the desk chair with the computer camera off.]
Positive
Spazieren Sie einfach los und lassen Sie sich
überraschen. [Just take a walk and be surprised.]
(…) ibili eta alde zaharrak aho bete hortz utziko
zaitu. [(…) walk and the old town will leave you
speechless.]
Suspected sl
intrusion
Spazieren Sie einfach los und lassen Sie sich
überraschen.
(…) joan paseatzera eta utzi zaitez harritzen.
[(…) take a walk and let yourself be surprised]
Cultural
mismatch
In den Semesterferien ist die Stadt ausgestorben,
(…). [During the semester break, the city is
deserted.]
Udako eta Gabonetako oporretan hiria ia hutsik
gelditzen da, (…). [During the summer and Christmas
holydays, the city is almost empty.]
of the abovementioned specialised text (ST4),
which was an abstract on the teaching of German
as a foreign language for native speakers of Basque
at the upv/ehu. The text describes the context
in which German is taught at the Faculty of Arts
of the upv/ehu. During the translation of this
text, students encountered terms that belong to
their own culture but had been adapted in the st
for a German-speaking readership. Hence, they
were familiar with the reality described in the st;
however, errors arose because, instead of using
15Íkala G erman - to -B asque translation analysis of multiword expressions in a learner translation C orpus
Medellín, ColoMbia, Vol. 29 issue 1 (January -april, 2024), pp. 1-21, issn 0123-3432
www.udea.edu.co/ikala
the usual terminology in Basque, they tended
to translate them literally. Frequently the mean-
ing of the terms (such as Hautpfach, Nebenfach,
Nebenfachstudierende, Hilfssprache…) was not
properly transferred, even though they were famil-
iar with this context. On other occasions, the
problem may have been the degree of specialisation.
For instance, the term kontrastive Analyse was fre-
quently translated in Basque as analisi kontrastatu
(7 out of 11), whereas in fact analisi kontrastibo is
the term generally used in this field. A search for
both terms in a reference corpus of contemporary
Basque (Sarasola et al., n.d.) yields 15 occurrences for
analisi kontrastibo and none for analisi kontrastatu.
Despite the low frequency, the books on linguistics
from which they are extracted are very reliable.
In the case of multiword non-terms, they were
usually tagged as errors because students produced
unusual and inappropriate word combinations in
the translations. For instance, in the example in
Table 6, the multiword verb zielen auf (‘aim at’),
referring to ein Forschungssprojekt (‘a research proj-
ect’) in this case, was translated with a single verb,
espero (‘hope’, ‘expect’), and a noun that functions as
the object, ikerketa-proiektu (‘research project’). As
a result, the meaning was not properly rendered in
the translation and from the perspective of the tl,
the sentence was incomplete. The same error was
found in five translations (out of a total of eleven).
We also find collocations that were translated too
literally or erroneously, as in Table 7. In the case of
Kontakte herstellen (literally ‘to make contact’, but
meaning to socialise with others), most students
(6 out of 8) tended to translate the noun Kontakte
with the Basque word kontaktu (‘contact’) or har-
reman (‘relation’) and then combine it with verbs
such as izan (‘to be’), hasi (‘to begin’), jarraitu (‘to
keep’). Only two students deviated more from the
original and conveyed the meaning of the sentence
using jendea ezagutu (‘to get to know people’).
Distortion
Informal mwes may have caused problems for
trainee translators in this language combination.
The case of the German mwe (gar) kein Bock
haben (‘not being in the mood for’), where two stu-
dents misinterpreted the meaning of the original,
is noteworthy. In the first translation of Table 8,
the meaning of the mwe was exaggerated and dif-
fered from the st: no desire to live. In the second,
another meaning of the word Bock was used and the
student inadequately translated the mwe as ‘not a
single person’. According to the German monolin-
gual dictionary dwds (Berlin-Brandenburgischen
Akademie der Wissenschaften, n.d.), Bock can also
mean “störrischer Mensch, Dickkopf ” (‘stubborn
person, obstinate’).
Table 7 Translation of a Collocation
ST8
Aber im 21.
Jahrhundert
gibt es ja noch
Möglichkeiten,
Kontakte
herzustellen (…).
[But in the 21 century,
there are still ways
to socialise (…).]
Baina, XXI. mendean
badaude jada
harremanetan
hasteko erak (…).
[But in the 21 century
there are ways to
begin socialising
(…).]
Table 8 Translation of a Colloquial mwe
ST6
„12 Uhr aufgestanden,
gar kein Bock“,
schreibt einer. [“Got up
at 12, not in the mood”,
writes one.]
“12tan esnatuta,
bizitzeko gogoirk
[sic] gabe”, idatzi du
norbaitek. [“Waking up
at 12, no desire to live”,
writes someone.]
«12:00ak puntuan, eta
tiporik ez» idazten du
batek. [“12 o’clock and
not a single person”,
writes one.]
Table 6 Translation of a Multiword Verb
ST4
Mittelfristig
zielen wir auf ein
Forschungsprojekt,
dass (…). [For
the medium term,
we are aiming at
a research project
that (…).]
Epe ertainera,
ikerketa-proiektu
bat espero dugu,
(…). [For the
medium term, we
hope a research
project (…).]
16Íkala Zuriñe SanZ-Villar
Medellín, C oloMbia, V ol. 29 issue 1 (January-april, 2024), pp. 1-21, issn 0123-3432
www.udea.edu.co/ikala
Compounds may also have caused difficulties to
students from the perspective of the transfer from
one language into the other. In st3 we find the
compound Nullerjahre (‘Noughties’). In this case,
the output of the machine translation (mt) sys-
tem may have influenced the translation. Six out of
ten trainee translators translated it as “in the 90s”
(90eko hamarkadan, 90. hamarkadan) and so did
DeepL in its German-to-Spanish translation out-
put (en los años noventa).
These (and other) examples show that mt systems
struggle not only with collocation- or idiom-like
multiword expressions but also with compounds,
as may be the case of Nebenfachstudierende from
our corpus. This refers to students with German
as their second foreign language. However, DeepL
translated this expression as alumnos menores, and
some students (4 out of 11), probably taking this
intermediary version as the new st, translated it to
Basque as ikasle gazte(ago) (‘young(er) students’)
or ikasle adingabe (‘underage students’), with-
out considering the context. We find this mwe
in the abstract (ST4). Only one student chose to
explain what the mwe means: bigarren hizkuntza
alemana daukatenentzat (‘for those who have
German as a second language’).
Idiomaticity may be another factor influencing the
translation of mwes. In the example in Table 9, it
is possible that additional linguistic and instrumen-
tal factors influenced the translation: there was no
literal mwe in the tl, the context did not help to
elucidate its meaning, and the outputs of mt sys-
tems made no sense. The mwe in Table 9, seine
Runden drehen (‘to do a few lengths (in the lake)’),
was translated in a variety of ways: paseoa egin (‘to
go for a walk’) or korrika egin (‘to run’), to mention
but two. In the previous sentences, we do find ref-
erences to water — Seestück and am Wasser — but
they appear not to have been helpful for most of
the students. The text is about the German film Die
Welle (The Wave), and this part describes the daily
routine of the chief character, Wenger.
In some cases, we found mwes that may have dif-
ferent definitions, whose meaning is dependent
on the context and which caused transfer prob-
lems for all students without exception. This is
the case of the binomial hin und her (gehen) in the
example in Table 10. The st refers to the situa-
tion described in the previous paragraph, where it
is explained how first-year students are waiting in
Zoom and writing in the chat before the online
class begins. We found six very different transla-
tion options, but they all contained transfer errors.
The first option included in the table refers to one
of the students who is nosing around. The second
makes reference to an explicit movement: she is
Table 9 Translation of an Idiomatic mwe
ST2
Die Geschichte beginnt als
Seestück. Wenger (Jürgen
Vogel) lebt mit seiner Frau
Anke (Christiane Paul) in
einem Blockhaus am Wasser.
Jeden Morgen dreht er in
vollkommener Einsamkeit
seine Runden, danach
wird gefrühstückt, und seit
ein paar Tagen ist Anke
schwanger. [The story begins
as a lakeside story. Wenger
(Jürgen Vogel) lives with his
wife Anke (Christiane Paul)
in a log cabin by the water.
Every morning he does a few
lengths in complete solitude,
then they have breakfast,
and for a few days now Anke
has been pregnant.]
Istorioa itsas pintura bat
bezala hasten da. Wenger
(Jürgen Vogel) eta bere
emazte Anke (Christiane
Paul) ur ertzetan bizi dira
egurrezko etxe batean.
Wengerrek goizero gosaldu
baino lehenago bakardade
osoan egiten du korrika,
eta duela egun pare batetik
Anke haurdun dago. [The
story begins like a sea
painting. Wenger (Jürgen
Vogel) and his wife Anke
(Christiane Paul) live by the
water in a wooden house.
Every morning Wenger goes
for a run in total solitude
before having breakfast, and
for a few days now Anke has
been pregnant.]
Table 10 Translation of a Binomial
ST6
So geht das ein
bisschen hin
und her. [It
goes back and
forth for a little
while.]
Kuxkuxeatzen ibili da
pixka bat. [She/he has
been snooping for a little
while.]
Alde batetik bestera
mugitzen da. [She/he
moves from one side to
the other.]
Gutxigorabehera badoa.
[It works somehow.]
17Íkala G erman - to -B asque translation analysis of multiword expressions in a learner translation C orpus
Medellín, ColoMbia, Vol. 29 issue 1 (January -april, 2024), pp. 1-21, issn 0123-3432
www.udea.edu.co/ikala
moving from one side to the other. The third says
“it works”, maintaining to some extent the ambi-
guity of the original, but leaving the reader with
the uncertainty of knowing what it is that works.
Positive
Translations marked as positive were regarded as
such because the outputs were particularly good
or because they utilised resources typical of the
Basque language. In the example of Table 11,
Table 11 Example of Positive Outcomes
ST2
Jeden Morgen dreht er
in vollkommener
Einsamkeit seine Runden,
danach wird gefrühstückt,
und seit ein paar Tagen ist
Anke schwanger. [Every
morning he does a few
lengths in complete solitude,
then they have breakfast,
and for a few days now Anke
has been pregnant.]
Wenger goizero joaten
da igeri egitera bakar-
bakarrik, eta, ondoren,
gosaltzen du. Bada egun
pare bat Anke haurdun
dela. [Every morning
Wenger goes swimming in
complete solitude and then
he has breakfast. For a few
days now Anke has been
pregnant.]
Table 12 Examples of Positive Outcomes
ST6
In den Semesterferien ist
die Stadt ausgestorben, am
Semesteranfang überdreht.
[During the break, the city is
deserted, at the beginning of
the semester, packed.]
Udako eta Gabonetako
oporretan hiria ia hutsik
gelditzen da, eta lauhileko
hasieran jendez lepo dago.
[During the summer and
Christmas holydays, the city
is almost empty and at the
beginning of the semester, it
is crowded.]
ST1
In ziemlich schlichter
Manier will das pädagogisch
wertvolle Buch die
Augen öffnen für
die Verführbarkeit des
Menschen durch autoritäre
Gemeinschaftsideologie.
[In a rather simple way, the
pedagogically valuable book
aims at opening people’s
eyes to the seductiveness
of authoritarian community
ideology.]
Oso liburu esanguratsua
da pedagogikoki, eta haren
asmoa modu sinplean
honakoaz ohartaraztea
da: gizakia erraz erakartzen
du komunitate-ideologia
autoritarioak. [Pedagogically,
it is a very significant book,
and its goal is to draw
attention to the following
in a simple way: people
are easily attracted to
authoritarian community
ideology.]
the student, instead of sticking to the origi-
nal, made use of reduplication — an often-used
resource in Basque (Ibarretxe-Antuñano, 2012,
p. 138) — which in this case served to express an
emphatic use (i.e. bakar-bakarrik, ‘in complete
solitude/loneliness’).
There are translations that are more idiomatic
than the German sts. This is the case of the first
example in Table 12, where the student used a
somatic mwe in Basque: (jende)z lepo (where lepo
means ‘neck’ and the overall figurative meaning is
‘crowded’).
This translation was tagged as positive, but it
should be noted that phraseological gain does
not always result in a better translation. In the last
example in the same table, the German somatic
mwe (die Augen öffnen, literally ‘open the eyes’)
was not translated with an equivalent Basque
somatic mwe, which exists, but with a verb
(ohartarazi, ‘to draw attention to’) that properly
captured the meaning of the original.
Suspected sl Intrusion
As previously discussed, when translating from
German to Basque, students are used to working
with an intermediate language, usually Spanish.
Furthermore, as the students themselves recog-
nised when completing the metadata forms, the
majority use mt systems. In this language combi-
nation, they first consult the German-to-Spanish
mt, and may then look for the output of the
Spanish-to-Basque mt system. The possible inter-
ference from the pivot language may be regarded
as textual interference since traces of the mt text
may be found in the Basque translations. Traces of
such textual interference may be observed in the
two examples in Table 13. DeepL, currently the
most widely used mt system among our trainee
translators, translated the first sentence from the
18Íkala Zuriñe SanZ-Villar
Medellín, C oloMbia, V ol. 29 issue 1 (January-april, 2024), pp. 1-21, issn 0123-3432
www.udea.edu.co/ikala
table as follows: “El alemán breve ha cobrado vida
propia”. Cobrar vida propia is an mwe meaning
‘to take on a life on its own’. In the Basque trans-
lations of this sentence, two students (out of 10)
chose to retain and literally translate the Spanish
mwe to Basque: bizitza propioa hartu (literally,
‘to take the own life’).
The translation of the second example (ST4) also
shows traces of another Spanish mwe, tener con-
ocimientos de algo (literally, ‘to have knowledge of
something’), which was found in the DeepL trans-
lation: “En primer lugar, la gran mayoría de ellos
tiene conocimientos de español y de euskera” (‘first
of all, the vast majority of them have knowledge of
Spanish and Basque’). Another reason for choos-
ing this direct translation of the Spanish mwe may
have been the register of the st. Because st4 was
a formal, specialised text, the trainee translator, in
an attempt to maintain the same level of formality,
may have thought that they were maintaining the
register in the translation by using this noun and
verb collocation (instead of a single verb — in this
case jakin — as others have appropriately done).
Cultural Mismatch
The group of mwes belonging to this section com-
prised predominantly solid compounds, which
were often mistranslated when translated literally
due to their attachment to the source culture, as
shown in Table 4.
Discussion and Conclusions
The goal of the present paper, carried out in the
framework of the must project, was to present
the results of the annotation in an ltc consisting
of German sts and trainee translators’ outputs
to Basque. As described in the results section,
the category with the highest number of error
annotations is Language, and within this cate-
gory, lexical errors are quantitatively the most
important. This has been observed in other stud-
ies (Castagnoli et al., 2011; Wurm, 2013). In the
Language category, the second subcategory with
the most errors is Mechanics, which includes both
spelling and punctuation errors. In the paper by
Fictumova, Obrusnik and Stepankova (2017), in
which a Czech-to-English ltc was compiled and
examined, punctuation was regarded as a problem
that needed to be “addressed in the curriculum”
(2017, p. 225). On the basis of our results, the
same applies to the language combination exam-
ined in this article.
More specifically, the present study has con-
tributed to the analysis of mwes in translation
didactics. The total number of annotations related
to these units is 324, spread over 8 different trans-
lation tasks from three different academic years.
The uneven number of translations per task makes
comparison between tasks difficult, but it can be
highlighted that the specialised text (ST4) repre-
senting an abstract was the most demanding. As
has already been explained, there were many mis-
translated terms (32, see Table 5) in this text. As
for non-terms, examples of erroneously translated
idioms (Table 9), collocations (Table 7), and mul-
tiword verbs (Table 6) were given.
Table 13 Interference from Spanish
ST3
Das Kurzdeutsch habe
sich verselbstständigt,
werde längst auch
von deutschen
Muttersprachlern aller
Bildungsschichten genutzt.
[“Short German” has taken
on a life on its own and
has long been used by
native German speaker of
all educational levels.]
Alemaniera laburrak
bizitza propioa hartu
du, eta Alemaneko hiztun
natiboek erabili dute
bizitzako arlo guztietan.
[“Short German” has
taken his own life and it
has been used by native
German speakers in all
areas of life.]
ST4
Zunächst gilt für die
große Mehrheit, dass
sie Kenntnisse in
Spanisch und Baskisch
mitbringen; [First of all,
the vast majority have
knowledge of Spanish and
Basque.]
Lehenik eta behin,
gehienek gaztelaniaren
eta euskararen
ezagutza dute, (…)
[First of all, most of
them have knowledge of
Spanish and Basque.]
19Íkala G erman - to -B asque translation analysis of multiword expressions in a learner translation C orpus
Medellín, ColoMbia, Vol. 29 issue 1 (January -april, 2024), pp. 1-21, issn 0123-3432
www.udea.edu.co/ikala
Activities to increase awareness of different types
of mwes and of the abovementioned problematic
cases can be addressed in class. More ambitiously,
a German-Basque bilingual dictionary of mwe
could be designed, whose main recipients would
be translation trainees: “Learner translation data
could be used to design similar notes and incor-
porate them into bilingual dictionaries, especially
learners’ bilingual dictionaries” (Granger & Lefer,
2016 as cited in Granger & Lefer, 2020, p. 1195).
Castagnoli (2023) argued that variation is greater
in the translation of idiomatic expressions and
especially when a literal translation is not possi-
ble in the tl. The analysis of variation was not the
goal of the present study; but based on the dis-
tortion errors described in this paper, it could be
suggested that in some cases not having a literal
translation in the tl may increase variation and
errors in the translations (see Table 10). This is
not a conclusion, but a matter that may be worthy
of further analysis in the future.
Certain specific mwes were generally problematic.
This was the case, for instance, with the binomial
mwe hin und her gehen (see Table 10) and the
multiword verb zielen auf (see Table 6). The lat-
ter example corroborates the results of a previous
study (Sanz-Villar, 2022), where the translation of
multiword verbs was analysed, as indicated in the
introduction of this paper. Marcelo Wirnitzer and
Amigo Extremeña (2015, p. 381) mention in their
paper that students are generally familiarised with
English phrasal verbs, but less frequent phrasal
verbs (or verbs with particles, as they call them)
were translated erroneously more frequently.
The examples in the section on distortion errors
may indicate that relying on mt systems’ outputs
when translating mwes can sometimes lead to
mistranslations or interference. A critical use of
resources for translators such as mt systems is nec-
essary, as stated by Rabadán and Gutiérrez Lanza
(2020, pp. 379-380): “Contemporary transla-
tion training relies on technology, from translation
memories and machine translation to the more
modest grammatical and spell-checkers, to reduce
the time and effort invested in the task. However,
as with any use of language and translation tech-
nology, successful performance requires that the
user can evaluate the outcome. A variety of (post)-
editing strategies can be applied to both human
and machine translation outputs, which require
critical human assistance”.
As far as interference is concerned, it is not lim-
ited to the influence that the German st can exert
on the Basque tt but can also be affected by the
influence of intermediary mt versions in a third
language, mostly Spanish, as shown in Table 13.
With regard to this type of interference, referred
to as instrumental interference in Sanz-Villar
(2018), hands-on activities to become familiar
with and use monolingual and bilingual resources
(albeit indirect) related to mwes can be helpful.
The importance of interference awareness among
translation trainees is highlighted by Rabadán
and Gutiérrez Lanza (2020, p. 380): “Whether
translating or (post)-editing, awareness of
language-pair-dependent problems underlies suc-
cessful performance. Human translation, partially
informed by machine-mediated translation, is a
given in student workflows, but errors can easily
go unnoticed if cross-linguistic competence is not
properly developed”. Thus, developing awareness
of source- and third-language interference when
translating mwes may benefit students’ cross-lin-
guistic competence.
It may also be useful to present in class the dif-
ferent translation options we may have when
translating mwes. Students may have the errone-
ous impression that phraseological maintenance
or gain ensures a better translation (as explained
in the example in Table 12). Exercises including
quality evaluation of different translation options
of the same ST may be employed to design cor-
pus-based teaching material.
In a previous study (Sanz-Villar, in press), it
has been found that for some students literally
20Íkala Zuriñe SanZ-Villar
Medellín, C oloMbia, V ol. 29 issue 1 (January-april, 2024), pp. 1-21, issn 0123-3432
www.udea.edu.co/ikala
translating a Spanish mwe (when translating a lit-
erary text from German to Basque) was a conscious
decision. The mentioned study analysed not only
students’ translations but also the results of ques-
tionnaires students filled out after completing the
translation tasks. In the future, if the process of
translation is to be taken into account, using more
than one method to collect data will be crucial.
The use of mt systems, the type of mwe to be
translated, the absence of a literal counterpart in
the tl or the type of st can be important vari-
ables that need to be taken into account when
analysing the translation of mwes. The metadata
collected during the creation of this corpus also
takes into account other variables, such as the sl
and tl command of the translation trainees. It
may be interesting to consider the effect of these
variables on the output, for which it would be nec-
essary to have more specific information on the
level of proficiency of both the foreign language
and native language, beyond the self-assessment.
References
Aierbe Mendizabal, A. (2008). La traducción a la lengua
vasca de las unidades fraseológicas especializadas del
lenguaje administrativo. In M. I. González Rey (Ed.),
A multilingual focus on contrastive phraseology and te-
chniques for translation (pp. 27–44). Dr. Kovaç.
Albaladejo Martínez, J. A. (2015). Fraseología especializada
en traducción general. In P. Mogorrón Huerta & F.
Navarro Domínguez (Eds.), Fraseología, didáctica y
traducción (pp. 273–290). Peter Lang.
Alcina, A. (2008). Translation technologies. Scope, tools
and resources. Target, 20(1), 79–102. https://doi.
org/10.1075/target.20.1.05alc
Berlin-Brandenburgischen Akademie der Wissenschaften.
(n.d.). Bock. In Digitales Wörterbuch der deutschen
Sprache (dwds). https://www.dwds.de/wb/Bock#1
Castagnoli, S. (2023). Exploring variation in student trans-
lation. International Journal of Learner Corpus
Research, 9(1), 97–125. https://doi.org/10.1075/
ijlcr.22010.cas
Castagnoli, S., Ciobanu, D., Kunz, K., Volanschi, A., &
Kübler, N. (2011). Designing a learner translator
corpus for training purposes. In N. Kübler (Ed.),
Corpora, language, teaching, and resources: From
theory to practice (pp. 221–248). Peter Lang.
Corpas Pastor, G. (1996). Manual de fraseología española.
Gredos.
Espunya, A. 2013. Investigating lexical difficulties of learn-
ers in the error-annotated upf learner translation
corpus. In S. Granger, & G. Gaëtanelle (Eds.),
Twenty years of learner corpus research. Looking back,
moving ahead (pp. 129–137). Presses Universitaires
de Louvain.
Fictumova, J., Obrusnik, A., & Stepankova, K. (2017).
Teaching specialized translation. Error-tagged trans-
lation learner corpora. Sendebar, 28, 209–241.
Granger, S., & Lefer, M-A. (2021). Translation-oriented
annotation system manual (Version 2.0). cecl
Papers 3. Centre for English Corpus Linguistics/Uni-
versité catholique de Louvain. https://cdn.uclouvain.
be/groups/cms-editors-cecl/cecl-papers/TAS-2.0_
annotation_manual_2021-10-26.pdf
Granger, S., & Lefer, M-A. (2020). The multilingual stu-
dent translation corpus: a resource for translation
teaching and research. Lang Resources & Evalua-
tion, 54, 1183–1199. https://doi.org/10.1007/
s10579-020-09485-6
Granger, S., & Lefer, M-A. (2018, November 5–7). The
translation-oriented annotation system: A tripartite
annotation system for translation research [Confer-
ence presentation abstract]. Parallel Corpora: Creation
and Applications (PaCor), Madrid. https://eventos.
ucm.es/_files/_event/_18651/_editorFiles/file/
Book%20of%20abstracts_ECETT-PaCor2018b.pdf
Granger, S., & Paquot, M. 2008. Disentangling the
phraseological web. In S. Granger, & F. Meu-
nier (Eds.), Phraseology: An interdisciplinary
perspective (pp. 27–49). John Benjamins. https://
doi.org/10.1075/z.139.07gra
Ibarretxe-Antuñano, I. (2012). Análisis lingüístico de las
onomatopeyas vascas. Oihenart: Cuadernos de Len-
gua y Literatura, 27, 129–177.
Leiva Rojo, J. (2013). La traducción de unidades fraseo-
lógicas (alemán-español / español-alemán) como
parámetro para la evaluación y revisión de traduc-
ciones. In C. Mellado Blanco (Coord.), P. Buján, N.
M. Iglesias, M. C. Losada, & A. Mansilla (Eds.), La
fraseología del alemán y el español: lexicografía y tra-
ducción, (pp. 31–42). Peniope.
Marcelo Wirnitzer, G., & Amigo Extremeña, J. J. (2015).
La traducción de fraseologismos en el aula de Tra-
ducción General. In G. Corpas Pastor, M. Seguiri
Domínguez, R. Gutiérrez Florido, & M. Urbano
Mendaña (Eds.), Nuevos horizontes en los estudios de
traducción e interpretación (pp. 373–387). Tradulex.
21Íkala G erman - to -B asque translation analysis of multiword expressions in a learner translation C orpus
Medellín, ColoMbia, Vol. 29 issue 1 (January -april, 2024), pp. 1-21, issn 0123-3432
www.udea.edu.co/ikala
Mogorrón Huerta, P., & Navarro Domínguez, F. (2015).
Fraseología, didáctica y traducción. Peter Lang.
https://doi.org/10.3726/978-3-653-05309-8
Obrusnik, A. (2014, July 20–23). Hypal: A User-Friendly
Tool for Automatic Parallel Text Alignment and
Error Tagging [Conference presentation abstract].
Eleventh International Conference Teaching and
Language Corpora (TaLC 11), Lancaster. https://
ucrel.lancs.ac.uk/talc2014/doc/TALC2014-ab-
stract-book.pdf
Rabadán, R., & Gutiérrez-Lanza, C. (2020). Developing
awareness of interference errors in translation. An
English-Spanish pilot study in popular science and au-
diovisual transcripts. Lingue e Linguaggi, 40, 379–404.
Sánchez-Nieto, M. (2012). La doble interpretación aspec-
tual de predicados en la traducción alemán-español
de secuencias narrativas: análisis de un corpus de
traducciones estudiantiles. trans. Revista de Tra-
ductología, 16, 79–99. https://doi.org/10.24310/
TRANS.2012.v0i16.3213
Sanz-Villar, Z. (in press). Los fraseologismos en el aula de
prácticas de traducción: estudio de caso sobre el im-
pacto de los traductores automáticos.
Sanz-Villar, Z. (2022). German-into-Basque translation of
verbal patterns. In C. Mellado Blanco (Ed.), Produc-
tive patterns in phraseology and construction grammar
(pp. 265–286). Walter de Gruyter. https://doi.
org/10.1515/9783110520569-011
Sanz-Villar, Z. (2018). Interference and the translation
of phraseological units in a parallel and multilin-
gual corpus. Meta: Journal des traducteurs /Meta:
Translators’ Journal, 63(1), 72–93. https://doi.
org/10.7202/1050515ar
Sarasola, I., Salaburu, P., & Landa, J. (n.d.). Egungo Testuen
Corpusa (etc). https://www.ehu.eus/etc/
Serrano Lucas, L. C. (2010). Metodología para la enseñanza
de la fraseología en traducción: la ficha fraseológica
como tarea final. Paremia, 19, 197–206.
tralima-itzulik. (2019). TAligner (Version 3.0)
[Computer software]. https://addi.ehu.es/
handle/10810/42445
Valero Cuadra, P. (2015). Fraseologismos en el aula de
traducción general. In P. Mogorrón Huerta, & F.
Navarro Domínguez (Eds.), Fraseología, didáctica y
traducción (309–320). Peter Lang.
Wurm, A. (2013). Eigennamen und Realia in einem Korpus
studentischer Übersetzungen (kopte). Trans-kom,
6(2), 381–419. https://www.trans-kom.eu/
bd06nr02/trans-kom_06_02_06_Wurm_Eigenna-
men.20131212.pdf
How to cite this article: Sanz-Villar, Z. (2024). German-to-Basque translation analysis of multiword
expressions in a learner translation corpus. Íkala, Revista de Lenguaje y Cultura, 29(1), 1–21. https://
doi.org/10.17533/udea.ikala.354417