Mobile Menu
New
Books
Forthcoming titles
New in paperback
New titles by subject
April 2024
March 2024
February 2024
January 2024
Book Series
Journals & Yearbooks
New serials
Latest issues
Currently in production
Catalog
Books
Active series
Other series
Collections
Open-access books
Text books & Course books
Dictionaries & Reference
By JB editor
Journals & Yearbooks
Active serials
Other
By JB editor
Software
Browse by person
Browse by subject
Advanced Search
Downloadable lists
Printed catalogs
E-book collections
Online Resources
Customer Services
Contact
Amsterdam (Main office)
Philadelphia (North American office)
Directions
Book Orders
General
US, Canada & Mexico
E-books
Examination & Desk Copies
Journal Subscriptions
General information
Access to the electronic edition
Terms of Use
Journal collections
Journal mutations
Rights & Permissions
Mailing List
E-newsletter
Book Gazette
For Authors
Proposals for Books
Proposals for Book Series
Proposals for Journals
Submissions to Journals
Editorial Manager
Ethics Statement
Kudos
Open Access Policy
Rights Policy
For Librarians
Evidence-Based Acquisition
E-book Collections
Journal Collections
Open Access information
Journal mutations
Part of
Parallel Corpora for Contrastive and Translation Studies: New resources and applications
Edited by Irene Doval and M. Teresa Sánchez Nieto
[
Studies in Corpus Linguistics
90] 2019
◄
previous
Index
A
–ment adverb
50
addition
111, 146, 148
agreement metric
169
agreement study
163, 164, 169
Aleuska
198, 233, 238–239
alignment
6, 67, 84–89 ; , 101–103 ; , 108–113 ; , 144–146 ; , 204–205 ; , 221–222 ; , 239–240 ; ,
see also
word alignment
text-to-video alignment
anchor
70, 256
annotation
discourse phenomena
160, 179
layer
65, 72, 82, 165, 166
scheme
156, 163, 165–166 ; , 170, 176
see also
corpus annotation
automatic annotation
AntPConc
67
Apertium
258
automatic annotation
80, 162
automatic speech recognition (=ASR)
282, 283
B
base
268–269
bilingual cognates
13, 252, 253, 256–258 , 262, 263
bilingual equivalence
275
bilingual extraction
251, 252
bilingual lexicon induction
59, 62
bilingual lexicon
59, 62, 109, 251
bilingual word embeddings
282, 286, 287
bitext
15, 70, 108–112 , 121
Bizimena
235
BleuAlign
6, 85
British National Corpus
234
Brown Corpus
1, 39, 233
C
Canadian Hansard Corpus
2
causative construction
100
code-switching
79–81 , 89
cognate extraction
251, 256–257 , 263
collocate
268, 271
collocation
269, 270
equivalent
267–269
extraction
268, 269
COMPARA
40, 124, 138, 197
comparable corpus
19, 20, 44, 198, 251, 259
comparable parallel corpus
11, 20, 21
continuous bag-of-words (=CBOW )
286
copyright
63, 144
corpus analysis
208, 211, 219
corpus annotation
81–84 , 86, 163
corpus compilation
106, 192, 197, 198, 200, 201, 217, 218, 220, 238
Corpus Query Processor (=CQP)
198, 200, 203–205 ; , 206, 207
see also
CQPweb
198, 201, 203
Corpus Workbench (=CWB)
198, 200, 212, 213, 219, 220, 223, 231
cosine distance
271, 272, 274
cosine similarity score
285, 286
CREA
20, 29, 30, 69
culture-specific item
46, 49
Czech National Corpus
93, 94
D
degree of comparability
257
Déjà Vu
201
dependency parsing
83, 267, 269, 270
DepPattern
259
Dice Coefficient
268
discourse phenomenon
160, 162, 179
distributional hypothesis
254
distributional model
270, 272, 273
distributional semantics
269, 271, 285
distributional similarity
253, 254, 256
dynamic corpus
193, 238
E
Edit Distance
252, 257
Egungo Testuen Corpusa
235
EHUskaratuak
235, 237
elliptical compound
84, 88
empirical turn
34
encoding
153–156 , 203, 204
engagement marker
176
English progressive
22, 24, 25
English-Norwegian Parallel Corpus (=ENPC)
2, 40, 67, 221, 227
English-Swedish Parallel Corpus (=ESPC)
2, 40, 227
Eroski Consumer corpus
235, 236
Europarl
2, 60, 79
evidentiality
159, 160
explicitation
22, 44
extended tagset
163, 170
extraction method
270
F
finite state machines (=FSM)
283
Foma
243
FreeLing
117
functional-semantic tagset
170
G
Galnet
153, 158
genre
28, 59, 66, 107, 109, 186, 218
gerund
25, 31–33 , 226
GIZA ++
99, 288
granularity
68, 163
gravitational pull hypothesis
23, 24, 31–33
H
Hizkuntzen arteko corpusa
235
Hunalign
109, 131
indexation
7, 197, 200
inter-annotator agreement
179
interference
50, 54, 135
interlanguage
64, 71
InterLingual Index
152
intermodal corpus
124, 125, 138
InterText
94, 95
intra-annotator agreement
164
IULA
198, 212
IXA pipe tools
246
J
JRC Acquis
79
K
Kappa coefficient
164, 169
Key Word in Context (=KWIC)
114, 115
Klasikoen gordailua
235
Kontext
94, 98
L
Lancaster-Oslo/Bergen Corpus
39
language identification
80, 81
language learning
10, 43, 64, 104, 105, 151
language variety
26, 233
lemmatization
84, 114, 132
lexical semantics
141, 142, 285
LF Aligner
109, 201
LinguaKit
274
Linguee
2, 106
loan word
133–135
log-likelihood
51, 52
M
machine learning
72
MaltParser
83, 274
manner adverb
50, 54
META-NORD
30
metadata
108, 118, 129–130 , 204, 208
metadiscourse marker
175
metatextual tagging
201
minority language
9, 253
modality
171, 172
mode of translation
238, 241
monitor corpus
27, 65, 238
moses
290
Multilingwis
86, 104
multimedia parallel corpus
147, 155
multivec
286
mutual information
271
multi-word expression (=MWE)
243
N
naïve bilingual distributional model
272
named entity recognition (=NER)
82, 241
Natural Language Processing Toolkit (=NLTK)
288
negative sampling
286
neural network
271, 286
normalization
52, 135–136 ;
see also
shorthand form normalization
Norwegian Spanish Parallel Corpus (
=
NSPC)
28
O
onomatopoeic expression
243, 246
OpenSubtitles 2016
274, 276
OPUS
60, 109, 114, 143
overrepresentation hypothesis
22, 24
P
parallel concordancing
126, 183
part-of-speech tagging (=POS tagging)
81, 82, 84, 117, 124, 127, 132, 162, 241–244 ; , 246
see also
universal POS tag
PETRA 1.0
69–70
phonetic sequence
284
pivot language 94
100, 252, 255
post-editing tool
69
prototypicality
23, 31–33
pseudo-parallel text
256
R
re-attachment of German verb prefixes
84
reciprocal corpus
124
register
142, 160–162
regularity of patterns
218
reliability
58, 99
of the annotation scheme and guidelines
163–164
reordering
145
replicable corpus building protocols
65
replicability
11, 21
reusable parallel resources
65
reusability
5, 30, 60, 65
sampling frame
21, 25–26 , 28–29
seed context
256
segmentation
112, 144, 241, 283, 289, 290, 295
of spoken language
128, 131
semantic annotation
82, 163
semantically annotated (corpus)
152
semantic distance
272
semantic mirror image
49
semantic network
152
semantic tagging
153
SemCor Corpus
152–153
SensoGal Corpus
152–154
sentence alignment
6, 95, 108, 298
sentence division
221
sentiment annotation
83
shining-through
50
shorthand form
281
normalization
283, 290
similarity measure
109, 268, 271
SketchEngine
60–61 , 86
skewedness
188
skip-gram architecture
272
sms4science
283, 287
statistical machine translation (
=
SMT)
282–284 , 289
Solr
116–119
specialized translation
142
spelling similarity
256
spelling variant
84, 287–294
standardization
58, 65
State treaty
184, 187, 191
Statistical Corpus of the Twentieth Century
234, 235
stylistic aspect of translation
145, 148
subtitling
132, 147, 150
synonym detection
87
synset
152
T
tagset
81, 132, 176
see also
extended tagset
tagger
61
TAligner
239
Translation Corpus Aligner (=TCA)
67, 221
text message
281–284 , 287–289
text-to-video alignment
131
TextHammer
186
textual/audiovisual interface
148
textual mark-up
108
thematization
164
TMX
144–146 , 150, 153
tokenization
80, 97, 162, 242
training corpus
113, 117, 162, 176
training materials
60, 290, 294
transcription
128
translation candidate
99, 252, 254, 259, 284
translation direction
161–162
translation equivalent
40, 104, 231, 253, 259, 263, 269
translation error detection
87
translation memory
60, 144, 201
translation norm
43
translation problem
45, 49, 53
translation process
22, 32, 109, 127
translation universal
19, 22, 41
Translational Database of the Gipuzkoan Provincial Government
235
translationese
40, 44
TreeTagger
68, 82, 98, 201, 222
tweet
284
U
UNESCO Index Translationum
217
unique items hypothesis
22, 24, 32
Universal Dependencies
270
universal dependency label
83
universal of translation
21
Unix pipes metaphor
242
universal POS tag
81
usability
60, 65, 95, 227
usefulness
65, 71, 227
user group
80, 114
user interface
198, 224
V
validation
65, 253, 255, 260
variety
218, 233, 276
of Norwegian
27
of Italian
135
of French
9
vector
271, 285
representation
285–286
VEIGA
147–151
verb-object collocation
275
W
Web Corpusen Ataria
237
web interface
216
word alignment
80, 85, 119, 272
word boundary
288, 290
word embedding
271, 274, 285
word2vec
286
WordNet
152
X
XML
144, 153, 221
Z
Zientzia eta teknologia corpusa
235
Zientzia Irakurle Ororentzat
235
Zuzenbide corpusa
235