Classifying heuristic textual practices in academic discourse
A deep learning approach to pragmatics
In this paper, we investigate how deep learning techniques can be applied to discourse pragmatics. As a testcase we analyse heuristic textual practices, defined as linguistic implementations of decision routines in research processes in academic discourse. We develop a complex annotation scheme of pragmalinguistic categories on different levels of granularity and manually annotate a corpus of texts across various scientific disciplines. This is the basis for training recurrent neural networks to classify heuristic textual practices. Our experiments show that the annotation categories are robust enough to be recognised by our models which learn similarities of the sentence-surfaces represented as word embeddings. Our study aims at an iterative human-in-the-loop process in which manual-hermeneutic and algorithmic procedures mutually advance the insight process. It underlines the fact that the interaction between manual and automated methods opens up a promising field for further research, allowing interpretative analyses of complex pragmatic phenomena in large corpora.
Keywords: discourse pragmatics, textual practices, academic discourse, deep learning, annotation
Published online: 11 November 2020
https://doi.org/10.1075/ijcl.19097.bec
https://doi.org/10.1075/ijcl.19097.bec
References
References
Aijmer, K.
Archer, D., & Culpeper, J.
Archer, D., Culpeper, J., & Davies, M.
Balbaschewski, M.
(2015) Das Bankhaus H. Aufhäuser 1870–1938: Netzwerkbildung und ihre Auswirkung auf die Verdrängungsbestrebungen und „Arisierung“ im Nationalsozialismus [The Bankhaus H. Aufhäuser 1870–1938: Networking and its Impact on Liquidation and “Aryanization” in National Socialism]. [Doctoral dissertation, TU Darmstadt]. TU Prints. http://tuprints.ulb.tu-darmstadt.de/4585
Becker, M., Staniek, M., Nastase, V., Palmer, A., & Frank, A.
Bender, M., & Müller, M.
Benitez-Castro, M.-A., & Thompson, P.
Bhatia, V. K.
Biber, D., & Gray, B.
Braun, S.
(2016) Einflussfaktoren auf den Wechsel des Abschlussprüfers: Eine empirische Analyse bei kapitalmarktorientierten Unternehmen [Influential Factors on Discontinuing Audit Engagements: An Empiric Analysis of Capital-marketoriented Enterprises] [Doctoral dissertation, TU Darmstadt]. TU Prints. http://tuprints.ulb.tu-darmstadt.de/5377
Bunton, D.
Cohen, J.
Cotos, E., Huffman, S., & Link, S.
Didzoleit, H.
(2016) Struktur und Magnetismus von Ferrocen und ferrocenhaltigen Polymeren in dünnen Filmen [Structure and magnetism of Ferrocene and Ferrocene-containing polymers in thin films] [Doctoral dissertation, TU Darmstadt]. TU Prints. http://tuprints.ulb.tu-darmstadt.de/5317
Dieleman, O.
(2016) Hinweise für die Entwicklung von Verfahren zur maßnahmenartübergreifenden Dringlichkeitsbewertung von Straßenbaumaßnahmen: Ein Beitrag zur Entscheidungsfindung im Rahmen der Aufstellung von Bauprogrammen für Straßenbaumaßnahmen [Information for the Development of Guiding Principles for Priority Appraisal Assessment Procedures: A Contribution to Decision-making in the Context of Drawing Up Construction Programs for Road Construction Measures] [Doctoral dissertation, TU Darmstadt]. TU Prints. http://tuprints.ulb.tu-darmstadt.de/5570
Eggs, E.
Feilke, H.
(2012) Was sind Textroutinen? Zur Theorie und Methodik des Forschungsfeldes [What are text routines? On the theory and methodology of the research field]. In Feilke, H. & Lehnen, K. (Eds.), Schreib- und Textroutinen: Theorie, Erwerb und didaktisch-mediale Modellierung [Writing and Text Routines: Theory, Acquisition and Didactic Media Modelling] (pp. 1–31). Lang. 

Gottschling, A.
(2016) Modellierung und Simulation von Altpapiersortieranlagen [Modelling and Simulation of Recovered Paper in Industrial Sorting Plants] [Doctoral dissertation, TU Darmstadt]. TU Prints. http://tuprints.ulb.tu-darmstadt.de/5569
Greve, W., & Wentura, D.
Hedderich, M., & Klakow, D.
(2018) Training a neural network in a low-resource setting on automatically annotated noisy data. In Proceedings of the Workshop on Deep Learning Approaches for Low-Resource NLP (pp. 12–18). Association for Computational Linguistics. https://www.aclweb.org/anthology/W18-3402/. 
Henrici, N.
(2016) Die Ansprüche und Rechte des mit der Objektüberwachung der Gebäudeerrichtung beauftragten Architekten und Ingenieurs bei Bauablaufstörungen [The Claims and Rights of the Architect and Engineer Comissioned with the Construction Supervision in Case of Construction Disturbance] [Doctoral dissertation, TU Darmstadt]. TU Prints. http://tuprints.ulb.tu-darmstadt.de/5446
Hess, V.
(2015) Psychobiologische Entspannungsreaktion bei abstinenten suchtkranken Patienten: Interindividuelle Differenzen in Abhängigkeit von stressbezogenen dispositionellen Verhaltensweisen und Persönlichkeitsmerkmalen [Psychobiological Relaxation Reaction in Abstinent Addicted Patients: Interindividual Differences Depending on Stress-related Dispositional Behavior and Personality Traits] [Doctoral dissertation, TU Darmstadt]. TU Prints. http://tuprints.ulb.tu-darmstadt.de/4531
Hey, S. P.
Hufler, T.
(2016) Automorphe Formen auf orthogonalen und unitären Gruppen [Automorphic Forms on Orthogonal and Unitary Groups] [Doctoral dissertation, TU Darmstadt]. TU Prints. http://tuprints.ulb.tu-darmstadt.de/5599
Hyland, K.
Hyland, K., & Jiang, F.
Johnson, R., & Zhang, T.
(2015) Effective use of word order for text categorization with Convolutional Neural Networks. In Proceedings of the Annual Conference of the North American Chapter of the ACL (NAACL) (pp. 103–112). Association for Computational Linguistics. https://www.aclweb.org/anthology/N15-1011/
Kanoksilapatham, B.
Knorr-Cetina, K.
Kommoß, B.
(2016) Die Hydrierung von CO2 zu CH3OH unter überkritischen Bedingungen: Eine reaktionstechnische Untersuchung [The Catalytic Hydrogenation of CO2 to CH3OH under Supercritical Conditions: A Reaction Engineering Investigation] [Doctoral dissertation, TU Darmstadt]. TU Prints. http://tuprints.ulb.tu-darmstadt.de/5593
Landis, J. R., & Koch, G. G.
Lee, L., & Dernoncourt, F.
(2016) Sequential short-text classification with Recurrent and Convolutional Neural Networks. In Proceedings of the 2016 Conference of the North American Chapter of the ACL: Human Language Technologies (pp. 515–520). Association for Computational Linguistics. https://www.aclweb.org/anthology/N16-1062/
Liu, L., Mu, F., Li, P., Mu, X., Tang, J., Ai, X., Fu, R., Wang, L., & Zhou, X.
(2019) NeuralClassifier: An open-source neural hierarchical multi-label text classification toolkit. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics: System Demonstrations (pp. 87–92). Association for Computational Linguistics. https://www.aclweb.org/anthology/P19-3015/. 
Liu, P., Shafiq, J., & Meng, H.
(2015) Fine-grained opinion mining with Recurrent Neural Networks and Word Embeddings. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP) (pp. 1433–1443). Association for Computational Linguistics. https://www.aclweb.org/anthology/D15-1168/
Madabushi, H. T., Lee, M., & Barnden, J.
(2018) Integrating question classification and deep learning for improved answer selection. In Proceedings of the International Conference on Computational Linguistics (COLING) (pp. 3283–3294). Association for Computational Linguistics. https://www.aclweb.org/anthology/C18-1278/
Meister, J.-C., Petris, M., Gius, E., Jacke, J., Horstmann, J., & Bruck, C.
Mikolov, T., Sutskever, I., Chen, K., Corrado, G. S., & Dean, J.
(2013) Distributed representations of words and phrases and their compositionality. In Advances in Neural Information Processing Systems (pp. 3111–3119). Neural Information Processing Systems (NIPS). https://papers.nips.cc/paper/5021-distributed-representations-of-words-and-phrases-and-their-compositionality.pdf
Ng, H. T., Lim, C. Y., & Foo, S. K.
(1999) A case study on inter-annotator agreement for word sense disambiguation. In Proceedings of the ACL SIGLEX Workshop: Standardizing Lexical Resources. Association for Computational Linguistics. https://www.aclweb.org/anthology/W99-0502/
Nielsen, M.
Ravenscroft, J., Oellrich, A., Saha, S., & Liakata, M.
(2016) Multi-label annotation in scientific articles – The Multi-label Cancer Risk Assessment Corpus. In Proceedings of the International Conference on Language Resources and Evaluation (LREC) (pp. 4115–4123). Association for Computational Linguistics. https://www.aclweb.org/anthology/L16-1650/
Reimers, N., Eckle-Kohler, J., Schnober, C., Kim, J., & Gurevych, I.
(2014) Germeval2014: Nested Named Entity Recognition with neural networks. In Proceedings of the 12th Edition of the KONVENS Conference (pp. 117–120). https://hildok.bsz-bw.de/frontdoor/index/index/year/2014/docId/285
Santos, C. N., & Gatti, M.
(2014) Deep Convolutional Neural Networks for sentiment analysis of short texts. In Proceedings of the 25th International Conference on Computational Linguistics (COLING) (pp. 69–78). Association for Computational Linguistics. https://www.aclweb.org/anthology/C14-1008/
Steinert, K.
(2017) Collaborative Web-Based Short Text Annotation with Online Label Suggestion [MA thesis, TU Darmstadt]. https://www.inf.uni-hamburg.de/en/inst/ab/lt/teaching/theses/completed-theses/2017-steinert-textanno.pdf
Swales, J. M.
Teruel, M., Cardellino, C., Cardellino, F., Alonso Alemany, L., & Villata, S.
(2018) Increasing argument annotation reproducibility by using inter-annotator agreement to improve guidelines. In Proceedings of the 11th International Conference on Language Resources and Evaluation (LREC) (pp. 4061–4064). Association for Computational Linguistics. https://www.aclweb.org/anthology/L18-1640/
Thompson, P.
Wang, L., & Ling, W.
(2016) Neural network-based abstract generation for opinions and arguments. In Proceedings of the 2016 Conference of the North American Chapter of the ACL (NAACL): Human Language Technologies (pp. 47–57). Association for Computational Linguistics. https://www.aclweb.org/anthology/N16-1007/
Weisser, M.
Wenninger, H.
(2016) Der Einfluss sozialer Online-Netzwerke auf ihre Mitglieder: Eine Analyse von Nutzungsarten und sozialen Mechanismen [The influence of Social Networking Sites on their Members: An Analysis of Usage Patterns and Social Mechanisms] [Doctoral dissertation, TU Darmstadt]. TU Prints. http://tuprints.ulb.tu-darmstadt.de/5531
Wess, J.
(2016) Untersuchungen zur Prozessierung von Intermediaten der DNA-Doppelstrangbruchreparatur in der Mitose nach Bestrahlung in der G2-Phase [Investigations on the Processing of Intermediates of DNA Double-Strand Break Repair in Mitosis after Irradiation in the G2-phase] [Doctoral dissertation, TU Darmstadt]. TU Prints. http://tuprints.ulb.tu-darmstadt.de/5575