An introduction to Visual Constituent Analysis: Chapter 7. Constructing corpora from images and text

Christiansen, Alex; Dance, William; Wild, Alexander

doi:10.1075/scl.98.07chr

Part of

Corpus Approaches to Social Media
Edited by Sofia Rüdiger and Daria Dayter
[Studies in Corpus Linguistics 98] 2020
► pp. 149–174

Chapter 7
Constructing corpora from images and text

An introduction to Visual Constituent Analysis

Alex Christiansen | Loughborough University,

William Dance | Lancaster University,

Alexander Wild | Lancaster University,

Visual analysis represents a significant oversight in the corpus literature, and possibly one that may lead to unintended omissions, particularly when analysing social media. In this chapter we introduce Visual Constituent Analysis (VCA), a method of multimodal corpus construction that allows researchers to construct and analyse visual aspects of online media in large-scale corpora. The chapter addresses the shortcomings of a purely textual approach to discourse analysis when dealing with social media texts and offers a solution using computer ‘Vision’-based image annotation (in our case Google Cloud Vision). Finally, we demonstrate how our approach can be used to analyse a sample of 150,000 micro-blog posts from Twitter and show the difference in level of user interaction with combined image/texts over language-only social media texts.

Keywords: corpus construction, multimodality, images, Twitter, information operations

Article outline

1.Introduction
2.Opportunities and obstacles – Why ‘visual’?
3.Tentative solutions – Constructing ‘constituents’
- 3.1Vision
- 3.2Concatenating outputs
4.Analysing T-IRA with VCA
- 4.1Hostile state information operations
- 4.2T-IRA – a general overview
5.Quantifying the importance of images
- 5.1Images, likes and retweets
- 5.2Text-image overlap
6.T-IRA – A case study
- 6.1Data
- 6.2Method
- 6.3Analysis
  - 6.3.1Image reference (IR)
  - 6.3.2Image and text reference (ITR)
7.Concluding remarks
Acknowledgments
Notes
References

Published online: 4 November 2020

https://doi.org/10.1075/scl.98.07chr

References (38)

References

Allwood, Jens. 2008. Multimodal corpora. In Corpus Linguistics: An International Handbook, Vol. 1 [Handbücher zur Sprach- und Kommunikationswissenschaft 29.1], A. Lüdeling & M. Kytö (eds), 207–225. Berlin: Mouton de Gruyter.

Archer, Dawn, Wilson, Andrew & Rayson, Paul. 2002. Introduction to the USAS Category System. Lancaster University. <[URL]> (15 October 2019).

Baker, Paul. 2006. Using Corpora in Discourse Analysis. London: Continuum.

Baker, Paul & McEnery, Tony. 2015. Corpora and Discourse Studies: Integrating Discourse and Corpora. London: Palgrave.

Bateman, John, Wildfeuer, Jamina & Hiippala, Tuomo. 2017. Multimodality: Foundations, Research and Analysis – A Problem-oriented Introduction. Berlin: De Gruyter.

Bednarek, Monika & Caple, Helen. 2017. Introducing a new topology for (multimodal) discourse analysis. In Transforming Contexts: Papers from the 44th International Systemic Functional Congress, Phil Chappell & John S. Knox (eds), 19–25. Wollongong: 44th ISFC Organising Committee.

Caple, Helen. 2018. Analysing the multimodal text. In Corpus Approaches to Discourse: A Critical Review, Charlotte Taylor & Anna Marchi (eds), 85–109. London: Routledge.

. 2019. “Lucy says today she is a Labordoodle”: How the dogs-of-Instagram reveal voter preferences. Social Semiotics 29(4): 427–447.

Chang, Yan-Shou. 2017. Fine-grained attention for image caption generation. Multimedia Tools and Applications 77(7): 2959–2971.

Chen, Jianfu, Kuznetsova, Polina, Warren, David S. & Choi, Yejin. 2015. Deja image-captions: A corpus of expressive descriptions in repetition. In Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Matt Post (ed.), 504–514. Denver CO: ACL.

CNSSI 4009–2015. CNSSI 4009 Committee on National Security Systems (CNSS) Glossary. Strategic Environmental Research and Development Program (SERDP), Committee on National Security Systems. <[URL]> (20 October 2019).

Deighton-Smith, Nova & Bell, Beth T. 2018. Objectifying fitness: A content and thematic analysis of #Fitspiration images on social media. Psychology of Popular Media Culture 7(4): 1–41.

Facebook. 2016. Facebook reports second quarter 2016 results. 27 July 2016, <[URL]> (20 October 2019).

Fanelli, Gabriele, Gall, Juergen, Romsdorfer, Harald, Weise, Thibaut & van Gool, Luc. 2010. 3D vision technology for capturing multimodal corpora: Chances and challenges. In LREC Workshop on Multimodal Corpora, Rada Mihalcea, Joyce Chai & Anoop Sarkar (eds), 70–73. Valletta: European Language Resources Association (ELRA).

Firth, John R. 1957. Papers in Linguistics, 1934–1951. London: OUP.

Gatt, Albert, Tanti, Marc, Muscat, Adrian, Paggio, Patrizia, Farrugia, Reuben A., Borg, Claudia, Camilleri, Kenneth, Rosner, Micahel & van der Plas, Lonneke. 2018. Face2Text: Collecting an annotated image description corpus for the generation of rich face descriptions. In Proceedings of the 11th edition of the Language Resources and Evaluation Conference, Nicoletta Calzolari, Khalid Choukri, Christopher Cieri, Thierry Declerck, Sara Goggi, Koiti Hasida, Hitoshi Isahara, Bente Maegaard, Joseph Mariani, Hélène Mazo, Asuncion Moreno, Jan Odijk, Stelios Piperidis, Takenobu Tokunaga (eds), 3323–3328. Miazaki: European Language Resources Association (ELRA).

Jewitt, Carey. 2015. Multimodal analysis. In The Routledge Handbook of Language and Digital Communication, Alexandra Georgakopoulou & Tereza Spilioti (eds), 69–85. London: Routledge.

Knight, Dawn. 2015. e-Language: Communication in the digital age. In Corpora and Discourse Studies, Paul Baker & Tony McEnery (eds), 20–40. London: Palgrave Macmillan.

Kress, Gunther & van Leeuwen, Theo. 2006. Reading Images: The Grammar of Visual Design, 2nd edn. London: Routledge.

Kuznetsova, Polina, Ordonez, Vicente, Berg, Alexander, Berg, Tamara & Choi, Yejin. 2013. Generalizing image captions for image-text parallel corpus. In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics, Hinrich Schuetze, Pascale Fung & Massimo Poesio (eds), 790–796. Sofia: Association for Computational Linguists (ACL).

Liu, Jeffrey, Weinert, Andrew & Amin, Saurabh. 2018. Semantic topic analysis of traffic camera images. In 21st IEEE International Conference on Intelligent Transportation Systems (ITSC), Wei-Bin Zhang, Alexandre M. Bayen, Javier J. Sánchez Medina & Matthew J. Barth (eds), 568–574. Maui: Institute of Electical and Electronics Engineers (IEEE).

McGowan, Michael. 2018. ‘Just not blond’: Fake Black Lives Matter Facebook page run by Australian union official – Report, The Guardian. 10 April 2018, <[URL]> (28 November 2019).

Mitchell, William J. T. 1994. Picture Theory: Essays on Verbal and Visual Representation. Chicago IL: University of Chicago Press.

Ordonez, Vicente, Kulkarni, Girish & Berg, Tamara L. 2011. Im2Text: Describing images using 1 million captioned photographs. In Advances in Neural Information Processing Systems, John Shawe-Taylor, Richard Zemel, Peter Bartlett, Fernando Pereira & Kilian Weinberger (eds), 1143–1151. Granada: Neural Information Processing Systems (NIPS).

Pastra, Katerina & Wilks, Yorick. 2004. Vision-language integration in AI: A reality check. In Proceedings of the 16th European Conference in Artifical Intelligence, Ramón López de Mántaras & Lorenza Saitta (eds), 937–941. Valencia: IOS Press.

Pew Research Center. 2019. State of the Union 2019: How Americans see major national issues. 4 February 2019, <[URL]> (28 October 2019).

Rayson, Paul. 2009. Wmatrix: A web-based corpus processing environment. Computing Department, Lancaster University. <[URL]> (5 November 2019).

Rayson, Paul, Archer, Dawn, Piao, Scott & McEnery, Tony. 2004. The UCREL semantic analysis system. In Proceedings of the LREC-04 Workshop, beyond Named Entity Recognition Semantic Labelling for NLP Tasks, Lisbon, Portugal, Maria Teresa Lino, Maria Francisca Xavier, Fátima Ferreira, Rute Costa & Raquel Silva (eds), 7–12. Lisbon: European Language Resource Association (ELRA).

Seo, Hyunjin. 2014. Visual propaganda in the age of social media: An empirical analysis of Twitter images during the 2012 Israeli–Hamas conflict. Visual Communication Quarterly 21(3): 150–161.

Twitter. 2015. Hearts on Twitter. 3 November 2015, <[URL]> (3 February 2020).

. 2018. An update on our elections integrity work. 1 October 2018, <[URL]> (5 October 2019).

. 2019a. What is a retweet? Twitter Help Center. <[URL]> (3 February 2020).

. 2019b. How to like a tweet. Twitter Help Center. <[URL]> (3 February 2020).

. 2020. Pricing: API access that scales with you and your solution. <[URL]> (10 February 2020).

United States Department of Justice. 2018. Case 1:18-cr-00032-DLF: UNITED STATES OF AMERICA v. INTERNET RESEARCH AGENCY LLC. 16 January 2018, <[URL]> (10 December 2019).

United States Joint Chiefs of Staff. 2014. Information operations. Homeland Security Digital Library. <[URL]> (10 December 2019).

Zappavigna, Michele & Martin, James R. 2018. #Communing affiliation: Social tagging as a resource for aligning around values in social media. Discourse, Context & Media 22: 4–12.

Zappavigna, Michele. 2012. Discourse of Twitter and Social Media. London: Continuum.

Cited by (3)

Cited by three other publications

Collins, Luke C & Paul Baker

2024. A computer-assisted analysis of image representations of obesity: comparing UK news content with the World Obesity Federation Image Bank. Visual Communication

Hiippala, Tuomo

2024. Rethinking multimodal corpora from the perspective of Peircean semiotics. Frontiers in Communication 9

Christiansen, Alex

2022. Book review: Empirical Multimodality Research: Methods, Evaluations, Implications. Visual Communication ► pp. 147035722210996 ff.

This list is based on CrossRef data as of 19 july 2024. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers. Any errors therein should be reported to them.

Chapter 7Constructing corpora from images and text

An introduction to Visual Constituent Analysis

Cited by three other publications

Chapter 7
Constructing corpora from images and text