Introduction
Interpersonal video communication as a site of human sociality: A special issue of Pragmatics

Richard Harper,1 Rod Watson2 and Christian Licoppe2
1Social Shaping Research, Cambridge | 2Telecom ParisTech, Paris

Table of contents

1.Background

The development of applications for low cost, web-based video communication, such as Skype, Google Hangouts and Apple Facetime, combined with the (apparently) ever increasing communication needs of geographically dispersed networks of families, friends and lovers, has led to the steady growth of this form of contact. Indeed, being in touch through video, through Skype say, is virtually routine for much of the world. Surveys of Skype use by that company itself, as a case in point, suggest that this video-calling product is known by the bulk of the population in Europe, the Far East and North America. Skype, Hangouts, Facetime and the various other interpersonal video communication applications on the market, are, then, part of a life where seeing another via a video connection, doing friendship, family and affection through the apparatus of screens and computers, is part of the taken for granted fabric of contemporary existence for much of the world. To ‘Skype’ one’s friend is as familiar as texting via the mobile, Facebooking on a tablet, having a drink at bar with friends. To make a video call is commonplace; to do so with family is routine, with a lover virtually a requirement, whether one is in Helsinki, London or Seattle.

2.Perspectives on mediated communications

Despite this, the literature on interpersonal video communication is limited. This is odd. After all, the literature on Computer-Mediated Communication (CMC) is enormous. However, the bulk of this research focuses on what are essentially textually-mediated forms of communication. There are many of these ‘textualities’ to be found. One can look at instant messaging, for example, at blogging, at Facebook posting, wiki entries and tweeting. All these entail typing, not gazing; reading and not listening; this seems to be the difference between Skyping and Facebooking, between blogging and Facetiming.

All these forms of the written have been investigated from a number of perspectives under the CMC rubric. The topic has proven to be especially fertile for enquiries from the pragmatic view (Herring & Androutsopoulos 2015Herring, S. C., and J. Androutsopoulos 2015 “Computer-Mediated Discourse 2.0.” In The Handbook of Discourse Analysiss, Second edition, ed. by D. Tannen, H. E. Hamilton, and D. Schiffrin, 127–151. Chichester: John Wiley & Sons.Google Scholar, 127–151). Here, crudely speaking, the concern is with the real world arrangements that allow words to have their meaning and practical application realized (Levinson 1983Levinson, S. 1983Pragmatics. Cambridge: Cambridge University Press. CrossrefGoogle Scholar). Whilst this is essentially an empirical corrective to overly theoretical and abstracted notions of language, the pragmatics approach naturally leads to theoretical categorization based on rich descriptions of types – types of words, sentences and grammars; types of contents and technological frames; types of purposes and users. A particular development has been identifying the types of textual genre associated with any particular technological form; another has been on how that genre is characterized, distinguished and evolves. The idioms of email are distinct from the patios of instant messaging, as is the rhetoric of blog posts from the hyperbole of twitter feeds, as are the instructive modalities of communication with robots from the gentle elicitations that parents use with offspring via SMS (Herring et al. 2013Herring, S. C., D. Stein, and T. Virtanen 2013 “Introduction.” In Pragmatics of Computer – Mediated Communication, 3–34, Berlin: Mouton. CrossrefGoogle Scholar, 3–31).

The contrast is not just between computer-mediated communication, of course, but all technologically enabled written acts of being in touch, from the written letter to the blog post (Baron 2000Baron, N. 2000From Alphabet to Email. London: Routledge. CrossrefGoogle Scholar; Crystal 2001Crystal, D. 2001Language and the Internet. Cambridge: Cambridge University Press. CrossrefGoogle Scholar). Indeed, one can only be impressed by the creative output of research in this area. But as we say, it seems somewhat lacking in interest in Skype, Google Hangouts and Facetime, the applications and technologies that support visual connection, ‘video conferencing’ as it gets called. Despite these being nearly always Internet-based, built on a version of the TCP-IP protocol, the technologies (and brands) in question seem to have been categorized as another kind of communications medium, closer to face to face than text to text.

Of course, pragmatics researchers have long had something to say about the relationship between face to face and the textual. According to many, the former is richer than the latter, allowing more forms of exchange, greater movement between genre, more flexibility between purposes. Crudely put, here is the distinction between la parole and la langue, as De Saussure would have it. It might be that many researchers have thought this contrast has been well mined, and therefore see little novelty in studies of video mediated communication. What more can be said but that the video mode is richer than more textual forms?

No attempt is being made to authoritatively comment on the reason for this lack of concern in the literature. It is just being noted. Nor is it being argued that there is nothing on the subject of ‘videoMC’, even if it has not been so central to CMC. For one thing, a great deal is made of it in Media Richness Theory, a derivation of pragmatics that has emerged in the management science literature. But even here there is little new insight on the features of action within and through web-based video-mediated communication beyond what is stated as the opening premise of the research, namely that real time video is richer than more asynchronous communication forms. Little interest seems to be shown in the felt life of video connection – in what it means to be in contact via Skype or Facetime, on what the purposes behind doing so might be.

If pragmatics research is focused quite properly on words, then one might imagine that other disciplines that focus more greatly on these felt life matters would have given more attention to video and everyday life. But these have not shown, to date, great interest in video mediated interaction either. One is thinking here of sociology and anthropology. That this is so is all the more startling given how much research – and how many books – were written from the view of these disciplines on a prior communications technology, the mobile phone, where the felt life affected and shaped by those devices was so central to the research in question. Books like Katz & Aakhus’s Perpetual Contact (2003Katz, J., and Aakhus (eds.) 2003Perpetual Contact: Mobile Communication, Private Talk, Public Performance. New York: Cambridge University Press.Google Scholar), Brown et al.’s Wireless World (2001Brown, B., N. Green, and R. Harper (eds) 2001Wireless World: Interdisciplinary Perspectives on the Mobile Age. Hiedleberg and Godalming: Springer Verlag.Google Scholar) and Harper et al’s Inside Text (2005)Harper, R., L. Palen, and A. Talyor (eds.) 2005The Inside Text: Social Perspectives on SMS. Dordrecht: Kluwer. CrossrefGoogle Scholar all reported how mobile phones were altering the fabric of being in touch – how it felt to have friends ‘in the hand’ day in, day out. And, yet, today, when video calling on mobile phones (and tablets, laptops and PCs) is becoming part of the new fabric of everyday communication, few such equivalents are to be seen – as far as we are aware, there are hardly any books on video connectivity and everyday life by sociologists or anthropologists. The recent publication of Miller and Sinanan’s Webcam (2014Miller, D., and J. Sinanan 2014Webcam. Cambridge: Polity Press.Google Scholar) comes close to the topic, as does Beck and Beck-Gershei’s Distant Love (2014Beck, U., and E. Beck-Gersheim 2014Distant Love. Cambridge: Polity Press.Google Scholar) but these are exceptions that prove the rule.

Perhaps there is a reason for this, and this might have to do with what video calling affords and what this says about interesting topics for the sociology and anthropology of the felt life. Whereas the mobile phone altered the mechanics of availability in ways that some said altered the socio-spatial geometries of the world (see for example Katz’s Magic in the Air, 2006Katz, J. 2006Magic in the Air: Mobile Communication and the Transformation of Social Life. New Brunswick: Transaction Publishers.Google Scholar; also Massey’s For Space, 2005Massey, D. 2005For Space. London: Sage.Google Scholar), video calling seems to let people communicate as they would do ordinarily and without (more or less) any corruption caused by the intermediation of technology. It lets them make contact without privileging one mode of communication over other, sound over sight say, the heard over the seen. One of the catch phrases of the parent company of Skype (Microsoft), even if it is not meant to claim a scientific basis, might say it all: natural interaction. Perhaps it is in this sense that video calling is uninteresting to sociologists and anthropologists alike – because it’s not strange; being normal, the natural way of communicating, albeit over distance. Its felt life has no obviously novel features.

Whatever the reason for the apparent dearth of research, this does not mean that video calling is not addressed in the literature at all. We have already mentioned Miller and Sinanan’s work from the anthropological perspective. But it is important to note how this example is representative of how such an interest often treats the features of interaction in and with video connection as only an element, and often only a minor one at that, of a larger topic of inquiry where those details become largely inconsequential. A book written somewhat before Webcam provides a clear example of this. Madianou and Miller’s Migration and New Media (2012Madianou, M., and D. Miller 2012Migration and New Media: Transnational Families and Polymedia. London: Routledge.Google Scholar) explains how contemporary international – or transnational – employment migration trends are resulting in many families finding that ‘Mum’ works and lives far from home: abroad no less. This is particularly so for Filipino families, the book’s chosen community and culture. Madianou and Miller show how video calling is used by Filipino mothers working in London (and elsewhere, though London is the primary site) to keep in touch with their families back home, in the archipelago. The book explains that these connections are highly sought after – desired if you will – because these mothers are remote from family members that are often quite young. It’s these mothers’ kids who are being looked after by grandparents and aunts. Madianou and Miller explain that it is via video that the young children in question come to recognise what their mother looks like, since in the routine of their life they rarely see their mother ‘for real’. Madianou and Miller argue that through video connection ‘Mum’ comes to be more than a mere idea conveyed in the written word, or through the sound of speech on a phone, or via the very occasional visit. Seeing Mum via Skype or Facetime (etc.) lets Mum be recognized and this is especially important when they come home, so that when they, for instance, walk out of the airport gates towards their children, those same children do not need prompting by aunts saying ‘There she is’ – as if the lady in question were a stranger. For the children can come to recognise their mother from the video calls. Mothers find they relish this; indeed they delight in it. It negates the grief of not been recognised at all, which hitherto – before free video telephony such as Skype – had been the price of work abroad.

These are important issues and worthy of inquiry. But in terms of topic, there is little interest shown by Madianou and Miller in how people ‘do’ video connection – the interactional mechanics of it, even the interactional processes of scheduling these calls, getting everyone ready for them. These matters are taken for granted by the authors. Their concern is identity – in how contemporary Filipino mothers do ‘being Mum’ as a new type of economic actor – a migrant, who is separated from family – and through this, sustain their families. One might say that the deployment and widespread use of webcam technologies is in effect a pretext for Madianou and Miller to re-examine a traditional anthropological topic, namely kinship and its constituents and the relationships between these constituents (mother/child, sister/sister, aunt/nephew and so on).

According to Madianou and Miller’s evidence, video calling gives greater importance to the visual in kinship systems. As it happens, the visual is an especial concern for another field within sociology and anthropology, cultural theory. In this view, the valence of video calling (certainly in the context Madianou and Miller report), is not merely that seeing allows recognition, it is rather that it brings an erotic element to family connections. By erotic is meant a concern for the sensual aspects of the body and all that ensues. Through video calling, mothers can feel the adoring gaze of their loved ones; they can delight in knowing that the one they cuddle at the airport has not been told to cuddle but does so since they see ‘It is Mum!’ Certainly this is the upshot of the arguments put forward by Peters Speaking into the Air (1999Peters, D. J. 1999Speaking into the Air: A History of the Idea of Communication. Chicago: Chicago University Press. CrossrefGoogle Scholar) even if he wrote that book somewhat before video connections became widespread. Peter’s thesis is that vision-delivering tools in contemporary communication technology are making the body more important than the mind when people seek to communicate. It is shifting expectations and the experiences that people delight in, what they desire. Certainly, today, and as we noted at the outset, seeing has become part of the requisite of the contemporary form of life. It would appear that distributed, fragmenting families solidify themselves not through articulating what they think when they are separated, but by letting each other recognise each other’s shape, their form, their body.

One can easily be persuaded that this is altering the connection between place and emotion and the visual. To see Mum has become the sought for value. In contrast, in the past, when one received a letter, say, it was understanding Mum’s subjectivity that was sought for, what Mum thought and felt ‘inside’. Today, according to Peters, seeing mother ‘on the outside’, for her shape and bodily presence, her physical aesthetic, is the desired goal. Mothers become pictures (or at least as they are seen through video connections), not entities with thoughts or inner reflections; in short, their looks become them, not their words. This is the rub of Peter’s analysis: we are stripped of our powers of articulation.

Though one might readily accept that the purposes of calls can be described as fostering a kind of gaze that removes a concern from the inner and replaces it with a concern for the outer, and, though one could also accept that this has consequences for the character of social relations, making bodies more evocative than the mind, one is less persuaded that this formulation adequately accounts for how those in video communications ‘do’ those communications. One might make a simple contrast here to illustrate what might be at issue. It seems to us that Peters is not really interested in how action is organized by those in the activities he mentions; he is interested in how one might describe those activities and thereby trace links between ideas and society more generally. His descriptions seek to highlight particular connections, and this might come at the expense of describing those calls in ways resonating with how the parties themselves experience such calls. Do Mums feel aesthetisised? Do they think their outer is displacing their inner? Of course, perhaps some do; our point is that Peters’ business is not inquiring into whether this is so and how this might affect demonstrable conduct in and through video communications technology. If Madianou and Miller are interested in kinship, family, identity, and not really in video calls themselves, likewise Peters is not concerned with how Skype calls are done or organised, or even accounted for by those who use that technology. He is interested in aspects of culture as a system of meanings and how that system might be visible in various ways in particular observable activities. He is not wanting to claim that the activities in question are best described with this concern, or even what proportion of those activities are to be understood through reference to this concern, with culture-as-system. It’s not the empirical features of doings themselves that matter, but the possible ‘imaginary’ that might be see-able when those doings are cast in the cultural theory lens. Through this lens, symbolic landscapes of desire bound up with the intricacies of absence and presence mediated by technology can be accessed. And when this is done, empirical questions don’t matter, only the élan at travelling this landscape by the cultural critic. Peters is not really interested in the doings of those in video calls, he is interested in his own thoughts about those doings.

3.The need for a different approach

We should make it clear we are not being critical of his approach – one that delves into empirical matters to service theoretical topics, cultural ones in his case. One might note also that anthropology uses empirical matters to service theoretical topics too, as is shown by Madianou and Miller’s concern with kinship. One might also say that the concern to create theoretical types of language practice is the goal of pragmatics research. But we do want to suggest that there might be other ways of examining what happens in and through video calling, where the detailed organisation of the doings in a video call (and all arrangements thereabouts – setting up, scheduling etc.) are the topic itself. Key to this topic is that these matters are also the business of those involved – the ones doing the looking, as it were, and the ones who need to sort out the scheduling and the purposes of video calls. The ones whose business it is.

One might turn to the human computer interaction (HCI) perspective for this detailed concern. Indeed, in terms of numbers of papers and edited collections, this discipline probably provides the largest corpus of research on this area. HCI is primarily concerned with how to design and enhance computer technology; video systems provide canonical examples of this. Consequent on that one might expect that the details of use of Skype-like applications would be important. How people use the technology must surely affect how it ought to be designed. But these practices are strangely treated in HCI research.

Consider, one important distinction that one would have thought essential to these practices and related details has, until quite recently, been unexamined in the HCI perspective. This is the difference between work and home settings. Most HCI research has focused on work, and has tended to treat insights about design derived from that setting as being relevant for other settings – including the home. Yet one might reasonably assume that the moral order of work activities, to coin a sociological phrase, is quite different from that constitutive of the activities of personal and private life, even though, of course, the private and the professional can and often do blur. For one thing, participation in work activities is bound up with contractual obligations for those involved. One has to turn up and be seen at work for a specified number of hours a week, for example. And while marriage is also based on a contract, this hardly specifies one’s hourly presence: it’s more a matter of legal relations to ownership of properties, to ‘chattels’ as the English common law expresses it. More importantly, and to express it in ways familiar to the pragmatics community, the implicature of work is quite distinct from home or private life. This difference has to do with intention. It is in light of this that one would conject that the typical purpose of video calling in the work setting is to get some work task done whereas in private life it is with a view to ‘being in touch’ and ‘seeing each other’; doing the work of friendship if you like. The desired connections between friends and family are simply unlike those in the workplace. But this key distinction has not been a cornerstone of HCI research on video communication.

More recently, HCI studies have started to focus on domestic and emotional uses of video connections where these implicatures are important. While the bulk of these studies assert that in this context video communication is a means of ‘performing intimacy’ and ‘closeness at a distance’ (Kirk et al. 2010Kirk, D., A. Sellen, and X. Cao 2010 “Home Video Communication: Mediating ‘Closeness’.” Proceedings of the 2010 ACM Conference on Computer Supported Cooperative Work. New York: ACM Press. CrossrefGoogle Scholar; Neustaedter and Greenberg 2011Neustaedter, C., and S. Greenberg 2011 “Intimacy in Long Distance Relationships over Video Chat.” Proceedings of CHI 2012, ACM Press.Google Scholar), their main interest is not exploring what is the distinct moral order of such intentions, however, as in providing solutions to what HCI sees as essentially problems of the user as a kind of ‘vision seeking body apparatus’. This leads them back to the same interests as those prior research studies that have used the workplace as their evidence base. As it does so it also leads them away from implicatures – the things that make work and private so different.

For example, both the HCI of video connection at work and in the more recent research on the HCI of video connection in the home have come to focus on such things as solving visual parallax; that is to say, in designing video systems where differences in what is seen by each party at either end of the connection are minimized and do not create problems of visual perspective (Heath & Luff 1992Heath, C., and P. Luff 1992 “Media Space and Communicative Asymmetries. Preliminary Observations of Video Mediated Interactions.” Human Computer Interaction 7: 315–346. CrossrefGoogle Scholar, 315–346; see also Brubaker et al. 2012)Brubaker, J., G. Venolia, and J. Tang 2012 “Focusing on Shared Experiences: Moving Beyond the Camera in Video Communication.” Proceedings of Designing Interactive Systems (DIS 2012) June 11–15th. Newcastle, UK. CrossrefGoogle Scholar. What distinguishes the HCI of home settings from work settings is that one might want more curtailing of some of the solutions to parallax in the home setting than in the workplace. Questions of privacy might superimpose themselves in more consequential ways in home than they do at work. Designing a system to allow one to see only a narrow field of vision in the bedroom, for example, might be a better solution than one that allows the camera to ‘follow’ the direction of the iris for example, and hence to see objects in the remote field (the bedroom) that the party in the bedroom would prefer hidden. Often, in the workplace, such matters of privacy are less salient – crudely speaking all workspace is public space, at least to those who have sanction to be there. And if there are private matters to deal with, these are normally handled in private conferencing rooms. As Harper (2011) 2011Texture: Human Expression in the Age of Communications Overload. Cambridge: MIT Press.Google Scholar explains, and these questions of privacy aside, this concern for parallax results in the HCI approach tending to reduce all communication, wherever it is, to matters of physiology, giving greater importance to matters of the eyes (and the body that acts as a cradle for them) than those of the mind. In affect, it places the purposes behind video calls below the mechanics of seeing. Few studies from within this field look at the interaction as it is constituted within the events themselves by those engaged in those activities. HCI is not interested in ‘users’ as ‘reasoning actors’ but as ‘agents in a socio-technical system’ – as pieces in a system of seeing (for further discussion of this see Harper 2009Harper, R. 2009 “From TelePresence to Human Absence – The Pragmatic Construction of the Human in Communications Systems Research.” Proceedings, 23rd Annual Conference of the British HCI Group (HCI 2009): 73–82.Google Scholar, 73–82; also Rintel et al. 2016Rintel, S., R. Harper, and K. O’Hara 2016 “The Tyranny of the Everyday in Mobile Video Messaging.” Proceedings of CHI’16. San Jose: ACM Press. CrossrefGoogle Scholar).

It must be admitted that this perspective can show dividends when designing systems. Questions of visual parallax point to easily made design choices. But ease of choice in this regard does not equate with understanding the ‘work’ of using video connections in either home or the workplace. The habitus of these domains, home or work, is lost from view. In short, HCI does not concern itself too much with the actual features of video mediated interaction, its detailed social organisation, what one might presume will be the improvised yet coordinated nature of turns at talk and at looking within a call, for example. All this and more turns out to be outside the remit of this perspective – just as it seems to be for anthropology, sociology and, for that matter, pragmatics.

4.Evidence from within

As should be clear, this lack of interest is proper in many ways. As remarked already, pragmatics researchers might feel that visual communications technologies are not as rich a source of novel evidence on the problems of pragmatics theory as are, say, textually-based technologies; anthropologists are not as interested in the felt life as in how aspects of that felt life affect matters of social structure, and in how social structure can be seen to constrain the felt life. But the consequence of these proper concerns would appear to be something of a lack of interest in and evidence about what actually goes on in video calls, when people Skype or Facetime one another.

As it happens, there are approaches that might help in this regard. Ethnomethodology and conversation analysis do, in various ways, focus on interactional details of communication between persons in ways that might be apposite for our concerns. Our thoughts turn to these in particular since they have proved effective at unpacking the organization of various non-visual yet mediated forms of communication, such as audio-telephony, in ways that allow understanding of how these communications are organised by those who do them – from within, so to speak. After all, it was telephone recordings that were the source of evidence in Sacks’ ground-breaking studies on how everyday talk is a participant-constructed social system (Sacks 1992Sacks, H. 1992Lectures on Conversation, Vols. 1 & 11. Ed by G. Jefferson. Oxford: Blackwell.Google Scholar; see also Schegloff 2007Schegloff, E. 2007 “Sequence Organization in Interaction.” A Primer in Conversation Analysis. Cambridge: Cambridge University Press. CrossrefGoogle Scholar).

More recently, ethnomethodological and conversation analytic approaches have gathered evidence that points towards the organization of video calls, though this evidence is not derived from that topic precisely. There have been studies of, for instance, the intricacies of interaction on mobile phones and these intricacies can include questions of the visual field and hence the use of video connections through mobile phones (Hutchby and Barnett 2005Hutchby, I., and S. Barnett 2005 “Aspects of the Sequential Organization of Mobile Phone Conversation.” Discourse Studies 7 (2): 147–171. CrossrefGoogle Scholar, 147–171; Arminen 2005Arminen, I. 2005 “Sequential Order and Sequence Structure: The Case of Incommensurable Studies on Mobile Phone Calls.” Discourse Studies 7 (6): 649–662. CrossrefGoogle Scholar, 649–662). Other forms of communication, such as Instant Messaging (Garcia and Jacobs 1999Garcia, A., and J. Jacobs 1999 “The Eyes of the Beholder: Understanding the Turn-Taking System in Quasi-Synchronous Computer-Mediated Communication.” Research on Language and Social Interaction 32 (4): 337–367. CrossrefGoogle Scholar, 337–367) and postings on Facebook (Page et al. 2013Page, R., M. Frobenius, and R. Harper 2013 “From Small Stories to Networked Narrative: The Evolution of Personal Narratives in Facebook Status Updates.” Narrative Inquiry 23 (1): 192–213. CrossrefGoogle Scholar, 192–213; Frobenius & Harper 2015Frobenius, M., and R. Harper 2015 “Tying in Comment Sections: The Production of Meaning and Sense on Facebook.” Semiotica 204: 121–143.Google Scholar, 121–143), have been examined and these too point towards how the visual can come to have a role in acts of communication. Notwithstanding the seminal work of Michel de Fornel in France in the 1980s (De Fornel 1994De Fornel, M. 1994 “Le Cadre Interactionnel de l’Echange Visiophonique.” Réseaux 64: 107–132. CrossrefGoogle Scholar, 107–132), it is only recently, however, that ethnomethodological and conversation analytic perspectives have been applied to interpersonal video calls specifically (see for example, Licoppe & Morel 2012Licoppe, C., and J. Morel 2012 “Video-in-Interaction: “Talking Heads” and the Multimodal Organization of Mobile and Skype Video Calls.” Research in Language and Social Interaction 45 (4): 399–429. CrossrefGoogle Scholar, 399–429; 2014 2014 “Mundane Video Directors. Showing one’s Environment in Skype and Mobile Video Calls.” In Video@Work, ed. by M. Broth, E. Laurier, and L. Mondada, 135–160. London: Routledge.Google Scholar, 135–160; Mondada 2010Mondada, L. 2010 “Eröffnung und Vor-Eröffnung in technisch vermittelter interaktion: Videokonferenzen.” In Situationeröffnungen: Zur multimodalen Herstellung fokussierter Interaktion, ed. by L. M. R. Schmitt, 217–334. Tubingen, Narr.Google Scholar, 277–334; Relieu 2007Relieu, M. 2007 “La Téléprésence, ou l’Autre Visiophonie.” Réseaux 144: 183–223. CrossrefGoogle Scholar, 183–223; Sunakawa 2012Sunakawa, C. 2012 “Japanese Family via Webcam: An Ethnographic Study of Cross-Spatial Interactions.” In Lecture Notes in Computer Science, Volume 7258, ed. by M. Okumura, D. Bekki, and K. Satoh, 264–276. Heidelberg: Springer-Verlag.Google Scholar, 264–276). What these initial forays into this area make clear is how video connection alters the salience of what is of concern in communication. Seeing and being seen, as Peters suggests, is indeed central, though while he connects this to ideas of the body in society more generally, these ethnomethodological studies show how matters of the visual, including the seen body of the interlocutors, come to be a powerful resource for topic management. It’s not culture that is at issue, but what video calls are ‘about’ that is.

All this affects the ways in which such mediated communication unfolds, this preliminary research is arguing. These studies also make it clear that when two or more persons show a concern in the world they jointly see through a video connection, their concern must be justified and articulated as such in that communication; their individual interest must turn into something of joint interest in the interaction.

Other matters too, and not just visual, can also be brought into play – made topical, made the stuff of the mutually organised interaction. But again, doing this entails work for those involved, manifest in their articulations in the video calls themselves about what those calls are for, what they are about, even as those purposes evolve and change in vivo. These purposes can include ‘things that cannot be seen’, and thus the reasons for that elision (as in ‘Why can’t I see that necklace?’).

The currency of video calling is then bound up to the mechanics of those calls as organized interactions, as doings that are made anew each time a call is made but those doing the communication. Video calls may be glossed as key acts in the contemporary patterns of friendship, family and romance, but how these calls come to be constitutive of these social relations is integral to how those relations are done.

5.The purpose of this special issue

The aim of this special issue is to bring forward more work that systematically explores these aspects of video connections. It will not matter whether, in this research, the evidence relates to the use of Skype, Google Hangouts, Apple FaceTime (or indeed any other video application); what is important is that this research documents how participants in such communications treat these practices as practical affairs, collaborations amongst the engaged parties, and which makes such events have the experiential form they do. That they are organized ‘from within’ is a feature of these events as well as a resource for those doing them and hence should be a topic of enquiries into those events.

The papers selected for this concern reflect this. The first addresses how video calling is identified and treated as an accountable act in ordinary affairs, as a reasonable thing to do in reasonable circumstances. What is of concern for Harper et al is what those reasonable things are said to be, how shared understanding of them is made and managed in talk, and how, through such talk, a world shared in common is fabricated. This world in common is not a label for observable facts about knowledge and access to Skype that relate to matters external to the call. This paper is not remarking on, say, social inclusion and or exclusion; matters that might appeal to the sociologist. It wants to explore how Skyping is undertaken given a premise of shared knowledge about what Skype is and how to use it ‘ordinarily’ by those involved. In other words, this paper enquires into the orientation of those who use video connection.

The authors of The Interrogative Gaze argue that there is an invoked order or orientation to video calling brought into play in talk about such calls that gives those calls, and indeed talk about them, an ‘ordinary feel’. Video calling is understood as having typical forms by those who make such calls; a commonplace form in their manner, in how they are conducted, and in the motivations for them. This ordinariness is intrinsic to why they are made.

All this makes up ‘what everyone knows about a video call’ whether it is Skype, Facetime or any other video technology, the authors argue. ‘What everyone knows’ is not a statement of facts so much as an interpretative schema, a constitutive set of concerns that allows all those engaged in talk about video communication to come to some shared agreement about what any particular call might be ‘about’.

Harper et al. show how this orientation aids not only in sense making about such things as topic management in video-calls, but also in elaborating the salience of the relationship between topic and the patterned governance of social relations in general and outside the call – relations between mother-daughter, say, or friend-friend and so on. This leads people to plan for such video calls and seek explanation when they are not made. Knowing about Skyping or Facetiming is not just a question of knowing what to do while in them, then, but knowing when and where and for whom they are sensible things to do. In this regard matters that are external to the call end up being internal to it, and hence traditional sociological concerns can be seen to be brought into play but through an unusual route: via the ways this particular action is made purposeful. People may video call their mother because she is singularly important to them, but they do so since Mother is a category that all know can explain the reasons for such a call as well as govern conduct within it.

In the following paper, Licoppe builds on this general framing of ‘what video calling is about’ by looking at the opening sequences of such calls. He is particularly interested in seeing if they have any special organisational form. Unsurprisingly he finds they do: just as in ordinary conversation, people don’t just make a connection; they have work to do at the opening of such contacts, justifying them as well as engaging in due propriety from the outset – saying hello at the right moment, waiting on a response and so on. But video calls have especial complexities and features. In Skype Appearances, Licoppe shows that they have a particular sequential adjacent pair organization, and a multi-staged format. They consist not just in an initial greeting pair, when a call starts and the initially involved parties respond in turn, but then further, subsequent greetings when others, also part of the call, come into interaction, into play. These moments are ‘arranged’ such that the participants themselves sometimes call the moment in question a ‘proper greeting’ – as in ‘We are all here now, say hello everyone!’

Licoppe goes on to show that part of the work of video calling, if work it is, entails not only getting things ready to see, but how to deal with opportunities for greetings that are serendipitous, or at least sometimes staged so as that they seem to be. Licoppe reports in particular on what he calls greetings which are massively bound up with the seeing of others, when it is the actual act of seeing that becomes the salient aspect of the greeting. As it happens the French have a word for this: they are called coucou moments. Coucou is a vernacular for saying ‘See you’ when seeing is very much the thing being alluded to – when someone sees a friend on the other side of the Metro station, say, or when someone eventually finds a person in a busy public place even though they have been talking with them on the phone as they seek them out.

Coucou is like a word that one would use in the family game of hide and seek at that moment when someone is found – though of course, there is no English vernacular for it – ‘found you!’ hardly conveys the feel of it. Coucou-ing is not then a mere statement of fact; it’s an outcome of a particular orientation, a desire to report what one sees and to arrange what is seen so as to make the seeing justifiably celebratory. To gaze at another over video is not a simple fact of communication, it’s a particular stance, noted and accounted for, its commencement worthy of comment. To see who is seeing and what is being seen is what one talks about, it gives purpose to a Skype call.

The third paper, Image-based Topical Talk, by Zouinar and Velkovska, looks at another, though clearly related feature of video calls – what’s done when the greetings are over and when the coucou moments have been played out. This is particularly an issue when there are no apparent or stated purposes of such calls other than that they are merely about ‘keeping in touch’. Zouinar and Velkovska show that the video in the communication itself, what it allows people to see and show (as well as to hide or elide), is a resource leveraged to make and direct topics – things to talk about that keep the video call going over and beyond the replaying of introductions, of ‘hello, I see you’ type acts; beyond the ‘coucou’s’. In this regard, this paper details what Harper et al. argued in the first paper is the normative orientation to video calling, namely, one that treats the visual as an interrogative opportunity, something to talk about and comment on, and indeed, the oriented-to frame of reference that is assumed when Skyping and such like are mentioned as possible things to do. For this frame of reference, the visual field, is the font of ‘reasons to Skype’ and ‘reasons to avoid’ such a communication, just as it is a resource used to explain and account for other doings on video calls – when the visual is ignored.

Zouniar and Volkovska evidence the relevance of a particular sequential adjacent pair organization in this regard, what they label showings and noticings (Sacks 1992Sacks, H. 1992Lectures on Conversation, Vols. 1 & 11. Ed by G. Jefferson. Oxford: Blackwell.Google Scholar). These articulate themselves in step-wise fashion, with an appearance-for-the-first-time sequence being a crucial resource for participants in a video call who seek to use such showings to orient subsequent showings and noticings in the communication. In this way, callers establish a ‘joint video interactional frame’ – a field of concerns that both (or all) attend to willingly. The authors go on to explore how a video image is also used as a resource to introduce, to maintain or to change topics. When this occurs it imposes interactional tasks on parties to the call – forcing them to account for why something is ‘being shown’.

By describing the practical actions that enroll the visual in family and domestic communications, Zouniar and Volkovska show how the interaction itself, the relationship between the persons incarnate in that interaction (mother-daughter, brother-brother, etc.), and the ‘technology as a resource’ are interwoven in courses of action. As they explain, video calling is embedded in already rich, detailed and well-rehearsed patterns of joint activity; the features or affordances of video extend and elaborate on these patterns, giving new nuance to what communications can be ‘about’.

The fourth paper elaborates a different set of practices, ones that make some of the peculiar properties of the visual field in video communication, and more especially some of the unique computer generated aspects of the visual, into an opportunity to make even more stuff to talk about. Rosenbaun and Licoppe evidence how, in multi-person video conferencing systems like Google Hangouts on Air (i.e. ones where there is more than a pair of connections, but several, in different places), participants engage in what can best be described as collective performances of computer literacy. If, in ordinary face to face talk, the things spoken about can be a topic of subsequent talk, and, if, in most video calls, the things seen and shown can also be a topic or resource for talk, then Rosenbaun and Licoppe show that so too can the ‘user experience resources’ of the communication applications themselves be a topic.

Parties to a call can exchange screenshares of pictures, for example, and this in turn can allow participants to bring to the communication event digital images that are ‘in’ their machine, so to speak; they can ‘run a video’ in the same manner and they can ‘link-share’ with those they are communicating with. In this way they can bring to bear content from outside the specific context of the video applications – from other applications, other computer tools. This can include Youtube, for example.

In this manner, stuff to talk about can have a peculiar property: it can be made even though this stuff has no real substantive existence outside the frame of the computer systems being used at the time. Matters that are internal to such communications can, if you like, be essentially internal to those calls or at least to the technological frame or context of those calls. In this regard, they are things communicated about that only exist within the praxis of communication via computer.

That this is so is not suggestive of how this makes an unfamiliar, artificial world. On the contrary, these properties or resources in computing mediated communication become an ordinary, ‘usual’ resource for doing the normal thing in such communication: making stuff to communicate about. Hence, this paper, Showing Digital Objects, returns us to the opening of this introduction, where we noted how video calling is not just routine and commonplace, but is often well-understood, leveraged in subtle and common ways to make play not just with the participants in the communication but with and through the technology itself. It’s not just the other that is a topic, then, but the means of communication too. Here is a twenty-first century reflexive aspect of what communicative practice consists in. One skypes and makes Skype the topic, or at least, part of the topic or purpose of the call.

The last paper, The Skype Paradox, deals with a much more serious matter but likewise turns on the same key insight: that using video calling technologies is a common place and that knowledge about the purposes and experiential resources afforded by such communication is used in those calls as well as considered when such calls are planned. But whereas Rosenbaun and Licoppe look at play, this paper looks at how Skype (and other communications technologies) provide a resource for the strategic and tactical management of everyday life; for serious matters if you like.

The paper examines, more especially, how people come to choose Skype (or its equivalent) over some other means of communication – Facebooking, emailing or voice messages on a telephone landline. It shows that such decisions, however unique and particular they might appear in any instance, are explained and described in everyday talk and interaction in terms of how different modes of communication constrain or open up different subsequent courses of action in communication. With video connection, next topics in the communication can be invoked by either party, for example, and their rejection or acceptance in the dialogue negotiated there and then – in vivo by the parties involved. By way of contrast, other modes of being in touch are described as allowing the management of turn taking and topic to have different forms and rules. Postings on Facebook are publically available, as a case in point, and so any responses to such postings need to fit into the general tenor of such responses, a tenor which is tamed as it were by the need to be ‘unremarkable’ – otherwise they unsettle the known in common currency of such responses (Page et al., 2013Page, R., M. Frobenius, and R. Harper 2013 “From Small Stories to Networked Narrative: The Evolution of Personal Narratives in Facebook Status Updates.” Narrative Inquiry 23 (1): 192–213. CrossrefGoogle Scholar; Frobenius & Harper, 2015Frobenius, M., and R. Harper 2015 “Tying in Comment Sections: The Production of Meaning and Sense on Facebook.” Semiotica 204: 121–143.Google Scholar). Things that might be said in the intimate space of a Skype call may be less easily or appropriately raised via Facebook.

And here is the rub: the homeless young, the subjects of this paper, do want to avoid certain topics with certain people. And they do so in the way they select communicative modes. Skype is a mode that reduces their capacity to exert control over the substance of communication and this is especially so when it comes to dealing with parents for example, and so might be a mode to be avoided. It is not merely that parents might use the interactional flexibilities that this mode of communication affords to raise topics that the homeless young want to avoid; according to the homeless young, conversations with their parents have always been difficult. Rather, using vision-supporting tools like Skype to communicate with parents once one has become homeless makes these difficulties worse. The mode of communication affords it.

Hence the title of the paper – the Skype Paradox. It is an allusion to how an apparently easy-to-use communications technology can make for the most difficult communications between persons. An important theme for the paper is explaining how knowing this is something that the homeless young act on. Their reasoning about this concern is integral to the ways that they Skype, how they appropriate the technology themselves, how they choose it, when they do so and when and why they sometimes choose other modes of being in touch.

The topic of the paper is then how selecting a communication mode is the output of choice-making procedures, of rational consideration. That this rationality might lead people ‘on the street’ avoiding Skyping to those ‘at home’ may seem odd to the home-occupying majority of the population, but is how the homeless young characterize their situation. They are homeless but nevertheless choose communicative media for ‘reasonable reasons’ that sometimes seem to reduce the affordances or richness of those communications. They do so not because of these affordances in themselves as because these affordances become variously salient given the ‘patterns’ and ‘rules’ of conversation; the frame-worked ways that parents and children deal with each other in their communicative acts irrespective of the medium of those communications. There is something more profound about acts of communication than can be seen just by the technological mode of that communication.

There is of course considerable pathos in The Skype Paradox. One thinks of the joyous moments of ‘coucou-ing’ that Licoppe remarks on in his paper and which presumably the homeless young (if French) might indulge in and one thinks as well of the fear of being ‘caught out’ by topic management that such very coucouing might result in. To see is not always something one wants to allow, especially if it is followed by topics that have nothing to do with the seeable, and focus instead on motive, competence, maturity. The latter are very much a concern for the homeless. The moral implicatures of these topics, the seeable as against motive, are quite different, though the technology used to bring these up might be the same. It is how the technology allows these implicatures to arise that is at issue.

That this is so underlines why this special issue of Pragmatics has been brought together. Though video telephony might seem everyday and, indeed, though the orientations to video calling examined in the papers underline this everyday nature, the effective use of this technology depends upon the adroit management of what is really a quite skilled matter: the routine ways of using video communications, and the routine ways of relating to other persons through this technology. And while these ways may be commonplace, this does not prohibit different and distinct application of them. For some, making a Skype call is a skillful act of love and is understood as such; while for others, it is an act that seeks to make accountable the remote person – and yet pushes that remote person further away even as they are looked at through the Skype call. Skills at social cohesion are used to make social fragmentation.

In sum, the social function of communication technologies is subtle indeed. This subtlety is not to be found in the technology, but in the pragmatics of use. As this Special Issue makes clear, these pragmatics are internal to, and the concern of, those who use Skype, Facetime, or Google Hangout; it’s their business. We have sketched how they conduct some aspects of this business. But whether that business is serious or playful, ritual or spontaneous, whatever its purposes, how it is done is our business in this Special Issue.

References

Arminen, I.
2005 “Sequential Order and Sequence Structure: The Case of Incommensurable Studies on Mobile Phone Calls.” Discourse Studies 7 (6): 649–662. CrossrefGoogle Scholar
Baron, N.
2000From Alphabet to Email. London: Routledge. CrossrefGoogle Scholar
Beck, U., and E. Beck-Gersheim
2014Distant Love. Cambridge: Polity Press.Google Scholar
Brown, B., N. Green, and R. Harper
(eds) 2001Wireless World: Interdisciplinary Perspectives on the Mobile Age. Hiedleberg and Godalming: Springer Verlag.Google Scholar
Brubaker, J., G. Venolia, and J. Tang
2012 “Focusing on Shared Experiences: Moving Beyond the Camera in Video Communication.” Proceedings of Designing Interactive Systems (DIS 2012) June 11–15th. Newcastle, UK. CrossrefGoogle Scholar
Crystal, D.
2001Language and the Internet. Cambridge: Cambridge University Press. CrossrefGoogle Scholar
De Fornel, M.
1994 “Le Cadre Interactionnel de l’Echange Visiophonique.” Réseaux 64: 107–132. CrossrefGoogle Scholar
Frobenius, M., and R. Harper
2015 “Tying in Comment Sections: The Production of Meaning and Sense on Facebook.” Semiotica 204: 121–143.Google Scholar
Garcia, A., and J. Jacobs
1999 “The Eyes of the Beholder: Understanding the Turn-Taking System in Quasi-Synchronous Computer-Mediated Communication.” Research on Language and Social Interaction 32 (4): 337–367. CrossrefGoogle Scholar
Harper, R.
2009 “From TelePresence to Human Absence – The Pragmatic Construction of the Human in Communications Systems Research.” Proceedings, 23rd Annual Conference of the British HCI Group (HCI 2009): 73–82.Google Scholar
2011Texture: Human Expression in the Age of Communications Overload. Cambridge: MIT Press.Google Scholar
Harper, R., L. Palen, and A. Talyor
(eds.) 2005The Inside Text: Social Perspectives on SMS. Dordrecht: Kluwer. CrossrefGoogle Scholar
Heath, C., and P. Luff
1992 “Media Space and Communicative Asymmetries. Preliminary Observations of Video Mediated Interactions.” Human Computer Interaction 7: 315–346. CrossrefGoogle Scholar
Herring, S. C., and J. Androutsopoulos
2015 “Computer-Mediated Discourse 2.0.” In The Handbook of Discourse Analysiss, Second edition, ed. by D. Tannen, H. E. Hamilton, and D. Schiffrin, 127–151. Chichester: John Wiley & Sons.Google Scholar
Herring, S. C., D. Stein, and T. Virtanen
2013 “Introduction.” In Pragmatics of Computer – Mediated Communication, 3–34, Berlin: Mouton. CrossrefGoogle Scholar
Hutchby, I., and S. Barnett
2005 “Aspects of the Sequential Organization of Mobile Phone Conversation.” Discourse Studies 7 (2): 147–171. CrossrefGoogle Scholar
Katz, J.
2006Magic in the Air: Mobile Communication and the Transformation of Social Life. New Brunswick: Transaction Publishers.Google Scholar
Katz, J., and Aakhus
(eds.) 2003Perpetual Contact: Mobile Communication, Private Talk, Public Performance. New York: Cambridge University Press.Google Scholar
Kirk, D., A. Sellen, and X. Cao
2010 “Home Video Communication: Mediating ‘Closeness’.” Proceedings of the 2010 ACM Conference on Computer Supported Cooperative Work. New York: ACM Press. CrossrefGoogle Scholar
Levinson, S.
1983Pragmatics. Cambridge: Cambridge University Press. CrossrefGoogle Scholar
Licoppe, C., and J. Morel
2012 “Video-in-Interaction: “Talking Heads” and the Multimodal Organization of Mobile and Skype Video Calls.” Research in Language and Social Interaction 45 (4): 399–429. CrossrefGoogle Scholar
2014 “Mundane Video Directors. Showing one’s Environment in Skype and Mobile Video Calls.” In Video@Work, ed. by M. Broth, E. Laurier, and L. Mondada, 135–160. London: Routledge.Google Scholar
Miller, D., and J. Sinanan
2014Webcam. Cambridge: Polity Press.Google Scholar
Massey, D.
2005For Space. London: Sage.Google Scholar
Madianou, M., and D. Miller
2012Migration and New Media: Transnational Families and Polymedia. London: Routledge.Google Scholar
Mondada, L.
2010 “Eröffnung und Vor-Eröffnung in technisch vermittelter interaktion: Videokonferenzen.” In Situationeröffnungen: Zur multimodalen Herstellung fokussierter Interaktion, ed. by L. M. R. Schmitt, 217–334. Tubingen, Narr.Google Scholar
Neustaedter, C., and S. Greenberg
2011 “Intimacy in Long Distance Relationships over Video Chat.” Proceedings of CHI 2012, ACM Press.Google Scholar
Relieu, M.
2007 “La Téléprésence, ou l’Autre Visiophonie.” Réseaux 144: 183–223. CrossrefGoogle Scholar
Page, R., M. Frobenius, and R. Harper
2013 “From Small Stories to Networked Narrative: The Evolution of Personal Narratives in Facebook Status Updates.” Narrative Inquiry 23 (1): 192–213. CrossrefGoogle Scholar
Peters, D. J.
1999Speaking into the Air: A History of the Idea of Communication. Chicago: Chicago University Press. CrossrefGoogle Scholar
Rintel, S., R. Harper, and K. O’Hara
2016 “The Tyranny of the Everyday in Mobile Video Messaging.” Proceedings of CHI’16. San Jose: ACM Press. CrossrefGoogle Scholar
Sacks, H.
1992Lectures on Conversation, Vols. 1 & 11. Ed by G. Jefferson. Oxford: Blackwell.Google Scholar
Schegloff, E.
2007 “Sequence Organization in Interaction.” A Primer in Conversation Analysis. Cambridge: Cambridge University Press. CrossrefGoogle Scholar
Sunakawa, C.
2012 “Japanese Family via Webcam: An Ethnographic Study of Cross-Spatial Interactions.” In Lecture Notes in Computer Science, Volume 7258, ed. by M. Okumura, D. Bekki, and K. Satoh, 264–276. Heidelberg: Springer-Verlag.Google Scholar

Address for correspondence

Richard Harper

Social Shaping Research

Cambridge CB3 9DY

United Kingdom

richard@socialshapingresearch.com

Co-author information

Rod Watson
Telecom ParisTech
r.watson339@bitinternet.com
Christian Licoppe
Telecom ParisTech, Paris
christian.licoppe@telecom-paristech.fr