Skype appearances, multiple greetings and ‘coucou’: The sequential organization of video-mediated conversation openings
Telecom ParisTech, Paris
This paper analyses the organization of ‘openings’ in Skype video-mediated conversation. It uncovers order in their apparent complexity by showing the relevance of a particular sequential adjacent pair organization, the appearing/noticing sequence, and its particular instantiation as an appearance-for-the-first-time greeting. The paper shows how this is a crucial resource in establishing a joint video interactional frame for the parties involved. This accounts for the occurrence of some specific phenomena in Skype openings, such as multiple greetings, and for the use of greetings which reflexively index their being occasioned by an appearance and related greeting, such as the French ‘coucou’, even when these do not occur at the start of Skype calls. When analysed this way, Skype openings, though complex, can be seen as an accomplished ‘dance of appearances and multiple greetings’.
Table of contents
- 2.1Pre-beginnings and the collaborative assemblage of a social scene for video-mediated encounters
- 2.2Multiple greetings
- 2.3Analyzing various actual instances of multiple greetings in the opening of Skype conversations
- 2.4Multiple greetings and the embedding of video-mediated communication in larger communicative ecologies
- 2.5Not only appearing, but appearing in a certain way
- 2.6A subtly orchestrated choreography of multimodal appearances and greetings
- 2.7‘This is not to be seen as the beginning’: Designing visual appearances and greetings so as to neutralize some of their sequential implications
- 2.8A sequentially frustrating succession of finely coordinated appearances and withdrawals
- 2.9Actually beginning the video conversation: A third greeting sequence
- Address for correspondence
This paper deals with the way people manage Skype video calls. Video-mediated interactions have been studied before, either briefly in the context of early experimental systems (De Fornel 1994De Fornel, M. 1994 “Le cadre interactionnel de l’échange visiophonique.” Réseaux 64: 07–132.) or more recently (and systematically) in the case of professional systems in organisational settings (Relieu 2007Relieu, M. 2007 “La téléprésence, ou l’autre visiophonie.” Réseaux 144: 183–223. ; Mondada 2010Mondada, L. 2010 “Eröffnung und Vor-Eröffnung in technisch vermittelter interaktion: Videokonferenzen.” In Situationeröffnungen: Zur multimodalen Herstellung fokussierter Interaktion, ed. by L. Mondada, and R. Schmitt, 277–334. Narr: Tubingen.). However, video-mediated communication (VMC), exemplified by the Skype product, is a technology primarily intended for interpersonal communication, for contact between family, friends, lovers, and is much less well studied. It is already known that VMC offers a resource for ‘performing intimacy and closeness at a distance’ (Kirk et al. 2010Kirk, D. S., A. Sellen, and X. Cao 2010 “Home Video Communication: Mediating ‘Closeness’.” Proceedings of the 2010 ACM Conference on Computer Supported Cooperative Work. New York: ACM Press. ; Neustaedter and Greenberg 2011Neustaedter, C., and S. Greenberg 2011 “Intimacy in Long Distance Relationships over Video Chat.” Proceedings of CHI 2012 ACM Press.; Brubaker et al. 2012Brubaker, J., G. Venolia, and J. Tang 2012 “Focusing on Shared Experiences: Moving Beyond the Camera in Video Communication.” Proceedings of the Designing Interactive Systems Conference (June 11–15, Newcastle, UK), 96–105. New York: ACM Press. ) and for helping constitute ‘distant love’ in a globalized world (Madianou and Miller 2012Madianou, M., and D. Miller 2012 Migration and New Media: Transnational Families and Polymedia. London: Routledge.; Beck and Beck 2014Beck, U., and E. Beck-Gersheim 2014 Distant Love. Cambridge: Polity Press.), but the actual organization of Skype-like calls, of VMC conversations themselves, for the doing of private lives, is much less well understood.
Consider openings, how Skype calls start, or rather how participants open up the conversation and start the interaction. These are clearly an important aspect of Skype encounters, and these are obviously the first thing about such calls that one might notice – they are the start after all. Yet even these have not been looked at. Of course, it might not be neglect that accounts for this alone. It might also be that the task of examining them is difficult. But if that is so then this might itself point toward the very nature of Skype interaction. Unlike the openings of phone conversations, which seem to have a self-evident shape, a hello/answer form crudely speaking, Skype openings, as a feature of these interactions, do indeed look complicated; to the lay observer and professional analyst alike, they look dense, improvised in ways that makes their order seem evanescent – much more than a hello replied to with another hello. Certainly, their organization cannot be accounted for by a ‘canonical sequence’, a straight-forward structure like ‘recipient speaks first’ (Schegloff 1986 1986 “The Routine as Achievement.” Human Studies 9: 111–151. ).
Even so, one can presuppose that careful examination could demonstrate that these openings do have an order, one that those using Skype orient to and one that, by the same token, are analyzable by investigative method – the conversation analytic one say. They maybe dense, complex, but surely they are orderly. Why would they not be? Studies of conversation endlessly show that talk is organised – that it is a feature of them for those involved as much as for those wanting to characterise conversation for the purposes of social science. Indeed, it is my purpose in this paper to show, using conversation analytic techniques, that Skype openings are orderly.
More particularly, I will show, with transcript evidence, that they are characterized by three distinct, and distinctive, features. First, even though the launching of a Skype conversation involves an interface-based procedure that metaphorically evokes telephony (with the application icons reminding users of picking up the phone, for example), it is not the case that participants orient to typical phone behaviour, such as the rule of conduct which holds, ‘recipient speaks first’ (Schegloff 1972Schegloff, E. 1972 “Sequencing in Conversational Openings.” In Directions in Sociolinguistics. The Ethnography of Communication, ed. by D. Hymes, and J. Gumperz, 346–380. New York: Holt, Rinehart & Winston.). This is because of the second two points. For in addition to this, and second of all, during Skype openings, the visual appearance of potential participants and their displays of involvement (as when, after initially appearing to look away, they come to orient to the screen, say) come to be treated as particularly noticeable to the participants themselves. Indeed, they are expected to be noticed (and even commented on) by those in the call. While one would expect the visual is consequential in openings in ways that is understandably unlike what is possible with phones, the way the visible comes to matter in Skype calls has consequences and ramifications that are more diverse and subtle than one might expect, and suffuse whole aspects of Skype communications. Third, and partly related to this, Skype openings often unfold in highly variable, contingent and complex sequences, extending over many turns, and this can raise potential uncertainties regarding the establishment of a proper joint interactional frame – for agreeing the purposes of a call even as those purposes evolve. And, I will say, the achievement of multiple greeting sequences within single Skype calls is a recurrent feature, and this can make openings protracted. Skype calls often entail continuous starts, and this is not a technological problem, but a feature, an outcome of their interactional form. These are the claims I will want to make in this paper.
For the purposes of doing so, I will construct my analysis of Skype openings in the following way. I start from the observation that, during these openings, various configurations of events occur which can be viewed as shifts in the way that participants perceive themselves as sharing a perceptual field. The way they display mutual involvement in the video-mediated talk enables particular types of interactional and recognitional moves – such as gesturing, speaking, gazing, for example. I will describe such occurrences (and some of their variety), as potential appearances – things that show themselves and can thus be brought to account and can, in turn, affect the substance of a call. My claim will be that the ecology of what is seen, the spatial location of see-ables through the video cameras and computer screens, the temporal ordering of mutual availability around perceptual congruence (as in ‘do you see this?’) alternate with restrictions and constraints on what is seen, of who and what is available and when, to create an self-organized system of ‘affordances’ for the parties involved which allows them to manage and jointly agree ‘appearances’, for making things come to matter as an opening in a Skype call – whenever that opening happens. The sharper the boundaries between these spatial domains or temporal moments, the more sudden the kind of appearances they can enable and sustain. I will argue, furthermore, that such ‘appearances’, because they fall under the auspices of a shared project, that of initiating a Skype encounter, are endowed with huge sequential relevance. To the extent that this is so, such appearances enact an expandable reconfiguration of capacities for interaction, and are treated by those involved as a resource to advance what a Skype call is about – a project of a particular kind, with particular topics and relevant persons and or things; purposes that can, moreover, evolve and shift as more things are brought into play and appear in the interaction. Appearances in all their variety are therefore expected; they are noticed too for they are often key to what happens in a Skype call – what a Skype call comes to be. In sum, appearances are in large part the substance of Skype calls since they help fabricate what those calls are about.
One can put this in a larger theoretical context. While appearances may constitute what Harvey Sacks has called ‘first positions’ (Sacks 1992Sacks, H. 1992 Lectures on Conversation. Cambridge: Cambridge University Press.), the way they project the relevance of greetings allows, I will show, those greetings-as-related-to-appearances, to spread throughout the communication in ways that is quite distinctive, quite unlike what might happen in other communication contexts. This is unlike how greetings typically function in telephone talk, for example. A further theoretical elaboration I will want to make relates to the idea that greetings are a definable unit in interaction. This is a claim put forward by Duranti. She argues that greetings occur at specific interactional junctures in most situations, and involve the establishment of a shared perceptual field. This is a resource for the accomplishment of a larger goal, the recognition of the parties involved. This in turn results, she argues, in greeting exchanges becoming an interactional unit in their own right (Duranti 1997Duranti, A. 1997 “Universal and Culture-Specific Properties of Greetings.” Journal of Linguistic Anthropology 7 (1): 63–97. ). Duranti’s attempt to offer an interactionally-based description of such greeting units, can be, I propose in this paper, reformulated in light of evidence here. I will argue that they can be considered as part of appearance sequences. In this view, greetings are to be seen as occasioned by and responsive to, configurations of events including ‘appearances’. Appearances-for-the-first time can come to elaborate a greetings sequence and vice versa; and, as they do so, so they can display a retrospective and prospective understanding of such configurations as ‘doing the work’ of making the Skype call have the shape it does. Appearances and greetings in this type of VMC are mutually elaborative, in other words, so while a recognizable appearance may project a greeting, the utterance of a greeting may retrospectively signal a potentially recognizable appearance. Appearances-for-the-first-time and their interoperability with greetings sequences can thus be seen, I argue, in terms of an adjacent pair organization, allowing for some recursivity in the understanding of both: A configuration of events can be turned into an appearance by a greeting, while a turn which is not designed as a conventional greeting but which is responsive to a recognizable appearance-for-the-first-time, may be understood and treated as a greeting post hoc. Some greetings, such as the French “coucou”, which I will discuss in more detail below, highlight, I will explain, an especial responsiveness to fleeting appearances-for-the-first-time and the connection these might have to the appropriation of greetings. They are particularly well suited to visual communication environments, I will suggest, but tell us a great deal about the complex yet organized nature of VMC interaction. In particular, because of the ‘ahistorical relevance’ of greetings (Sacks 1992Sacks, H. 1992 Lectures on Conversation. Cambridge: Cambridge University Press.) there is no reason to exclude them at any relevant juncture in an interaction. Therefore, if the opening of an encounter unfolds as a succession of recognizable appearances, then the production of multiple greetings may constitute a resource to achieve a ‘proper beginning’ whenever that beginning might occur – such as at a coucou occasion that might happen in later moments of a Skype call.
All of this is significant, I believe. In studies of the initiation of face to face encounters, it has been shown how a form of sequentiality emerges from the treatment by the protagonists of their mutual approach to one another. Exchange of gazes, then ‘distant greetings’, then getting close enough for a proximal greeting, as is the case with friends, all constitute a series of coordinated moves, a dance of introduction, if you like (Kendon 1990Kendon, A. 1990 Conducting Interaction: Patterns of Behavior in Focussed Encounters. Cambridge: Cambridge University Press.). This has echoes in the initiation of the ‘encounter’ as happens with a service desk, a dance in the workplace (Mortensen and Hazel 2014Mortensen, K., and S. Hazel 2014 “Moving into Interaction. Social Practices for Initiating Encounters at a Help Desk.” Journal of Pragmatics 62: 46–67. ). In the perspective that I develop here, a similar organisational purpose and dance is visible but its sequential form can be much more complex. In Skype calls, greetings can be seen as a response to successive ‘appearances’ in the course of an approach. That is to say, greetings are moments that can be retrospectively configured such that they can allow a transition in what the interaction is about at that moment in time – back to a greeting say, or perhaps to a pointing, an adjustment of the perceptual frame of reference, what the talk is about. Hence, coming closer to a camera so as to imply gestural reach and hence a form of ‘remote greeting’, or coming bodily close for the proximal version of a handshake or hug, and similarly pointing with a hand beyond the see-able field, can constitute moments of reconfiguration in the structural order of the interaction. My argument is that the sequential properties of the appearance/greeting sequences which might involves greeting-like behaviours are, in fact, crucial to understanding the organization of Skype openings in that these can shape up as multiple greeting sequences. Moreover, and pursuing this notion of broader significance, the complexity of Skype openings lies also in the fact that participants may have been connected by instant messaging before the Skype call commences or is even planned (Sunakawa 2012Sunakawa, C. 2012 “Japanese Family via Webcam: An Ethnographic Study of Cross-Spatial Interactions”. In Lecture Notes in Computer Science Volume 7258, ed. by M. Okumura, D. Bekki, and K. Satoh, 264–276. Berlin: Springer-Verlag.). This blurs when a greeting occurs, and what role it has – after all, people may have greeted one another long beforehand. In addition to this, the audio and video connections may occur at different times (and not at the same time for both participants) and this too will affect the orderliness and pattern of opening and greetings. That participants do indeed move forward and approach the screen in ways that can be visually noticed by others as a kind of greeting process, that is to say in an interaction-relevant way, as Kendon describes happens elsewhere, only compounds this complexity.
This is very important so let me put it another way: Though participants in a Skype call may be apart, their movement with regard to the technology itself (affecting what is seen and shown) as well as with regard to some of the broader features of the technology (the quality of the audio say), are mutually accountable resources for all involved. Not only might a person shift towards a camera to assert a move towards intimacy or ‘commencement’, say, but they can display themselves as engaged elsewhere, with their body torqued (Schegloff 1998 1998 “Body Torque.” Conversation 65(3):535–596.), their head turned or their gaze averted. This can suggest a move towards closure. Similarly, they may suddenly re-orient so that they make themselves available for interaction by displaying the kind of ‘talking head’ orientation to the camera that is an almost ‘expected’ grammar or feature of video conversationalists (Licoppe and Morel 2012Licoppe, C., and J. Morel 2012 “Video-in-Interaction: “Talking Heads” and the Multimodal Organization of Mobile and Skype Video Calls.” Research in Language and Social Interaction 45 (4): 399–429. ). This offers a renewed opening. Besides all this, if such potential appearances are not simultaneous nor occur symmetrically (or vice versa), then Skype openings can involve a cascade of aural and visual appearances; and this in turn can result in involvement shifting, with participants having to manage the achievement of a proper beginning, a proper reconfiguration of topic or sequential orderliness, in the concern of the call. Taken as a whole, the amount of work required to establish an effective, jointly manageable, ‘interaction frame’ points to the frailty of VMC to various sorts of perturbations of social and technological character; this intensifies the orientation of the participants so that they notice and treat potential appearances as sequentially implicative resources whatever the source or timing of those appearances – the deliberate acts of the participants, the ‘breakability’ of the technology. Skyping turns out to be hard work given the organization of the interaction it affords.
It is no wonder then that in such a connection procedure, often gradual and asymmetric, participants display so vividly how their appearances should be attended to and sequentially treated with a suitably and equally vivid response. This can be seen in the fact that participants often check for appearances in Skype encounters, through confirmation queries ("do you see me?”, “do you hear me?") or anticipate the occurrence of their own appearance to the co-participant ("wait it will come," in Fragment 1 below, for example). Such an orientation towards the frailty of the connection process and the graduality of the establishment of a joint interaction frame (involving successive shifts in hearability, visibility and involvement; i.e. successive potential appearances), makes the treatment of appearances by greetings a very useful sequential resource, circumventing the need for multiple confirmation checks. By greeting, the greeter signals that some previous event has occurred in her or his environment, making the context of the greeted relevant to a similar scrutiny, and offering a sequential emplacement for confirming that the appearance was mutual (by greeting back) or disconfirming (by initiating some trouble talk). What might look to an external observer like a complex, highly contingent and protracted Skype video opening, is then observable and accountable as something which looks to those inside a call like an unfolding choreography of appearances and multiple greetings, a dance of modern life conducted in the everyday context of a video-mediated communication, a ‘Skype’.
Data was gathered from 15 regular Skype users who allowed us to record some of their naturally occurring Skype conversations over a period of about one month, through a video capture software (Camtasia). These primary participants, and about 30 of their usual correspondents, agreed to make some of these conversations available for the purpose of this research, with us deleting those recordings that they deemed too personal. This provided us with a primary corpus of 73 hours of recorded video conversations. This contains 181 conversations, about one third of which are multi-party calls (involving more than two participants). All these conversations were personal calls between couples, family or friends who were not residing in the same town or country at the time.
2.1Pre-beginnings and the collaborative assemblage of a social scene for video-mediated encounters
For any kind of interaction to proceed, participants assemble the social scene of the encounter as a collaborative process (Button and Sharrock 1998Button, G., and W. Sharrock 1998 “The Organizational Accountability of Technological Work.”. Social Studies of Science 28: 73–102. ) in the opening stages of that interaction. The question that motivated our enquiries was how do participants achieve such an emergent interactional frame in Skype video-mediated communication (VMC)? What kinds of specific concerns do they display in this orientation work?
Our first fragment involves a Skype conversation between two friends. Our informant, Art, initiated a Skype conversation with an old friend, Ben, and this appeared to constitute the first such ‘first time’ video call between them.
1. ((sound connection)) 2. ((Ben’s image appears on Art screen)) 3. Art #ça va?11.A turn the equivocality of which cannot be conveyed in English. It can mean as well “How are you?” and “Is it OK?” (see discussion in the text). 4. Im #Image 1.1 5. Ben ouais (.) ça va toi? yeah how are you? 6. Art ouais (.)#attends j’arrive (.) tu me vois là? yeah wait I’m coming do you see me there? 7. Im #Image 1.2 8. Ben non no 9. (3.0) 10. Art ouais (.)#attends ça y est yeah wait here it is 11. Im #Image 1.3 12. Ben ah ça y est c’est bon ah here it is it’s okay 13. Art okay #enlève ces lunettes j’ai l’écran qui reflète dedans take off these glasses I have the screen that reflects in them 14. Im #Image 1.4 15. Ben *(1.5) 16. *((takes off glasses and leans back in chair)) 17. Art ^Bonjour #CHAT^ Hello CAT 18. Im #Image 5 19. ^--------------*----^ ((raises hands and waves)) 20. Ben *#((waves back *with wider gestures)) 21. Im #Image 6 22. Art *(amplified wave))--------*#-----^ 23. Ben *((laughs)) 24. Im #Image 7 Image 1.5 Image 1.6 Image 1.7 25. Art ((kisses hand at the end of waving)) 26. Ben ((leans back in chair))
After line 26, a topic is introduced. Following Lorenza Mondada’s analysis of multi-party video-mediated communication, we could say that the participants orient to the greeting sequence starting with the turn at line 17, combining a verbal greeting, a familiar term of address, and a hand wave, as the proper ‘beginning’ for their video call (Mondada 2010Mondada, L. 2010 “Eröffnung und Vor-Eröffnung in technisch vermittelter interaktion: Videokonferenzen.” In Situationeröffnungen: Zur multimodalen Herstellung fokussierter Interaktion, ed. by L. Mondada, and R. Schmitt, 277–334. Narr: Tubingen.). In doing so they project an ‘anchor’ position (Schegloff 1986 1986 “The Routine as Achievement.” Human Studies 9: 111–151. ) for moving to whatever business is at hand. What will interest us here is the kind of preliminary, ‘pre-beginning’ work which has been done to achieve this sense that a proper interaction frame for a video-mediated communication has been achieved and which makes relevant the production of the greeting sequence at this juncture. This in turn allows the subsequent move into some interactional business at hand. How is this done?
After Art’s initiation of the call, a background noise can be heard, which is usually treated as a cue indicating the audio connection, and which is followed almost immediately by the visual appearance of Ben on Art’s screen (Image 1.1), and Art’s production of a first turn (Line 3). We have here an instance of a Skype video call where the caller speaks first, which is common enough in our corpus, and is not treated as troublesome. This suggests that a different sequential organization might operate in video-mediated environments to determine who speaks first than operates in phone conversation. It suggests, perhaps more aptly, that a sequential phenomenon, which may be treated as irrelevant by participants and therefore lies hidden in standard phone openings, becomes both relevant and apparent in Skype video calls.
Art’s turn is produced just after Ben has appeared on his screen, and can be heard as a greeting, so that it seems to be responsive to this visual occurrence. Video communication reinforces the importance of the establishment of visual availability (Mondada 2010Mondada, L. 2010 “Eröffnung und Vor-Eröffnung in technisch vermittelter interaktion: Videokonferenzen.” In Situationeröffnungen: Zur multimodalen Herstellung fokussierter Interaktion, ed. by L. Mondada, and R. Schmitt, 277–334. Narr: Tubingen.), so that in VMC settings, greetings constitute an important resource to establish a mutual sense of co-presence (Relieu 2007Relieu, M. 2007 “La téléprésence, ou l’autre visiophonie.” Réseaux 144: 183–223. ). This greeting has not the usual first position greeting form of “hello” or “good morning”. Moreover its design as a “how-do-you-do” type of greeting is ambiguous for, though it will be treated by Ben as a greeting, it can also be heard in French as a check for possible trouble. This equivocal design might be a subtle way to treat the particulars of Ben’s appearance on screen, that is, first, that the background noise cues their mutual hearability (making it possible to talk), and, second, that Ben has just become visible on screens, which seems here to project a greeting response (thus making it relevant for the caller to speak if he is the first to see the other). The exchange of greetings which occur at this juncture will not however be treated as a proper ‘beginning’ for Art. Art will immediately make explicit some concerns with the interactional frame (as we shall see below). So what we have here is a sequence of greetings which is triggered by the possibility to talk and the visual appearance of one participant, but which is not treated as a proper beginning.
After Ben’s response, Art instructs him to wait. The directive is followed by the announcement “I’m coming” (Line 6). This makes further talk by Ben conditionally-relevant to some presence-related phenomenon. The pause and the visibility check that follows make retrospectively clear that the event which Art is looking for is precisely of a visual nature, that is Art’s own visual appearance on Ben’s screen. What makes Art aware that he might not be visible yet to Ben is the fact that his control image is yet to appear on his own screen (Image 1.2). The instruction to wait and the question check shows that Art’s visual appearance on Ben’s screen is an event which is not only to be expected, but to be noticed and made interactionally relevant. The visual, moral order of video-mediated communication openings assumes and requires symmetric perceptual access. This also retrospectively ties the kind of presence indexically referred to in the “I’m coming” announcement to visibility, thus emphasizing again the importance of visual appearances as interactional events. After a repetition of the instruction, Art uses the expected visual cue (the appearance of his control image, Image 1.3) to utter an announcement confirming that it is now ok (and therefore was not before), designed with a change-of-state token in initial position (Heritage 1984Heritage, J. 1984 “A Change-of-State Token and Aspects of its Sequential Placement.” In Structures of Social Action, ed. by J. M. Atkinson, and J. Heritage, 299–345. Cambridge: Cambridge University Press.), marking that the kind of visual event which was expected has been achieved. Art then obtains a confirmation from Ben which makes their mutual visibility into the relevant interactional common ground.
The next sequential slot is not taken up by Art to do a greeting but to instruct Ben to take off his glasses, followed by an account for that “I have the screen which reflects in them” (line 13, Image 1.4). Coming after an instruction to change something about self, and in this particular sequential environment (pre-beginning), this can be heard as pointing to some trouble affecting the establishment of a proper interactional frame. To see the reflected screen in the glass as troublesome makes salient another dimension of the moral order of video-mediated communication: Visual access to the gaze of co-participants is expected so that gaze orientations can be monitored. With the glasses, not only are the eyes invisible, but spurious events are created in the regions of the eyes which attract attention while not being gaze-related. It is only after Ben has complied with the instruction and taken off his glasses that Art utters a first part greeting, the production of which displays his understanding that a proper interactional frame for the video call has eventually been achieved.
Throughout the pre-beginning build up, the sequence makes analytically perceptible several members’ concerns relative to the openings of video conversations: (a) that they can hear and see one another, and that they are mutually aware of this joint perceptual availability; (b) that they display, visibly, their orientation to the start of the video-mediated encounter, through their crafting a ‘talking head’ appearance (Licoppe and Morel 2012Licoppe, C., and J. Morel 2012 “Video-in-Interaction: “Talking Heads” and the Multimodal Organization of Mobile and Skype Video Calls.” Research in Language and Social Interaction 45 (4): 399–429. ), ensuring that they can be perceived as showing their full face whilst being as close to the screen as possible, as facing their screens so that their transactional segments overlap (Kendon 1990Kendon, A. 1990 Conducting Interaction: Patterns of Behavior in Focussed Encounters. Cambridge: Cambridge University Press.; De Fornel 1994De Fornel, M. 1994 “Le cadre interactionnel de l’échange visiophonique.” Réseaux 64: 07–132.), and as gazing at the screen (which requires that their eyes should be visible); (c) that they pay attention to the constructed character and potential frailty of this standard and expected interactional frame by treating potential deviations from such ‘normal’ video appearance as systematically noticeable and mentionable, and making one another accountable for that, as here with the glasses. This is true throughout the calls but especially so during openings where such a video communication frame for joint audiovisual telepresence has to be achieved and checked, and where deviations which might threaten it are remarkable and expected to be noticed. Such a concern also accounts for the frequency of checking and confirmatory sequences in such openings, either before or just after proper beginnings, such as the “do you see me” in line 6.
This accounts for the observation that in Skype video calls it may as well be the caller who speaks first as the call recipient. This is because participants are oriented to attending and noticing occurrences related to the establishment of a proper joint talking heads interaction frame, and to making visible such attending and noticing. Audio and especially visual appearances are also treated as sequentially implicative at this juncture – the opening (just as they are throughout a call). As a consequence, the first turn is often produced after a co-participant has become visible, and such an event may occur on both sides. This orientation then supersedes the summons-answer organization which would favor the call recipient speaking first. That this happened massively in the landline phone conversations studied by Schegloff in the 1960s is evidence for the standardized organization of (audio) landline phone connections: When picking up the phone, participants are usually quasi-immediately and symmetrically put in an aural connection, and they expect it to be so. In a sense, that the caller speaks first in such phone conversation shows the trust participants put in the solidity of the mutual audio interaction frame which derives from the recipient picking up the phone. Conversely, that the ‘call recipient speaks first’ rule might not be relevant to Skype video-mediated conversations provides evidence for the kind of work participants have to achieve to establish a proper interactional frame beyond just calling and accepting the call, and to the way they hold such an interactional accomplishment to be a frail one. Whereas people trust in the affordances of audio telephony, they do not with VMC.
Moreover, the same considerations provide for another phenomenon present in this fragment but which is more general to openings in VMC, i.e. the possibility of multiple greetings and multiple greeting sequences. While the greeting at line 17 marks the proper beginning of the video-mediated encounter, as evidenced by the participants orienting towards the introduction of other topics, it is not exactly the first greeting, for this could be said of the exchange of ‘how-do-you-dos’, lines 3–6. This initial exchange of greetings seems to be made relevant by an ‘on screen appearance’, that is an instance in which one participant becomes suddenly visible to the other. Visual appearances are recognizable as meaningful actions in opening environments, worthy of notice and response. Unlike collocated encounters where the visual appearance of an approaching party may be managed gradually (for instance through distant exchanges of glances, see Kendon 1990Kendon, A. 1990 Conducting Interaction: Patterns of Behavior in Focussed Encounters. Cambridge: Cambridge University Press.; Mortensen and Hazel 2014Mortensen, K., and S. Hazel 2014 “Moving into Interaction. Social Practices for Initiating Encounters at a Help Desk.” Journal of Pragmatics 62: 46–67. ), video-mediated call technologies provide for a sharp temporal boundary between visibility and invisibility, giving the on screen visual appearance at connection its discreteness and its suddenness as well as its recognizability, and shaping its sequential implications. Moreover, visual appearances in openings are mostly appearances-for-the-first-time, and the production of a greeting in return appear geared to treat both the sequential relevance of appearances and their perceivable ‘first-timeness’.
As we begin to see with the fragment above, openings of Skype conversations may be protracted processes in which various connection events are not occurring simultaneously, and are treated separately by the participants. Multiple greetings then show how participants may isolate, recognize and treat as meaningful interactional moves, discrete transformations of the way participants are present to one another; that is, as stepwise transitions in their mutual appearances. I will look here at three different examples.
Greetings are difficult to grasp from a semantic perspective because of their propositional voidness (Searle 1969Searle, J. 1969 Speech Acts: An Essay in the Philosophy of Language, Cambridge: Cambridge University Press. ) and formulaic character (Duranti 1997Duranti, A. 1997 “Universal and Culture-Specific Properties of Greetings.” Journal of Linguistic Anthropology 7 (1): 63–97. ). Sacks has shown the constitutive relationship between greetings and sequentiality: “One thing which is nice about greetings is that the greeting items, things like “hello”, are not greetings wherever they happen to occur. They are only greetings if they occur in the greeting place. And that’s a pretty good way to see we have a greeting place. And the greeting place is what we would call ‘first position’. If the item occurs there, then it’s a ‘greeting’. We can also look there to see whether it’s absent, because we know where to look” (Sacks 1992Sacks, H. 1992 Lectures on Conversation. Cambridge: Cambridge University Press., 308). A first implication of this is that different type of interactional moves may be recognized as greetings, even if their morpho-syntaxical or gestural appearance is unconventional, if they are produced in a relevant slot for greetings. A second implication is that the proper slot for a greeting and the greeting constitute one another within the sequential organization: There is an ‘enchronic’ relationship which governs their interlocking as a sequence of interactional moves (Enfield 2011Enfield, N. J. 2011 “Sources of Asymmetry in Human Interaction: Enchrony, Status, Knowledge and Agency.” In The Morality of Knowledge in Conversation, ed. by L. Mondada, J. Steenswig, and T. Stivers, 285–312. Cambridge: Cambridge University Press. ). The occurrence of a proper slot for a greeting makes the next interactional move inspectable for its ‘greetingness’. The production of a recognizable greeting makes its emplacement recognizable as a possible slot for greetings. In particular, the greeting and the first-time-ness of the occasion which made it relevant are mutually elaborative, and such a first-time-ness is a member’s phenomenon.
Working from a linguistic anthropology perspective, Alessandro Duranti has suggested that proper occasions for greetings involved the ‘establishment of a shared perceptual field’ and the ‘implicit establishment of a spatio-temporal unit of interaction’. Therefore, occasions in which some sudden change in their mutual communicative capacities becomes recognizable or suspected by co-participants (something which I gloss as appearances in this paper), and where such changes may be oriented to and treated as ‘first time’ circumstances may be candidate events for the status of ‘first position’ events which make a greeting a conditionally relevant next action. Conversely, the uttering of a greeting may make prior events retrospectively inspectable for the status of ‘first position’ appearances (that is as cues that some change relevant to the establishment of a mutual communicative field has occurred). Greetings may therefore constitute powerful resources to test the establishment of a communicative field in complex technology-mediated environments (Relieu, 2007Relieu, M. 2007 “La téléprésence, ou l’autre visiophonie.” Réseaux 144: 183–223. ), and particularly in the openings of Skype encounters in which the establishment of a proper interaction frame is often achieved in the course of stepwise and protracted process. On the one hand, the aural appearance of a co-participant (through the sudden audibility of ambient noise), potentially indexing mutual hearability, or her visual appearance on screen (through adjustment of a camera or the participant’s body position vis-à-vis the camera) makes relevant a greeting. On the other hand, the production of such a greeting alerts the remote co-participant that such an event may have occurred in the perceptual field of the speaker, and offers the other a slot to confirm the mutual character of the appearance or signal some trouble. The more participants may expect difficulties in the establishment of the connection the more they will orient to such appearances as noticeable and to be noticed, and as greetings as resources to test the quality of their joint interactional focus.
Moreover, according to Sacks, ‘greetings are a-historically relevant’. Compared to, say, an “‘introduction’ which, having been gone through once … are no longer appropriate, between any two people without regard to how long they have been acquainted, there isn’t a rule which says, on the nth conversation, no longer begin ‘conversations’ with ‘greetings’ … To say that ‘greetings’ are a-historically relevant is not to say that every conversation must begin with greetings, but that there’s no exclusion rule for greetings” (Sacks 1992Sacks, H. 1992 Lectures on Conversation. Cambridge: Cambridge University Press., 551–552). The ahistorical relevance of greetings provides us with a sense of how multiple greetings may occur in openings: Openings are especially the locus of first-time events related to the establishment of an interactional frame, and if the latter involves a temporal succession of discrete and recognizably different appearances, and since there is no rule excluding the production of greetings at such junctures, members may orient to such contingent organizations for openings by treating successive appearances with greetings, so that the proper beginning of the social encounter will occur after multiple greetings.
Multiple greetings have been observed before in collocated interactions, under the form of a ‘distance greeting’ (such as a hand wave), and before the participants move closer and exchange a proximal greeting which marks the beginning of their encounter (Kendon 1990Kendon, A. 1990 Conducting Interaction: Patterns of Behavior in Focussed Encounters. Cambridge: Cambridge University Press., 172–173). Such greetings orient to the establishment of mutual perception and joint orientation (usually through gaze) enabling the initiation of a (gestural) unit of interaction. Within the frame of our analysis, however, distance greetings can be glossed as retrospectively framing that moment in their approach as a jointly oriented-to-appearance-at-gesturing-distance. Multiple greetings have also been observed to occur in video communication. In an early French experiment on home video communication in the 80s, the caller had to establish a phone-like connection first, and then to negotiate the establishment of the video with the recipient through a request/accept technology-mediated sequence, and participants would respond to such a temporal and interactional separation in the establishment of audio and video connection by exchanging greetings twice, first after the audio connection, and then after the video one (De Fornel 1994De Fornel, M. 1994 “Le cadre interactionnel de l’échange visiophonique.” Réseaux 64: 07–132.). This phenomenon is not tied to the particulars of this video-mediated setting (though it clearly facilitated the production of double greetings) but can be more generally observed in Skype video communication openings under different forms, as we see below.
2.3Analyzing various actual instances of multiple greetings in the opening of Skype conversations
‘Audio appearance’, and then ‘video appearance’: double greetings
1. ((rings)) 2. Art Art’s control image appears on screen 3. Art moves Skype window to center of screen# 4. Im # Image 2.1 5. (7.0) 6. Audio noise starts 7. (3.0) 8. Art Bib/ 9. Bud quoi? what? 10. Art bonjour hello 11. Bud’s image becomes visible to Art# 12. Im # Image 2.2 13. Art bonjour Bibi hello Bibi 14. ((waves hands emphatically for 2.5 seconds#)) 15. Im # Image 2.3a to 2.3c Image 2.3a Image 2.3b Image 2.3c 16. Bud bonjour (said as art gestures with his hands) hello 17. Art ça va ? how do you do?
Art treats the advent of background noise as a cue that they might hear one another, and as a slot to speak, though Bud’s image is not yet visible on his screen. He provides an address term with an upward prosody, trying for a response (“Bib/”, Line 8). Being responsive to an event which can be understood as possibly establishing a perceptual field of mutual audibility, Art’s address term works and can be recognized as a greeting, though its form is slightly unconventional. Bud’s response (“what?”, Line 9) demonstrates the establishment of the mutual perceptual field (he has heard something), but could be understood in this context both as a repair initiation (indicating for instance he might not have heard properly), or as a treatment of the previous address term as a summons (“what” then being taken then as a kind of shorthand for “what do you want with me?”). Art responds to this with a conventionally designed greeting term (Line 10), which confirms retrospectively that his opening turn was oriented to the establishment of a mutual auditory field, and then could be heard as a greeting.
At that moment, Bud’s image appears on Art’s screen. Art orients to this by uttering a new greeting combining the same greeting term, and the address term marking that some sort of recognition has been made. It is interesting that he does so immediately (in the sense of not waiting for Art’s expected completion of the previous greeting exchange), thus displaying his being geared to pay particular attention to that kind of visual appearance and its timeliness, From Bud’s perspective, this new greeting turn is shaped as being occasioned by some appearance allowing some additional form of recognition. It is thus not just prolonging the previous sequence, but constitutes a new greeting, responsive to another event than in the previous greeting sequence. Art will moreover make it even clearer that this is a visually-oriented greeting by gesturing widely with his hands (line 14 and images 2.3a to c, turning the greeting into an embodied visual performance. When Art smiles and provides a greeting of his own the participants move on with the sequence, displaying their understanding that a proper beginning has been achieved.
We have here a situation in which the contingent temporal organization of the connection gives rise to a form of opening which is close to the initial observations of Michel de Fornel. The audio connection and the video connection occur successively in time, and the participants orient to them as successive forms of appearance, occasioning the production of two distinct greeting sequences. This is however not the only configuration in which opening sequences might involve multiple greetings, as we shall now see.
2.4Multiple greetings and the embedding of video-mediated communication in larger communicative ecologies
Current communication systems are layered and embedded in larger digital ecologies affording multiple forms of interaction with remote parties. Such ecologies afford ‘appearances’ of all sorts. For instance, web-based video communication systems are usually associated with chat platforms allowing users to message one another between and during the video calls. It is often the case that video calls are pre-arranged on these platforms, by SMS, instant messaging or Facebook (Sunakawa 2012Sunakawa, C. 2012 “Japanese Family via Webcam: An Ethnographic Study of Cross-Spatial Interactions”. In Lecture Notes in Computer Science Volume 7258, ed. by M. Okumura, D. Bekki, and K. Satoh, 264–276. Berlin: Springer-Verlag.). In the following fragment, Hal and Léa are a couple leaving in different cities, who communicate through Skype several times a week.
1. 10:51 Léa bonjour bonjour ;-) good morning good morning 2. 10:52 Hal coucou 3. 10:52 Hal dis donc j’avais pas vu ce dernier message hey I had not seen this last message ((Several turns of Im conversation not transcribed here)) 62. 11:09 Hal je peux te voir cinq minutes ? can I see you five minutes 63. 11:09 Léa oui yes 64. 11:09 Léa ne suis pas vraiment présentable am not really presentable 65. 11:09 Hal moi non plus me neither (10.0)
Léa’s message is sent in the morning. Producing such a greeting message retrospectively makes the fact that they are both connected online as a relevant occasion for interaction. That the turn is a greeting highlights the recognizable ‘first-timeness’ of the occasion (it is their first interaction in the morning). Hal reciprocates with a greeting of his own, the untranslatable French expression “coucou”, which I will discuss in more detail below, and which precisely highlights mutual appearances. The two move on to a fifteen minutes long topical chat. Hal then proposes a video call (Line 62). The call is framed to be a short one because she has said earlier that she was waiting for her mother to arrive. Léa agrees and then asserts that she is not properly visible. Rather than declining the invitation, the design of her turn (with the use of “really”), shows she is self-deprecatingly accounting in advance for her visual appearance. Self-deprecative assessments establish a preference for positive assessments (Pomerantz 1984Pomerantz, A. 1984 “Agreeing and Disagreeing with Assessments: Some Features of Preferred / Dispreferred Turn Shapes.” In Structures of Social Action, ed. by J. M. Atkinson, and J. Heritage, 57–102. Cambridge: Cambridge University Press.; Schegloff 2007 2007 Sequence Organization in Interaction: A Primer in Conversation Analysis. Cambridge: Cambridge University Press. ). However, Hal does not see her yet, and he manages alignment by providing a similar self-deprecating assessment regarding his own visual appearance. Even in good humor, this preliminary assessment sequence does some ‘pre-appearance work’ by framing in advance the way their upcoming on screen appearance will be viewed and interpreted. ‘Pre-appearance work’ demonstrates participants’ concern with their visual appearance before the call itself, it can take other forms, such as using the control image to adjust one’s face or hair before the mutual video connection has been established.
After a few seconds, Léa eventually launches the video call.
1. Lea. ((calling)) ((Rings)) 2. Hal ((accepts the call and Skype windows opens up)) 3. Lea ((appears, looking down #) # image 3.1 4. Léa ((looks up #)) # Image 3.2 5. Léa ((gets closer to the screen #)) 6. # image 3.3 Image 3.1 Image 3.2 Image 3.3 2 3 4 7. Hal salut hello 8. Léa salut *(.) ça va hello how are you 9. * ((moves the Skype window to the center)) 10. Léa ((looks down)) 11. Hal ((puts the camera on)) 12. ((Hal’s control image appears on his own screen)) 13. (3.0) 14. Léa ((looks up)) 15. Léa ((smiles)) *ah *(.) ça va ? how are you 16. * # ((smiles)) 17. Im # image 3.4 18. *((looks quickly down and up)) Image 3.4 19. Hal mm 20. (2.0) 21. Hal bah t’as t’es du matin t’es toute t’es t’es bah you’ve you’re e morning person you’re all 22. t’es d’toute façon t’es jolie tout le temps you’re anyway you’re great all the time
When she appears on his screen, she is looking down. It is only when she looks up and gets close to the screen that he produces a greeting (Line 7). The greeting treats her visual appearance, but the timing of the greeting shows that it is sensitive to the production of a talking head configuration (and not just the visual connection) as the kind of appearance which properly projects a greeting as a relevant response. She returns the greeting and a how- do-you-do (Line 8), so that, within the same encounter, flowing from instant messaging to Skype, they have now produced two different pairs of greetings. After this she appears busy with her screen and keyboard. He possibly orients towards this display of parallel involvement as marking some trouble on her side for he seems to realize that his camera is off, and he puts it on. After a few seconds she looks up at her screen and smiles (Lines 115–16), then produces a change of state token (Heritage 1984Heritage, J. 1984 “A Change-of-State Token and Aspects of its Sequential Placement.” In Structures of Social Action, ed. by J. M. Atkinson, and J. Heritage, 299–345. Cambridge: Cambridge University Press.), which frames what she is about to say as visually occasioned, and repeats her how-do-you-do greeting. Her conduct can be read as orienting to the immediate sequential relevance of his visual appearance on her screen. Responding with a greeting, even if it involves redoubling a previous greeting, marks the fact that it is the first time he becomes visible in the course of their encounter. Her second verbal greeting is embedded in a sequence which already involved multiple greeting exchanges. Moreover the design of her greeting (a “ça va ?”) is sensitive to the placement of his visual appearance in the opening sequence (they are now at the ‘how-do-you-do’ stage). Hal produces a token of agreement and then moves on to introduce a topic, commenting on her visual appearance, and providing at that juncture, the kind of positive assessment of her visual presentation of self which her self-deprecating self-assessment message earlier on had projected. This demonstrates his understanding that they have established a proper frame of interaction, and therefore have begun the video encounter.
2.5Not only appearing, but appearing in a certain way
It would be wrong to deduce from these previous examples that multiple greetings are tied only to transitions from one medium of interaction to another, as with audio to video, or instant messaging to audio and/or video. In the previous extract, Hal produces a greeting not when Léa becomes visible, but once she has looked up and started to gaze at the screen. Becoming visible, and ‘appearing’ in the sense I am discussing here are two different things. The next fragment is even more striking in this respect.
1. #video connexion 2. Jan *hello:: # 3. Im # Image 4.1 4. *back turned cleaning things in kitchen Image 4.1 5. (4.0) 6. Jan gets close with face to the screen 7. #hey 8. Im #Image 4.2 9. Amy (hh) 10. Jan blows kiss *blows two more kisses 11. Amy *#coucou toi: 12. Im Image 4.3 13. Jan moves away *moves back to screen to adjust something 14. Amy * t’es (.) t’es encore en train de faire le ménage you’re you’re cleaning up again
When the image appears on Amy’s screen (Amy is the caller), Jan has her back turned to the screen (Image 4.1). She is nevertheless the first to speak, uttering a greeting (“hello”, line 3). What we may infer is that she has heard the characteristic background noise which cues the establishment of a situation of mutual hearability. Responding to that with a greeting displays an orientation towards treating this connection event as an aural appearance. For several seconds, Amy does not return the greeting and keeps silent. Amy’s silence is sequentially meaningful in two ways. First, considering the sequential import of visual appearances in video communication which we have stressed above, she does not seem to treat the visual appearance of Jan on her own screen. Second she does not even respond to Jan’s greeting, as would be expected from the adjacent pair organization of greetings. However, if we consider this particular fragment from Amy’s viewpoint, her conduct suggests that for the appearance of a co-participant, and even for an initial greeting to be sequentially implicative and project a return greeting, the co-participant is expected to appear, but also to appear in a certain way, i.e. in a way which displays her involvement in the concerted production of a beginning. Gazing at the screen is usually treated as a crucial index of involvement in video-mediated interaction, hence the normative importance of talking heads configurations, as we mention above (c.f. Licoppe and Morel 2012Licoppe, C., and J. Morel 2012 “Video-in-Interaction: “Talking Heads” and the Multimodal Organization of Mobile and Skype Video Calls.” Research in Language and Social Interaction 45 (4): 399–429. ). In our example, Jan has her back turned to the screen and displays her involvement in some kitchen task, which makes it relevant for Amy to hold her talk though Jan has just become visible to her, and even greeted her in this fashion. In other words, with respect to sequential implications, there is a difference between just becoming visible and doing so in a way which displays involvement in the initiation of the communicative event, i.e. ‘appearing’. Coming back to Jan’s initial greeting, she could not see Amy of course, but since Amy had launched the call she could make the plausible guess that Amy would be oriented to the establishment of a unit of interaction. After a few seconds Jan eventually turns back, walks in, brings her head into the screen, and gazes at it. She thus makes herself accessible and displays her potential involvement in the video conversation by achieving a situation of mutual gazing (Image 4.2). Here, this achievement occasions Jan’s uttering of a second greeting (“hey”, line 7). Amy produces a responsive laugh, and Jan expands her verbal greeting gesturally by blowing Amy a series of mute kisses (Line 10, Image 4.3), in the middle of which Amy utters a return greeting (“coucou toi”, Line 11). The silent kisses can only be appreciated visually and they highlight retrospectively the visual character of the interactional ecology in which the “hey” was produced. “Hey” is thus marked as a different kind of greeting from the initial, “hello”, done while looking away. By uttering a second greeting, Jan frames the occasion as an appearance of Amy on her screen, as she turns towards it and suddenly sees her: Though Amy might have been visible before, she had not been seen yet by Jan. Uttering a second greeting, Jan also enacts new sets of categorical relevancies. While the first greeting was uttered in an ecology which only made relevant turn-generated categories such as speaker/hearer, the second greeting enacts the viewer/viewed relational pair as locally relevant. Amy’s previous ‘silence’ is readable as a kind of holding back, sensitive to the perceivable involvement of Jan in other activities besides establishment of a joint video interaction frame. The fact that her return greeting in line 11 can be read as occasioned by Jan’s embodied reshaping of their interaction frame and second greeting confirms retrospectively this interpretation. Amy’s turn combines the greeting term (“coucou”) which we already have encountered several times and the address term (“you”) which marks social recognition, a design which displays an orientation towards a proper beginning being potentially achieved at that juncture. Indeed, in her next turn Amy introduces a first topic, in which she comments on Jan’s initial visual appearance while providing a candidate account for it (“you’re cleaning up again”, Line 14). In this way she retrospectively marks Jan’s initial appearance and side involvement as noticeable and she overtly makes her accountable for them.
The production of double greetings is therefore an interactional device the use of which is not tied only to the temporal distribution of ‘technical’ connection events marking discrete steps in in the establishment of an audio-visual mutual perceptual field. This example also enlarges our sense of what might make a particular event recognizable as an appearance-projecting-a -greeting. Any occasion which can be seen as enacting a reshaping of the ongoing participation frame and interaction-generated participative statuses is potentially available as an appearance, projecting a greeting as a relevant next action, whether the agency involved in such a reshaping lies more on the technological side (as when one or both participants hear one another a few seconds before they see one another) or on the participants’ conduct (as when someone initially off screen), or was initially facing (and gazing) away, thus displaying only limited or partial involvement in the video-mediated interaction, moves in to produce a recognizable talking heads configuration and display of focused attention. Multiple greetings constitute a powerful interactional resource to treat all kinds of occurrences in which the emergent joint interactional frame is recognizably transformed during openings, these successive transformations being retrospectively confirmed as different sorts of appearances by the successive greetings.
While multiple greetings may occasionally be observed in other environments,22.Though space precludes a further discussion of this, we have recorded instances of multiple greetings in the opening of face to face interaction, and in video-mediated institutional settings. even face to face, Skype video communication offers significant opportunities for the production of multiple greeting sequences for three reasons. First the use of Skype is embedded in the use of other communicative media and other forms of telepresence, instant messaging in particular. Second, as we have seen, the establishment of the video connection often involves delays and asymmetries in the establishment of the audio and video connection. Third, as it become routinized, Skype communication may involve participants at home experiencing an increased time pressure in the management of various domestic tasks (Hochschild 1999), using Skype while being involved in all kinds of other activities. The development of multiple greeting sequences in Skype openings shows how a proper interaction frame for video communication can be collaboratively achieved in a stepwise fashion, involving successive appearance/greetings, each enacting a recognizably distinct form of mutual presence.
2.6A subtly orchestrated choreography of multimodal appearances and greetings
Multiple greetings provide empirical evidence for the importance of the appearance/greeting organization in the emergence of a sequential patterning of interaction. It provides a resource to understand how participants may manage particularly complex and extended Skype openings, in what I will describe as an unfolding dance of appearances and greetings. The next fragment provides a characteristic example of such a situation. It involves the same participants as in Fragment 3, as well as Léa’s young son Tim.
Hal launches the video call, and when the video image appears, simultaneously enough for any asymmetry in that respect to go unnoticed, the situation is made peculiar by the fact that caller and recipient do not present themselves as attentive talking heads. Only the caller is visible, possibly looking towards the screen, but he is standing up and speaking on his mobile phone. On the call recipient side, only her son Tim is visible, and apparently attentive (Image 5.1). Tim initiates a greeting plus recognition markers when he sees Hal (Line 6) and thus displays his own ability to recognize and manage the sequential implications of the visual appearance/greeting pair organization, even at his young age.
1. Hal ((launches connection)) 2. ((rings)) 3. Hal ouais oui (.) *#moi j’suis pour ((on the phone)) 4. *((video image appears)) 5. Im # Image 5.1 6. Tim bonjour coco (.) *bonjour mon coco ((two steps back)) hello coco hello my coco 7. Hal *je suis au telephone ((IM)) I’m on the phone 8. Léa coucou ((IM)) 9. Tim ah y telephone ah he’s on the phone 10. Hal ((stops typing and stands back))
Hal bends forward, and type’s “I’m on the phone” to Léa on the messaging window. Though this account for not appearing properly may be triggered by Tim’s question, and is designed in a way it could have been uttered as a verbal response to his greeting, this change in communicative channel reshapes the participation framework and recipiency. It effectively ignores the boy and excludes him as a recipient for Tim can’t read yet. Using the messaging channel rather than talk or gesture can also be understood as crafting a response which is compatible with his involvement on the phone (and reflexively points to his verbal unavailability), while being sensitive also to the fact that Léa has not appeared on screen. She replies immediately by providing a greeting turn of her own, which signals that she is close enough to the computer to respond, and may yet anticipate some availability issues on her side as well.
‘Coucou’ is a highly indexical French greeting expression, which, to the best of my knowledge, cannot be translated in English, lying somewhere between ‘hey’ and ‘peekaboo’. It indexes both the suddenness of appearances, in which persons perceive themselves as forthrightly put in the presence of one another with a capacity to interact in some way, and the embodied work involved in producing the resulting field of mutual perception. The use of coucou may therefore also index the briefness of an encounter, and it made its way into an often used meta-pragmatic utterance, “je te fais juste un petit coucou” (“I’m just doing a little coucou”), which is often used to frame a visit or a phone conversation as a short, unmotivated one, as if it were ‘just to say hello’. As a greeting like ‘hello’ or ‘good morning’, it is responsive to an appearance which it constitutes as such, but, perhaps more than these conventional greetings, it also, and especially, attracts attention to the occasioned character of the mutual presence – its suddenness, say, its unexpectedness, its charm because of these sorts of properties. Consequently, while these conventional forms of greetings involve a social recognitional dimension, i.e. identifying and making the interlocutor into ‘a distinct being worth recognizing’ (Duranti 1997Duranti, A. 1997 “Universal and Culture-Specific Properties of Greetings.” Journal of Linguistic Anthropology 7 (1): 63–97. ), and are therefore resources for doing politeness, this is less the case with coucou, focused as it is almost exclusively on the instantiation of a shared moment. This is the reason why coucou is a greeting that is mostly used between participants who are already familiar with one another, or on playful occasions for which one may dispense with the politeness and social recognitional dimension of more formal social greetings. These two reasons combine to make the production of coucou a common enough occurrence in Skype video interpersonal conversation openings (besides this fragment we already had examples of that in Fragment 3), for the latter usually involve participants who are oriented to appearance events and who enjoy rather close and intimate relationships. Finally, though coucou may be used in situations like phone calls or instant messaging as here, it often has a strong visual connotation that makes it particularly relevant to the treatment of seeable matters. It is commonly used, for instance, by children when playing hide and seek. Piaget even made what he called the game of The Coucou (that is making objects appearing and disappearing in the visual field of young infants) into a crucial method to his psychological investigation of how children may understand and recognize presence and identity (Piaget 1954Piaget, J. 1954 The Construction of Reality in the Child. New York: Basic Books. ).
Coucou can be used both as a first position and a second position greeting. A second position coucou like the one observed here is nicely designed to reciprocate Hal’s initial IM greeting, while pointing at the occasioned character of the situation of textually-mediated mutual presence this instant messaging exchange of greeting enacts. By responding in this way rather than on other available, and possibly more expected, modalities of interaction (such as talk or mutual gaze), she also displays that she might be somewhat unavailable in that respect. Coucou might also index here the expectable briefness of their messaging conversation, clearly done under the auspices of the project of launching a video-mediated conversation and therefore framed as ancillary to that project.
2.7‘This is not to be seen as the beginning’: Designing visual appearances and greetings so as to neutralize some of their sequential implications
11. Léa mm 12. Tim *à quelqu’un d’autre (.) to someone else 13. *((gazes twice left towards mother )) 14. Hal ((speaks on the phone)) 15. Tim #ça va? how do you do? 16. Im #Image 5.2 17. Hal #((waves)) 18. Im #Image 5.3 Image 5.2 Image 5.3 19. Léa #((waves back)) 20. # Image 5.4 21. Léa ((bends into the screen)) #((whispers “coucou”)) 22. # ((image 5.5)) Image 5.4 Image 5.5 23. Hal ((walks backwards)) 24. Tim *et pis arête de (téléphoner) and now stop phoning 25. *((pointing to the mother’s phone))
In this segment, Tim provides a noticing to his mother, realizing that Hal’s on the phone. Unlike his mother he does not treat that as a reason not to talk, for he utters a “how do you do?” (Line 15). Hal responds with a wave, the choice of a gesture to respond displaying his oral involvement in another activity (Line 17). The gesture is, however, visible to the invisible Lea, who treats it as addressed to her, and shows this by first bringing her hand into the center of the screen and then waving back (Image 5.4). She then leans into the screen, so that she becomes visible as such, and repeats her hand waving while whispering an inaudible but visible “coucou” (Image 5.5), before leaning back and disappearing off-screen again. By responding to the hand wave, which was initially addressed to her son, she effectively turns the sequence into a second greeting sequence between her and Hal. She then exploits her own gesture, the embodied displays of involvement and her mute talk, as a mutually elaborative process that affords the kind of meaning she desires – a greeting (Goodwin 2000Goodwin, C. 2000 “Action and Embodiment within Situated Human Interaction.” Journal of Pragmatics 32: 1489–1522. ) even though the conclusion of this entails a disappearing visual act, her body disappearing off screen whilst she waves in view. She becomes for a fleeting moment only an arm and hand. Performing communicative gestures such as this, out of view, would appear to run counter the organizing maxim of video communication which requires the relevant co-participant to be visible (Licoppe and Morel 2012Licoppe, C., and J. Morel 2012 “Video-in-Interaction: “Talking Heads” and the Multimodal Organization of Mobile and Skype Video Calls.” Research in Language and Social Interaction 45 (4): 399–429. ), a normative order that implies as well that participants seek to avoid – or at least manage – restrictions of view. Lea therefore seems to orient to that requirement by self-repairing her previous greeting into one in which she waves and she is visible, with the “coucou” highlighting her sudden visual appearance and mutual visibility.
Usually a visual appearance and a return greeting like this would signal the possible beginning of a video conversation and project some further development. However, I would argue that her appearance and greeting are subtly crafted precisely to neutralize their sequential implicativeness: (a) the coucou is mute, highlighting the fact that she and Hal are on the phone (and partly unavailable) and that talk (and of course further talk might be a perturbation; (b) the coucou may be heard as pointing to the briefness of this moment of mutual visibility; (c) that reading of coucou is retrospectively emphasized by her visual disappearance afterwards; (d) since ratified participants are expected to be on screen in video communication, moving off screen can be read as a withdrawing oneself from an active participative status as speaker or recipient. The design of her return greeting and her subsequent disappearance are thus understandable as making any uptake irrelevant by ostensively displaying a fleeting involvement in the video communication at this juncture. In sequential terms, it says something like ‘though it might look like this, this is not to be seen as the proper beginning to our video conversation’, and Hal orients to this by not providing any recognizable response to her visual appearance, though she has actually become visible for the first time. Quite interestingly for our argument, she makes visible here the kind of work which is needed to neutralize some of the sequential implications of visual appearances and greetings in video communication.
2.8A sequentially frustrating succession of finely coordinated appearances and withdrawals
31. Tim (inc) ((to his mother)) 32. Hal mais moi j’risquais *d’être à Dijon# *à 33. Im # Image 5.6 34. *((reappears into screen)) 35. Tim *((moves closer)) 36. Hal #[c’moment-là 37. Tim |COMME [T’ES BEAU ((said getting close to screen)) how beautiful you are 38. Im #ImageImage 5.7 Image 5.6 Image 5.7 39. Hal *#passqu’y a::(.)*# Amandine qui passe le 5 juin *euh (.) à la 40. Tim *((moves back and pauses, looking at the screen for effect)) 41. Hal *((turns back and starts to move away)) 42. Tim *((looks left)) 43. Hal télé et j’aimerais bien le voir (1.0) hh d’ailleurs faut que 44. j’te file euh:: *#faut que j’te file *#euh::[son : son disque (.) 45. *((turns and starts pacing back towards screen)) 46. Tim *((looks back at screen)) 47. Im #Image 5.8 48. Hal *((faces screen)) 49. Tim *((moves to screen)) 50. Tim [tu téléphones à qui? you’re phoning to whom 51. Im #image 5.9 Image 5.8 Image 5.9
A few seconds later, Hal walks back into screen, and stands in front of it (Image 5.6). The child orients to this talking head-like appearance (though Hal is talking on the phone): he gets close to the screen and utters an assessment of Hal’s visual appearance (Image 5.7). He then stands back and pauses looking at the screen, apparently expecting a response, which does not come: Hal goes on talking, turns his back and walks to the other end of the room. As he does that, the child turns away from the screen, looking back at his mother, who is still on the phone. Then, as Hal starts to turn back towards the screen, the child looks back to the screen and starts to monitor Hal’s actions (Image 5.8). When Hal walks back and stands once more in front of the screen (in effect appearing again in a way that can be understood as making him visually available as a potential co-participant to a video communication, Image 5.9), Tim seizes the opportunity to ask a question, about his being on the phone and which points to his unavailability. Once again, he is denied an answer and Hal moves away.
This little choreography will be repeated a few more times, with the child uttering gradually less sequentially-implicative utterances as Hal appears again in the screen, such as mere vocalizations. Thus, even though he seems to habituate to the lack of response, the child still orients repeatedly towards treating Hal’s appearances as sequential opportunities for him to talk, and to expect answers. He thus seems to grasp fairly well, even at his young age, the morally implicative sequential order of visual appearances, and the interactional relevance of talking head configurations for video-mediated communication, while still having difficulties with handling the consequences of Hal’s displays of unavailability and multi-activity with respect to the interactional management of those same visual appearances.
2.9Actually beginning the video conversation: A third greeting sequence
65. Hal bisous ciao ((on the phone)) kisses ciao 66. ((hangs up, turns, comes *#back and bends towards screen))# 67. * ((changes position)) 68. Im #Image 5.10a 69. Im # Image 5.10b 70. Hal #salut les genoux 71. hello les genoux 72. Im # Image 5.10c Image 5.10a Image 5.10b Image 5.10c 73. Léa coucou 74. Léa # ((sits so as to get into the frame)) # Image 5.11 75. Tim comme t’es *#BÔ::::Ô you’re so beautiful 76. Léa *((sits progressively to left)) # Image 5.12 77. Hal ah c’est gentil toi aussi t’es beau ah that’s nice you’re beautiful as well 78. Léa ouais il a son nouveau (inc) t-shirt yeah he’s got his new t-shirt
Hal eventually finishes his phone call, moves back and leans close to the screen, thus making himself ‘appear’ once more, (Images 5.10a to 5.10.c). He produces a new greeting “hello the knees” (Line 70). This unusual greeting is framed and hearable as ironical on the basis of the moral order of video-communication, that the face of speakers and recipients should be visible on screen. This makes it possible to pick up what appears on screen as a candidate ‘greetable’, whether knees or something else. But of course knees are not a part of the body usually involved in addresses and greetings, and knees cannot return a greeting. Moreover, since Hal has just displayed for a first time a full involvement with the video conversation (hanging off the phone, coming close to the screen), this new greeting, again done under the auspices of the project to start a Skype video call, can be seen as orienting more specifically towards the possible beginning of the call. The multimodal context and the unusual design of the greeting conspire to project strongly a response. Since the greeting makes Léa’s partial invisibility a salient feature of the situation, it may be heard as an oblique and indirect request (by Hal) for her to show herself if she can. She would appear to hear it in that way, and thus to orient to the same visual moral order: she moves back into to the screen (Image 5.10 a to c), which incidentally shows that she is not on the phone anymore. Just as she starts to appear visually (Image 5.10c), she utters a return greeting (Line 73). Her choice of greeting is a second position coucou, which nicely highlights their joint visibility and availability, the kind of display of a joint involvement achieved for the first time in this sequence. The child seems to pick that something is going on sequentially. He seizes the mutual visibility and the accomplishment of the greeting sequence as an opportunity to repeat his previously ignored assessment, showing some persistence in his pursuit of an answer and refusal to be denied one. He will get one now from Hal, which shows that all participants are orienting to this as the proper beginning of the video conversation.
This whole sequence therefore provides a nice instance of a complex opening in which, for contingent reasons, participants appear, from the start, unavailable for video communication though they have in fact, initiated that communication. It shows how the interactional dynamics of this pre-opening sequence can be driven by the way participants craft various forms of appearances in its course, and which make relevant responsive actions and occasions elicitations of multiple greetings. Moreover, the young child’s behavior shows he is already competent to detect and manage the sequential implications of appearances, though less so to treat the moral and sequential consequences of visible displays of multiple involvements and multi-activity. Taken as a whole, the Skype call proceeds like a kind of dance of appearances and multiple greetings that involves all, child and parent, remote party and those together, trying to peer into the screen and be noticed by the one far away.
I started with the observation that the openings of Skype conversation are often complex and protracted sequences, the unfolding of which is subject to contingencies of various origins. As a consequence, openings in Skype video communications exhibit a relatively high degree of variability and it is somewhat difficult to identify a standard opening sequence as has been done with phone conversations. Unlike phone conversations, both caller and call recipients in Skype communications may happen to speak first. In addition to this, and because of the spatio-temporal discontinuities that can be introduced by video-mediating technologies (in the sense that the screen and camera can constrain what is seen and shown, and how temporal patternings in hearing and seeing can also be unsettled by failures in signal), video conversation openings can unfold not only as complex but as a succession of appearances. They are events to which participants orient to as cuing how they have (or believe they have) been put into and made available for the ‘presence’ of one another in a very particular way – in the very instance of the seeing, in the very second of the noticing; appearances can repeat themselves as if actors keep walking back on stage and demand the audience to respond.
In my analysis, I have shown how participants treat such appearances as noticeable and as sequentially implicative first pair parts, projecting some next action, and do so under the form of a responsive move, oriented to the particulars of the appearance. Whoever experiences some occurrence as the potential appearance of a co-participant therefore behaves as entitled to respond to it, and, particularly in openings, where such appearances occur under the auspices of a project to initiate a video conversation, do so as if they are expected to. As the example of the child in the last fragment shows, managing the sequential implications of appearances seems to be acquired early in life, a basic interactional skill that may have only been enabled by recent technologies but has evidently suffused the skills of the vast majority of those who engage in remote communicative practice.
These expectations and skills are, needless to say, normative, and may be reinforced when the establishment of a joint interactional space is experienced as a delicate, complex and fragile accomplishment. My evidence suggests that this is often the case with Skype video-mediated communication. Recognizing and noticing whatever may count as an appearance and thus making it interactionally relevant in Skype calls becomes a powerful resource in the process of achieving, collaboratively, a proper joint interactional frame of interaction – the purposes of a call. This accounts for why a caller or a callee in a Skype communication may speak first: either can orient to such expectations and can provide a ‘noticing’ of the appearance of the other on the screen at the first opportunity, whenever that might be. It can even be achieved by simply self-selecting to speak first. This speaking first can also be repeated, as a sequence of actions later in a Skype call, depending on how the callers manage the frailties of their purposes, the unreliability of their Skype connection, and the contingencies of their domestic spaces. Moreover, when such appearances can be understood as ‘appearances-for-the-first-time’ (which is the case in openings: openings are precisely a lapse of time in which appearances are likely to be ‘appearances-for-the-first-time’ and oriented to as such), the projected next action becomes a greeting as well. Yet sometimes all types of appearances seem to be conflated and occur at once, at the start of a video call. Then participants engage in greetings and re-open (so to speak) the call. But quite often, different kinds of appearances may occur in succession (participants may first message one another, launch the call, get the audio feedback before they become visible, get visible before they are visually displaying a proper involvement by orienting to the screen and gazing at it, etc.) and asymmetrically (one participant may get perceptual access to the other before the other way round). Consequently, and because of the sequential implications of appearances (providing for a kind of noticing which frames retrospectively the occasion as a first time appearance and offering a slot for the co-participant to (dis)confirm the mutuality of the occasion), participants will attend to them and are expected to attend to them as they contingently happen. The more protracted the connection and establishment of a joint audio-visual focus of attention, the more Skype video call openings might look and be done as an apparent choreography of appearances and multiple greetings, a mutually elaborating set of movements and articulations that bind the participants together as much in that very moment in time as they do for the next steps in the call, in the future moments as it were, once the greetings have been done.
The choice and design of greetings in multiple greeting sequences is then, and unsurprisingly, finely tuned to the particulars of the situation (the kind of appearance-for-the-first time they could be seen as responsive to) and to what a particular choice of greeting might do. As a case in point, I have discussed how the French greeting coucou can be a common feature not only in the openings of French Skype conversations, but later on too, dependent upon the contingencies that affect that call and indeed the playfulness of the parties involved who might want, in their playfulness, to make such contingences happen. Coucou is the kind of greeting which not only builds on the occurrence of a recognizable appearance as a relevant sequential context for its utterance, but also seems to point to appearance as process, as something to do: it highlights a potential appearance as an occasion in which a mutual perceptual field and focus may have been achieved and is sought for. Coucou is therefore particularly relevant in an interactional setting where the establishment of a joint interactional frame is a sensitive issue, whether because it is a delicate accomplishment given that the technological mediation makes it so (as in the case of VMC), or because it is an opportunity for ludic behavior in such mediated communications (and this accounts for its original use in children’s games of hide and seek, of course). So while it is obvious that video communication is about being able to speak to, and see one another from, a distance through a screen, the establishment of a proper interactional frame for such communication involves participants hearing and seeing one another and demonstrably displaying focused and concerted involvement by orienting towards the screen. What I have shown is that they have various artful ways of doing so, some more commonplace than others – performing in the recognizable ‘talking heads’ mode is perhaps the most obvious. Indeed, that this is so is also a resource for the visual moral order of video communication: departures from that talking heads mode are noticeable and accountable – a resource for complaint, compliment, mirth and seriousness. With respect to openings, such a normative visual organization frames what might constitute a proper beginning for participants, i.e. a shared mutual appearance as talking heads, occasioning (possibly, but not necessarily offering the final instance of) greetings. Other kinds of events which can be treated as appearances, such as hearing one another, can also be recognized as intermediate steps towards the establishment of a ‘proper’ video interactional frame, and the greetings they occasion as not signaling the beginning of the video conversation, but its recommencement, a focus that is renewing itself. Such an organization also makes some visual occurrences recognizable as appearances of sorts, also and in some cases even worthy of a greeting, as when one participant is visible, but is not gazing at or orienting to the screen, and she suddenly turns to it for the first time (Fragment 4). Conversely, treating such an occurrence as an appearance and an occasion for a greeting displays an embedded orientation towards joint talking heads as the expected configuration for video conversations. This visual dimension also accounts for the relevance of the use of coucou. Though it can be used for all kinds of appearances (we saw it used as a greeting at the start of instant messaging sequences), coucou used as a greeting particularly highlights sudden mutual visibility and attention.
Beyond this, analyzing the initiation of video conversations in terms of appearances/noticings, and appearance-for-the-first-time/greeting pairs, not only allows us to bring some order with respect to the apparent variability and complexity of Skype video calls. It also brings up new insights with respect to what might constitute a greeting (and by extension, an appearance). Since appearances and greetings are sequentially tied and mutually elaborative, and as a pair, highlight the reflexivity between talk and its setting, whatever follows a recognizable appearance-for-the-first-time may constitute a greeting, even if not conventionally designed as such. Conversely, uttering a greeting displays a participant’s understanding of its immediate sequential environment as constituting a sort of appearance, and makes it investigable as such for the analyst just as much as for participants. The kind of analysis I have developed here may be extended elsewhere, for expectations regarding the recognition and treatment of events as sequentially-implicative appearances, are constitutive features of the moral organizations of openings in many, if not all, social settings. How participants ascribe and are accountable for ascribing the status of appearances to various occurrences allows a finer understanding of how the sequential organization and enactment of participative roles constitutive of any social encounter may emerge from a continuous stream of embodied conduct and lived experience. It points towards what I have called the dance of communication – a dance which is quite literally a choreographed movement of pairs of bodies (or more), and where the dancers work together despite their physical separation. It also opens up the possibility of cross-cultural comparisons regarding how the phenomenal presence of agents of all kinds, whether it be friends or strangers, humans or animals, ghosts or gods, may be made manifest in communications settings and how, in these manifestations, the participants claim some form of existence and social presence in their sites of communication. It points also to how, within these, they demand to be recognized, noticed, greeted, or ignored; this may be a dance, but it is a consequential one. This dance matters. And its consequences are shown in the multiple ways users of Skype technology greet each other; not once but again and again, doing so when the mood seems right, when the dance is in ‘sync’; when a coucou is not a spontaneous reaction, but a coordinated play that comes across as spontaneous; as a move, in other words, in the choreography of twenty first century communicative practice.
Address for correspondence
I3-SES, CNRS, Télécom Paris Tech
Telecom ParisTech, Paris