Streaming Chat and Connective Effervescence

or, the end of the Feed and the beginning of the Stream

With AOC-on-Twitch becoming a regular thing, it seems like a good time to investigate the power of “streaming chat” as a media technology.

In earlier media regimes, with theater, opera, and political speeches and debates taking place live, audiences were active and involved. The audience member was constantly bombarded with boos and cheers, shouted slogans or laughter. The mass media era of the 20th and early 21st century entailed the “pacification of the audience.” People became accustomed to consuming broadcast media in their own homes, and their perception of the role of the audience changed.

Streaming chat represents a return to a more active audience, to a re-orientation of the experience of being an audience member.

The format of the streaming chat also seriously constrains how people choose what to say. The primary differences between participating in a streaming chat and leaving comments in other online spaces is the speed of the chat. Protracted conversations are impossible; instead, commenters primarily react directly to what is happening in the online broadcast. The shared knowledge that the communal attention is focused on the video media makes it possible to speak without specifying a referent. Commenters can say “that was awesome” and rely on temporal proximity (contiguity) for mutual context.

This means that the already ferocious competition for attention in online platforms is only heightened. There are many stimuli happening simultaneously, and the main broadcast covers the majority of the viewers' screen.

Connective effervescence—the digital crowd

The primary purpose for many commenters is thus better understood as collective expression than anything approaching persuasion, let alone deliberation. A useful metaphor is that of a fan at a sporting event: there is an expectation of yelling or otherwise vocally responding to the focal action. The goal is partially to antagonize your opponents and hearten your allies, but more directly, to become part of living mass of people, to experience the digital analogue of Durkheim's “collective effervescence.”

Although previous scholars have applied the concept of collective effervescence to describe other forms of digital communication, we argue that the technological affordances of streaming chat make this a qualitatively different experience. Text-based digital media are constrained by the rhythms of writing; the latency is too high to approach the experience of being a live crowd of people. One-to-one (or small group) videos are low latency, but cannot numerically create the crowd experience; bottlenecks in information transmission like internet bandwidth, screen definition and audio quality mean that even our advanced communication technology cannot replicate the live crowd.

I want to drive this home. The hardware, software and userbase of the internet are all changing rapidly, in hard-to-observe ways. As an old person and an academic, my experience of the internet has for years been calcified at the level of technology I’m most comfortable: I mostly just look at Twitter on my desktop. But the technological frontier is shifting, and forms of digital communication that were once impossible are now real. I frequently catch myself thinking with the implicit assumption that these parameters (mobile broadband, video upload speeds, camera quality, distribution of video editing skills in society) are fixed—but they aren’t fixed, instead they are the digital bedrock on which the platforms that dominate our imagination of online politics are built.

Streaming chat around a focal broadcast accomplishes both low latency (frequent updates) and distinguishable actors (hundreds of distinct text posts): short bursts of emotion that allow each commenter to perform their role as part of the crowd, thereby experiencing what I call “connective effervescence.

Where earlier social media is divided into posts, a Twitch or YouTube stream is just one long flow, happening in real time.

One important implication is that sophisticated content moderation becomes completely impossible ex ante, which is why streaming companies have adopted punitive measures for ex post content moderation. See below for one of the most important moments in the career of one of the world’s most famous humans.

Ex ante moderation of chat is possible only through blunt bans on certain words. Just like in the Chinese case (where blunt bans on words or topics gave rise to image- or homophone-based puns), streaming chat has rapidly evolved a colorful language of specialized emoji and text memes that limit the effectiveness of this style of content moderation.

Even more than Twitter, the context for the speech that comprises connective effervescence is temporally local; most of the comments mean nothing without the context of the stream, and they certainly aren’t intended to be read later. This is, of course, how almost all human conversation works outside of formal settings.

We should expect to see communication technology continue to develop in this direction. This chart, by Tasha Kim, is a great visualization of the recent history of media technology, and correctly identifies TikTok as an entirely new beast due to its scope and virality potential.

However, I think that Twitch and other streaming chats conducive to connective effervescence belong in a separate category, Social Media 4.0. That’s a silly distinction, but despite the power of TikTok’s affordances, it remains stuck in the world of the individual post, uploaded and then static forever. In my schema, Twitch and other streaming chats belong in their own bubble that also includes the upstart Clubhouse, which uses real human voices in conversation to create real-time conversations that flow. The tectonic shifts in the bedrock of digital politics—the hardware, software and user skillbase stacks—means that the feed will be replaced by the stream.

From the perspective of the researcher, Social Media 4.0 poses huge challenges. Our intuitions about how it works can only be developed by spending time with it; there are no old media analogues, and intuitions developed by time on Twitter or Facebook will be badly misguided. Data collection is also dramatically more difficult, as these “platforms” do not naturally store data in a format that can be easily accessed after the fact. Theoretically, the effect of using streaming social media will likely be distinct from previous forms of social media, but we should still expect those effects to be large: television/radio + a social component.

Impermanence joins encryption as a tool for rendering more and more digital communication difficult to research at scale. Rather than continuing to treat the “clearnet” of Twitter, Reddit, Facebook, YouTube, and honestly mostly Twitter as at all representative of digital communication, quantitative researchers will need to come to terms with the limitations of the new age. It’s weird thinking that the age of “peak legibility” of digital media is already past, but given how things have been going, I’m not opposed to moving towards a more human social media.