Cookies disclaimer

Our site saves small pieces of text information (cookies) on your device in order to keep sessions open and for statistical purposes. These statistics aren't shared with any third-party company. You can disable the usage of cookies by changing the settings of your browser. By browsing our website without changing the browser settings you grant us permission to store that information on your device.

I agree

05 2022

Transversal Sounds – Part I

Some ideas for thinking about audio publishing (and the auditory more broadly)

Kelly Mulvaney

Seemingly without prior consultation, it became clear around a year ago that several people involved in transversal texts had become curious about expanding the work of the little nonpublisher-machine of books and blog and multilingual webjournal into sound. Surely in the bigger picture, the timing was no coincidence—the pandemic, not least, comes immediately to mind. Nor were the personal interests in sound necessarily new. And yet there was something delightful to the apparent coincidence.

By now, first attempts have been made. A general idea has taken shape that the practice will be to produce various things, from podcasts, one on end, to text-related sonic-artistic experiments, on the other. And perhaps some things in between.

As production sets out, an exciting and yet expectedly slowgoing undertaking for a disparate bunch otherwise held together, among other things, of course, by various numbers of decades of engagement in text production, the following text (yes, text!) is written in the spirit of initiating some common thinking about sound. Common thinking in the sense that it may well be questioned, and most hopefully will be supplemented and extended. Above all its intention is to spur practical thoughts: by presenting some ideas about the current moment of audio formats, the specificities of sound in relation to text, and the common ways the two are culturally distinguished, the text is an invitation to think in more precise terms about what transversal sounds might be (and perhaps what might be avoided).

The limits of what follows are my own—the heavy drawing on Anglo-American anthropological literature and absence of much else, my newcomer’s status in relation to the topics. At a certain point, however, I also had to choose omissions, and therefore there is one glaring absence and one striking narrowness: the absence is a consideration of the digital, beyond the initial consideration of contemporary social conditions; the narrowness regards the restriction of what is considered to the sociocultural formation of Western modernity/contemporaneity, which is moreover presented somewhat schematically.

If what follows is not comprehensive, it is also not conclusive. In the place of a closing discussion, the text opens into a short list of sound-based art, research and activist works that transversal might learn from, briefly commented and with links.

The currency of sound – why now?

Why sound now? Like any question, this one arises in a certain situation; we ask it, for one, in the wake of a rapid expansion of possibility and popularity conditions for audio formats in publishing. As an acquaintance recently remarked: today anyone who is important has a podcast. Of course, it’s not only podcasts. It’s audiobooks, Spotify accounts, YouTube listening, Soundcloud and Bandcamp publishing. The current moment of audio-based “content” sharing, to redeploy a concept that is part of the phenomenon (and one that transversal can critically engage), takes shape as new aesthetic and discursive forms develop out of and promote novel digital and platform-capitalist commodification conditions—in all their subjectivational, geopolitical and technecological dimensions and contradictions. A key term to think about such new publishing formats might be “portability”, a concept anthropologists Anya Bernstein and Paul Kockelman deploy “to characterize the degree to which the meaningfulness and (means-ends-fullness) of a semiotic technology is (or at least seems to be) widely applicable and/or contextually independent.”[1] Categories such as ‘content’ (or ‘information’) rely on a high degree of culturally presumed and technologically enabled portability in this sense, implying processes of cultural/psychosocial and infrastructural commensuration reminiscent of language standardization and commodification. And yet, as will be expanded below, the dominant occidental ideology (and for some, the proposed ontology) of sound and especially of voice as singular and non-commensurable posit the latter as resistant to such processes.

Considering the portability of audio formats and their ‘others’ in the simpler phenomenological sense of “carry-ability,” an audiobook, given the availability and possession of the required digital-electronic means, seems to be more portable than a visual-text-based book (paper or electronic), a podcast more portable than an audiovisual recording of a conversation. Like an e-book or video file, an audio format can be carried on a smartphone. But it can also be accessed, engaged, consumed while human carriers port themselves, are auto-mobile, or oppositely, in a state of what appears to be almost total inactivity. Compared with text that must be read, or an audiovisual recording that must be watched, the audio-only format leaves the eyes free to navigate their broader environment—or to rest. The audio-only format would seem to be porting less: less data/demand/intensity. And maybe this experienced “lessness” of sound is part of its current popularity—amidst subjective, geopolitical and technecological conditions of exhaustion.

Given ambient stress, sound can be distraction. Given productivity pressure, auditory information consumption can be part of an effort to multitask, to listen to one thing while the eyes engage another, or to remain productive longer, after the eyes can do no more. In a setting of visual information overload and simultaneous information addiction, there is a turn to sound. Even as sound might seem more sensorially demanding than a text (for it is literally louder), if the eyes are particularly exhausted and reading requires a certain level of concentration, then the experience is qualified differently. My drowsy eyes staring at a text might be frustrated, might fall asleep; half-listening to a podcast in the same state, given a certain tone I might be pleasantly displaced, lulled to rest, given another I may feel enlivened. What of this lulling quality? And what of the capacity of sound to arouse? The point is not that visual/text formats cannot evoke the same feelings, but rather to think about the distinction. If loneliness and alienation are endemic in modern and neoliberal social life, they are even more prevalent in the pandemic society. Could the great rise in popularity of audio formats be related also to something about sound, and particularly about the sound of human voice, that can feel present, intimate and personal, soothing or rousing, in a way not evoked by words on the page or screen? Sound studies scholar Michael Bull, drawing on Adorno’s writing on music, would offer a qualified affirmation: the auditory is unique in its capacity to evoke feelings of ‘we-ness’; the relationship between technologically mediated music and the desire for social attachment is ambiguously utopian.[2]

Social/cultural theory of sound: a modern other

In social and cultural studies of sound, as both medium and sensorially as hearing, sound is most frequently distinguished from the visual and from text. For purposes of initial analytical clarity, we might think of ‘the visual’ and ‘the auditory’ as connoting broader fields of sensory experience, while ‘text’ and ‘voice’ conventionally pertain more specifically to human communication. Text in this analogical pairing is considered a visual technology. These categories may be useful, but they are also questionable constructs. What follows is not so concerned with maintaining strict relationships: sound is considered here in relation to text, there in relation to the visual.

Sound distinguishes itself from text sensorily, in its relation to the embodied psyche. The point of departure for considering this distinction in much social and cultural theory of the senses comes by way of reinstating the dominant categorical dualism of modern thought: text as abstract, sound as concrete; text as virtual, sound as embodied; text as rational, sound as sensory/intuitive; text as language, as standard; sound as speech, as vernacular. More implicitly but no less powerfully: text is male, sound is female; text is European, white, sound is non-European, black. Text is conceived in terms of ‘print capitalism’ as quintessentially modern,[3] and sound—not least as corollary to the argument that modernity is essentially visual—as primordial or archaic.[4] The schema, of course, has political implications: text becomes potentially public, autonomous, and generally interpretable; sound, including spoken discourse, is inevitably private, situation-dependent, and restrictedly interlocutionary.[5]

Even if ‘schizophonic’ technologies, i.e., technologies that allow voice to be separated from its source,[6] have made spoken discourse more publicly accessible and situationally detachable today, the authority granted to and conjured by text as public, autonomous and interpretable (and culturally masculine/white) lives on. This is the case especially in academic/scientific standards of what counts as knowledge and professional output, which themselves are one if not the primary site of that authority’s institutional emanation. And yet the relegation of sound to the ‘non-modern’ is inaccurate: sound technologies have been integral to modern subjectivation, and the boundaries between spoken and written word have never been as strict as some nineteenth and twentieth-century social theory imagined.[7] The assumption behind the idea that the senses can be conceived as separate from and privileged in relation to one another is misleading, writes anthropologist and ethnomusicologist Veit Erlmann, insofar as “it makes scientific sense to conceive of the senses as an integrated and flexible network but also, and more importantly, because arguments over the hierarchy of the senses are always also arguments over cultural and political agendas.”[8] And yet cultural and political agendas, of course, can be very powerful. It may be inaccurate to consider the senses as separate, to consider the visual to be more ‘objective’ than the auditory, and yet this is common practice.

The discipline of anthropology is illustrative of how sound technology has been used as the subordinate other to and in the service of authorizing the textual. The tape recorder as a sound technology played a key role in the development of the modern, text-based discipline and its claims to scientific authority. Considered in simple terms of information theory, the task of the historical field ethnographer was to record oral knowledge, then to distinguish signal from noise in those recordings and to render the former legible in text. In this way ethnographic texts contributed to the state-led process of producing ‘legible subjects’ traceable by text-based bureaucratic technologies, textualization implying controllability. This techno-institutionally endowed, hierarchical sound-to-text translation relationship remains a cornerstone of ethnography and plays a key role in other social domains (think of bureaucratized/mediatized fields that rely on this kind of translation, from social work to medical documentation, and perhaps somewhat differently, journalism). In the meantime, a variety of approaches have been developed and probed with the aim of managing this relationship with critical reflexivity and care. But it is also telling that a common tool for calling attention to such relationality is the question “who speaks for whom?” (rather than, more literally, who inscribes on behalf of what speakers?) The sound-to-text aspect of the relationship is ignored, the differences between the two modes of perceiving/expressing/knowing.

This erasure occurs easily insofar as the same general qualitative distinction drawn between text and sound is projected onto other relationships. The phrase “who speaks for whom” is illustrative insofar as it points out hierarchically distinct social positions but also ways of speaking. Ways of speaking that are closer to standardized written language are generally thought to be more understandable and thus more transparent, and more authoritative and valuable, than less standard-proximate speech. This kind of ideological-sociolinguistic ordering (attaching sociocultural qualities to sound) tends to buttress the habituated ways people link ways of speaking to social identities (attaching social identities to sound).[9] Both processes are inseparable from writing technologies and political authority and histories of language standardization, but they operate as sound-based phenomena, wherein the public/private, general/particular distinctions deployed to distinguish text from speech, in the implicit and often unintended service of certain cultural and political agendas, are (re)deployed to hierarchically distinguish vocal sounds.[10] This occurs in everyday social life. Somewhat differently, a similar mechanism can be found in theories about what speech should count as public discourse or communicative action, although here public/private is a starting point from which kinds of speech are distinguished from one another in terms of semantic content. The point here is not so much to precisely trace these recursions as to suggest that at this level of social authority it is perhaps the “fractal recursion”[11] of the distinction and the devalorization of the auditory(-like) in relation to the text(-proximate) that is decisive.

Sound studies and voice studies: re- and transvaluations

Against this backdrop, it is not surprising to find strategies of those engaged in (critical) sound studies and sonic artistic practice aimed at the revaluation and transvaluation of sound(s). On one end of a spectrum, there are attempts to focus on and draw out the special uniqueness of sound, including in its imbrication with other culturally devalued senses, especially touch, and to demonstrate its richness and forgotten histories. Such approaches complicate easy separations but tend to maintain the dominant categories. Erlmann’s edited volume Hearing Cultures, for example, as an attempt to think about sound as “lively” (as opposed, presumably, to dead), might be considered in terms of the former strategy. The book contains chapters that conceptualize the “bodily force” of speech, the historical deployment of music as medicine, and the tactility of sounds that emanates from “umbilical continuity” with their source.[12] It is exemplary of this kind of strategic approach of revaluation to claim that sound—and not only vision—is also significant to modern history. There is a sense of setting right a wrong, of recognition, and claims are made in tandem with attempts to similarly recuperate the body/embodiment.

Also attuned to sound as embodiment, architect/poet Robert Kocik and choreographer Daria Faïn inaugurated a research field they call “The Prosodic Body” and describe as “an experiential artscience that collectively explores prosody: the tone, tempo, intensity and total body language of speech, the ways in which words say what they say and say more than they can say.”[13] This probing of sounded language engages with language in its “expressive, emotional, musical, motivational, kinesic, interactional and most-intensely connotative and communicative aspects”[14] and in relation to psychoneuroimmunoendocrinology—the study of the interactions between the psyche, neural and endocrine functions and immune responses[15]—, exploring this relationship in artistic-research practice. If Kocik/Faïn’s work participates in part on the ‘revaluing sound’ side of the strategic spectrum, it also reaches such a spectrum’s other, transvaluational end, which involves reimagining/reconfiguring the evaluative categories and their distinctions. The probing of sound as embodied is not opposed to but performed together with the probing of language as symbolic and as computational.[16] It is evident in the engagement with psychoneuroimmunoendocrinology, a field that, even if marginal, is endowed with modern scientific authority. It can also be seen, for example, in their work “Re-English,” a choreographed choir performance by the Commons Choir, a multi-lingual/racial/generational performing group, which takes up English as a “duplicitous, mercenary, and commercial tongue,” and as “the hegemonic language of globalization,” and by way of staging confrontations of its dominant and minoritarian aspects aims to expand and thereby transform it.[17]

Inseparable yet distinct from the focus on embodiment in sound studies, another approach to re/transvaluing sound focuses on its dialogic nature. Sound here is considered primarily as spoken word; more than sound studies, voice studies is the decisive domain of research and practice. The strategic attempts at revaluing sound are here analog to the above: there are attempts to valorize speech as text’s embodied and subordinate other, or oppositely, by demonstrating speech’s text-proximate properties; there is also the approach to valorizing certain dialogic forms as public in relation to other, presumably private kinds of speech. From a Bakhtinian perspective, however, a focus on dialogue can also become a matter of transvaluation. Bakhtin’s notion of dialogue illuminates the sociolinguistic and ideological aspects of both text and speech, and of speech of all kinds. All text, like all speech, is dialogic in nature: situationally perspectival and relational, formulated as address.[18] This understanding, along with other Bakhtinian concepts such as polyphony, wherein one presumed voice consists of multiple voices, and heteroglossia, which pertains to the many ways of talking and the social perspectives and power differentials these enregister within a national language, is in line with the common notions, commitments and practices of transversal texts, with the emphases on the multiple and manifold, on translation and on vernacular and nondominant languages/ways of speaking.

To think of speech as situated address raises implicit ethical and political dimensions as well as issues of translation. Questions of belonging and exclusion, of language and authority, of ‘homolingual’ versus ‘heterolingual’ address, of instituent practices and polyphonic aesthetics have been elaborated in numerous transversal journal issues and books and are all pertinent to working with sound, including speech. I won’t try to review them here. To indicate one arguably voice-specific point, however: The writer Andrew Robinson contrasts his understanding of Bakhtin’s dialogistic ethics and of “liberal notions of coexistence and tolerance.” In the latter, he writes, perspectives are seen as partial and complementary, whereas in the former, “the dynamic interplay and interruption of perspectives is taken to produce new realities and ways of seeing,” and the formation of one’s perspective is a process requiring self-activity.[19] If we consider the feminist perspective of voice studies scholar Adriana Cavarero, however, it could be argued that at the core of both approaches is a conception of voice as perspective, and ultimately as semantic content or thought.[20] Both approaches thereby miss what is most important about voice—its inherent uniqueness and relationality. This is the reality of embodiment, not in a depersonalized sense of voice pertaining to ‘the body’, but in the sense that “every human being is a unique being, and is capable of manifesting this uniqueness with the voice.”[21] Cavarero frames this understanding, against the conceptions of voice as thought and of uniqueness as semantic difference, as a kind of waking up: the discovery of having a unique body, a unique voice in this sense is the discovery that one has one life to live.[22] From this comes responsibility, or what we might call care.[23]

Similarly concerned with distinguishing the sounding of voice from linguistic meaning and in sound as encounter, Fred Moten emphasizes not uniqueness but ensemble.[24] Cavarero introduces voice by way of a story by Italo Calvino: a king hears singing, wakes up to his humanity and surrenders his authority as king. Moten begins his discussion of sound by double reference to Frederick Douglass. Douglass hears the shrieking of his aunt being whipped by the slavemaster and knows the slavemaster to whip even harder in response to her screams; Douglass writes that it was by hearing slave songs, their anguished tones, that he first began to understand the dehumanizing nature of slavery. Working with, thinking with, being affected by sound as “inspirited materiality”[25] is for Moten a matter of cuts and breaks and improvisations that he traces through a black radical aesthetic tradition and which embody a critique and a sociality: “there occurs in such performances a revaluation or reconstruction of value, one disruptive of the oppositions of speech and writing, and spirit and matter.”[26] In this mode sound can be experienced as an encounter of the ensemble of the senses—as ensemble, each sense being neither reduced to itself nor valorized in relation to an other—the ensemble of the social, a way to something radically different that Moten, with Marx, still wants to call communism.

Responsibility/care; ensemble/communism. On the sound of voice beyond meaning, one last aspect can be considered. This regards sound and secularism. Talal Asad has written that secularism “is not only an abstract principle of equality and freedom that liberal democratic states are supposed to be committed to but also a range of sensibilities—ways of feeling, thinking, talking,” adding: “Perhaps the single most important sensibility is the conviction that one has a direct access to the “truth.”[27] This notion of direct access is related to ideas of transparency, to strictly semantic and symbolic understandings of meaning, and to expectations of immediacy, sameness, and the availability of one-to-one translation—characteristic notions of the Enlightenment-scientific ideology of language.[28] With respect to sound, it would be bound up with a practical and notional sense that all sounds are, ontologically, equal, and intelligible in terms of something like truth-content. Secular sound in this sense would have no way of understanding, so argues Asad, the nontranslatability of the Qur’an in Islamic ritual context, where the worshipping recitation of Qur’anic text is performed in Arabic as sacred language, distinguishing it from, and making it untranslatable into, whatever dialect(s) worshippers use in everyday, non-ritual speech.[29] A sound is not a sound. ‘Transparent’ speech (and the kind of writing performed in this text) and the listening (and reading) styles that accompany it, despite their neutralized self-image and projection, also require a certain cultivation and comportment and particular ways of feeling, thinking, talking. And while such a mode may serve worthy ends, sound may also lead us to probe those limits.

Transversal sounds in publishing practice: some resonating works

In lieu of conclusion, and in light of the above, the expansion of print and electronic text-based publishing into sound-based formats invites us to think about the ways text and sound are and might be experienced: by writers, readers, listeners, speakers, translators, technicians, and publication caretakers (in transversal texts-speak, the “Sorgetragende” of publishing projects). Considering sound brings up aspects of embodiment, of feeling-tones, synaesthesia, intimacy and distance, and questions of how text and sound can be worked with, separately, in tandem, at once—interwoven, (in)distinguished—to message, to feel, to relate, to offer. How can adding sound to the repertoire of transversal’s work take up, and play with, expectations of sound and text, including our own? How, to propose a kind of self-investigation, does working with sound, with voice, affect us? What different moments and thresholds of control and spontaneity does it demand and allow? What feelings and relationships? What repetitions and interventions? What listeners?

How to have a conversation, with all its uncertainty and open-endedness, slippage and laughter, thinking/feeling out loud and together, process rather than product oriented—and yet publishable? How to reimagine a multilingual webjournal issue in sound, beyond a simple audio recording of translated texts? How to artistically document translation processes? How to post songtracks accompanying books? How to engage the possibilities of not only sound but of silence?

These are early but not easy questions. In probing them, we might well learn from the sound-based practices/works/collectives presented below,[30] gathered here for the likelihood that they have something to teach us, recording new questions along the way.

List of sound works

Sound as medium of political activism, relationship and transformation


The work of Ultra-red centers on collective listening as an activist-research and political-organizing tool. From the group’s own mission statement: “In the worlds of sound art and modern electronic music, Ultra-red pursue a fragile but dynamic exchange between art and political organizing. Founded in 1994 by two AIDS activists, Ultra-red have over the years expanded to include artists, researchers and organisers from different social movements including the struggles of migration, anti-racism, participatory community development, and the politics of HIV/AIDS. Collectively, the group have produced radio broadcasts, performances, recordings, installations, texts and public space actions (ps/o). Exploring acoustic space as enunciative of social relations, Ultra-red take up the acoustic mapping of contested spaces and histories utilising sound-based research (termed Militant Sound Investigations) that directly engage the organizing and analyses of political struggles.”[31]

Deep Listening

The practice that came to be called deep listening was developed by composer/musician Pauline Oliveros as a transformative tool for embodied understanding, relationship and collective healing. Beginning in the late 1960s at a moment of political despair Oliveros developed the method in the context of the feminist movement in 1970s California, also taking up Eastern-influenced notions of embodiment. The idea is to develop a practice of hearing everything, of allowing oneself to hear everything, and thereby allowing oneself to be fully present in a meditative sense. This practice and its listening-centered cultivation of openness have been adopted in political and artistic contexts around the world. “Deep Listening is intended to facilitate creativity in art and life through this form of meditation. Creativity means the formation of new patterns, exceeding the limitations and boundaries of old patterns, or using old patterns in new ways.”[32]

Robert Kocik and Daria Faïn: C-O-M-M-O-N-S Choir

Linking health, art and politics, the choreographed-sound-poetic performances and teaching activities of the Commons Choir are based on Robert Kocik and Daria Faïn’s “Prosidic Body” research described above, with the choir simultaneously serving as a site/tool/subject of that research. The Commons Choir performs and teaches with language as sound, embodiment and expression in a deep sense: under the skin, shaping space, in virtuality, considering manifold bio-chemical-neuro-endicrino-etc processes and the subtle body of Eastern traditions, but also movement, color, numbers, inheritance, algorithm, ideology, space, power. These aspects are engaged/performed/studied not as abstract concepts but as forces animated in and by language, and not by language in the abstract but specifically in English, with its specific burdens, violences, resources, and possibilities.

Sound in multi/transmedia formats

Don Mee Choi’s “DMZ Colony”

Winner of the 2020 U.S. National Book Award in Poetry, DMZ Colony by Don Mee Choi is a multimodal, genre-fluid work about torture and loss in the context of the Korean War. It is concerned with finding means to communicate about experiences rendered unheard, invisible and unauthorized in the war’s neoimperial aftermath. The book powerfully demonstrates how careful attention to the auditory and its physical and affective dimensions can be rendered in image and text. This is particularly the case in the way Choi depicts translations of nonverbal sonic elements in computer and handwritten text form on the page, weaving these together with other textual and image elements as the book simultaneously orbits around the unspeakable and builds arguments about poetic language and command language, about translation politics, and about neocolonialism and imperialism.

Young-Hae Chang Heavy Industries

Young-Hae Chang Heavy Industries’ dozens of digital audiovisual works consist of video footage overlayed with animated text and set to music. The three elements are often combined in unconventional, provocative ways, such that the combination itself presents an additional layer of meaning—sometimes, as seeming meaninglessness—in works that engage with issues of police violence, racism, war, Samsung, and cultural identity, to name a few. The auditory experience here is not restricted to the music track but also consists of a sonic aspect of the visual layer, for example as words in animated text are visually repeated in flashes, evoking loudness. The animated text is often explicitly (and reflexively) dialogic, reading as one pair-part of a dialogue, frequently as emphatic address. At the same time, the music expands, augments, amplifies the visual/textual experience, especially through rhythm. Most works appear in at least two languages, such that this is also a project of translation.

Sound in language-political performance

Works by Caroline Bergvall

Sound/language artist Caroline Bergvall’s installation “Say Parsley” shows how the space between word and sound is not only creative, generative and potentially future-oriented, but also historical, ghostly and violent. The work shows how the pronunciation of the letter “r” is socially effective as a shibboleth, its sounding replete with ideas about sociocultural types. Visitors to the installation hear the repetition of one phrase by different speakers, as the repetition grows eerie, evocative of the haunted nature of language as inheritance, until finally revealing the murderous end to which the shibboleth “r” was harnessed in the Dominican Republic under Rafael Trujillo, who ordered those persons supposedly distinguishable as Haitian by virtue of their pronunciation of “r” in the word parsley to be massacred. Visually and spatially disorienting, the evident non-neutrality of sound and language leave the auditory dimension experientially enlarged.

In the installation/performance “Crop” (text here) Bergvall reads a poem whose lines rotate between English, Norwegian and French, reflecting on how people exist and are denied existence as bodies in language and as they move, and are forced to move, between languages. The rotation of languages works as a performative, epistemic method, not least as the expectation that the three languages ‘say the same thing’ is made to break down. For most listeners, moreover, at least one of the languages will be unintelligible, so the listening experience is one of partial understanding cut through with the unknown, while the sounds’ understood referential meanings increasingly involve violence and erasure.

Sound in translation

“Complete translation”

Jerome Rothenberg’s sonically attuned translations of Seneca and Navajo sound-poems are the result of years of collaboration mostly with Seneca songspeople beginning in the late 1960s, but also with an ethnomusicologist who was working with the music of a deceased Navajo singer. Rothenberg came up with the concept of “total translation” to describe the method he developed for these projects. “Vocables”—sounds without verbal/referential value but with sonic/musical and iconic/indexical “coherence”—were key elements of the original songs. But existing translations of the songs into English had either ignored the vocables or rendered them in Anglophonic transcriptions, denying their “coherence” in the original. With respect to the Navajo songs, Rothenberg’s translation solution was to create a new “field of sounds” in English: sonically-English “non-words” that could carry the “character” of the Navajo sounding “non-words” while being defined by the (associationally replete) acoustics of English. The result of this particular solution is arguably a kind of inhabiting of English by Navajo, on the one hand, and an interrogation of the distinction between ‘meaning’ and ‘sound’ on the other.


Jonathan Stalling’s multi-modal work Yíngēlìshī similarly presents an inhabiting of one language by another, here English by Chinese. Read by an English-language reader, Yíngēlìshī reads “English” (as the word sounds ‘with a Chinese accent’); read by a Chinese-language reader, the characters for “chanted songs, beautiful poetry” are seen. Stalling wrote a book of poetry and created an opera in this Sinophonic English, which becomes multi-dimensional, blurring the categories “English,” “Chinese,” “sound,” and “text.” The “origin” and “target” languages, from a translation perspective, also lose their distinction. As such the work expands and elaborates a space in-between, a both/and of English and Chinese, and of sound and writing. Experientially, it demonstrates that there is no such thing as language, just different ways of verbally and nonverbally ‘signing’ that vary across social space and time.[33]



[1] Kockelman, P., & Bernstein, A. (2012). Semiotic technologies, temporal reckoning, and the portability of meaning. Or: Modern modes of temporality–just how abstract are they?. Anthropological Theory, 12(3), 320-348.

[2] More specifically, Bull draws on Adorno to think about his own research on personal stereo users in urban space. The longer passage: “Adorno describes, importantly for my purposes, issues of cognition, aesthetics and the interpersonal. The subjective desire to transcend the everyday through music becomes a focal point of his analysis, as is the desire to remain ‘connected’ to specific cultural products. The nature of this ‘connection’ constitutes a state of ‘we-ness’ which also provides the ‘subjective’ moment in Adorno’s analysis of music reception. The ‘social’ undergoes a transformation through the colonization of representational space by forms of communication technology, and the ‘site’ of experience is subsequently transformed thus changing the subject’s ‘interiority’. This ‘transformation’ is replicated phenomenologically in the behaviour of personal-stereo users in their everyday experience. States of ‘we-ness’ thus might be seen dialectically as colonizing the user’s desire for social attachment, thereby creating new forms of experiential dependency within the emancipatory desires of the user.” Bull, M. Sounding Out the City: An Auditory Epistemology of Urban Experience, in: M. Bull and L. Back, The Auditory Culture Reader, 2nd edition, Routledge, 79.

[3] This is a quintessentially modern understanding of text, as mass-reproducible text (See, for example, Anderson, B. (1983). Imagined Communities: Reflections on the Origin and Spread of Nationalism.) Of course, this is not the only kind of text, but it remains dominant today and is what we are concerned with here.

[4] See Erlmann, V. (Ed.). (2020). Hearing cultures: Essays on sound, listening and modernity. Routledge.; see also Adorno in: Bull, M. (2016). Sounding out the City: An auditory epistemology of urban experience. In: Bull, M. & Back, L. (Eds.), Auditory Culture Reader, 79.

[5] James Clifford elaborates this based on Ricoeur’s concept of dialogue in:

Clifford, J. (1983). On ethnographic authority. Representations, 2, 118-146.

[6] -- Schizophonic, I said

-- Schizo-what? The group asked.

-- Schizophonic. It’s a word I invented. You know that phono pertains to sound. The Greek prefix schizo means split or separated. I was thinking of Barbara’s wonder at how a voice or music could originate in one place and be heard in a completely different place miles away. See “Schizophonia” in: R.M. Schafer, R.M. (1969). The New Soundscape: A Handbook for the Modern Music Teacher. Ontario/New York.

[7] See Erlmann, as well as publications by the Working Group “Sound Objects in Transition. Epistemes of Modern Acoustics” (Max Planck Institute, 2015-20), whose research explored this history more broadly.

[8] Cf. Erlmann, p. 4.

[9] In linguistic anthropology ways of speaking and their sociocultural evaluation are studied in terms of processes of sociolinguistic “enregisterment” and “indexical orders.” See Silverstein, M. (2003). Indexical order and the dialectics of sociolinguistic life. Language & communication, 23(3-4), 193-229.

[10] For a sociolinguistically informed critique of the public/private distinction, see Gal, S. (2002). A semiotics of the public/private distinction. Differences: a journal of feminist cultural studies, 13(1), 77-95.

[11] Gal, S. (2006). Contradictions of standard language in Europe: Implications for the study of practices and publics. Social Anthropology, 14(2), 163-181.

[12] See the contributions by Smith, Gouk, and Connor, respectively, in Erlmann (Ed.), Hearing Cultures. My reference is to Erlmann’s introduction to the volume: “But What of the Ethnographic Ear? Anthropology, Sound, and the Senses,” 1-21.

[13] Kocik, R., “Home,” last retrieved 30 May 2022:

[14] Kocik, R., “Prosidic Body,” last retrieved 30 May 2022:

[15] González-Díaz, S. N., Arias-Cruz, A., Elizondo-Villarreal, B., & Monge-Ortega, O. P. (2017). Psychoneuroimmunoendocrinology: clinical implications. The World Allergy Organization journal, 10(1), 19. Retrieved 22 May 2022 from:

[16] Kocik, R. (2013). Supple Science. A Robert Kocik Primer. ON Contemporary Practice, last retrieved 30 May 2022:

[17] Kocik, “Dearest Choir,” last retrieved 30 May 2022:

[18] Cf. Bakhtin, M. (1981). Discourse and the Novel, in: Ibid. The Dialogic Imagination. (Ed. M. Holquist) (Trans. C. Emerson & M. Holquist), 259-422.

[19] See A. Robinson, “In Theory Bakhtin: Dialogism, Polyphony and Heteroglossia,” last retrieved 30 May 2022:

[20] This is not the most generous reading of Robinson’s rendering of Bakhtin, insofar as neither the assumption of a new ‘perspective’ nor ‘self-activity’ need imply a reduction to something like ‘thought’; nor must ‘thought’ be reduced to a kind of disembodied semantic-content phenomenon. This would be the more conventional, even offhand way of reading the qualification, which I am admittedly performing here in order to draw out a distinction.

[21] Cavarero, A. (2005). For more than one voice: Toward a philosophy of vocal expression. Stanford University Press, p. 7.

[22] Cf. Ibid. Cavarero goes on to develop a politics of voice, drawing heavily on Hannah Arendt. Here, the truth of voice becomes the basis for a pluralist democratic politics of speaking, but one distinctly based on Cavarero’s understanding of voice as singularly embodied and relational.

[23] “Care” because the way Cavarero emphasizes waking up to having one life to live, a life in relation with others, recalls, I think, Carol Gilligan’s framing of the development of responsibility as care in relationships—even if the “voice” Gilligan writes of, at least initially, refers to thinking. See C. Gilligan. (1982). In a Different Voice. Psychological Theory and Women’s Development. Harvard University Press.

[24] Moten, F. (2003). In the Break: The aesthetics of the black radical tradition. University of Minnesota Press.

[25] Moten, 11.

[26] Moten, 14.

[27] Asad, T. (2020). Secular Translations: Nation-State, Modern Self, and Calculative Reason. Duke University Press.

[28] Bauman, R., Briggs, C. L., & Briggs, C. S. (2003). Voices of modernity: Language ideologies and the politics of inequality (No. 21). Cambridge University Press. See especially Chapter 2, “Making language and making it safe for science and society: From Francis Bacon to John Locke.

[29] See Chapter 2, “Translation and the Sensible Body,” in Asad, Secular Translations.

[30] For introducing me to most of the works listed below, I thank Jen Scappettone and the wonderful ‘Poetry On/Off the Page’ workshop she guided during the winter months of 2021.

[32] Oliveros, P. (2005). Deep Listening. iUniverse.

[33] See Silverstein, M. (2016). The “push” of Lautgesetze, the “pull” of enregisterment. Sociolinguistics: Theoretical debates, 37-67.