sodestream

Paper: Power and Vulnerability: Managing Sensitive Language in Organisational Communication

by sodestream Project • Monday 27 November 2023 • Permalink

Patrick Healey, Prashant Khare, Gareth Tyson, Mladen Karan, Ignacio Castro, Ravi Shekhar, Stephen McQuistin, Colin Perkins, Matthew Purver
Frontiers in Psychology, section Psychology of Language, 2024

Abstract

Organisational responsibilities can bring power but also a degree of vulnerability and exposure. This tension leads to divergent predictions about the use of potentially sensitive language: power might license it, exposure might inhibit it. Data from a large corpus of organisational emails shows that people in positions of relative power are approximately three times less likely to use sensitive words than people more junior to them. This tendency appears to be independent of whether other people are using potentially sensitive words and independently of a whether a particular word occurs in a sensitive context. These results suggest that, in at least some circumstances, vulnerability is a more significant influence on language use than social power.



 Download

Paper: Temporal Network Analysis of Email Communication Patterns in a Long Standing Hierarchy

by sodestream Project • Tuesday 28 November 2023 • Permalink

Matthew Russell Barnes, Mladen Karan, Stephen McQuistin, Colin Perkins, Gareth Tyson, Matthew Purver, Ignacio Castro, Richard G. Clegg
Proceedings of the 18TH International AAAI Conference on Web and Social Media 2024

Abstract

An important concept in organisational behaviour is how hierarchy affects the voice of individuals, whereby members of a given organisation exhibit differing power relations based on their hierarchical position. Although there have been prior studies of the relationship between hierarchy and voice, they tend to focus on more qualitative small-scale methods and do not account for structural aspects of the organisation. This paper develops large-scale computational techniques utilising temporal network analysis to measure the effect that organisational hierarchy has on communication patterns throughout an organisation, focusing on the structure of pairwise interactions between individuals. To this end, we focus on one major organisation as a case study — the Internet Engineering Task Force (IETF) — a major technical standards development organisation for the Internet. A particularly useful feature of the IETF is a transparent hierarchy, where participants take on explicit roles (e.g., Area Directors, Working Group Chairs), and because its processes are open we have visibility into the communication of people at different hierarchy levels over a long time period. Exploiting this, we utilise a temporal network dataset of 989,911 email interactions among 23,741 participants to study how hierarchy impacts communication patterns. We show that the middle levels of the IETF are growing in terms of their dominance in communications. Higher levels consistently experience a higher proportion of incoming communication than lower levels, with higher levels initiating more communications too. We find that, overall, communication tends to flow “up” the hierarchy more than “down”. Finally, we find that communication with higher-levels is associated with future communication more than for lower levels, which we interpret as “facilitation”. We conclude by discussing the implications this has on patterns within the wider IETF and the impact our analysis can have for other organisations.



 Download

Paper: Tracing Linguistic Markers of Influence in a Large Online Organisation

by sodestream Project • Friday 18 August 2023 • Permalink

Prashant Khare, Ravi Shekhar, Mladen Karan, Stephen McQuistin, Colin Perkins, Ignacio Castro, Gareth Tyson, Patrick G.T. Healey, Matthew Purver
In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics, 2023.

Abstract

Social science and psycholinguistic research have shown that power and status affect how people use language in a range of domains. Here, we investigate a similar question in a large, distributed, consensus-driven community with little traditional power hierarchy – the Internet Engineering Task Force (IETF), a collaborative organisation that designs internet standards. Our analysis based on lexical categories (LIWC) and BERT, shows that participants’ levels of influence can be predicted from their email text, and identify key linguistic differences (e.g., certain LIWC categories, such as “WE” are positively correlated with high-influence). We also identify the differences in language use for the same person before and after becoming influential.



 Download  Code and Data

Paper: LEDA: a Large-Organization Email-Based Decision-Dialogue-Act Analysis Dataset

by sodestream Project • Wednesday 16 August 2023 • Permalink

Mladen Karan, Prashant Khare, Ravi Shekhar, Stephen McQuistin, Colin Perkins, Ignacio Castro, Gareth Tyson, Patrick G.T. Healey, Matthew Purver
In Findings of the Association for Computational Linguistics: ACL 2023

Abstract

Collaboration increasingly happens online. This is especially true for large groups working on global tasks, with collaborators all around the globe. The size and distributed nature of such groups makes decision-making challenging. This paper proposes a set of dialog acts for the study of decision-making mechanisms in such groups, and provides a new annotated dataset based on real-world data from the public mail-archives of one such organisation – the Internet Engineering Task Force (IETF). We provide an initial data analysis showing that this dataset can be used to better understand decision-making in such organisations. Finally, we experiment with a preliminary transformer-based dialog act tagging model.



 Download  Code and Data

Paper: Errare humanum est: What do RFC Errata say about Internet Standards?

by sodestream Project • Monday 17 July 2023 • Permalink

Stephen McQuistin, Mladen Karan, Prashant Khare, Colin Perkins, Matthew Purver, Patrick Healey, Ignacio Castro, and Gareth Tyson
In Proceedings of the 7th Network Traffic Measurement and Analysis Conference (TMA) (pp. 1-9). IEEE.2023

Abstract

Protocol standards, such as RFCs developed by the IETF, are crucial for the correct operation of the Internet, but many are published containing errors. The RFC Editor allows people to report errata, allowing anybody to flag such errors for subsequent correction. This represents an important part of the RFC publication process, and may reveal ways in which standards can be improved. This paper performs the first study of RFC errata reports. We characterize and perform a statistical analysis of the scale and nature of these errata and explore who submits them. Finally, we evaluate the impact, in terms of the number of errata filings, of three different strategies that are designed to improve the standards process. We find that specialist review teams and formal language checkers can reduce the volume of errata filed against standards documents.



 Download  Code and Data

Paper: The Web We Weave: Untangling the Social Graph of the IETF

by sodestream Project • Monday 21 March 2022 • Permalink

Prashant Khare, Mladen Karan, Stephen McQuistin, Colin Perkins, Gareth Tyson, Matthew Purver, Patrick Healey, and Ignacio Castro
In Proceedings of the International AAAI Conference on Web and Social Media, 2022

Abstract

The Internet Engineering Task Force (IETF) is responsible for producing the standards that underpin the web (e.g., HTTP, WebSockets, and WebRTC). While the IETF follows an open, consensus-driven process, protocol standardisation is inherently social and political, and latent influential structures might exist in the community. Exploring and understanding these is essential to ensuring the IETF's resilience and openness. We use network analysis to explore the social graphs of IETF participants and the influence that key contributors have. We show that a small core dominates: the top 10% of participants contribute 43.75% of emails and come from a relatively small set of organisations. On the other hand, we also find that influence has become relatively more decentralised with time. IETF participants also propose and work on protocol drafts that are either adopted by a working group for further refinement or get rejected at the early stage. Using the social graph features combined with email text features, we perform regression analysis to understand the effect of user influence on the success of proposed protocol drafts being adopted. Our findings shed useful insights into behavior of participants across time, correlation between influence and success in draft adoption, and the significance of affiliated organisations in development of protocol drafts.


 Download  Code and Data

Paper: Characterising the IETF Through the Lens of RFC Deployment

by sodestream Project • Tuesday 12 October 2021 • Permalink

Stephen McQuistin, Mladen Karan, Prashant Khare, Colin Perkins, Gareth Tyson, Matthew Purver, Patrick Healey, Waleed Iqbal, Junaid Qadir, and Ignacio Castro
In Proceedings of the 21st ACM Internet Measurement Conference (pp. 137-149), 2021

Abstract

Protocol standards, defined by the Internet Engineering Task Force (IETF), are crucial to the successful operation of the Internet. This paper presents a large-scale empirical study of IETF activities, with a focus on understanding collaborative activities, and how these underpin the publication of standards documents (RFCs). Using a unique dataset of 2.4 million emails, 8,711 RFCs and 4,512 authors, we examine the shifts and trends within the standards development process, showing how protocol complexity and time to produce standards has increased. With these observations in mind, we develop statistical models to understand the factors that lead to successful uptake and deployment of protocols, deriving insights to improve the standardisation process.


 Download  Code and Data

Paper: Mitigating Topic Bias when Detecting Decisions in Dialogue

by sodestream Project • Thursday 29 July 2021 • Permalink

Mladen Karan, Prashant Khare, Patrick Healey, and Matthew Purver
In Proceedings of the 22nd Annual Meeting of the Special Interest Group on Discourse and Dialogue (pp. 542-547), 2021

Abstract

This work revisits the task of detecting decision-related utterances in multi-party dialogue. We explore performance of a traditional approach and a deep learning-based approach based on transformer language models, with the latter providing modest improvements. We then analyze topic bias in the models using topic information obtained by manual annotation. Our finding is that when detecting some types of decisions in our data, models rely more on topic specific words that decisions are about rather than on words that more generally indicate decision making. We further explore this by removing topic information from the train data. We show that this resolves the bias issues to an extent and, surprisingly, sometimes even boosts performance.


 Download  Code and Data