Stats corner: is the Standard Cross-Cultural Sample really standard?

Post by Péter Rácz.

We use large cross-cultural datasets to test theories of cultural evolution. These tests face what is commonly referred to as “Galton’s problem” (see here for an elegant overview). Since cultural traits co-evolve (think historical linguistics) and are traded freely in close proximity (think Sprachbund effects), their co-variance will be partly explained by shared ancestry and geographic proximity.

This co-variance is interesting in itself, but many theories of cultural evolution seek to form generalisations about human nature. In such cases, Galton’s problem has to be accounted for. One way to do this is to use statistical methods that take co-variance into account. Another way is to use a dataset that samples societies across phylogenies and geographic regions in a representative way.

As an inconsequential exercise, I compare one such dataset, the Standard Cross-Cultural Sample (SCCS), with another, larger, non-representative dataset, the Ethnographic Atlas (EA). I access these through the D-Place database. The SCCS contains 195 societies, the EA 1290 societies. All the societies in the former are also part of the latter. This allows me to compare them directly, using the 95 cross-cultural variables in the EA.

My question is: How much variation is explained in the EA by shared ancestry and geographic proximity? How much, if any, variation do these explain in the SCCS?

In order to make a comparison, I choose the 85 categorical variables in the EA. Using an arbitrary cutoff in category size, I filtered out those variables which have a large number of small categories or where the largest category is “absent” (i.e. most societies do not really have this specific cultural practice). This left 55 variables, covering 70,100 / 120,000 observations across the 1290 societies in the EA.

I fit a binomial mixed-effects regression model (using Douglas Bates’ lme4 package in R) on each of these variables, predicting whether a society is in the largest category, and estimating an intercept, as well as a random intercept for language family and one for geographic region in D-Place. If the distribution of the largest category for the variable does not co-vary with ancestry and proximity, such a model would have very little explanatory power. If it does, the model should explain some variation in the dataset. This variation can be expressed using r², the fraction of the variation in the response variable that is explained by the model. By proxy, the r² will indicate how much the entire categorical variable co-varies with language family and region — a simple estimate of cultural co-variation.

Since the societies in the SCCS are a proper subset of the societies in the EA, I can re-fit these models on the SCCS sample only. If the SCCS sample is more representative than the EA (which has no aspirations of the sort), I expect the r² values to go down: less variation should be explained by shared ancestry and geographic proximity.

The r²-s for the 55 relevant models across the two datasets can be seen below. Bearing in mind a number of caveats (variable coding is simplistic, language family is a poor approximation of phylogeny, the SCCS sample is smaller, etc.), this can give us a sense of how much co-variation is present in the two samples.

Family and region explain less variance in the SCCS than in the EA, as expected. But their effect is not negligible.

The point here is not at all to give an accurate estimation of co-variation in the SCCS or the EA. Rather, it is to encourage the use of more sophisticated statistical methods (unlike the ones used in this post) and to propagate discretion in the use of the SCCS, because human culture is more complicated than it seems.

(For data, code, and methods in graphic detail, go here.)

Overview of the CAKTAM Workshop January 2018

Notions of family and kin terms vary in complexity and structure, so to what extent does linguistic and cultural variation affect the acquisition of kinship knowledge? While kinship provides the major framework for social organisation in many societies, we still know very little about how children learn to categorise different kinds of kin.  The ‘Children’s Acquisition of Kinship Knowledge: Theory and Method Workshop’, led by EXCD lab of University of Bristol, provided a unique opportunity to explore and refine ideas in this largely overlooked area of research. Early-career researchers and distinguished academics alike, from anthropology, linguistics and psychology, gathered at The Engine Shed, Bristol in late January 2018, to propose theories and share in discussion. The result was a truly stimulating event.

Kicking off the two-day workshop, Professor Fiona Jordan’s introduction emphasised the EXCD lab’s interdisciplinary approach, highlighting the restricted variation of kinship systems, the question of ‘unthinkable families’ and the notable diversity of cousin systems around the world. Eve Danziger, Professor of Linguistic Anthropology at the University of Virginia, followed with a consideration of the syntactic and pragmatic parallels between kinship and spatial relationship terms, and their origins in “gesture-calls”. Using kinship acquisition data from her fieldwork with Mopan (Mayan) speakers, Eve showed how cultural elaboration of respect for elders complements the semantic feature “sex-of-senior”, producing cultural and cognitive consequences for sense of self.

Eve Clark, Professor of Linguistics at Stanford University, our second speaker of the day, offered interesting reflections on her pioneering 1974 study of the semantic complexity of kinship term acquisition using elicited definitions. This fresh perspective suggested further consideration should be given to children’s experience with kin terms in their communities, looking at both address and third-person reference.

Next, we heard from Bob Parkin, Emeritus Fellow of Oxford University’s School of Anthropology who considered the lack of current research on children’s learning of kinship within social anthropology. Bob’s presentation pointed towards the widespread anthropological objections to Malinowski’s extensionism, its unsuitability to all terminologies and its shortcomings as a universal theory of learning.  We then heard about infants’ observational learning skills from Tanya Broesch, Assistant Professor of Psychology at Simon Fraser University. Tanya told us about learning from behavioural cues such as infant-directed speech, gestures, and facial expressions and how these cues aid interpretation of complex group member information such as defining friend or foe. The talk included an overview of Tanya’s multi-methods, cross-cultural approaches and her current data, collected via natural observation in multiple societies.

After lunch, a close analysis of the acquisition of kinship concepts in Australian Murrinhpatha-speaking communities followed, with interactional linguist Joe Blythe, of Macquarie University. Joe’s personalised experiments involved photos of individuals from each child’s genealogy, along with pre-recorded audio clips and stick figure animations, in order to determine children’s comprehension of kinterms. Leading on from this, EXCD team member, linguistic anthropologist Alice Mitchell of the University of Bristol, presented preliminary findings into kinship learning among Datooga children of Tanzania, as studied over nine months of fieldwork. Initial observations focused on child-anchored kin terms as a source of information for children. She then considered children’s understanding of the kin term for ‘mother’ and the apparent resistance to the use of word when referring to classificatory mothers.

As the afternoon progressed, we heard from Francis Mollica of The Computational & Language Laboratory, University of Rochester. Using a probabilistic Language of Thought model, Frank discussed simulations scrutinizing how simplicity, data distributions and assumptions about relatedness interface, giving rise to behavioural effects observed in children. These included a trajectory from under- to over-extension of kinship terms, and, in the case over over-extensions, the characteristic-to-defining shift.  The next presentation, by Annie Spokes from Harvard University’s Department of Psychology explored conceptual understanding of kinship as a social category and expectations for social interactions in 3-5 year old children in the US. She also examined how infants track relationships in care-giving networks within the first two years of life, forming expectations and early inferences about kin.

Julia Nee of the Department of Linguistics at Berkeley addressed us for the final session of the day via video-link. Julia’s field research with Teotitlan del Valle Zapotec speakers allowed her to examine whether languages show an optimization of complexity and communicative cost in dividing up the semantic domain of kinship, compared with English-speaking participants. Having covered a great deal of ground on the first day, workshop attendees met for dinner in central Bristol during the evening and talked over research ideas and experiences.

Friday provided an opportunity to focus on research methods. Joe and Alice introduced the first hour with a talk on elicitation and experiments. Camilla Morelli, Lecturer in Anthropology at the University of Bristol, then provided an overview of the use of visual and sensory methods in child-centred anthropology. Drawing on her ethnographic fieldwork with indigenous children in the Peruvian Amazon, Camilla suggested ways in which such techniques can be applied when investigating kinship and the acquisition of kinship knowledge.

After a morning break, we heard again from Joe and Alice who led a wide-ranging discussion about linguistic and corpus-based methods. This useful, interactive session provided an opportunity for a closer exploration of the various approaches. Their two methods talks covered questionnaires and surveys for eliciting definitions and factual information, stimuli-based tasks using photos and/or dolls, and collecting behavioural data, both linguistic and non-linguistic. The discussion provided an opportunity to appraise successes and difficulties encountered in each of the approaches and the group exchanged experiences in the field.

We were then delighted to hear presentations from three early-career Phd Researchers. Sheina Lew-Levey from Cambridge University’s Dept of Psychology outlined her recent findings into the transmission of foraging knowledge as well as social and gender norms through play, word-play and teaching among Mbendjele forager children in the Congo Basin. Noa Lavi of Cambridge’s Anthropology Dept followed, with an overview of kinship concepts and flexible patterns of relationality among the Nayaka, hunter-gatherers in Nilgiri, South India. Noa described how Nayaka children’s knowledge and knowledge acquisition are based on gradual learning of the ability to alternate between different kinship concepts. Lastly, Gabriella Piña, a social anthropologist from the London School of Economics, talked about her work with the Pehuenche people of Southern Chile. In this society, independence and freedom are highly valued and offset by the practice of visiting and hosting, to support collaboration and avoid tension. She examined children’s participation in these activities and how these practices develop their understanding of kin.

Friday afternoon was dedicated to a round-up discussion. The group gathered in an open session to exchange views on the creation of a ‘field-kit’ intended to aid the study of the acquisition of kinship terms, for use by the group and other researchers.

In addition, as an ongoing interest, the group intend to make a joint interdisciplinary contribution towards a forthcoming article which will address a universal set of concerns relating to kinship acquisition. Most notably, the event was the first of its kind in its interdisciplinary draw and related events are likely to follow. One of the most considerable outcomes of the workshop has been the momentum created for future ventures and collaborations around developing the questions of kinship, forming new ideas and attracting newer researchers from an even greater diversification of approaches.

Conversation across languages and cultures: Dr Joe Blythe

The past few weeks the lab has hosted Dr Joe Blythe as  Benjamin Meaker Visiting Fellow from the University of Bristol’s Institute for Advanced Studies (thanks IAS!).

Joe’s final event is this evening, and we’re delighted to be hosting his public lecture:

Conversation across languages and cultures: Cross-linguistic perspectives on taking turns to talk.


Thursday 8 February 2018

17:00 – 18.00 & drinks reception

Lecture Theatre 3, Woodland Road Arts Complex

Today we are delighted to welcome colleagues from around the globe as we meet for this evening’s opening of the CAKTAM Workshop.  Over the next few days we’ll be sharing and learning together, ideas and methods for children’s acquisition of kinship knowledge.

We’ll be keeping you updated on Twitter and providing an overview here of all the great moments after the event.  In the meantime, here’s the CAKTAM Handbook and Programme.

Children’s Acquisition of Kinship Knowledge: Theory and Method

25th-26th January 2018, Bristol, UK

How do children learn kinship concepts? Given that both kin terms and kinship systems vary in complexity,
to what extent does linguistic and cultural variation affect the acquisition of kinship knowledge?

For many societies around the world, kinship provides the major framework for social organisation, yet we know very little about how children learn to categorise different kinds of kin. This two-day workshop at the University of Bristol will bring together researchers working both directly and indirectly on children’s acquisition of kinship concepts to stimulate and refine research in an important area for the cognitive and social sciences.


We are keen to engage a broad range of theoretical and methodological perspectives on kinship acquisition. We aim to address the following questions:

  • What do children of different ages know about kinship?
  • In what contexts, and through what media, do children learn about kinship? (e.g., everyday conversation, ritual, narrative)
  • What cognitive abilities does the acquisition of kinship terminology depend on? Is there anything “special” about kinship as a cognitive domain?
  • What light can acquisition shed on semantic models of kinship terms?
  • Do children differentiate close vs distant kin? How do they learn to classify the latter?
  • How does socio-cultural context affect the acquisition of kinship terms?
  • How, when, and why do children talk about kinship?
  • To what extent does complexity affect learning of kinship concepts?
  • To what extent do children differentiate kin from non-kin? How does this change over the course of development?
  • How is kinship represented in play?
  • How should we go about studying children’s acquisition of kinship concepts?


Our key speakers for the workshop include:

Joe Blythe                 (Linguistics, Macquarie)

Tanya Broesch         (Psychology, Simon Fraser)

Eve Clark                   (Linguistics, Stanford)

Eve Danziger            (Anthropology, Virginia)

Alice Mitchell           (Anthropology, Bristol)

Bob Parkin                (Anthropology, Oxford)

Annie Spokes           (Psychology, Harvard)


You are warmly invited to CAKTAM and invited to contribute as a participant or attendee.

We are very happy to invite additional contributions for 20-minute talks that respond to one or more of our guiding questions. Scholars from any relevant discipline are welcome, including but not limited to anthropology, linguistics, sociology, psychology, education, social work etc.

We would also like to encourage postgraduate students and early career researchers who may be interested in conducting research on children’s acquisition of kinship terms to attend the workshop. We will ask these participants to provide a short description of their research background for the workshop handbook, and, optionally, for those with an active or potential field site, to give a short, informal talk (5-10 minutes) discussing what this kind of research might look like in their particular research setting.

Workshop Organisers
Fiona Jordan
Alice Mitchell
Joe Blythe
Jo Hickey-Hall

CAKTAM is a research activity of the ERC-funded VariKin project, hosted at the University of Bristol and led by Professor Fiona Jordan




Handbook for CAKTAM Workshop

Journal Club roundup

Post by Catherine

We at the excd.lab are a highly interdisciplinary bunch, with backgrounds spanning anthropology, linguistics, psychology, philosophy, music, biology, and statistics. Nowhere is this more evident than in our weekly journal club, where we come together (in an archaeology laboratory!) to discuss cultural evolution and learn more about each other’s areas of research.

The first rotation of papers was intended to be an introduction to each other’s fields. If you could inflict on (er, ‘present to’) your colleagues one paper from your specialty, what would it be?



The second set of journal club readings fell under the theme ‘classic papers in your field.’ What early paper in your field’s history best showcases why your specialty is so exciting? Note: we interpreted ‘early’ in a metaphorical sense.



The third sequence of papers explicitly focused on the present day. What’s an exciting paper in your field from the past few years?



We’re now partway into our fourth cycle of journal club papers. We don’t have a theme so far, aside from the entirely independent selection of two papers by Richard McElreath, but we’re beginning to learn what sort of papers make for interesting journal club discussions. Ideally, we’re looking for papers that bring together big ideas from multiple disciplines, that clearly explain their hypotheses and methodologies to a generalist audience, and that have implications that we can tie into our own specialities. (Easier said than done, right?!)



What papers have you been reading recently? Do you have any suggestions for our lab group? Let us know in the comments here or on Twitter, @excd_lab!

excd.lab summer by the numbers


How big is your N?

Over summer, lab members have been super-busy on their various projects, taking advantage of the quiet(er) environment out of the teaching term. In the autumn, we have PhD upgrades, submissions, and vivas; papers to submit; some lab members to farewell (boo), and excitingly, a number of folk will be presenting at the Inaugural Cultural Evolution Society conference in Jena, Germany.

As a round-up, here’s “EXCD by the numbers”:

SeanThe Great Language Game is a large-scale online game where players listen to an audio speech sample and guess which language that they think they’re hearing. We analysed 15 million judgements from 964,000 participants from 80 countries. We found that people are more likely to confuse languages that are closely related in time and space.

Simon: As my research focuses on the micro rather than the macro, the most impressive number I can give in relation to this work is one – to represent each of the international student sojourners who make up my research participants, and the unique quality of their experiences that furnishes my data.

Catherine: I’ve collected Australian kinship terms from the Pama-Nyungan language family. This section of Kinbank, our database of kinship terminologies for the VariKin Evolution project, contains 13,338 words across 77 languages, while the Atlantic-Congo section that we’ve just started currently stands at 802 words across 23 languages. We are analysing 29 Pama-Nyungan languages to investigate the potential link between community marriage norms and the words one uses to talk about one’s grandparents. In my ornithological life, I was part of a recently-published study that analysed images of 49,175 eggs from 1,400 species of birds, demonstrating that egg shape is linked to avian flight ability.

Sam: There are a theoretical 10,480,142,147 different ways to classify 16 different family members. In KinBank right now we have data on 407 languages and have collected 52,408 kin-terms.

Alice: During ongoing fieldwork for the VariKin Acquisition subproject, I have collected around 38 hours of recordings of Datooga children’s interactions with adults and other children. We have so far transcribed 18,300 words of these recordings. The youngest speaker currently has 1 kinship term in his active vocabulary: ‘mother’, which he only uses in the expression “mother’s stomach!”, meaning “I swear!”

Peter: I compiled a frequency database for the VariKin Usage project. It contains information on the frequency of use of 45 distinct kin term types (such as “mother” or “mother’s father”) from 21 Indo-European languages, covering 498 distinct forms in three separate textual genres, sampling spoken, written, and on-line use. Sam and I are using these data to estimate the rate of change of a set of kin terms in Indo-European and compare it to the rate of change of basic vocabulary items. I am working on a similar database in Arabic and other non-Indo-European languages.

Rebecca: As part of my MSc project I have analysed data on marriage practices and parental investment strategies for 262 societies in four different language families. We are determining whether these cultural traits are evolving differently in each language family and whether they have a co-evolutionary relationship.

Alarna5’33” is the length of each of the recordings of two creation stories we are inviting participants in the UK and US to listen to. These stories combined contain 538 propositions that participants are asked to recall, and contain at least 6 types of content bias. The Transmission project has currently collected 1,439 minutes (23 hours and 59 minutes) of audio recordings from participants.

Cecilia: There are over 10,000 possible permutations of pitch and tone elaboration (lengthening, decoration etc) offered by the music systems of 15 worldwide music cultures (thanks to Sean for doing the math). And yet in 182 musical endings randomly selected from a project-wide sample of over 1,500 pieces, most final tones fall into one or other of only two combinations.

Fiona: I’m lucky to have eleven fantastic graduate students and postdocs in the excd.lab. Since last September we’ve had around eight visitors to the lab, two summer interns, and acquired 1 honorary lab member, Rob Ross. In a few weeks we’ll be welcoming two data collection assistants (Lucy Harries (back) and Luis Henrique), and hosting two visitors, Andreea Calude from New Zealand, and Joshua Birchall from Brazil. I have learned and used functions from about 14 R packages in the pursuit of analysing data on Pacific agricultural systems, and am writing 1 new undergraduate course for the autumn.

Big Bang Science Fair


Lucy, Shakti, and Sam at the excd.lab stall.

Last week, the excd.lab sent a team to “Big Bang Bristol“, a two day science, technology, engineering, and maths extravaganza.

Guest post by Shakti Puri & Lucy Harries.

The fair had the purpose of introducing children to research through hands-on experiments, activities, and live demonstrations. Our stall, entitled the ‘Science of Culture’, consisted of a range of activities based on ongoing lab work, such as kinship, cultural transmission, linguistic relativity, and ecology, with the aim of engaging and educating students on the science behind cultural diversity. We (Alarna, Lucy, Shakti, and Sam) originally targeted activities at students aged 11-15, but we had unexpected interest from younger children and families, so our activities were adapted to suit the wide range of enthusiastic participants.


Day 1 consisted of a school session in the morning, followed by an afternoon open to the general public. The most engaging activities appeared to be those based on linguistic relativity, specifically with regards to colour and body parts. Giving the children a blank cartoon person, we first asked them to colour in and label different parts of the body, such as ‘the right arm’ or ‘the left foot’, before moving on to more general terms, such as ‘limb’. These instructions brought about a wide range of responses, particularly with the term ‘limb’, which varied even across peers (e.g., some would colour in the arms or legs, some would colour in the head!). 

The colour activity gained the most interest across age groups, and proved to be quite controversial at times! We first asked the children to name as many colours as they could, before showing them charts illustrating variation in the number of basic colour terms cross-culturally. They were then asked to draw the boundaries between colours on a chart, such as ‘green’ and ‘blue’, which showed variation. Despite arguments, the children seemed fascinated to learn that “different languages had different colours”, a concept they had never considered before.


The second day consisted of predominantly secondary schools, with one or two primary schools attending the fair later in the afternoon. Due to a smaller number of participants, we were able to engage more with the students, who were able to complete all of the activities on offer. Of particular interest were the D-PLACE related activities, which were being trialled for the first time. Our D-PLACE dominoes, which consisted of using D-PLACE’s search option to pair societies and languages, introduced the children to D-PLACE’s wide database of information, as well as encouraging them to consider the links between linguistic, cultural and environmental practices. We also showed children D-PLACE maps based on climate and number of languages to see if they could infer any patterns. This activity engaged the children with linguistic spread, introducing them, in a simplified manner, to the study of diversity and phylogeny.

Shakti and Alarna distract students with a memory task before getting them to re-tell a story.
Lucy and Shakti explain latitudinal gradients.

Overall, the fair was a fantastic opportunity to obtain useful feedback about our resources, as well as to engage the wider public with university research. The children’s positive feedback and enthusiasm for our work introduced them to the use of scientific methods in the study of diversity, and was a great opportunity to engage them with the ‘Science of Culture’.



Welcome to Summer Research Interns

Today we welcomed two new members of the lab as Summer Research Interns. Shakti Puri and Lucy Harries have both just finished their final year as Modern Languages students and will be with us for four weeks over the summer working on bringing D-PLACE to the wider public.

Lucy Harries
Shakti Puri










Lucy: I am a French and Italian graduate interested in the link between linguistic, cultural and environmental elements and practices. I am currently undertaking a research assistantship with the aim of making D-PLACE accessible to the wider public, in particular for school teaching. This involves developing resources, such as lesson plans and tutorials, in order to encourage use of the database in communities outside of academia.

Shakti: I am a graduate in Spanish and Portuguese interested in linguistic diversity and how this is reflected in various cultures. As a Summer Research Intern I will be hoping to expand the accessibility of the online database D-PLACE. This development aims to increase the use of these resources amongst the general public with a focus on students and teachers, aiding a deeper understanding of diverse communities.

Thanks to the Faculty of Arts Research Committee for funding Lucy and Shakti this summer!

February 2017 events in the excd.lab

  1. As part of the British Academy International Partnership Mobility award that enabled Josh Birchall from the Museu Goeldi to visit us back in October, Fiona is currently in Belém, Brazil to meet with collaborators on an incipient comparative database of South American language and kinship.
  2. As part of her trip, Fiona gave a talk on “As dinâmicas da diversidade cultural e linguística” (The Dynamics of Cultural and Linguistic Diversity).
  3. Friend-of-the-lab, Bristol Anthropology PhD student Janet Howard published a paper titled Frequency-dependent female genital cutting behaviour confers evolutionary fitness benefits in Nature Ecology & Evolution. Read The Economist’s summary here.
  4. Peter Racz’s paper Social Salience Discriminates Learnability of Contextual Cues in an Artificial Language has been published in Frontiers in Psychology.