Communicating Science: Statistical Thinking Can Organize Qualitative Analysis

W/ my co-presenter, I conducted a whirlwind tour of complex regression models (serial mediation, parallel mediation, multi-level models, and multi-level mediation) to our lab. If you can imagine it, and you have meaningful quantitative data, there’s a model for you! (even if you only use SPSS, there are macros – PROCESS and MLMED) for you.)

When I was driving back from visiting my parents in Raleigh, one of the things I really looked forward to in Columbus was my multi-level modeling class (finding patterns in data when your observations are clustered within an individual, country, media market, school, etc). I was excited to be acquiring new tools – new ways of tackling meaningful questions, systematically. Knowledge is power (limited power, sometimes, but power none the less).

Statistical tools are not only powerful for measuring complex social situations, but can be powerful for thinking about them as well. I like to joke that stereotypes reflect really simple statistical thinking (mean differences). Intersectionality starts to take different levels of variables into account (regression). Privilege demands thinking in terms of clustering – different people with different traits in different situations (i.e. multi level regressions).

How many articles have you read on topics like privilege that had no guiding framework for thinking about the influence of group-membership, individual traits, overlapping groups, and categories of situations? They tend to fumble. They try to simplify with analogies, but often that simplicity feels artificial. Multi-level regression provides a heuristic framework – a way of organizing how we tackle that complexity. Even if we lack the data for a conclusive analysis – multi-level modeling helps us to articulate our questions, our guesses, and our insights.

It is also something, as the presentation this morning indicated, that can be made accessible in qualitative terms.

Communicating Science: Things to keep in mind when thinking about police shootings . . .

Things to keep in mind when thinking about police shootings:

Police show considerable race-weapon stereotyping, and race-aggression stereotyping on the social-psychologist designed “shooter task” – other people show even more (Correll and colleagues’ work).

Everyone has to resolve ambiguity – that’s when stereotypes can creep in. Is that a gun or a wallet? Is that person aggressive or scared? It could even lead to sensory distortions under high stress conditions. Anyone who has ever been really nervous knows to watch out for this (as opposed to blindly acting upon it). For a more everyday example of a sensory distortion, ever misread something while copy editing? Your brain “filled in the gap” with a coherent story, and the typo remained, unseen.

Police departments are not known for cultivating good mental health – but someone with a gun acting out poor mental health is a problem across the board – whether they’re killing themselves or killing someone else.

Citizens have less experience coping wth spikes of fear – of adrenaline and cortisol – than police, who have been through training, do.

White citizens are more likely to call cops on black citizens doing “ambiguously criminal” behaviors. Cops, then, are more likely to be monitoring for “suspicious black people.”

Studies of police-driver interactions at traffic stops often see the the citizen’s reactions – perfectly at ease versus even politely defensive – leading to more controlling attitudes from police. It would be hard for citizens who have been targeted to ever be perfectly at ease. Heck, even I’ve been harassed by customs agents and police for “looking nervous”.

Mentally ill people are particularly likely to become targets – because any not perfectly “safe and predictable” behavior is interpreted as a threat. This is why some departments call in specialists who are better able to assess the situation when mental illness is suspected.

Police, like the rest of us, like their stories – even ones that are more a matter of faith than fact. You can also imagine that departments would vary by how often they actually deal wth threat. On the low end, they may, on average, be looking for an opportunity to “suit up,” at the high end the may live, on average, in a state of chronic stress and fear.

Statistically – we can control for a lot of things – including actual-race-based difference in weapons charges and other signs of real versus imagined racial differences in dangerousness. Stereotypes are likely still relevant, even after those things are controlled for. This makes sense, empirically. We apply schema – ideas about the world – to resolve ambiguity all the time (the typo example). The solution would seem to be better schema and better methods for gathering information in the moment (both better schema and better attempts at evaluating the situation are part of the training that specialized mental health responders have).

However, statisticians, particularly ones relying on observational data, can cherry pick which measures they include, and which they exclude. They are also often trying to “start a conversation” with others in their field, such that national attention may be secondary. Sometimes “starting a conversation” means generating controversy. Even academics engage in PR, albeit for a limited audience.

So always ask yourself “Would I expect to see a race-based difference if the neighborhoods, suspects, or the officers were matched on a different set of characteristics? Statistics don’t provide the final answer, only pieces of the puzzle. To know they fit together, you have to look at them closely.In the end, however, most statistics are asking:

“Is there an average difference in y as we go up one unit on x, constant across (controlling for) levels of these other variables?” Y, for example, could be likelihood of getting shot, a one unit change in x could be going from “white, coded as 0, to black, coded as 1.” That would be a categorical variable. It could be that when people are matched on income, education, etc, a racial difference disappears, remains, or even increases. This is called “controlling for” or “adjusting for” those variables.

Often, the statistician (and you) could ask, does the odds of getting shot when black versus white depend upon another of those other variables, so that if that variable, W, let’s say, is high, the difference in Y (odds of getting shot) as X goes from 0, white, to 1, black, is bigger (or smaller)?” So if “W” is median income – it could be that the odds of getting shot while black versus white is lower in areas with high median income, controlling for (holding constant) the percentage of black people living in that area. So many questions can be asked (and partially answered) with statistical models!
 
The mechanics may seem complex, even scary, but understanding what a model is trying to evaluate, and how it works, is not rocket science. Like a jigsaw puzzle, it requires patience, but you can get it done!If you want to try your hand, think about the following and try to break it down into primary relationships (outcome variables versus focal predictor variables, control variables, and moderators (if any):
http://journals.plos.org/plosone/article?id=10.1371%2Fjournal.pone.0141854

Police shootings involve a rich if terrible tapestry of factors.
 
However, if you’re talking to any every day person, and they are feeling a sense of concrete personal danger, why not tend and befriend first, discuss and debate second?

Communicating Science: Reading Science News and Empirical Articles – What Not To Do

I recently came across this on my Facebook feed:
https://psmag.com/the-death-of-the-white-working-class-has-been-greatly-exaggerated-1c568d3e6b8c

It is a great example of what happens when you allow a headline to make you curious, about methodology as well as topic. Dig deep, and you may uncover misleading, questionable, or just confusing decisions made by researchers and people reporting on the research. Sometimes this is deliberate, an attempt to influence the policy decisions of people who don’t have time to understand research.

However, in order to get appropriate attention, portions of almost every empirical article you see will be hyped. The introduction, sometimes, the abstract, sometimes the discussion too, will inevitably exaggerate what the researchers actually found. Those sections are designed to say “Hey, if our interpretation is correct, these results will be really interesting.”

A social scientist reads the abstract and skips right to the manipulations (what the experimenters presented differently to different people) and the measures (what they measured). That’s the concrete, meaty detail. That’s the difference between being told a movie fits in a certain genre and watching the actual movie.

Then, we look at the statistics, acknowledging that these estimates (models designed to uncover an average trend amidst all that variation) are the best they researchers could do with the tools they had. (Let’s assume they were researching in good faith). We ask questions about what they didn’t report (word limits), what they might have found but not been able to tell a clear story about (there’s a lot of messiness, even scientists like to read about clarity).

If we’re really good at statistics, we may even look for mistakes. Peer reviewers who act as gatekeepers for academic journals are generally unpaid and overwhelmed. Mistakes happened.

Then we check out the discussion (the “let’s get real” section of the paper). Then we skim the intro for any novel interpretations of the existing literature, or citations we were unfamiliar with.

This is a good approach, not just with academic papers, but with anything. What does the real evidence look like? Are people interpreting it in good faith? Are they making errors you can help correct? What other interpretations could you offer?

Applying Psychological Research at Sooth – Anger and Information Processing

Sooth is [now, was] a social-psychologist founded company that develops community around the art of giving and receiving good advice. Their IOS platform app brings users together for anonymous advice-asking and advice-providing. Despite anonymity, and due in part to the educational materials provided to users, advice tends to be very high quality. I encourage you all to check it out!

For my own contribution, see:
http://www.soothspace.com/blog/anger (now defunct, unfortunately! See link at end of this post)

I briefly describe literature studying the effects of anger on information processing, and propose a response to anger that facilitates perspective-taking.

Advice on Anger — Sooth

Applying Psychological Research at Sooth: Advice-Giving

Sooth is a social-psychologist founded company that develops community around the art of giving and receiving good advice. Their IOS platform app brings users together for anonymous advice-asking and advice-providing. Despite anonymity, and due in part to the educational materials provided to users, advice tends to be very high quality. I encourage you all to check it out!

For my own contribution, see:
http://www.soothspace.com/blog/giving-good-advice

I describe obstacles to giving good advice – including confirmation bias, the illusion of explanatory depth, and passive dehumanization. I then recommend some research and experience-supported antidotes.

 

Applying Psychological Research at Sooth, Advice-Asking

Sooth is a social-psychologist founded company that develops community around the art of giving and receiving good advice. Their IOS platform app brings users together for anonymous advice-asking and advice-providing. Despite anonymity, and due in part to the educational materials provided to users, advice tends to be very high quality. I encourage you all to check it out!

For my own contribution, see:
http://www.soothspace.com/blog/seeking-good-advice

It explores the tension between automatically arising preferences and both descriptive and prescriptive norms in seeking good advice.

 

Key Concepts: Dual + Process Theories and the Tripartite Theory of Mind

There is an interesting chapter (which I link to here) by Stanovich on the “tripartite” mind that got me thinking about research, as well as being human.

Stanovich distinguishes, first, the Autonomous Set of Systems (TASS) which includes all automated parallel/associative processing. This is the set of systems that creates a “primary” representation of reality. They create the world as we initially perceive it – drawing on schemas as well as perceptual inputs. This is what dual systems theorists call System 1.

Second, he distinguishes the algorithmic mind – which creates secondary – “decoupled” representations. This is most clearly related to working memory capacity – our capacity to sustain representations in memory (and screen out distractions). He distinguishes two general abilities of the algorithmic mind, the second more sophisticated than the first. The first starts with the simplest model of the world that the person can come up with quickly, then adjusts that model, serially (one adjustment at a time). In social terms, this could be someone who is angry at their boss and who we advise to “correct” for different biases associated with anger. Basically, the angry person has one model, then we advise them to adjust it so that – if our advice is helpful – it better resembles the world.

There is a second ability of the algorithmic mind – to simulate different models of the world, and to select between them. This is a more cognitively-loading process. Intuitively, people are more likely to use this ability when they are thinking about the future. Even then, they’ll tend to want to default to a simpler process and start with a single model and then serially process it until they’re more confident in it. This approach will ultimately be more biased, assimilating or contrasting to the first impression. I think we’ve all found ourselves unable to fully break away from that initial model when forecasting future events.

Simulating and comparing multiple models could (should!) also be the process when people are perspective-taking. Rather than starting with a single model or schema (often a cultural schema) and then comparing their target’s behavior to that model (adjusting their impression of the target accordingly), people could start with multiple possible interpretations from different sources and rely on unfolding personal experience to distinguish the superior from the inferior models. Ethnographers and clinical psychologists become very good at this sort of thinking.

Last, in Stanovich’s theoretical model, is the reflective mind. This includes intentional, guiding goals, that a) can trigger the need to go beyond the TASS (the autonomous set of systems) and b) guide the algorithmic mind. People, of course, can differ in their tendency to override the TASS or to engage in single model vs. multi-model thinking. Measures like the Need for Cognition, the Need for Cognitive Closure, Personal Need for Structure, Actively Openminded Thinking, etc provide the researcher information.

I should, in closing, note that TASS-processing isn’t bad! Ideally, people would grow better at a)  recognizing when TASS-processing will fail them and b) being mindful enough of their goals that they can judge when a model is 1) sufficiently detailed and 2) gives appropriate weights to different variables. The TASS, at least in perhaps too tightly controlled thin slicing studies, can be better at both “1” and “2.”

Mini Lectures: Illusion of Explanatory Depth

In research, we must consider our own and others’ biases.

The illusion of explanatory depth, described in the video below, can negatively impact the precision and plausibility of our hypotheses.

It can also help explain participant behaviors.

In qualitative research, we can unintentionally disrupt this illusion in our informants – prompting them to give a less automatic, less “natural” answer to our questions.

Research Methods Intro: Accuracy and Ethnography

In personality psychology researchers empirically investigate sources of accuracy by using information about both the perceiver – the person whose accuracy we are evaluating – and the target – the person the perceiver is accurate about. Using a round robin design – where every participant rates and is rated by every other participant, as well as peer and self-report ratings of each participant, researchers can examine and quantify the relative predictive power of different factors.

Participants rate themselves and others on different traits. Ratings are more accurate if the perceiver’s ratings of the target match an average of the target’s peer and self-report ratings. Researchers can, for example, simultaneously compare:Accuracy Illustration

  • normativity – the actual prevalence of the trait in a group

 

  • perceived similarity to the perceiver – the influence of distinctive traits about the perceiver (calculated by adjusting the average of the perceiver’s peer and self-reports for the average self-report for the entire group.)

 

  • distinctiveness  – the extent to which the target is higher or lower than average on the trait (calculated by adjusting the average of the target’s peer and self-reports for the average self-report for the entire group) (Human & Biesanz, 2011).

 

Theoretically, the influence of normative accuracy – the extent to which an individual references others against a norm – should be higher when perceivers and targets share a cultural background. On the one hand, normative accuracy is the product of experience. The more muembers of a group you meet, the better you estimate average behaviors. On the other hand, cultural norms also shape who we seek to become and how we express ourselves.

Perceived similarity, on the other hand, can bias the perceiver towards seeing her own distinctive traits in others, at least when she likes or in some way identifies with those others. For example, an ethnographer may tend to see informants that he likes as being more similar to him than they actual are and informants that he dislikes as either being contrasted against his perceptions of himself or more similar to his perception of the “average” informant.

When perceptions of normativity are less established, however, the target’s distinctiveness should be less biasing, given that the ethnographer may not know what traits are distinctive and what traits are common. In other words, as the ethnographer’s perception of the actual averages for the group of informants changes, the roles of similarity and distinctiveness may change as well.

One takeaway for the ethnographer, then, is to exercise greater caution and give attention to the influence of presumed normative behaviors, perceived similarity (or lack thereof), and target distinctiveness. However, where a round robin design is practical, the ethnographer could also apply this observational research to the field. Given a culturally-validated scale, the ethnographer could compare the respective roles of these different influences on person perception across cultures. Other analysis could compare normative accuracy as determined by the actual average ratings for the groups to stereotypic accuracy – as determined by participant ratings of an imaginary “average person.”

This quantified data could be used to contextualize participant observation and in depth interviews.

Other Considerations – What is Accuracy?:

Accuracy is multi-dimensional. For example, if asked to judge the prevalence of a certain trait in different social groups, a person could have poor absolute accuracy. In that case, they might consistently underestimate or overestimate the prevalence in each group. However, they might still have good relative accuracy – judging the differences between groups well.  As in the discussion above, accuracy is a continuous variable and it can increase or decrease over time. Our stereotyping intervention, for example, targeted absolute accuracy for a target social group. It could be expanded to target absolute accuracy for both the target’s social group and for the perceiver’s. Relative accuracy would then take care of itself.

Further, statistical measures of accuracy are blind to process. Other research examines how an observer learns about the group’s average rating on any trait. More research can disentangle the roles of shared social-desirability concerns, self-stereotyping, and other culturally-accessible influences on the self concept. These shared concerns could, for example, lead participants to report being more similar without actually being more similar.

Considering this relative complexity, stereotyping and prejudice interventions have to choose their target:

  • Improving the validity and reliability of the process by which we judge individual targets and target groups?
  • Improving the absolute accuracy of these judgments?
  • Improving the relative accuracy of these judgments?
  • Improving accuracy for certain traits, but not for others? (Accuracy may differ by trait).