**Four Steps for Conducting Bivariate Analysis**

By Daniel Palazzolo, Ph.D.

(printable version here)

The statistics we use for bivariate analysis are determined by levels
of measurement for the two variables. We normally will want to take
four steps in conducting a bivariate analysis. Keep in mind, we use statistics
to test a bivariate hypothesis. In commonsense terms, we are using statistics
to explain the relationship between the two variables and to determine the
strength and significance of the relationship. I will use the relationship
between gender and party identification to illustrate a bivariate analysis.

Here are the four steps:

**Step 1:** Define the nature of the relationship in terms
of how the values of the independent variables relate to the values of
the dependent variable.

For example, if I am testing the relationship between gender
and party identification, then I will ultimately say something to the
effect of:

“The data show a relationship between gender and party
identification; women are more likely than men to call themselves Democrats;
and men are more likely than women to call themselves Republicans. About
39% of women call themselves Democrats compared with just 30% of men;
and about 29% of men call themselves Republicans compared with about 23%
of women. The crosstab shows that men are slightly more likely to be independents,
but the difference is so small (about 2% points), that a difference among
the population is unlikely.”

**Step 2:** Identify the type and direction (if applicable)
of the relationship.

Ex: In the example above, gender is nominal and party
identification is ordinal, so it is a correlative relationship.

**Step 3:** Determine if the relationship is statistically
significant, i.e. different from the null hypothesis (meaning there is
no expected relationship), and generalizable to the population.

Ex: We
use Chi-square to determine the statistical significance of a relationship
with at least one nominal variable; and looking at the chi-square table,
we see that the relationship is significant at beyond the .05 level.

**Step 4:** Identify the strength of the relationship, i.e.
the degree to which the values of the independent variable explain the
variation in the dependent variable.

Ex: We can use Lambda or Cramer’s
V to measure strength, and we see that on a scale from 0 to 1, a symmetric
value of .01 for Lamba and .098 for Cramer’s V places strength on
the low end of the scale. Thus, this is not a strong relationship, although
it is statistically significant.