User Tools

Site Tools


Statistics with Crosstab Tables


A crosstab table is probably the most common way to visualize the nominal (categorical) data. It is a table representing the distributions of the responses to two variables. A crosstab table can be 2 x 2 or n x m as follows.

Variable 2
B1 B2
Variable 1 A11629
Variable 2
B1 B2 B3 B4
Variable 1 A158107

As I explain in the types of data page, you cannot do many things on categorical data. However, a crosstab table helps you explore your categorical data a lot, and there are several statistics you can do with a crosstab table. A Chi-square test and other similar tests is one of them. In this page, I explain other statistics you can do with a crosstab table.

Coefficients of Association (Phi-Coefficients, Contingency Coefficients and Cramer's V)

A coefficient of association is something like a correlation for categorical data. In other words, it represents how the distribution of the data are changing depending on one variable. Let's take a look at an example with a crosstab table.

Device ownership
Device A Device B
Age Young2010

This crosstab table shows the distribution of the ownership of the two devices separated by users' ages. From this table, it looks like the age affects the ownership of the device (i.e., younger users tend to like owing Device A and older users tend to like owing Device B). Thus, it seems that the age and ownership are correlated. Now what we want to know is how much they are correlated. Unfortunately, we cannot do correlation because the data are not ordinal, interval, and ratio. But, there are three metrics instead of correlation.

  • Phi-Coefficient: If your crosstab table is 2 x 2, it becomes equal to the absolute value of Pearson's product-moment correlation. However, this coefficient depends on the size of the crosstab table (defined as the min(n, m) of a n x m table), and ranges between 0 and the square root of (min(n, m) - 1).
  • Contingency Coefficient: This is also dependent on the size of the crosstab table, and ranges between 0 and the square root of (1 - 1 / min(n, m)).
  • Cramer's V: This is independent of the size of the crosstab table, and ranges between 0 and 1.

Although phi-coefficients and contingency coefficients are valid metrics, the problem of those metrics is that you cannot use them to compare the strength of association across crosstab tables with different sizes. Thus, Cramer's V is generally the first choice to see the association. You can calculate these values very easily in R, but you need to include vcd package.

data <- matrix(c(20, 10, 3, 27), ncol=2, byrow=T) library(vcd) assocstats(data) X^2 df P(> X^2) Likelihood Ratio 22.185 1 2.4762e-06 Pearson 20.376 1 6.3622e-06 Phi-Coefficient : 0.583 Contingency Coeff.: 0.503 Cramer's V : 0.583

Thus, Cramer's V is 0.58 in this example. You can calculate Cramer's V for n x m crosstab tables in the same way.

Agreement and inter-rater reliability (Cohen's Kappa)

Cohen's Kappa for Nominal Data

Agreement is another metric you can derive from a crosstab table. This metric is used when two people look at the same data and categorize them. For instance, you have a bunch of quotes you gained from the interviews with your participants, and categorized them with you and another researcher. You have several themes (categories or groups), and for one category, your categorization is as follows (“yes” means that the rater thinks the quote belongs to that category, and “no” means the rater doesn't think so).

Rater 2
Yes No
Rater 1 Yes355

What you want to show is how well both of you agreed with the categorization for this category. If you don't agree much, this categorization doesn't really have a power or is ambiguous. One metric for this is an agreement percentage, which is the ratio of the numbers of the instances both raters agreed with (i.e., both said “yes” or both said “no”) over the total number. You can easily calculate this manually.

(35 + 110) / (35 + 5 + 4 + 110) 0.9415584

Thus, you have 94% agreement. This seems fine, but the problem is that you do not remove the effects caused by randomness. You just have a good result by chance. To claim the reproducibility, we want to have a metric which gets rid of the effects caused by randomness, and a Cohen's Kappa is such a metric. It ranges from -1 to 1. You can easily calculate the Cohen's Kappa in R.

data <- matrix(c(35, 5, 4, 110), ncol=2, byrow=T) library(vcd) Kappa(data) value ASE Unweighted 0.8467831 0.04955746 Weighted 0.8467831 0.08027348

Thus, the Cohen's Kappa for this categorization is 0.85 (look at the value for unweighted). The Cohen's Kappa is usually smaller than the agreement percentage. ASE means Approximate Standard Error, and you can calculate an approximate 95% confidence interval by 1.96 * ASE. So in this case, the 95% confidence interval is [0.8467831 - 1.96 * 0.04955746, 0.8467831 +1.96 * 0.04955746] = [0.75, 0.94].

You can see the magnitude of the agreement by the Kappa coefficient.

Kappa value magnitude of agreement
< 0 no
0 - 0.2 small
0.2 - 0.4 fair
0.4 - 0.6 moderate
0.6 - 0.8 substantial
0.8 - 1 almost perfect

In a practical situation, a Kappa coefficient over 0.6 suggests that your categorization is robust. If not, it suggests that there are categories which are ambiguous or are not well-agreed by the raters. So, you may have to re-think about your categorization.

If you do not have a crosstab table for the data, and instead you have raw data (like 0 or 1 in one column for one rater, and another column for the other rater), it is probably easier to use the psy package. Here is a quick example of how to use the psy package.

library(psy) testdata <- rbind(c(1,1,1,1,1,0,0,0,0,0,1,1,1,1,1,0,0,0,0,0),c(1,1,1,1,0,1,0,0,0,0,1,1,1,0,0,1,1,0,0,0)) testdata <- t(testdata) ckappa(testdata) $table 0 1 0 7 3 1 3 7 $kappa [1] 0.4

You can then calculate the confidence interval as well, but it is a bit more complicated process because we are using a bootstrap method. We will use the boot package.

library(boot) ckappa.boot <- function(data,x) {ckappa(data[x,])[[2]]} res <- boot(testdata, ckappa.boot, 1000),type="bca") BOOTSTRAP CONFIDENCE INTERVAL CALCULATIONS Based on 1000 bootstrap replicates CALL : = res, type = "bca") Intervals : Level BCa 95% (-0.0602, 0.7802 ) Calculations and Intervals on Original Scale

So the Kappa is 0.40 and its CI is [-0.06, 0.78]. I won't go into the details of this code, but it should work by a simple copy-and-paste. Please note that the CI estimated by the bootstrap method is usually different from the CI calculated by ASE. The bootstrap method is known to be more accurate, so I would recommend to use it if possible.

Finally, you can report your Cohen's Kappa with its confidence interval as follows: The measured Cohen's Kappa for our results was 0.85 (95% CI: [0.75, 0.94]), indicating a strong agreement.

Weighted Cohen's Kappa for Ordinal Data

The above Cohen's Kappa is for nominal (or categorical) data, which means that your dependent values do not have any specific order. But you may often have ordered values (such as subjective ratings) and want to know how much two raters agree on their ratings. Let's think about a hypothetical case in which two raters are going to rate the quality of pictures taken by someone with three rating: Good, OK, Bad. Then, you get the following results.

Rater 2
Good OK Bad
Rater 1 Good4032

This is almost the same as a 2×2 crosstab table, so we can just use the Cohen's Kappa for testing the agreement (which is legitimate). The limitation of this naive Cohen's Kappa is that it treats all disagreements equally. But, in this example, it is more natural to think that the disagreement between “Good” and “Bad” has more weight than the disagreement between “Good” and “OK”. To account this, we can calculate the weighted Cohen's Kappa.

There are a number of ways to calculate such weight, but the most common one is the squared weight: weights are related to squared differences between rows and columns indices in the crosstab table. We are going to use the psy package. And in this example code, I use 0, 1, 2 for representing the ratings (so 0 for “bad”, etc)

library(psy) testdata <- rbind(c(0,0,0,0,0,1,1,1,1,1,2,2,2,2,2,1,1,1,0,0),c(0,0,0,0,2,1,1,1,1,0,2,2,2,2,1,2,2,1,1,0)) testdata <- t(testdata) wkappa(testdata) $table 0 1 2 0 5 1 1 1 1 5 2 2 0 1 4 $weights [1] "squared" $kappa [1] 0.6428571

And we are going to estimate the confidence interval with the boot package.

library(boot) wkappa.boot <- function(data,x) {wkappa(data[x,])[[3]]} res <- boot(testdata, wkappa.boot, 1000),type="bca") BOOTSTRAP CONFIDENCE INTERVAL CALCULATIONS Based on 1000 bootstrap replicates CALL : = res, type = "bca") Intervals : Level BCa 95% ( 0.1495, 0.8775 ) Calculations and Intervals on Original Scale Some BCa intervals may be unstable

Thus, the weighted Cohen's Kappa is 0.64, and its 95% confidence interval is [0.15, 0.88]. You can report your weighted Cohen's Kappa with its confidence interval as follows: The measured weighted Cohen's Kappa (squared weights) for the ratings by the two raters was 0.64 (95% CI: [0.15, 0.88]), indicating a moderate agreement.


For the Cohen's Kappa,

  • Cohen, Jacob (1960). “A coefficient of agreement for nominal scales”. Educational and Psychological Measurement 20 (1): 37–46.
  • Cohen, Jacob (1968). “Weighed kappa: nominal scale agreement with provision for scaled disagreement or partial credit”. Psychological Bulletin 70 (4): 213–220.


Kate, 2015/03/12 14:35
Did you try to use intra-class correlation for validation agreement between several experts?
raskruBib, 2019/04/09 17:04
<a href=></a>
Helenvaw, 2019/09/11 18:13
Delivery Club – независимый проект, появившийся на рынке в 2009 году. Своей задачей он видел объединение большинства служб доставки в одну базу, благодаря чему клиент мог бы с легкостью заказать блюдо из ресторана или продукты из магазина, в буквальном смысле, не вставая с дивана.
Сегодня в базе этого сервиса можно найти более 5000 ресторанов и магазинов самого разного направления: от рынков и ресторан итальянских остерий до фермерских магазинчиков и всем известного фастфуда. Получите <a href=>промокод деливери клаб</a>
NuoXS, 2021/01/11 03:32
Drugs information sheet. Effects of Drug Abuse. <a href="">how to get viagra price</a> in the USA. All news about drug. Get now.
<a href=>Everything trends of pills.</a> <a href=https://xn--80aaxqclcipi.xn--p1ai/gallery/comment-page-5/#comment-101235>Some about pills.</a> <a href=>All what you want to know about drug.</a> bdbc54c
sdxkaq, 2022/05/06 17:27
uso medical abbreviation
njtvrd, 2022/05/07 19:16
hydroxychloroquine 200 mg side effects <a href="">hydroxychloroquine buy</a>
Santos, 2022/11/02 11:44
Darleen, 2022/11/08 16:49
Santos, 2022/11/10 21:02
Catalina, 2022/11/14 21:58
Santos, 2022/11/18 01:43
Walker, 2022/11/18 20:17
jiuer7845, 2022/11/20 01:28 Yeezy Adidas UK Air Jordan 4 Air Jordan 1 Jordan 1 Air Jordans Nike Air Jordans Jordans 1 Air Jordan 1 Nike Jordan 1 Nike Jordan Jordan 1s Jordan 1 Adidas Yeezy Adidas Yeezy Yeezy Adidas Yeezy Adidas Yeezy Yeezy Shoes Adidas Yeezy Yeezy Shoes Air Jordan 1 Air Jordan 11 Air Jordan 1 Jordan 1 Jordan 1 Air Jordan 4 Air Jordans Air Jordan 1 Jordan 1 Air Jordan Shoes Air Jordan Air Jordans Air Jordan 1 Jordan 1 Jordan 1 Jordan 1 Jordan 1 Jordan 11 Jordan 1s Air Jordan 4 Jordan 4 Air Jordan 4 Jordan 4 Air Jordan 4 Jordan 4 Jordan Jordan 4 Military Black Jordan 4 Jordan 4s Jordan Retro 4 Jordan 11 Jordans 4 Jordan Shoes Jordans Shoes NFL Shop Official Online Store NFL Jerseys Nike Air Jordan Nike Air VaporMax Nike Jordan 1 Nike Jordans Nike Outlet Nike Outlet Store Nike Outlet Nike Outlet Store Online Shopping Nike Outlet Nike Shoes Nike UK Off White Pandora Outlet Pandora UK Yeezy Yeezy 350 Yeezy 350 Yeezy Yeezy 350 Yeezy Boost 350 Yeezy 350 V2 Yeezy 450 Yeezy 700 Adidas yeezy Yeezy Foam Runner Yeezy Yeezy Yeezy Shoes Yeezy Yeezy Slides Yeezy Slides Yeezy Supply Jordan Shoes Jordans Yeezy Slides Ray Bans Ray Bans Nike Shoes Nike Outlet Nike Outlet Ray Ban Glasses Ray-Ban Sunglasses Yeezy Shoes Moncler Jacket Moncler Outlet Moncler Outlet Moncler Jacket Moncler Moncler UK Moncler Jacket Moncler Moncler Outlet Yeezy Yeezy Shoes Yeezy Sneakers Yeezy Yeezy Shoes Yeezy 350 Pandora Jewelry Pandora Yeezy Yeezy Shoes Yeezy Sneakers Yeezy Yeezy Shoes Yeezy 350 Pandora Jewelry Pandora Pandora Jewelry Yeezy Ray Bans Ray Bans Sunglasses Ray Ban Outlet
William, 2022/11/20 21:00
Joseph, 2022/11/21 08:36
Cara Jenny, 2022/11/24 08:15
Hermes Bags ,

Alexander McQueen Pumps ,

Air Jordan 7 ,

Air Jordan 11 ,

Moncler Man Jackets ,

Loewe Handbags ,

Air Jordan 5 ,

Canada Goose Womens Jackets ,

Canada Goose Mens Jackets ,

Canada Goose Jackets ,

Air Jordan 6 ,

Moncler Mens Vests ,

Alexander McQueen Flats ,

Moncler Woman Coats ,

Burberry Handbags Outlet ,

Canada Goose Womens Vests ,

Alexander McQueen Loafer ,

Moncler Outlet ,

Alexander McQueen Sneakers ,

Air Jordan 2 ,

Canada Goose Mens Vests ,

Dior Bag Outlet ,

Ysl Bag Outlet ,

Air Jordan 1 ,

Bottega Veneta Bag Sale ,

Alexander McQueen Sandals ,

Moncler Sale ,

Moncler Womens Coats ,

Air Jordan 13 ,

Air Jordan 12 ,

Moncler Man Coats ,

Fendi Handbags Sale ,

Prada Bag Sale ,

Moncler Mens Jackets ,

Alexander McQueen Outlet ,

Moncler Woman Vests ,

Alexander McQueen Slides ,

Moncler Man Vests ,

Moncler Outlet Store ,

Moncler Mens Coats ,

Canada Goose Mens Coats ,

Moncler Womens Vests ,

Alexander McQueen Lace-Up ,

Air Jordan Shoes Sale ,

Moncler Womens Jackets ,

Air Jordan 4 ,

Alexander McQueen Boots ,

Canada Goose Womens Coats ,

Air Jordan 3 ,

Moncler Woman Jackets ,
Barbara, 2022/12/25 11:14
William, 2023/01/17 02:05
Kim, 2023/01/24 03:55
Barbara, 2023/01/24 21:21
Enter your comment:
If you can't read the letters on the image, download this .wav file to get them read to you.
hcistats/crosstab.txt · Last modified: 2014/08/14 05:24 by Koji Yatani

Donate Powered by PHP Valid HTML5 Valid CSS Driven by DokuWiki