Compare Models

Now we’re in the home stretch. We’ve gone through all of these steps to get here: the point where we can now draw conclusions about human decision-making and how it is influenced by their social preferences.

Lesson

Elijah Galvan

September 1, 2023

8 min read

Goal During this Stage

We’re going to compare models against each other now. While we’ve already determined what the best model is, we haven’t yet tested the hypotheses that it embodies.

How to Achieve this Goal

Hypothesis Testing via Model Comparison

We’re going to compare MFIs for various models within subjects using uncorrected paired t-tests. Remember that we have to exclude any subjects for whom the model perfectly explains behavior. Also, if we hypothesized that our favored model would outperform other models, we should use directional tests.

Note

You should never correct for multiple comparisons when testing computational models against each other. Your model fits the data best: you’re testing the best model, and the best model only, against all of the other models in your set. One of those models in your model set will have the lowest lower boundary of their confidence interval - this is the model we want to test against (this is usually the second-best performing model but not necessarily). There is no reason to compromise our power when we are making multiple comparisons only to facilitate testing your model against the best challenger.

excluded1 = which(subjectData$modelSS == 0 | altSubjectData$modelAlt1SS == 0)
excluded2 = which(subjectData$modelSS == 0 | altSubjectData$mode2Alt1SS == 0)

t.test(subjectData$modelAIC[-excluded1], altSubjectData$modelAlt1AIC[-excluded1], paired = T, alternative = 'less') #favored model should be less than the other model (i.e. better model fit)
t.test(subjectData$modelAIC[-excluded2], altSubjectData$modelAlt2AIC[-excluded2], paired = T, alternative = 'less')
excluded1 = find(subjectData.modelSS == 0 | altSubjectData.modelAlt1SS == 0);
excluded2 = find(subjectData.modelSS == 0 | altSubjectData.modelAlt2SS == 0);

[~, p1] = ttest(subjectData.modelAIC(setdiff(1:end, excluded1)), altSubjectData.modelAlt1AIC(setdiff(1:end, excluded1)), 'Tail', 'left');
[~, p2] = ttest(subjectData.modelAIC(setdiff(1:end, excluded2)), altSubjectData.modelAlt2AIC(setdiff(1:end, excluded2)), 'Tail', 'left');
from scipy.stats import ttest_rel

excluded1 = np.where((subjectData['modelSS'] == 0) | (altSubjectData['modelAlt1SS'] == 0))[0]
excluded2 = np.where((subjectData['modelSS'] == 0) | (altSubjectData['modelAlt2SS'] == 0))[0]

_, p1 = ttest_rel(subjectData['modelAIC'][np.setdiff1d(np.arange(len(subjectData)), excluded1)],
                altSubjectData['modelAlt1AIC'][np.setdiff1d(np.arange(len(altSubjectData)), excluded1)],
                alternative='less')

_, p2 = ttest_rel(subjectData['modelAIC'][np.setdiff1d(np.arange(len(subjectData)), excluded2)],
                altSubjectData['modelAlt2AIC'][np.setdiff1d(np.arange(len(altSubjectData)), excluded2)],
                alternative='less')

Tutorials

Tutorial 1 - van Baar, Chang, & Sanfey, 2019

Hypothesis Testing via Model Comparison
t.test(subjectData$modelAIC, altSubjectData$modelAICGreed, paired = T) #negative mean difference/t value means that first term is lower than second term and lower MFI is better
t.test(subjectData$modelAIC, altSubjectData$modelAICGuilt, paired = T)
t.test(subjectData$modelAIC, altSubjectData$modelAICInequity, paired = T)

library(ggsignif)
aic = c(mean(subjectData$modelAIC), mean(altSubjectData$modelAICGreed), mean(altSubjectData$modelAICGuilt), mean(altSubjectData$modelAICInequity))
qplot(y = aic,
    x = as.factor(c('Moral Strategies Model', 'Greed Model', 'Guilt Model', 'Inequity Model')),
    fill = as.factor(c('Moral Strategies Model', 'Greed Model', 'Guilt Model', 'Inequity Model')),
    color = '',
    geom = 'col') +
    labs(x = 'Model', y = 'AIC', fill = NULL, color = NULL) +
    theme_minimal() + scale_color_manual(values = c(rgb(0, 0, 0, maxColorValue = 255))) +
    scale_fill_manual(values = c(rgb(132.5, 132.5, 132.5, maxColorValue = 255),
                                rgb(132.5, 132.5, 132.5, maxColorValue = 255),
                                rgb(132.5, 132.5, 132.5, maxColorValue = 255),
                                rgb(218, 165, 32, maxColorValue = 255))) +
    geom_signif(comparisons = list(c('Moral Strategies Model', 'Greed Model')), y = 341, textsize = 0)+
    geom_signif(comparisons = list(c('Moral Strategies Model', 'Guilt Model')), y = 328, textsize = 0)+
    geom_signif(comparisons = list(c('Moral Strategies Model', 'Inequity Model')), y = 315, textsize = 0) +
    annotate("text", x = 2.5, label = "***", y = 358, size = 4)+ # 3 for p < 0.001, 2 for p < 0.01, 1 for p < 0.05 usually
    annotate("text", x = 3, label = "***", y = 345, size = 4)+
    annotate("text", x = 3.5, label = "***", y = 332, size = 4)
ttest(subjectData.modelAIC, altSubjectData.modelAICGreed, 'Paired', true);
ttest(subjectData.modelAIC, altSubjectData.modelAICGuilt, 'Paired', true);
ttest(subjectData.modelAIC, altSubjectData.modelAICInequity, 'Paired', true);

aic = [mean(subjectData.modelAIC), mean(altSubjectData.modelAICGreed), mean(altSubjectData.modelAICGuilt), mean(altSubjectData.modelAICInequity)];
figure;
bar(categorical({'Moral Strategies Model', 'Greed Model', 'Guilt Model', 'Inequity Model'}), aic, 'FaceColor', [0.85 0.85 0.85], 'EdgeColor', 'none');
ylabel('AIC');
xlabel('Model');
ylim([300 360]);
hold on;
text(2.5, 358, '***', 'HorizontalAlignment', 'center', 'VerticalAlignment', 'bottom', 'FontSize', 14);
text(3, 345, '***', 'HorizontalAlignment', 'center', 'VerticalAlignment', 'bottom', 'FontSize', 14);
text(3.5, 332, '***', 'HorizontalAlignment', 'center', 'VerticalAlignment', 'bottom', 'FontSize', 14);
hold off;
ttest_rel(subjectData['modelAIC'], altSubjectData['modelAICGreed'])
ttest_rel(subjectData['modelAIC'], altSubjectData['modelAICGuilt'])
ttest_rel(subjectData['modelAIC'], altSubjectData['modelAICInequity'])

aic = [np.mean(subjectData['modelAIC']), np.mean(altSubjectData['modelAICGreed']), np.mean(altSubjectData['modelAICGuilt']), np.mean(altSubjectData['modelAICInequity'])]
plt.figure();
plt.bar(['Moral Strategies Model', 'Greed Model', 'Guilt Model', 'Inequity Model'], aic, color=[0.85, 0.85, 0.85], edgecolor='none');
plt.ylabel('AIC');
plt.xlabel('Model');
plt.ylim([300, 360]);
plt.text(2.5, 358, '***', ha='center', va='bottom', fontsize=14);
plt.text(3, 345, '***', ha='center', va='bottom', fontsize=14);
plt.text(3.5, 332, '***', ha='center', va='bottom', fontsize=14);
plt.show();

Tutorial 2 - Galvan & Sanfey, 2024

Hypothesis Testing via Model Comparison
alpha = 0.05/length(conditions)

# Merit Condition
t.test(subjectData$AICMerit[-excludedM], altSubjectData$AICPMerit[-excludedM], paired = T, conf.level = (1 - alpha), alternative = 'less')
t.test(subjectData$AICMerit[-excludedM], altSubjectData$AICLMerit[-excludedM], paired = T, conf.level = (1 - alpha), alternative = 'less')
t.test(subjectData$AICMerit[-excludedM], altSubjectData$AICTMerit[-excludedM], paired = T, conf.level = (1 - alpha), alternative = 'less')
t.test(subjectData$AICMerit[-excludedM], altSubjectData$AICPLMerit[-excludedM], paired = T, conf.level = (1 - alpha), alternative = 'less')
t.test(subjectData$AICMerit[-excludedM], altSubjectData$AICPTMerit[-excludedM], paired = T, conf.level = (1 - alpha), alternative = 'less')
t.test(subjectData$AICMerit[-excludedM], altSubjectData$AICLTMerit[-excludedM], paired = T, conf.level = (1 - alpha), alternative = 'less')


# Corruption Condition
t.test(subjectData$AICEntitlement[-excludedE], altSubjectData$AICPEntitlement[-excludedE], paired = T, conf.level = (1 - alpha), alternative = 'less')
t.test(subjectData$AICEntitlement[-excludedE], altSubjectData$AICLEntitlement[-excludedE], paired = T, conf.level = (1 - alpha), alternative = 'less')
t.test(subjectData$AICEntitlement[-excludedE], altSubjectData$AICTEntitlement[-excludedE], paired = T, conf.level = (1 - alpha), alternative = 'less')
t.test(subjectData$AICEntitlement[-excludedE], altSubjectData$AICPLEntitlement[-excludedE], paired = T, conf.level = (1 - alpha), alternative = 'less')
t.test(subjectData$AICEntitlement[-excludedE], altSubjectData$AICPTEntitlement[-excludedE], paired = T, conf.level = (1 - alpha), alternative = 'less')
t.test(subjectData$AICEntitlement[-excludedE], altSubjectData$AICLTEntitlement[-excludedE], paired = T, conf.level = (1 - alpha), alternative = 'less')

# Corruption Condition
t.test(subjectData$AICCorruption[-excludedC], altSubjectData$AICPCorruption[-excludedC], paired = T, conf.level = (1 - alpha), alternative = 'less')
t.test(subjectData$AICCorruption[-excludedC], altSubjectData$AICLCorruption[-excludedC], paired = T, conf.level = (1 - alpha), alternative = 'less')
t.test(subjectData$AICCorruption[-excludedC], altSubjectData$AICTCorruption[-excludedC], paired = T, conf.level = (1 - alpha), alternative = 'less')
t.test(subjectData$AICCorruption[-excludedC], altSubjectData$AICPLCorruption[-excludedC], paired = T, conf.level = (1 - alpha), alternative = 'less')
t.test(subjectData$AICCorruption[-excludedC], altSubjectData$AICPTCorruption[-excludedC], paired = T, conf.level = (1 - alpha), alternative = 'less')
t.test(subjectData$AICCorruption[-excludedC], altSubjectData$AICLTCorruption[-excludedC], paired = T, conf.level = (1 - alpha), alternative = 'less')

# Luck Condition
t.test(subjectData$AICLuck[-excludedL], altSubjectData$AICPLuck[-excludedL], paired = T, conf.level = (1 - alpha), alternative = 'less')
t.test(subjectData$AICLuck[-excludedL], altSubjectData$AICLLuck[-excludedL], paired = T, conf.level = (1 - alpha), alternative = 'less')
t.test(subjectData$AICLuck[-excludedL], altSubjectData$AICTLuck[-excludedL], paired = T, conf.level = (1 - alpha), alternative = 'less')
t.test(subjectData$AICLuck[-excludedL], altSubjectData$AICPLLuck[-excludedL], paired = T, conf.level = (1 - alpha), alternative = 'less')
t.test(subjectData$AICLuck[-excludedL], altSubjectData$AICPTLuck[-excludedL], paired = T, conf.level = (1 - alpha), alternative = 'less')
t.test(subjectData$AICLuck[-excludedL], altSubjectData$AICLTLuck[-excludedL], paired = T, conf.level = (1 - alpha), alternative = 'less')
alpha = 0.05 / length(conditions);

% Merit Condition
for i = 1:length(conditions)
    ttest(subjectData.AICMerit(~excludedM), altSubjectData.AICPMerit(~excludedM), 'Tail', 'left', 'Alpha', alpha);
    ttest(subjectData.AICMerit(~excludedM), altSubjectData.AICLMerit(~excludedM), 'Tail', 'left', 'Alpha', alpha);
    ttest(subjectData.AICMerit(~excludedM), altSubjectData.AICTMerit(~excludedM), 'Tail', 'left', 'Alpha', alpha);
    ttest(subjectData.AICMerit(~excludedM), altSubjectData.AICPLMerit(~excludedM), 'Tail', 'left', 'Alpha', alpha);
    ttest(subjectData.AICMerit(~excludedM), altSubjectData.AICPTMerit(~excludedM), 'Tail', 'left', 'Alpha', alpha);
    ttest(subjectData.AICMerit(~excludedM), altSubjectData.AICLTMerit(~excludedM), 'Tail', 'left', 'Alpha', alpha);
end

% Entitlement Condition
for i = 1:length(conditions)
    ttest(subjectData.AICEntitlement(~excludedE), altSubjectData.AICPEntitlement(~excludedE), 'Tail', 'left', 'Alpha', alpha);
    ttest(subjectData.AICEntitlement(~excludedE), altSubjectData.AICLEntitlement(~excludedE), 'Tail', 'left', 'Alpha', alpha);
    ttest(subjectData.AICEntitlement(~excludedE), altSubjectData.AICTEntitlement(~excludedE), 'Tail', 'left', 'Alpha', alpha);
    ttest(subjectData.AICEntitlement(~excludedE), altSubjectData.AICPLEntitlement(~excludedE), 'Tail', 'left', 'Alpha', alpha);
    ttest(subjectData.AICEntitlement(~excludedE), altSubjectData.AICPTEntitlement(~excludedE), 'Tail', 'left', 'Alpha', alpha);
    ttest(subjectData.AICEntitlement(~excludedE), altSubjectData.AICLTEntitlement(~excludedE), 'Tail', 'left', 'Alpha', alpha);
end

% Corruption Condition
for i = 1:length(conditions)
    ttest(subjectData.AICCorruption(~excludedC), altSubjectData.AICPCorruption(~excludedC), 'Tail', 'left', 'Alpha', alpha);
    ttest(subjectData.AICCorruption(~excludedC), altSubjectData.AICLCorruption(~excludedC), 'Tail', 'left', 'Alpha', alpha);
    ttest(subjectData.AICCorruption(~excludedC), altSubjectData.AICTCorruption(~excludedC), 'Tail', 'left', 'Alpha', alpha);
    ttest(subjectData.AICCorruption(~excludedC), altSubjectData.AICPLCorruption(~excludedC), 'Tail', 'left', 'Alpha', alpha);
    ttest(subjectData.AICCorruption(~excludedC), altSubjectData.AICPTCorruption(~excludedC), 'Tail', 'left', 'Alpha', alpha);
    ttest(subjectData.AICCorruption(~excludedC), altSubjectData.AICLTCorruption(~excludedC), 'Tail', 'left', 'Alpha', alpha);
end

% Luck Condition
for i = 1:length(conditions)
    ttest(subjectData.AICLuck(~excludedL), altSubjectData.AICPLuck(~excludedL), 'Tail', 'left', 'Alpha', alpha);
    ttest(subjectData.AICLuck(~excludedL), altSubjectData.AICLLuck(~excludedL), 'Tail', 'left', 'Alpha', alpha);
    ttest(subjectData.AICLuck(~excludedL), altSubjectData.AICTLuck(~excludedL), 'Tail', 'left', 'Alpha', alpha);
    ttest(subjectData.AICLuck(~excludedL), altSubjectData.AICPLLuck(~excludedL), 'Tail', 'left', 'Alpha', alpha);
    ttest(subjectData.AICLuck(~excludedL), altSubjectData.AICPTLuck(~excludedL), 'Tail', 'left', 'Alpha', alpha);
    ttest(subjectData.AICLuck(~excludedL), altSubjectData.AICLTLuck(~excludedL), 'Tail', 'left', 'Alpha', alpha);
end
alpha <- 0.05 / length(conditions)

# Merit Condition
t.test(subjectData$AICMerit[-excludedM], altSubjectData$AICPMerit[-excludedM], paired = TRUE, conf.level = 1 - alpha, alternative = 'less')
t.test(subjectData$AICMerit[-excludedM], altSubjectData$AICLMerit[-excludedM], paired = TRUE, conf.level = 1 - alpha, alternative = 'less')
t.test(subjectData$AICMerit[-excludedM], altSubjectData$AICTMerit[-excludedM], paired = TRUE, conf.level = 1 - alpha, alternative = 'less')
t.test(subjectData$AICMerit[-excludedM], altSubjectData$AICPLMerit[-excludedM], paired = TRUE, conf.level = 1 - alpha, alternative = 'less')
t.test(subjectData$AICMerit[-excludedM], altSubjectData$AICPTMerit[-excludedM], paired = TRUE, conf.level = 1 - alpha, alternative = 'less')
t.test(subjectData$AICMerit[-excludedM], altSubjectData$AICLTMerit[-excludedM], paired = TRUE, conf.level = 1 - alpha, alternative = 'less')

# Entitlement Condition
t.test(subjectData$AICEntitlement[-excludedE], altSubjectData$AICPEntitlement[-excludedE], paired = TRUE, conf.level = 1 - alpha, alternative = 'less')
t.test(subjectData$AICEntitlement[-excludedE], altSubjectData$AICLEntitlement[-excludedE], paired = TRUE, conf.level = 1 - alpha, alternative = 'less')
t.test(subjectData$AICEntitlement[-excludedE], altSubjectData$AICTEntitlement[-excludedE], paired = TRUE, conf.level = 1 - alpha, alternative = 'less')
t.test(subjectData$AICEntitlement[-excludedE], altSubjectData$AICPLEntitlement[-excludedE], paired = TRUE, conf.level = 1 - alpha, alternative = 'less')
t.test(subjectData$AICEntitlement[-excludedE], altSubjectData$AICPTEntitlement[-excludedE], paired = TRUE, conf.level = 1 - alpha, alternative = 'less')
t.test(subjectData$AICEntitlement[-excludedE], altSubjectData$AICLTEntitlement[-excludedE], paired = TRUE, conf.level = 1 - alpha, alternative = 'less')

# Corruption Condition
t.test(subjectData$AICCorruption[-excludedC], altSubjectData$AICPCorruption[-excludedC], paired = TRUE, conf.level = 1 - alpha, alternative = 'less')
t.test(subjectData$AICCorruption[-excludedC], altSubjectData$AICLCorruption[-excludedC], paired = TRUE, conf.level = 1 - alpha, alternative = 'less')
t.test(subjectData$AICCorruption[-excludedC], altSubjectData$AICTCorruption[-excludedC], paired = TRUE, conf.level = 1 - alpha, alternative = 'less')
t.test(subjectData$AICCorruption[-excludedC], altSubjectData$AICPLCorruption[-excludedC], paired = TRUE, conf.level = 1 - alpha, alternative = 'less')
t.test(subjectData$AICCorruption[-excludedC], altSubjectData$AICPTCorruption[-excludedC], paired = TRUE, conf.level = 1 - alpha, alternative = 'less')
t.test(subjectData$AICCorruption[-excludedC], altSubjectData$AICLTCorruption[-excludedC], paired = TRUE, conf.level = 1 - alpha, alternative = 'less')

# Luck Condition
t.test(subjectData$AICLuck[-excludedL], altSubjectData$AICPLuck[-excludedL], paired = TRUE, conf.level = 1 - alpha, alternative = 'less')
t.test(subjectData$AICLuck[-excludedL], altSubjectData$AICLLuck[-excludedL], paired = TRUE, conf.level = 1 - alpha, alternative = 'less')
t.test(subjectData$AICLuck[-excludedL], altSubjectData$AICTLuck[-excludedL], paired = TRUE, conf.level = 1 - alpha, alternative = 'less')
t.test(subjectData$AICLuck[-excludedL], altSubjectData$AICPLLuck[-excludedL], paired = TRUE, conf.level = 1 - alpha, alternative = 'less')
t.test(subjectData$AICLuck[-excludedL], altSubjectData$AICPTLuck[-excludedL], paired = TRUE, conf.level = 1 - alpha, alternative = 'less')
t.test(subjectData$AICLuck[-excludedL], altSubjectData$AICLTLuck[-excludedL], paired = TRUE, conf.level = 1 - alpha, alternative = 'less')

Tutorial 3 - Crockett et al., 2014

Hypothesis Testing via Model Comparison
t.test(subjectData$BIC[-20], altSubjectData$BICM2[-20], paired = T)
t.test(subjectData$BIC[-20], altSubjectData$BICM3[-20], paired = T)
t.test(subjectData$BIC[-20], altSubjectData$BICM4[-20], paired = T)
t.test(subjectData$BIC[-20], altSubjectData$BICM5[-20], paired = T)

Tutorial 4 - Li et al., 2022

Hypothesis Testing via Model Comparison
t.test(altSubjectData$BIC_M5, subjectData$BIC, paired = T)
t.test(altSubjectData$BIC_M5, altSubjectData$BIC_M1, paired = T)
t.test(altSubjectData$BIC_M5, altSubjectData$BIC_M2, paired = T)
t.test(altSubjectData$BIC_M5, altSubjectData$BIC_M3, paired = T)
t.test(altSubjectData$BIC_M5, altSubjectData$BIC_M4, paired = T)
t.test(altSubjectData$BIC_M5, altSubjectData$BIC_M6, paired = T)