October 21, 2024
model_data <- vdem |>
filter(year == 2006) |>
select(country_name,
libdem = v2x_libdem,
wealth = e_gdppc,
oil_rents = e_total_oil_income_pc,
polarization = v2cacamps,
corruption = v2x_corr,
judicial_review = v2jureview_ord,
region = e_regionpol_6C,
regime = v2x_regime) |>
mutate(
region = factor(
region,
labels=c("Eastern Europe",
"Latin America",
"MENA",
"SSAfrica",
"Western Europe and North America",
"Asia and Pacific"))
)
glimpse(model_data)
library(broom) # for tidy() function
lm(libdem ~ polarization, data = model_data) |>
tidy() # for nicer regression output
# A tibble: 2 × 5
term estimate std.error statistic p.value
<chr> <dbl> <dbl> <dbl> <dbl>
1 (Intercept) 0.376 0.0187 20.1 2.66e-47
2 polarization -0.0914 0.0139 -6.58 5.30e-10
\[ \hat{Y_i} = a + b_1*Polarization + b_2*GDPpc \]
\[ \hat{Y_i} = 0.18 + -0.05*Polarization + 0.10*GDPpc \]
\[ \hat{Y_i} = a + b_1*Polarization + b_2*GDPpc \]
\[ \hat{Y_i} = 0.18 + -0.05*Polarization + 0.10*GDPpc \]
\(a\) is the predicted level of Y when BOTH GDP per capita and polarization are equal to 0
\[ \hat{Y_i} = a + b_1*Polarization + b_2*GDPpc \]
\[ \hat{Y_i} = 0.18 + -0.05*Polarization + 0.10*GDPpc \]
\[ \hat{Y_i} = a + b_1*Polarization + b_2*GDPpc \]
\[ \hat{Y_i} = 0.18 + -0.05*Polarization + 0.10*GDPpc \]
\[ \hat{Y_i} = a + b_1*Polarization + b_2*GDPpc \]
\[ \hat{Y_i} = a + b_1*Polarization + b_2*GDPpc + b_3*OilRents \]
# A tibble: 4 × 5
term estimate std.error statistic p.value
<chr> <dbl> <dbl> <dbl> <dbl>
1 (Intercept) 0.153 0.0317 4.82 3.30e- 6
2 polarization -0.0577 0.0128 -4.50 1.32e- 5
3 log(wealth) 0.131 0.0150 8.72 3.65e-15
4 oil_rents -0.0000413 0.00000607 -6.80 1.98e-10
\[ \hat{Y_i} = a + b_1*Polarization + b_2*GDPpc + b_3*OilRents \]
\[ \hat{Y_i} = a + -.05*Polarization + .13*GDPpc + .00004*OilRents \]
10:00
# A tibble: 4 × 5
term estimate std.error statistic p.value
<chr> <dbl> <dbl> <dbl> <dbl>
1 (Intercept) 0.962 0.0278 34.6 1.25e-78
2 libdem -0.618 0.0574 -10.8 6.40e-21
3 polarization 0.0298 0.0108 2.75 6.66e- 3
4 log(wealth) -0.0864 0.0125 -6.89 1.07e-10
factor()
to convert a variable to a factorlevels()
to see the categoriesrelevel()
to change the reference categoryJudicial Review:
\[ \widehat{Democracy_{i}} = 0.17 + 0.28*JudicialReview(yes) \]
How should we interpret intercept? How about the coefficient on Latin America?
# A tibble: 6 × 5
term estimate std.error statistic p.value
<chr> <dbl> <dbl> <dbl> <dbl>
1 (Intercept) 0.434 0.0361 12.0 1.41e-24
2 regionLatin America 0.0664 0.0535 1.24 2.16e- 1
3 regionMENA -0.236 0.0571 -4.13 5.63e- 5
4 regionSSAfrica -0.139 0.0456 -3.06 2.61e- 3
5 regionWestern Europe and North America 0.376 0.0541 6.94 7.84e-11
6 regionAsia and Pacific -0.134 0.0519 -2.57 1.09e- 2
What if you want a different baseline category? How do we interpret now?
# make SS Africa the reference category
model_data <- model_data |>
mutate(newReg = relevel(region, ref=4))
lm(libdem ~ newReg, data = model_data) |>
tidy()
# A tibble: 6 × 5
term estimate std.error statistic p.value
<chr> <dbl> <dbl> <dbl> <dbl>
1 (Intercept) 0.295 0.0279 10.6 2.24e-20
2 newRegEastern Europe 0.139 0.0456 3.06 2.61e- 3
3 newRegLatin America 0.206 0.0484 4.25 3.47e- 5
4 newRegMENA -0.0962 0.0523 -1.84 6.74e- 2
5 newRegWestern Europe and North America 0.515 0.0491 10.5 3.36e-20
6 newRegAsia and Pacific 0.00582 0.0466 0.125 9.01e- 1
Which types of regime have more corruption?
V-Dem also includes a categorial regime variable: Closed autocracy (0), Electoral Autocracy (1), Electoral Democracy (2), Liberal Democracy (3)
Which types of regime have more corruption?
First, let’s make this an easier factor variable to work with.
# Make nicer regime factor variable
model_data <- model_data |>
mutate(regime = factor(regime,
labels = c("Closed Autocracy",
"Electoral Autocracy",
"Electoral Democracy",
"Liberal Democracy")))
levels(model_data$regime)
[1] "Closed Autocracy" "Electoral Autocracy" "Electoral Democracy"
[4] "Liberal Democracy"
Which types of regime have more corruption?
10:00
# A tibble: 4 × 5
term estimate std.error statistic p.value
<chr> <dbl> <dbl> <dbl> <dbl>
1 (Intercept) 0.598 0.0397 15.1 2.81e-33
2 regimeElectoral Autocracy 0.151 0.0475 3.18 1.74e- 3
3 regimeElectoral Democracy -0.0606 0.0484 -1.25 2.11e- 1
4 regimeLiberal Democracy -0.469 0.0502 -9.35 4.46e-17