Final report Quantitative and Qualitative Analysis
Name: Vu Hoang Dung
Student code: 17110077
Program: Public Policy – 2nd intake – Vietnam Japan University
Dependent variable: Internet users (per 100 people)- Y
GDP_PPP - GDP per capita, PPP (constant 2000 international $)- X1
Ser_import - Computer, communications and other services (% of commercial service imports)X2
Ur_pop - Urban population (% of total)- X3
Sample size: 19 countries
Model 2: Deleting Japan
Model 3: Log GDP_PPP
* Method: Number of variables is 3, size 19 countries, not overfitting and underfitting. Use
stepwise to determine the appropriate model, which will then determine the statistically
> vif = diag(solve(cor(x))) ; vif
Service import GDP_PPP Ur_pop
Model 1 is a normal stepwise method after checking VIF, there is no multi-co linearity between
these variables and no need to delete any variable. From the regression results table, we can
see that R2 value is 0.7899, which is quite high, and the significance level is quite good and AIC
value is 142.7617. However, in the graph country number 9 (Japan) are near the line value 1,
which can have effect on regression results.
Model 2 is obtained by deleting Japan.
> vif = diag(solve(cor(x))) ; vif
GDP_PPP Ser.import Ur_pop
4.202452 1.000479 4.202732
Model 2, after deleting Japan, checking the multiple co linearity and stepwise method, model 2
has R2 = 0.795 and AIC = 135.4048. Comparing model 1 and model 2, we can see that, model 2
can be better than model 1. However, the significant level of variables higher than 0.05, so we
can use logarithm function with GDP_PPP and stepwise method to find out model 3.
Model 3 is created by stepwise method and use logarithm with GDP per capita
Estimate Std. Error t value Pr(>|t|)
(Intercept) -61.786 9.169 -6.739 4.76e-06 ***
9.915 1.193 8.309 3.38e-07 ***
--Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 8.009 on 16 degrees of freedom
Multiple R-squared: 0.8119, Adjusted R-squared: 0.8001
F-statistic: 69.04 on 1 and 16 DF, p-value: 3.38e-07
After checking 3 models, we can see model 3 is the best one, with t-value has statistical
significance, R-square = 0.8119 and AIC is the smallest. Although model 3 has the fewest
variables, it can explain the relationship between GDP and internet penetration (per 100
Internet users are defined as individuals who have access to the Internet at home, through
computers or mobile devices. As a result, the number of computers and mobile devices will
affect the number of internet users in each country.
Moreover, in the 2000s, internet access was mostly via computers. This has led to an increase in
the import of computers and other communication devices, which has been linked to the
proportion of Internet users in this period (model 1).
In addition to importing, the percentage of computer users also depends on the proportion of
urban areas in the country. In countries with large urban areas, the proportion of internet users
will increase due to higher demand for work and living conditions.
Today the use of the internet for everyday use and application is indispensable. A country that
successfully applies technology applications to life will make the country more prosperous.
People will have higher living standards and income. According to statistics, countries with high
numbers of internet users such as Australia, Hong Kong, Japan and Singapore are all high
income countries. So GDP and the number of internet users are closely related (model 3).