UNIVERSITY OF SCIENCE

------

FINAL EXAM

Subject: Statistics for Ecology

Topic: Analysis of presence or absence of species

Requirement:

The data to be analyzed is the data on the abundance of Faramea occidentalis (in

attached text file). Please explain the influence of precipitation, altitude, age and

geology parameters on the presence-absence of Faramea occidentalis species. The

calculation and the numerical results are required.

Full name:

Bui Thi Hao

Class:

K55 of Advanced Program of Environmental Science

Student’s code: 1000739

1

Content:

Page

A. Load the data from external file..........................................................

B. Doing analysis.....................................................................................

I.

Explanation with single explanatory variable................................

a. Explain the influence of Age categories..............................

b. Explain the influence of elevation(i.e.altitude)...................

c. Explain the influence of Precipitation ...............................

d. Explain the influence of Geology ......................................

II.

Explanation with several explanatory variables..........................

SOLUTION

A. Load the data from external file

2

B. Doing analysis

3

Due to the more comprehensive analysis of frequencies, I prefer to use GLMgeneralized linear model (binomal or quasibinomal flexibly)

I.

Explanation with single explanatory variable

a. Explain the influence of Age on the presence-absence of species

From using Biodiversity.R, I got the result as above. The result shows the

coefficients of Age.categories (in logit value). However, more important, the deviance

residuals and Pr-value should be concerned. According to the above results, the variance

of presence/absence of Faramea occidentalis depending on age.categories explained only

3.0793 per 59.401 of null deviance (5.18%) (very small). Especially, that Pr-value ~

0.2508 in the ANOVA table is so high implies there is evidence so that coefficients of

categories equal zero. It means age categories have no effect on the presence/absence of

species.

In conclusion, the age categories in their own have no contribution on explaining

the presence/absence of Faramea occidentalis.

4

b. Explain the influence of elevation(i.e.altitude) on the presence-absence of

species

From using Biodiversity.R, I got the result as above. The result shows the

coefficients of Elevation (in logit value), as well as, the deviance residuals and Pr-value.

According to the above results, the variance of presence/absence of Faramea

occidentalis depending on elevation explained only 9.9317 per 59.401 of null deviance

(16.72%) (so small). However, that Pr-value ~ 0.0357 is very low implies there is

evidence so that coefficients of elevation do not equal zero. It means, elevation still has

certain effect on the presence/absence of species.

In conclusion, the elevation in its own has contribution on explaining the

presence/absence of Faramea occidentalis (but not clear and strong due to small

explained deviance) according to the following link fuction:

Logit(µ)= 1.0595-0.00784x = y

Where µ: the mean of presence/absence value

x: the elevation value (should be the mean value of certain interval)

µ= exp(y)/(1+exp(y))

5

c. Explain the influence of Precipitation on the presence-absence of species

The above result shows the coefficients of precipitation (in logit value), as well as,

the deviance residuals and Pr-value. Accordingingly, the variance of presence/absence

of Faramea occidentalis depending on precipitation explained only 8.8406 per 59.401 of

null deviance (14.88%) (so small). However, that Pr-value ~ 0.0172 is very low implies

there is evidence so that coefficients of precipitation do not equal zero. It means,

precipitation still has certain effect on the presence/absence of species.

In conclusion, the precipitation in its own has contribution on explaining the

presence/absence of Faramea occidentalis ((but not clear and strong due to small

explained deviance) according to the following link fuction:

Logit(µ)= 6.9483-0.00272x = y

Where

µ: the mean of presence/absence value

µ= exp(y)/(1+exp(y))

x: the precipitation value ((should be the mean value of certain interval)

6

d. Explain the influence of Geology on the presence-absence of species

The above result shows the coefficients of geology (in logit value), as well as, the

deviance residuals and Pr-value. Accordingingly, the variance of presence/absence of

Faramea occidentalis depending on geology explained 25.548 per 59.401 of null

deviance (43%) (noticeable). Moreover, that Pr-value ~ 0.002027 in the ANOVA table

is very low implies there is evidence so that coefficients of geology do not equal zero. It

means, v has certain effect on the presence/absence of species.

In conclusion, the geology in its own has contribution on explaining the

presence/absence of Faramea occidentalis according to the following link fuction:

Logit(µ)= intercept + coefficient for geology category = y

Where µ: the mean of presence/absence value

µ= exp(y)/(1+exp(y))

Example: For GeologyTc:

Logit(µ)= -2.0794+2.367 = 0.2876 => µ= 57.14 %

II.

Explanation with several explanatory variables

7

Is there more complex pattern in relationship of explanory variables on

explaining response varible => Use binomal GLM on several explanatory

variables

8

As we know, AIC (Akaike Information Criterion) is to provide us information about

combination of simplicity and explained deviance. A model with a lower AIC has a

better combination of simplicity and explained deviance, therefore be more prefered than

that with the higher AIC.

It is better to use model with ( Precipitation + Precipitation^2 + Age. Cat +

Geology + Elevation^2) rather than (Precipitation + Precipitation^2 + Age. Cat +

Geology + Elevation+ Elevation^2), and than (Precipitation+ Age. Cat + Geology

+ Elevation) (since AIC respectively: 42.020 < 43.376 < 43.969)

9

In conclusion, the best model is binomal GLM on Precipitation +

Precipitation^2 + Age. Cat + Geology + Elevation^2

As above result, this model can explain up to (59.401-18.020)/59.401= 69.66% of null

deviance of dataset (much higher than that off all the models with single explanatory

variable). Moreover, in the single term deletions, the deletion of any term will cause the

increase of AIC, i.e. the less combination of simplicity and explained deviance. That

means all of mentioned terms should be kept in the model, and the link function would

be:

Y= Logit(µ)= -8.830e + 8.031e^-2.x -1.765e^-5.x2+ y+z-6.407e^-5.k

Where:

µ: the mean of presence/absence value => µ= exp(Y)/(1+exp(Y))

x: the precipitation value

z: coefficient for age category

y: coefficient for geology category

k: the elevation value

10

Conclusion:

Each explanatory variable (precipitation, altitude, age and geology) has its own influence

on response variable ( the presence/absence of Faramea occidentalis) at certain level

(even zero level-no influence). More obviously, however, the complex pattern in which

all explanatory variables are included is much better in explaining the presence/absence

of species, so such a model should be more prefered.

11

------

FINAL EXAM

Subject: Statistics for Ecology

Topic: Analysis of presence or absence of species

Requirement:

The data to be analyzed is the data on the abundance of Faramea occidentalis (in

attached text file). Please explain the influence of precipitation, altitude, age and

geology parameters on the presence-absence of Faramea occidentalis species. The

calculation and the numerical results are required.

Full name:

Bui Thi Hao

Class:

K55 of Advanced Program of Environmental Science

Student’s code: 1000739

1

Content:

Page

A. Load the data from external file..........................................................

B. Doing analysis.....................................................................................

I.

Explanation with single explanatory variable................................

a. Explain the influence of Age categories..............................

b. Explain the influence of elevation(i.e.altitude)...................

c. Explain the influence of Precipitation ...............................

d. Explain the influence of Geology ......................................

II.

Explanation with several explanatory variables..........................

SOLUTION

A. Load the data from external file

2

B. Doing analysis

3

Due to the more comprehensive analysis of frequencies, I prefer to use GLMgeneralized linear model (binomal or quasibinomal flexibly)

I.

Explanation with single explanatory variable

a. Explain the influence of Age on the presence-absence of species

From using Biodiversity.R, I got the result as above. The result shows the

coefficients of Age.categories (in logit value). However, more important, the deviance

residuals and Pr-value should be concerned. According to the above results, the variance

of presence/absence of Faramea occidentalis depending on age.categories explained only

3.0793 per 59.401 of null deviance (5.18%) (very small). Especially, that Pr-value ~

0.2508 in the ANOVA table is so high implies there is evidence so that coefficients of

categories equal zero. It means age categories have no effect on the presence/absence of

species.

In conclusion, the age categories in their own have no contribution on explaining

the presence/absence of Faramea occidentalis.

4

b. Explain the influence of elevation(i.e.altitude) on the presence-absence of

species

From using Biodiversity.R, I got the result as above. The result shows the

coefficients of Elevation (in logit value), as well as, the deviance residuals and Pr-value.

According to the above results, the variance of presence/absence of Faramea

occidentalis depending on elevation explained only 9.9317 per 59.401 of null deviance

(16.72%) (so small). However, that Pr-value ~ 0.0357 is very low implies there is

evidence so that coefficients of elevation do not equal zero. It means, elevation still has

certain effect on the presence/absence of species.

In conclusion, the elevation in its own has contribution on explaining the

presence/absence of Faramea occidentalis (but not clear and strong due to small

explained deviance) according to the following link fuction:

Logit(µ)= 1.0595-0.00784x = y

Where µ: the mean of presence/absence value

x: the elevation value (should be the mean value of certain interval)

µ= exp(y)/(1+exp(y))

5

c. Explain the influence of Precipitation on the presence-absence of species

The above result shows the coefficients of precipitation (in logit value), as well as,

the deviance residuals and Pr-value. Accordingingly, the variance of presence/absence

of Faramea occidentalis depending on precipitation explained only 8.8406 per 59.401 of

null deviance (14.88%) (so small). However, that Pr-value ~ 0.0172 is very low implies

there is evidence so that coefficients of precipitation do not equal zero. It means,

precipitation still has certain effect on the presence/absence of species.

In conclusion, the precipitation in its own has contribution on explaining the

presence/absence of Faramea occidentalis ((but not clear and strong due to small

explained deviance) according to the following link fuction:

Logit(µ)= 6.9483-0.00272x = y

Where

µ: the mean of presence/absence value

µ= exp(y)/(1+exp(y))

x: the precipitation value ((should be the mean value of certain interval)

6

d. Explain the influence of Geology on the presence-absence of species

The above result shows the coefficients of geology (in logit value), as well as, the

deviance residuals and Pr-value. Accordingingly, the variance of presence/absence of

Faramea occidentalis depending on geology explained 25.548 per 59.401 of null

deviance (43%) (noticeable). Moreover, that Pr-value ~ 0.002027 in the ANOVA table

is very low implies there is evidence so that coefficients of geology do not equal zero. It

means, v has certain effect on the presence/absence of species.

In conclusion, the geology in its own has contribution on explaining the

presence/absence of Faramea occidentalis according to the following link fuction:

Logit(µ)= intercept + coefficient for geology category = y

Where µ: the mean of presence/absence value

µ= exp(y)/(1+exp(y))

Example: For GeologyTc:

Logit(µ)= -2.0794+2.367 = 0.2876 => µ= 57.14 %

II.

Explanation with several explanatory variables

7

Is there more complex pattern in relationship of explanory variables on

explaining response varible => Use binomal GLM on several explanatory

variables

8

As we know, AIC (Akaike Information Criterion) is to provide us information about

combination of simplicity and explained deviance. A model with a lower AIC has a

better combination of simplicity and explained deviance, therefore be more prefered than

that with the higher AIC.

It is better to use model with ( Precipitation + Precipitation^2 + Age. Cat +

Geology + Elevation^2) rather than (Precipitation + Precipitation^2 + Age. Cat +

Geology + Elevation+ Elevation^2), and than (Precipitation+ Age. Cat + Geology

+ Elevation) (since AIC respectively: 42.020 < 43.376 < 43.969)

9

In conclusion, the best model is binomal GLM on Precipitation +

Precipitation^2 + Age. Cat + Geology + Elevation^2

As above result, this model can explain up to (59.401-18.020)/59.401= 69.66% of null

deviance of dataset (much higher than that off all the models with single explanatory

variable). Moreover, in the single term deletions, the deletion of any term will cause the

increase of AIC, i.e. the less combination of simplicity and explained deviance. That

means all of mentioned terms should be kept in the model, and the link function would

be:

Y= Logit(µ)= -8.830e + 8.031e^-2.x -1.765e^-5.x2+ y+z-6.407e^-5.k

Where:

µ: the mean of presence/absence value => µ= exp(Y)/(1+exp(Y))

x: the precipitation value

z: coefficient for age category

y: coefficient for geology category

k: the elevation value

10

Conclusion:

Each explanatory variable (precipitation, altitude, age and geology) has its own influence

on response variable ( the presence/absence of Faramea occidentalis) at certain level

(even zero level-no influence). More obviously, however, the complex pattern in which

all explanatory variables are included is much better in explaining the presence/absence

of species, so such a model should be more prefered.

11

## Economic Statistics for NOAA

## Tài liệu Statistics for Environmental Engineers P2 doc

## Tài liệu Statistics for Environmental Engineers P1 ppt

## Tài liệu Response statistics for ''''New Legislative Framework for the marketing of products: proposal to align 10 product harmonisation directives to Decision 768/2008.'''' pdf

## Summary Health Statistics for U.S. Children: National Health Interview Survey, 2010 pdf

## Monitoring Butterflies for Ecology and Conservation: The British Butterfly Monitoring Scheme docx

## Báo cáo khoa học: "Noun-phrase co-occurrence statistics for semi-automatic semantic lexicon construction" ppt

## Intermediate statistics for DUMmIES

## Mathematics for Ecology and Environmental Sciences pptx

## fundamentals of probability and statistics for engineers - t t soong

Tài liệu liên quan