PS15 W24 Study Guide
.pdf
keyboard_arrow_up
School
University of California, Santa Barbara *
*We aren’t endorsed by this school
Course
15
Subject
Political Science
Date
Apr 3, 2024
Type
Pages
16
Uploaded by DoctorMonkey17272 on coursehero.com
PS 15. Introduction to Research in Political Science
Department of Political Science
University of California, Santa Barbara
Winter 2024
Final Exam Study Guide
Part 1± Causality
1..1. Key concepts: in an empirical relationship, what is: (a) a dependent variable, (b) an
independent variable; what are treated and control groups in an experimental setting?
1.2.
When
you
observe
a
relationship
between
two
variables,
a
dependent
and
an
independent variable, what are some of the issues you should take into account before
making
any
claims
about
causality?
Hint: this includes a discussion of confounders,
comparability between groups, and selection problems
1.3. What is the key difference between an observational study and an experimental study?
Which one is more conducive to make causal inferences about empirical relationships?
1.4. What does internal and external validity (usually for an experimental study) refer to?
1.5. What are some of the limitations and advantages of experimental studies?
1.6. What are the formal definitions of endogeneity and exogeneity?
Sample Questions
A researcher looks at some
observational
data and finds that when individuals participate in
a program to improve their job skills, their income increases. They are excited to publish a
paper arguing that the program was effective at helping individuals find a job, which, in
turn, increased their income, and offer you to join as a coauthor. Having taken PS 15, what
would you choose?
A.
You’d politely decline the offer, saying that the job training program was not
randomly assigned
1
B.
You’d politely decline the offer, saying that individual income is exogenous
C.
You’d politely decline the offer, saying that correlation doesn't equal causation
D.
You would accept the offer and co-author the paper
For any empirical relationship we analyze, what are two of the dimensions we have to
consider before making any causal claims about them
A.
Whether there are any confounding variables and the extent to which the treated
and control units are comparable
B.
Heteroskedasticity and confounding
C.
Whether the sample is representative of the population and how big the sample is
D.
None of the above
What is the main difference between observational and experimental studies:
A.
Observational studies are qualitative and experimental studies are quantitative
B.
In observational studies, researchers do not randomize the main variable of interest;
in experimental studies, researchers do randomize the main variable of interest.
C.
In observational studies, researchers do randomize the main variable of interest; in
experimental studies, researchers do not randomize the main variable of interest.
D.
You cannot infer causality from observational studies ever; you can always infer
causality from experiments.
In a linear regression model, a variable is endogenous if:
A.
It is not correlated with the error term
B.
It is correlated with other explanatory variables in the model
C.
It is correlated with the error term
D.
None of the above
Part 2± Fundamentals of Probability
2.1. Key concepts: population, sample, and individual
2.2. What is a probability distribution; what are some types of distributions we studied in
class. Particularly important are the normal distribution, the uniform distribution, the
binomial distribution, and the t-distribution
2
2.3. One of the most important concepts we discussed in class is the process of statistical
inference. What are the four steps we defined for this process?
What is the role of
estimands, estimators, and estimates in this process?
2.4. Use the example of the population and sample mean to illustrate the process of
statistical inference
2.5. What is a random variable and what are their key elements? Related to random
variables, what is their expectation and variance? What are the sample mean and sample
variance?
2.6. What is a sampling distribution (for example the sampling distribution of the mean)?
Sample Questions
What do we call the overall collection of units we are interested in analyzing? For example,
all the inhabitants of the United States.
A.
Unit
B.
Individual
C.
Sample
D.
Population
What is the correct order of the statistical inference process?
A.
We are interested in inferring something about a population parameter or estimand;
however, we have access to only a sample of the population; therefore, we employ
an estimator to obtain a sample estimate that gives us information about the
population parameter.
B.
We are interested in inferring something about a population estimate; usually we
have access to the entire population; therefore, we employ an estimand to obtain a
sample estimate that gives us information about the population parameter.
C.
We are interested in inferring something about a population parameter or estimand;
however, we have access to only a sample of the population; therefore, we employ
an estimate to obtain a sample estimator that gives us information about the
population parameter.
D.
We are interested in inferring something about a sample estimator; however, we
have access to only a sample of the population; therefore, we employ an estimator
to obtain a sample estimate that gives us information about the population
parameter.
3
A probability density function with multiple outliers towards the higher values of X is called
A.
A normal distribution
B.
A chi-squared distribution
C.
A left-skewed distribution
D.
A right-skewed distribution
You throw a dice 100 times and record the outcome each time. If you were to graph a
histogram or density plot of all the outcomes, what type of distribution would you expect
to see?
A.
Normal
B.
Chi-squared
C.
Unimodal
D.
Uniform
You throw a dice 10000 times and record the outcome each time. If you were to graph a
histogram or density plot of all the outcomes, what type of distribution would you expect
to see?
E.
Normal
F.
Chi-squared
G. Unimodal
H. Uniform
A random variable is defined as:
A.
A variable that can take a set of possible values with different probabilities.
B.
A variable whose outcome is unknown until after a draw is made.
C.
A variable that is randomly assigned by a researcher.
D.
Both A & B
The best guess of the value we would get from a draw of a random variable is known as:
A.
The standard deviation
B.
The expected value
C.
The variance
D.
The median
4
The mathematical theorem that states that for a random variable X, the sample mean of
draws from that distribution will tend towards the expectation of X as the sample size
increases
A.
Central Limit Theorem
B.
Heteroskedasticity
C.
Homoskedasticity
D.
Law of the Large Numbers
Part 3± Bivariate Regression
3.1. In lecture we described three different ways to assess the relationship between two
variables: covariance, correlation, and bivariate regression. What are the main differences
between these?
3.2. What are the key elements in the basic model of (bivariate) regression? What is
β
0
+
β
1
and
ϵ
i Make sure to identify the equation for a basic linear regression model.
3.3. Explain in words the logic behind linear regression. In other words, what do we mean
by “the line of best fit”? A key element of this question is the Sum of Squared Errors, make
sure to review what we mean by this.
3.4.
The
interpretation
of
the
linear
regression
output
in R is crucial. What is the
interpretation of the coefficient
β
̂
₀
?
What is the interpretation of the coefficient
β
̂
₁
?
For
this question, remember two key things: (1) use the expression “a one unit change in X”,
reflecting the actual variables of each case and (2) avoid the use of causal language such
as “causes”, “leads”
3.5. What is the interpretation of the r-squared in a linear regression?
Sample Questions
In class, we discussed different ways to analyze a relationship between two variables; what
is the main disadvantage of employing the covariance to do so?
A.
It only applies to large samples
B.
It only works when one of the variables is normally distributed
C.
It goes from -1 to 1
D.
It reflects the original units of the variables, making interpretation difficult
5
The difference between the fitted value of Y (
Ŷ)
and the actual value of Y is:
A.
The residual
B.
The coefficient
C.
The slope
D.
The intercept
Subtracting the actual values of Y minus the predicted values of Y, squaring these
differences and adding them together is known as:
A.
The Sum of Squared Errors
B.
The Sum of Squared Residuals
C.
A or B
D.
The Sum Squared Expectations
Why do we call OLS the line of best fit?
A.
Because it minimizes the sum of squared residuals
B.
Because it matches all the points in the data
C.
OLS is not called the line of best fit
D.
Because it reduces multicollinearity
What is the main difference between
β
̂
₁
and
β
₁
in a linear regression model
A.
There is no difference; we can write them both to mean the same thing
B.
β
₁
is represents the true value of the population parameter and
β
̂
₁
is a sample
estimate
C.
β
₁
is the estimator and
β
̂
₁
is the estimate
D.
β
₁
refers to the slope of the line and
β
̂
₁
to the intercept of the line
What do we call
β
₀
in our basic regression model?
A.
The error
6
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help