PS15 W24 Study Guide

.pdf

School

University of California, Santa Barbara *

*We aren’t endorsed by this school

Course

Subject

Political Science

Date

Apr 3, 2024

Type

pdf

Pages

Uploaded by DoctorMonkey17272 on coursehero.com

PS 15. Introduction to Research in Political Science Department of Political Science University of California, Santa Barbara Winter 2024 Final Exam Study Guide Part 1± Causality 1..1. Key concepts: in an empirical relationship, what is: (a) a dependent variable, (b) an independent variable; what are treated and control groups in an experimental setting? 1.2. When you observe a relationship between two variables, a dependent and an independent variable, what are some of the issues you should take into account before making any claims about causality? Hint: this includes a discussion of confounders, comparability between groups, and selection problems 1.3. What is the key difference between an observational study and an experimental study? Which one is more conducive to make causal inferences about empirical relationships? 1.4. What does internal and external validity (usually for an experimental study) refer to? 1.5. What are some of the limitations and advantages of experimental studies? 1.6. What are the formal definitions of endogeneity and exogeneity? Sample Questions A researcher looks at some observational data and finds that when individuals participate in a program to improve their job skills, their income increases. They are excited to publish a paper arguing that the program was effective at helping individuals find a job, which, in turn, increased their income, and offer you to join as a coauthor. Having taken PS 15, what would you choose? A. You’d politely decline the offer, saying that the job training program was not randomly assigned 1

B. You’d politely decline the offer, saying that individual income is exogenous C. You’d politely decline the offer, saying that correlation doesn't equal causation D. You would accept the offer and co-author the paper For any empirical relationship we analyze, what are two of the dimensions we have to consider before making any causal claims about them A. Whether there are any confounding variables and the extent to which the treated and control units are comparable B. Heteroskedasticity and confounding C. Whether the sample is representative of the population and how big the sample is D. None of the above What is the main difference between observational and experimental studies: A. Observational studies are qualitative and experimental studies are quantitative B. In observational studies, researchers do not randomize the main variable of interest; in experimental studies, researchers do randomize the main variable of interest. C. In observational studies, researchers do randomize the main variable of interest; in experimental studies, researchers do not randomize the main variable of interest. D. You cannot infer causality from observational studies ever; you can always infer causality from experiments. In a linear regression model, a variable is endogenous if: A. It is not correlated with the error term B. It is correlated with other explanatory variables in the model C. It is correlated with the error term D. None of the above Part 2± Fundamentals of Probability 2.1. Key concepts: population, sample, and individual 2.2. What is a probability distribution; what are some types of distributions we studied in class. Particularly important are the normal distribution, the uniform distribution, the binomial distribution, and the t-distribution 2

2.3. One of the most important concepts we discussed in class is the process of statistical inference. What are the four steps we defined for this process? What is the role of estimands, estimators, and estimates in this process? 2.4. Use the example of the population and sample mean to illustrate the process of statistical inference 2.5. What is a random variable and what are their key elements? Related to random variables, what is their expectation and variance? What are the sample mean and sample variance? 2.6. What is a sampling distribution (for example the sampling distribution of the mean)? Sample Questions What do we call the overall collection of units we are interested in analyzing? For example, all the inhabitants of the United States. A. Unit B. Individual C. Sample D. Population What is the correct order of the statistical inference process? A. We are interested in inferring something about a population parameter or estimand; however, we have access to only a sample of the population; therefore, we employ an estimator to obtain a sample estimate that gives us information about the population parameter. B. We are interested in inferring something about a population estimate; usually we have access to the entire population; therefore, we employ an estimand to obtain a sample estimate that gives us information about the population parameter. C. We are interested in inferring something about a population parameter or estimand; however, we have access to only a sample of the population; therefore, we employ an estimate to obtain a sample estimator that gives us information about the population parameter. D. We are interested in inferring something about a sample estimator; however, we have access to only a sample of the population; therefore, we employ an estimator to obtain a sample estimate that gives us information about the population parameter. 3

A probability density function with multiple outliers towards the higher values of X is called A. A normal distribution B. A chi-squared distribution C. A left-skewed distribution D. A right-skewed distribution You throw a dice 100 times and record the outcome each time. If you were to graph a histogram or density plot of all the outcomes, what type of distribution would you expect to see? A. Normal B. Chi-squared C. Unimodal D. Uniform You throw a dice 10000 times and record the outcome each time. If you were to graph a histogram or density plot of all the outcomes, what type of distribution would you expect to see? E. Normal F. Chi-squared G. Unimodal H. Uniform A random variable is defined as: A. A variable that can take a set of possible values with different probabilities. B. A variable whose outcome is unknown until after a draw is made. C. A variable that is randomly assigned by a researcher. D. Both A & B The best guess of the value we would get from a draw of a random variable is known as: A. The standard deviation B. The expected value C. The variance D. The median 4

The mathematical theorem that states that for a random variable X, the sample mean of draws from that distribution will tend towards the expectation of X as the sample size increases A. Central Limit Theorem B. Heteroskedasticity C. Homoskedasticity D. Law of the Large Numbers Part 3± Bivariate Regression 3.1. In lecture we described three different ways to assess the relationship between two variables: covariance, correlation, and bivariate regression. What are the main differences between these? 3.2. What are the key elements in the basic model of (bivariate) regression? What is β 0 + β 1 and ϵ i Make sure to identify the equation for a basic linear regression model. 3.3. Explain in words the logic behind linear regression. In other words, what do we mean by “the line of best fit”? A key element of this question is the Sum of Squared Errors, make sure to review what we mean by this. 3.4. The interpretation of the linear regression output in R is crucial. What is the interpretation of the coefficient β ̂ ₀ ? What is the interpretation of the coefficient β ̂ ₁ ? For this question, remember two key things: (1) use the expression “a one unit change in X”, reﬂecting the actual variables of each case and (2) avoid the use of causal language such as “causes”, “leads” 3.5. What is the interpretation of the r-squared in a linear regression? Sample Questions In class, we discussed different ways to analyze a relationship between two variables; what is the main disadvantage of employing the covariance to do so? A. It only applies to large samples B. It only works when one of the variables is normally distributed C. It goes from -1 to 1 D. It reﬂects the original units of the variables, making interpretation difficult 5

The difference between the fitted value of Y ( Ŷ) and the actual value of Y is: A. The residual B. The coefficient C. The slope D. The intercept Subtracting the actual values of Y minus the predicted values of Y, squaring these differences and adding them together is known as: A. The Sum of Squared Errors B. The Sum of Squared Residuals C. A or B D. The Sum Squared Expectations Why do we call OLS the line of best fit? A. Because it minimizes the sum of squared residuals B. Because it matches all the points in the data C. OLS is not called the line of best fit D. Because it reduces multicollinearity What is the main difference between β ̂ ₁ and β ₁ in a linear regression model A. There is no difference; we can write them both to mean the same thing B. β ₁ is represents the true value of the population parameter and β ̂ ₁ is a sample estimate C. β ₁ is the estimator and β ̂ ₁ is the estimate D. β ₁ refers to the slope of the line and β ̂ ₁ to the intercept of the line What do we call β ₀ in our basic regression model? A. The error 6

Your preview ends here

Eager to read complete document? Join bartleby learn and gain access to the full version

Access to all documents
Unlimited textbook solutions
24/7 expert homework help