资源简介 (共32张PPT)3.1 The Important Distinction Between Correlation and Causality3.2 Measuring Causation with Data We’d Like to Have: Randomized Trials3.3 Estimating Causation with Data We Actually Get: Observational Data3.4 ConclusionEmpirical Tools of Public Finance 3Empirical Tools of Public Finance3This chapter focuses on empirical public finance.Empirical public finance: The use of data and statistical methods to measure the impact of government policy on individuals and markets.Distinguishing between correlations and causal relationship is the key task in empirical public finance.Correlated: Two economic variables are correlated if they move together.Causal: Two economic variables are causally related if the movement of one causes movement of the other.The Important Distinction Between Correlationand Causality3.1There are many examples where causation and correlation can get confused.Russian peasant fallacyBreast feeding fallacySAT training fallacyIn statistics, this is called the identification problem: Given that two series are correlated, how do you identify whether one series is causing another The Problem3.1Whenever we see a correlation between A and B, there are three possible explanations:A is causing B.B is causing A.Some third factor is causing both.The general problem that empirical economists face is trying to distinguish among these three explanations. Correlation alone does not imply causation.Example Identification Problem: SAT Prep Courses3.1Among Harvard students who took an SAT prep course, SAT scores were 63 points lower than among those who hadn’t.Do prep courses reduce scores (i.e., A causes B) Do low scores cause people to enroll in prep courses (i.e., B causes A) Does some third factor cause both low scores and enrollment Randomized Trials as a SolutionRandomized trials solve the identification problem.Randomized trial: The ideal type of experiment designed to test causality, whereby a group of individuals is randomly divided into a treatment group, which receives the treatment of interest, and a control group, which does not.Treatment group: The set of individuals who are subject to an intervention being studied.Control group: The set of individuals comparable to the treatment group who are not subject to the intervention being studied.3.2Randomized Trials as a SolutionWhy do randomized trials solve the problem Random assignment rules out reverse causation.Random assignment means the treatment and control group differ only by treatment. This rules out any third factors causing both treatment and effects.Any difference between treatment and control group must be due to treatment.Randomized trials therefore considered the “gold standard” for determining causality.3.2The Problem of Bias3.2The identification problem is a problem of bias.Bias: Any source of difference between treatment and control groups that is correlated with the treatment but is not due to the treatment.Randomization eliminates bias, which is why it is the gold standard.Measuring Causation with Data We’d Like to Have: Randomized TrialsRandomized trials are useful in medicine and public policy.ERT: Randomized trials showed that estrogen replacement therapy raised the risk of heart disease. These trials lead to reduced use of ERT.TANF: Randomized trials showed that changing welfare programs can encourage employment among recipients.3.2Why We Need to Go Beyond Randomized Trials3.2Even the gold standard of randomized trials has some potential problems.The results are only valid for the sample of individuals—not the population as a whole.They can suffer from attrition.Attrition: Reduction in the size of samples over time, which, if not random, can lead to biased estimates.Estimating Causation with Data We Actually Get: Observational Data3.3Typically, randomized data are not available; researchers rely on observational data.Observational data: Data generated by individual behavior observed in the real world, not in the context of deliberately designed experiments.Bias is a pervasive, difficult problem in observation data.There are, however, methods available that can allow us to approach the gold standard of randomized trials.Time Series Analysis3.3Can be used to identify and measure correlation, which supports existing theory, implying causation.Time series analysis often produces striking patterns.Time series analysis: Analysis of the comovement of two series over time.Time Series Analysis: Cash Welfare Guarantee and Hours Worked Among Single Mothers3.3A strong negative correlation exists between average benefit guarantee and level of labor supply. Not necessarily causal, however.Time Series Analysis Problems3.3Does not separate out causation from correlation.Different subperiods (1968–1976, 1978–1983, 1993–1998) give different impressions.Excluded variables may be driving the results—especially the macroeconomy and wage-subsidy programs.When Is Time Series Analysis Useful Cigarette Prices and Youth Smoking3.3Sharp, simultaneous changes in prices and smoking rates in 1993 and 1998–onwardKnown causes: price war, tobacco settlementsCross-Sectional Regression Analysis3.3An alternative to time series analysis is cross-sectional regression analysis.Cross-sectional regression analysis: Statistical method of the relationship between two or more variables exhibited by many individuals at one point in time.Regression analysis finds the best-fitting linear relationship between two variables.Regression line: The line that measures the best linear approximation to the relationship between any two variables.Cross-Sectional Regression Analysis: Illustration using Labor Supply and TANF Benefit3.3We graph the two data points when the benefit guarantee is $5,000 (see Figure 3-3).One data point, point A, represents labor supply of 0 hours and an income guarantee of $5,000.The other data point, point B, represents a labor supply of 90 hours per year and TANF benefits of $4,550.The downward-sloping line makes clear the negative correlation between TANF benefits and labor supply; the mother with lower TANF benefits has a higher labor supply.Cross-Sectional Regression Analysis: Illustration using Labor Supply and TANF Benefit3.3Regression analysis takes this correlation one step further by quantifying the relationship between TANF benefits and labor supply.Regression analysis does so by finding the line that best fits this relationship and then measuring the slope of that line.The line that connects these two points has a slope of –0.2 (see Figure 3-3). That is, this bivariate regression indicates that each $1 reduction in TANF benefits per month leads to a 0.2-hour-per-year increase in labor supply.Cross-Sectional Regression Analysis: Labor Supply and TANF Benefit, Figure 3-33.3Negative correlation between TANF benefits and labor supply.Example with Real-World Data: Labor Supply and TANF Benefits3.3Using real data from the Current Population Survey (CPS), we take a sample of single mothers from this survey and ask the following: What is the relationship between the TANF benefits and hours of labor supply in this cross-sectional sample The linear regression line (see Figure 3-4) shows the best linear approximation to the relationship between the points that represent TANF benefits and labor supply.Example with Real-World Data: Labor Supply and TANF Benefits3.3The regression line (see Figure 3-4) has a slope of –127, which indicates that each doubling of TANF benefits reduces work by 127 hours per year.It is convenient to represent the relationship between economic variables in elasticity form. Based on the CPS data, the mean (average number of) hours of work in the sample is 1,274 hours. So we know that each 100% rise in TANF benefits reduces hours of work by 10% (127 is 10% of 1,274), for an elasticity of –0.1.This is a fairly inelastic response; there is a relatively modest reduction in hours (10%) when TANF benefits rise (by 100%).Example with Real-World Data: Labor Supply and TANF Benefits, Figure 3-43.3Each doubling of TANF benefits reduces work by 127 hours per year.Problems with Cross-Sectional Regression Analysis3.3Mothers who receive the largest TANF benefits work the fewest hours.There are several possible interpretations of this correlation:Perhaps higher TANF benefits are causing an increase in leisure.Or perhaps some mothers have a high taste for leisure and wouldn’t work much even if TANF benefits weren’t available—and benefits are low because they aren’t working much.Control Variables3.3Sometimes, control variables can correct bias.Control variables: Variables that are included in cross-sectional regression models to account for differences between treatment and control groups that can lead to bias.If we could measure “taste for leisure,” then we could compare two single mothers with identical taste for leisure but different TANF benefits.In reality, it is probably impossible to measure “taste for leisure” and all other relevant variables.Quasi-Experiments3.3An alternative approach is to use quasi-experiments.Quasi-experiments: Changes in the economic environment that create nearly identical treatment and control groups for studying the effect of that environmental change, allowing public finance economists to take advantage of randomization created by external forces.Policy differences across states and over time often create quasi-experiments.3.3Difference-in-difference estimators are popular quasi-experimental designs.Difference-in-difference estimator: The difference between the changes in outcomes for the treatment group that experiences an intervention and the control group that does not.Example: In 1998, Arkansas cut its benefit guarantee from $5,000 to $4,000, but Louisiana did not change policy.Difference-in-Difference Estimators3.3Benefits and Labor Supply in Arkansas and LouisianaArkansas 1996 1998 DifferenceBenefit guarantee ($) 5,000 4,000 1,000Hours worked 1,000 1,200 200Louisiana 1996 1998 DifferenceBenefit guarantee ($) 5,000 5,000 0Hours worked 1,050 1,100 503.3With quasi-experimental studies, we can never be completely certain that we have purged all bias from the treatment–control comparison.Quasi-experimental studies use two approaches to try to make the argument that they have obtained a causal estimate.Intuitive approach: Argue that, given the experiment, most of the bias has been removed.Statistical: Use alternative or additional control groups to confirm that bias has been removed.Problems with Quasi-Experiments: Bias3.3Experiments give the reduced form impact of some policy, but they do not explain why the policy works.Structural estimates: Estimates of the features that drive individual decisions, such as income and substitution effects or utility parameters.Reduced form estimates: Measures of the total impact of an independent variable on a dependent variable, without decomposing the source of that behavior response in terms of underlying utility functions.Problems with Quasi-Experiments: Interpretation3.3They only provide an estimate of the causal impact of a particular treatment, such that we can’t necessarily extrapolate from a particular change in the environment to model all possible changes in the environment.They can tell us how outcomes change when there is an intervention but often can’t tell us why.Limitations of Both Randomized Trials and Quasi-Experimental Approaches:Conclusion3.4The central issue for any policy question is establishing a causal relationship between the policy in question and the outcome of interest.How to distinguish causality from correlation, or eliminate bias Gold standard: randomized trial.Alternative methods: time series, cross-sectional analysis, and quasi-experimental analysis.Each alternative method has weaknesses, but it is possible to overcome the identification problem using careful consideration of the problem at hand.Conclusion3.4Randomized trials are often infeasible.Alternative methods include: time series, cross-sectional analysis, and quasi-experimental analysis.Each alternative method has weaknesses, but it is possible to overcome the identification problem using careful consideration of the problem at hand. 展开更多...... 收起↑ 资源预览