EAE-6060: Public Finance I

class: center, middle, inverse, title-slide

.title[
# EAE-6060: Public Finance I
]
.author[
### Pedro Forquesato <a href="http://www.pedroforquesato.com" class="uri">http://www.pedroforquesato.com</a> Sala 217/FEA2 - <a href="mailto:pforquesato@usp.br" class="email">pforquesato@usp.br</a>
]
.institute[
### Departamento de Economia Universidade de São Paulo
]
.date[
### 2022/3 - Topic 4: Optimal labor taxation
]

---

class: inverse, middle, center

# Optimal labor income taxation

---
class: middle
## Efficiency vs equity

Edgeworth in the 19th century had already realized that without **behavioral effects**, optimal utilitarian labor taxation implies complete equalization of marginal utility: same income for everyone

Unfortunately, behavioral effects are unavoidable, and they generate a **trade-off between efficiency and equity** &mdash; as Mirrlees (1971) states:

> *As a result of using men’s economic performance as evidence of their economic potentialities*, complete equality of social marginal utilities of income ceases to be desirable, for the tax system that would bring about this result would completely discourage all unpleasant work.

---
class: middle
## Mirrlees problem

If `$H(z)$` is the cdf of pre-tax income `$z$`, and `$c = z - T(z)$`, then government maximizes socially-weighted income `$G\circ u$` given a minimum revenue of `$E(p)$` `$$\text{SWF} = \int_0^{\infty} G\left( u \left( z - T(z) \right)\right) \text{d}H(z) \text{ s. t.} \int_0^{\infty} T(z)\text{d}H(z) \geq E(p)$$`

If income `$z$` is fixed, then point-wise maximization leads to `$u^{\prime} \left( z - T(z) \right) = p$` &mdash; namely, egalitarianism (Edgeworth); unfortunately, in the real world `$z$` is a choice variable: if we tax income at a 100% rate, then no one will work

---
class: middle
## Maximum revenue with linear taxation

Let `$Z(1 - \tau)$` be the taxable income function when subject to a linear tax `$\tau$`

Then government revenue is given by `$R(\tau) = \tau Z(1 - \tau)$`, with maximum (**top of the Laffer curve**) satisfying the following *first order conditions*: `$$\text{[FOC]: }Z(1 - \tau) - \tau Z^{\prime}(1 - \tau) = 0$$`

`$$\Rightarrow \frac{Z(1 - \tau)}{1 - \tau} = \frac{\tau}{1 - \tau} Z^{\prime}(1 - \tau)$$`
`$$\therefore \frac{\tau^*}{1 - \tau^*} = \frac{1}{e} \Rightarrow  \tau^* = \frac{1}{1 + e}$$`

---
class: middle
## Optimal linear taxation

Consider now the (more important) problem of designing a linear income tax to maximize social welfare, where we again apply the **envelope theorem**: `$$\int_i \omega^i G\left( u^i \left( (1-\tau)z^i + \tau Z(1-\tau) - E, z^i \right)\right) \text{d} \nu (i)$$`

$$ \text{[FOC]: } \int_i \omega^i G^{\prime}(u^i) u^i_c \left[ Z - z^i - \tau \frac{\text{d}Z}{\text{d}(1-\tau)} \right] \text{d} \nu(i) = 0$$

If `$g^i \equiv \omega^i G^{\prime}(u^i) u^i_c / \int_j \omega^j G^{\prime}(u^j) u^j_c \text{d} \nu (j)$` is the **normalized marginal social welfare weight** of individual `$i$`,
`$$Z \cdot \left[1 - \frac{\tau}{1 - \tau} \left( \frac{\text{d}Z}{\text{d}(1-\tau)} \frac{1-\tau}{Z}\right) \right] = \int_i g^i z^i  \text{d} \nu(i)$$`

---
class: middle
## Optimal linear taxation

`$$\therefore \tau^* = \frac{1 - \bar{g}}{1 - \bar{g} + e} \text{, onde } \bar{g} = \int_i \frac{g^i z^i}{Z}\text{d} \nu(i)$$`

where `$1 - \bar{g} = - \text{corr}(g^i, z^i / Z)$` is the correlation between social welfare weights and taxable income, and it represents social redistributive preferences &mdash; if `$\bar{g} = 1$`, so there are no redistributive motives, then `$\tau^* = 0$` is optimal; if `$\bar{g} = 0$` (Rawls), then the solution is the maximum revenue

Optimal tax is also decreasing in (Marshallian) **taxable income elasticity** `$e$`, which includes income effects and compensated elasticity

---
class: middle
## Optimal taxation at the top

Consider a maximum (marginal) tax rate `$\tau$` charged over incomes above `$z^*$`, which contain fraction `$q$` of individuals

Imagine now an infinitesimal change in the tax rate `$d\tau$`: it has 2 effects, individuals `$i$` above `$z^*$` pay an additional `$(z^i - z^*)d\tau$` (**mechanical effect**), with this amount of revenue and a social cost of `$g^i (z^i - z^*)d\tau$`

But there is also a **behavioral effect** `$\tau dz^i$` &mdash; since: `$$e^i = \frac{dz^i}{d(1 - \tau)} \frac{1 - \tau}{z^i} \Rightarrow dz^i = - e^i \frac{z^i}{1-\tau} d\tau$$`

---
class: middle

A rise in marginal tax after `$z^*$` has two effects on revenue: individuals are *mechanically* paying more taxes for a fixed taxable income (positive), but for a given tax level they *behaviorally* reduce their income (negative) &mdash; at the optimum, these effects must be equal [PS13]
 
---
class: middle
## Optimal taxation at the top

Note that while a mechanical increase in revenue has mixed effect on welfare, the behavioral effect only changes social welfare by the **fiscal externality** (because of *envelope theorem*) &mdash; therefore, if `$g$` is the social weight of people earning more than `$z^*$`:

`$$\text{[FOC]: } d\text{SWF} = \left[ (1 - g) (z - z^*) - e z \frac{\tau}{1 - \tau} \right] q d\tau =  0$$`
Calling `$a \equiv z / (z - z^*)$` (which is approximately the parameter of a Pareto distribution), the optimal tax rate at the top is given by `$$\tau^* = \frac{1 - g}{1 - g + a\cdot e}$$`

---
class: middle
## Optimal taxation at the top

If `$g = 0$`, then the optimal top rate is `$\tau^{*} = 1 / (1 + a \cdot e)$`, the top of the Laffer curve &mdash; and note that if agents have *decreasing marginal utility*, this implies that `$g \rightarrow 0$`

Furthermore, if `$z^* = 0$`, then `$a = 1$` and we get the linear revenue-maximizing rate, as before &mdash; for nonlinear taxation, `$a \geq 1$` measures how thick is the tail of the distribution (for US, it is estimated at about `$1.5$`)

On the very top, if the distribution of income is finite, then `$z=z^*$` implies `$a \rightarrow \infty$`, so `$\tau^* = 0$` &mdash; this is the famous result of *zero marginal taxation at the top*

---
class: inverse, middle, center

# Chetty, R. (2009). “Is the taxable income elasticity sufficient to calculate deadweight loss? The implications of evasion and avoidance”

---
class: middle
## Taxable income elasticity

We just saw that the *sufficient elasticity* for optimal labor taxation is the elasticity of income &mdash; in the past, this was thought of as the elasticity of labor supply

But elasticities of labor supply empirically are very small, around 0.1 [Kea11], which generate very high optimal tax rates &mdash; in that case, the top of Laffer curve is at `$91\%$`

This "counterintuitive" result generated a lot of discussion, and the main retort came through a pair of extremely influential papers from Martin Feldstein in 1995 and 1999

---
class: middle
## Taxable income elasticity

In these papers, Feldstein argued that labor elasticity captured only one form of behavioral response to taxation (work hours), while ignoring other potentially even more important ones: effort, education, career choice, etc.

To account for these other mechanisms he studied the **taxable income elasticity**: any other behavioral response that matters must lead to a decrease in taxable income, so we can consider only this one sufficient statistic

With this new methodology, Feldstein found gigantic efficiency costs of taxation: he estimated a TI elasticity higher than 1 (it is not), leading to most countries being on the *wrong side* of the Laffer curve

---
class: middle
## Taxable income elasticity

This approach became dominant, because it has two big advantages: (1) it is practical, as researchers need "only" tax data; (ii) it is elegant, since it needs not care about many possible channels of deadweight loss

The problem is that one way of reducing taxable income is **tax evasion** &mdash; and intuitively, we should not consider evasion a cost when considering the optimal tax rate

Feldstein's contribution was to argue the opposite: an optimizing agent equals the marginal benefit of evasion (tax rate) with the marginal cost of evasion, so evasion generates real economic costs that must be taken into account

---
class: middle
## Feldstein (1999) model without evasion

Consider a canonical labor supply model, with a linear labor tax generating revenue `$R(t) = twl$`

`$$\max_{l} u(c, l) = c - \psi (l) \text{, s.t. } c = y + (1 - t) wl$$`
We want to calculate the welfare impact of taxation when the government revenue is returned to the individual, namely `$$W(t) = \left\{ y + (1 - t)wl - \psi (l) \right\} + twl$$`

The **envelope theorem** implies that  `$du / dt = \partial u / \partial t = -wl$`

---
class: middle
## Feldstein (1999) model without evasion

Therefore: `$$\frac{d W}{dt} = \frac{du}{dt} + \frac{dR}{dt} = \frac{\partial u}{\partial t} + \frac{dR}{dt}$$`

`$$= -wl + wl + t \frac{d(wl)}{dt} = t \frac{d \text{TI}}{dt},$$`
where the **taxable income** TI is a sufficient statistic! (Note that we are only considering welfare losses here, so optimal tax is zero)

---
class: middle
## Feldstein (1999) model with evasion

Now imagine that the agent can evade an amount `$e$` paying a cost `$g(e)$` &mdash; now the agent solves in `$l$` and `$e$`: `$$\max_{l,e} u(c, l) = c - \psi (l) - g(e) \text{ s.t. } c = y + (1 - t) (wl - e) + e$$`

Social welfare is now given by: `$$W(t) = \left\{ y + (1 - t)(wl - e) + e - \psi (l) - g(e) \right\} + t(wl - e)$$`

`$$\text{But again }\frac{d W}{dt} = - \left( wl - e \right) + \left( wl - e \right) + t \frac{d (wl - e)}{dt} = t \frac{d\text{TI}}{dt}$$`

---
class: middle
## Feldstein (1999) model with evasion

This was Feldstein's insight: it does not matter for deadweight loss if reductions in taxable income come from reduced labor supply or higher evasion

Application of **envelope theorem**: at the optimum, agents equalize the *marginal cost of evasion* one dollar, `$g^{\prime}(e)$`, with the marginal benefit `$t$`

Therefore the welfare cost of evasion, at the margin, is the same as the *marginal value of labor*, which at the optimum is also `$t$`, since agents work until the disutility of an extra dollar is `$1-t$`

---
class: middle
## Evasion as transfers

Chetty's contribution is noting that the above logic is only valid if *evasion is an economic cost*: a large part of it, though, is just transfers to third-parties (notably government fines)

Note here that payments for lawyers and accountants do not work: in a competitive market, a lawyer's wage will equal their marginal cost of effort, which is a social cost

Imagine then that the government can audit tax payers, imposing an expected evasion cost of `$z(e, t) = p(e) \left[ te + F(e,t) \right]$`, where `$p(e)$` is the audit probability and `$F(e,t)$` the fine &mdash; the agent's problem becomes:

---
class: middle
## Evasion as transfers

`$$\max_{l,e} u(c, l) = c - \psi (l), \text{ s.t. } c = y + (1 - t) (wl - e) + e - z(e,t)$$`
`$$\Rightarrow W(t) = \left\{ y + (1 - t)(wl - e) + e - \psi (l) - z(e,t) \right\} + t(wl - e) + z(e,t)$$`

But now the sufficient statistic is not taxable income (elasticity) anymore, but **labor income**: `$$\frac{d W}{dt} = - \left( wl - e \right) - \frac{\partial z}{\partial t} + \left( wl - e \right) + \frac{\partial z}{\partial t} + t \frac{d (wl - e)}{dt} + \frac{\partial z}{\partial e} \frac{d e}{d t}$$`

`$$= t \frac{d\text{TI}}{dt} + \frac{\partial z}{\partial e} \frac{d e}{d t} =  t \frac{d\text{LI}}{dt} + \frac{d e}{d t} \left(\frac{\partial z}{\partial e} - t \right) = t \frac{d\text{LI}}{dt} \text{!}$$`

---
class: middle
## General case

The general model is simply a combination of case 2 (with `$g^{\prime}(e)$`) and 3 (with `$z(e,t)$`) &mdash; the derivation is exactly as in the previous slide, reaching: `$$\frac{d W}{dt} = t \frac{d\text{LI}}{dt} + \frac{d e}{d t} \left(\frac{\partial z}{\partial e} - t \right)$$`

This time however, the FOC for evasion is different: `$t = z^{\prime}(e) + g^{\prime}(e)$`, which leads to a different (general) formula for deadweight-loss: `$$\frac{d W}{dt} = t \frac{d\text{LI}}{dt} - g^{\prime}(e)\frac{de}{dt}$$`

---
class: middle
## General case

If we call `$\mu \equiv g^{\prime}(e)/(g^{\prime}(e) + z^{\prime}(e)) = g^{\prime}(e)/t$` the share of evasion cost that is an **economic cost**, we get:

`$$\frac{d W}{dt} = t \left\{ (1 - \mu + \mu) \frac{d(wl)}{dt} - \mu\frac{de}{dt} \right\} = t \left\{ \mu \frac{d\text{TI}}{dt} + (1 - \mu) \frac{d\text{LI}}{dt} \right\}$$`

`$$= - \frac{t}{1 - t} \left\{ \mu \text{TI} \epsilon_{\text{TI}} + (1 - \mu) wl \epsilon_{\text{LI}} \right\}$$`

Where `$\epsilon_{\text{TI}}$` (or `$\epsilon_{\text{LI}}$`) is the (absolute value) elasticity of taxable income (resp. labor income) to the *net-of-tax rate* `$1 -t$`

---
class: middle
## Optimal taxation

Assume that government revenue generates a benefit `$(1 + \lambda)$` (otherwise `$t^* = 0$`) &mdash; then at the optimum the following FOC must hold: `$$\frac{d \tilde{W}}{dt} = - \frac{t}{1 - t} \left\{ \mu \text{TI} \epsilon_{TI} + (1 - \mu) wl \epsilon_{LI} \right\} + \lambda \text{TI} \left( 1 - \frac{1}{1-t} \epsilon_{TI} \right) = 0$$`

`$$\Rightarrow \frac{t^*}{1 - t^*} = \frac{\lambda}{ (\mu + \lambda) \epsilon_{TI} + (1 - \mu) \left(wl/(wl-e) \epsilon_{LI}\right)}$$`

Equally, the optimal taxation equalizes the **marginal benefit of public funds** `$\lambda$` with the marginal cost of public funds, which is the welfare cost per unit of revenue `$R$`: `$\text{MCPF} \equiv (dW/dt)/(dR/dt)$`

---
class: middle
## Taxable income elasticity

Feldstein's argument of taxable income elasticity being a **sufficient statistic** requires that the *social* marginal cost of evasion equals the marginal benefit, but rational choice only ensures this for *private* marginal cost

Besides optimization errors, these two will differ when part of the costs are transfers to other agents: notedly, audit fines &mdash; then, we also need to estimate the **labor** income elasticity for optimal taxation

In fact, if evasion is purely transfers, then there is no deadweight loss, and we should account only for labor income elasticity &mdash; in practice, it will be something in between, namely, `$\mu \in (0,1)$`

---
class: inverse, middle, center

# Piketty, T., Saez, E., and S. Stantcheva (2014). “Optimal taxation of top labor incomes: a tale of three elasticities”

---
class: middle
## Three elasticities

The authors propose three possible relations between top tax rate and income share of the top 1%: (i) a real effect of lower economic activity, (ii) higher tax evasion; and (iii) less **rent seeking** by the rich

Motives (i) and (ii) were already discussed in the previous paper; here they try to argue empirically that (iii) is relevant, and they calibrate *sufficient statistics* formulas with it

To do this they analyze how share of the top 1% and GDP correlate with maximum tax-rates on the time-series and across countries, as well as how top rates affect CEO rent-seeking behavior

---
class: middle
## Optimal top rate

By a process very similar to what we have seen, they arrive at this "3 elasticities formula" for optimal top tax rate, where `$e = (y/z)e_1 + e_2 + e_3$` is the **elasticity of taxable income** and `$t$` (comp. to `$\tau$`) the tax rate for evaded income:

`$$\tau^{*} = \frac{1 + t \cdot a \cdot e_2 + a \cdot e_3}{1 + a \cdot e}$$`

Here:

1. `$e_1$` measures real economic effects of taxation (decreased labor supply, effort, etc); 
2. `$e_2$` is the elasticity of evasion, which can be lessened by a broad tax base; 
3. `$e_3$` is the elasticity of rent-seeking, which actually *increases* optimal tax

---
class: middle
## "Macro" evidence

Looking at the time-series for USA, there is a large negative correlation between tax rates at the top and top 1% income, which denotes a large *overall* elasticity of taxable income: `$e \approx 0.5$`

Also, using more broad measures of income does not give very different estimates, so tax evasion elasticity cannot be too large

Finally, there is no negative correlation (if anything, a positive one) between tax rates at the top and *average* income, so real effects of taxation must be small &mdash; if `$e$` is big, but `$e_1$` and `$e_2$` are small, then `$e_3$` must be sizable

---
class: middle

After the war, top MTR were large and top 1% share small, while after the 80s top MTR declined significantly and *exactly at the same time* top 1% share started to rise rapidly &mdash; these are only correlations, but the timing is very coincident

---
class: middle

The same is valid if we look at top 1% income instead of income share &mdash; but if we look at "bottom" 99% income, we see the opposite: it grew when top MTR were high and stopped increasing after *Reagonomics*

---
class: middle

---
class: middle

They also look at the cross-country panel of growth of top 1% income share and changes in top MTR, finding strikingly similar results

---
class: middle

---
class: middle
## "Micro" evidence

The above analysis is suggestive, but entirely correlational: *correlation is not causation* &mdash; the authors try to strengthen their argument by giving some "microfoundational" evidence on the importance of **rent-seeking**

[BM01] show that CEOs frequently earn higher payments for reasons entirely outside of their control ("luck") &mdash; this would imply that their remuneration does not only reflect high produtivity, but also *bargaining for rents*

Here they contribute by finding that this *luck remuneration* is stronger when top marginal tax rates are lower &mdash; when CEOs pay lower taxes, it is worth more for them to pay costs to extract rents from the firm

---
class: middle
## Remunerating for luck

In an efficient contract, observable components that are independent from CEO effort ("luck") should never affect their remuneration &mdash; we can check this hypothesis in two stages

First, we estimate the effect of firm performance on CEO remuneration: `$$\log (\text{pay}_{it}) = \beta p_{it} + \gamma_i + \chi_t + \mathbf{X}_{it} \alpha + \epsilon_{it}$$`

This OLS regression identifies the **total derivative** of firm performance `$p_{it}$` on CEO payments &mdash; we then run an IV regression using as instrument the average performance of *other*  firms in the same industry, `$p_{\text{luck},it}$`, which is (assumedly) orthogonal to CEO effort

---
class: middle
## Remunerating for luck

If `$\beta_{IV} > 0$`, then CEOs are getting payed for luck &mdash; further, if `$\beta_{IV} \geq \beta_{OLS}$`, then luck is rewarded *at least as much* as effort

Here the exercise is to test whether `$\beta_{IV}$` *relative to* `$\beta_{OLS}$` is higher when there is more surplus to be bargained: when top marginal tax rates are lower

So they divide the USA time-series in a period with high (1970-1986) and low marginal tax rates at the top (1987-2010), and run this exercise in each &mdash; they also examine internationally whether lower tax rates are related to higher CEO salaries

---
class: middle

In the high tax period, `$\beta_{OLS}$` is relatively low and larger than `$\beta_{IV}$`, while in the low tax period a 1% increase in stock-market value increases remuneration by 0.37% in general and much larger 0.95% if by luck!

---
class: middle

The elasticity of (pre-tax) CEO pay and *net-of-tax rate* at the top is `$1.9$` (very large!) &mdash; interestingly, this relation is larger in firms with worse governance score, where arguably CEOs can engange more successfully in **rent-seeking behavior**

---
class: middle

We can use this to try to *calibrate* an optimal labor tax formula for marginal taxation at the top: if all the reduction in taxable income comes from real effects, the optimal top MTR is (already high) 57%, but if we believe that real effects are small and rent-seeking is important, it can be as high as 83%

---
class: inverse, middle, center

# Akcigit, U., Baslandze, S., and S. Stantcheva (2016). “Taxation and the international mobility of inventors”

---
class: middle
## Taxation and mobility

One potential impact of taxation that is usually not well covered by taxable income elasticity estimates is tax-payer mobility: many workers respond to higher taxes not by working less, but moving to another location altogether

This can be very harmful to economic activity and long-term growth, especially since the most affected by top marginal tax rates is (presumably) the most productive workers

This paper examines a exemplary case: superstar inventors, those with top 1% in patent citations

---
class: middle
## Taxation and mobility

Researching international mobility has some big difficulties: (i) it requires an international panel of people locations, (ii) mobility decisions depend on *counterfactual* incomes that people expect to receive elsewhere; and (iii) country taxes are endogeneous

Here they solve these issues by using international data on patent fillings, constructing counterfactual incomes based on inventor characteristics and using quasi-experimental evidence on international mobility

---
class: middle

This paper investigates behavior of the top 1% in number of citation-weighted patents, extremely prolific and highly payed inventors

---
class: middle

Plan is to compare the top 1% of inventors (**treatment group**), high income inventors whose mean tax rate is very close to the top MTR, with top 5%/10%/25% investors (**control group**), which should be similar in most characteristics (at least trends), but are less afected by changes in top MTR

---
class: middle

Consistently with that, top inventors respond positively to top retention (or net-of-tax) rate, while low quality inventors have no reaction &mdash; however, for domestic inventors, the elasticity is small

---
class: middle

However, for foreigners, the mobility elasticity w.r.t. retention rate is very considerable

---
class: middle

They also use as a *quasi-experiment* the collapse of Soviet Union, which made it easier for Russian scientists to migrate: before the collapse, the elasticity of mobility to retention rate at the top was similar (and small) for all scientists; after the collapse, it increased sharply for high quality scientists, but not for low quality ones

---
class: middle

Another exercise is a **synthetic control** method investigating the decline in top MTR with the 1986 tax reform: for foreign top 1% of investors the elasticity is enormous, but for top 10-25% it is not distinguishable from zero

---
class: middle

Finally, they look at the effects of the 1992 Danish tax reform, and find a 0.72 elasticity for foreign inventors

---
class: middle
## Location choice model

`$$U^{i}_{ct} = u \left( (1 - \tau_{ch^{i}t}) w^i_{ct} \right) + \mu_{ct}^i$$`
`$$\Rightarrow U_{cti} = \alpha_{r_{it}} \log \left( 1 - \text{top MTR}^i_{ct} \right) + \alpha \log ( w_{cti} ) + \beta_c \mathbf{x_{ti}} + \zeta \mathbf{x_{ct}} + \eta \mathbf{x_{cti}} + \nu_{cti}$$`

If the error term `$\nu_{cti}$` follows a Extreme Type I distribution, this structural model can *be estimated as* a multinomial logit

---
class: middle

This multinomial logit model gives estimated elasticities very close to the quasi-experimental techniques: almost zero for natives (0.03), but considerable for foreigners (~0.8) &mdash; luckily very similar across specifications

---
class: middle

They also estimate a *differences-in-differences* style model comparing treated and control groups when top MTR changes, reaching similar results

---
class: middle

A (hypothetical) 10 p.p. decrease in top MTR would have small effect on the number of domestic inventors (0.1% in USA), but a sizable effect on foreign ones (18% increase in USA, even larger in other countries)

---
class: middle

Making "heroic assumptions" about patent value and productivity, they calculate yearly economic gains from a 5p.p. and 10p.p. decrease in top MTR

---
class: middle

Finally, they estimate a **Heckman selection model** using as instrumental variable a 1994 reform in USA and comparing location decisions between USA and Canada, and compare to a Probit model, finding almost identical results

---
class: inverse, middle, center

# Chetty, R., Friedman, J. N., Olsen, T., and L. Pistaferri (2011). “Adjustment costs, firm responses, and micro vs. macro labor supply elasticities: Evidence from Danish tax records”

---
class: middle
## Bunching

A technique recently developed in public economics literature is **bunching**: to estimate taxable income elasticity by looking at how tax payers bunch at *kink* points in the tax schedule (or *notches*, see next paper)

The insight is that in a rational choice model, the amount of tax payers at a kink is informative about the marginal rate of substitution between effort and leisure

Heuristically, if the marginal tax rises from 60% to 80%, those (and only those) for which the cost of earning an extra dollar is higher than 0.6 but lower than 0.8 will stop working

---
class: middle

In a linear budget set (dashed line), workers with different abilities `$n$` choose different levels of taxable income &mdash; but a piecewise linear tax system creates **kinks**, where nonconvexities cause workers with different abilities to choose the same amount of earnings (**bunching**) [Kle16]

---
class: middle

Empirically, they look at changes in marginal tax rates in Denmark in 2000

---
class: middle
## Bunching model

Consider a quasilinear utility in consumption with constant elasticity of labor supply:

`$$u_i (c, h) = c - \alpha_i^{1/\epsilon} \frac{h^{1 + 1/\epsilon}}{1 + 1/\epsilon}$$`

Given a linear taxation `$1 - \tau$`, so that the budget constraint is `$c = (1 - \tau)hw$` it generates Marshallian labor supply that varies smoothly on the ability (or disutility of work) parameter `$\alpha_i$` (*there is no bunching*):

`$$h_i^* = \alpha_i \left( (1 - \tau)w \right)^{\epsilon}$$`

---
class: middle
## Bunching model

However, when the tax-schedule is *piecewise-linear*, at the *kink* where MTR rises from `$\tau_1$` to `$\tau_2 > \tau_1$`, then there is `$\underline{\alpha}$` and `$\bar{\alpha}$` such that:

`$$h_i^* = \alpha_i \left( (1 - \tau_1)w \right)^{\epsilon} \text{ if } \alpha < \underline{\alpha},$$`

`$$h_i^* = \alpha_i \left( (1 - \tau_2)w \right)^{\epsilon} \text{ if } \alpha > \bar{\alpha}$$`

and `$h_K = K/w$` if otherwise `$\alpha \in \left[ \underline{\alpha}, \bar{\alpha} \right]$`, where `$\underline{\alpha} = h_K /\left( (1 - \tau_1)w \right)^{\epsilon}$` and `$\bar{\alpha} = h_K / \left( (1 - \tau_2)w \right)^{\epsilon}$`

---
class: middle
## Bunching model

If `$\alpha_i \sim F(\alpha_i)$`, then the model predicts a *bunching density* of `$$B_{NL} = F(\bar{\alpha}) - F(\underline{\alpha})$$`

`$$= F(h_K / \left( (1 - \tau_2)w \right)^{\epsilon}) - F(h_K /\left( (1 - \tau_1)w \right)^{\epsilon})$$`

Which identifies (given these *parametric assumptions*) the **taxable income elasticity** `$\epsilon$` &mdash; by a Taylor approximation, if `$g$` is the counterfactual distribution without kinks:

`$$F(\bar{\alpha}) \approx F(\underline{\alpha}) + \frac{g(\alpha_i)  \left( (1 - \tau_1)w \right)^{\epsilon}}{2} \left[ h_K / \left( (1 - \tau_2)w \right)^{\epsilon} - h_K /\left( (1 - \tau_1)w \right)^{\epsilon}  \right]$$`

---
class: middle
## Empirical estimation

`$$\therefore \frac{B_{NL}}{h_K} = \frac{g(\alpha_i)}{2} \left[ \left( \frac{(1 - \tau_1)}{(1 - \tau_2)} \right)^{\epsilon} - 1 \right]$$`

To estimate the counterfactual density, they estimate a `$q$` order polynomial close to (but outside of) the kink threshold, divinding in `$j\in J$` bins and estimating: `$$C_j = \sum_{i=0}^{q} \beta_i^0 (Z_j)^i + \sum_{i=-R}^{R} \gamma_i^0 \mathbb{1} \left[Z_j = i \right] + \varepsilon_j^0$$`

---
class: middle
## Empirical estimation

If `$\hat{C}_j = \sum_{i=0}^{q} \beta_i^0 (Z_j)^i$` the polynomial estimated removing the kink area, then we can estimate the bunching by `$\hat{B}_N^0 = C_j - \hat{C}_j = \sum_{i=-R}^{R} \gamma_i^0$`

And then we estimate: `$$\hat{b} = \frac{\hat{B}_N}{\sum_{j=-R}^{R} \hat{C}_j / (2R + 1)} \text{ and } \epsilon = \frac{\hat{b}}{K\ln \left( \frac{1-\tau_1}{1-\tau_2}\right)}$$`

Standard errors can be estimated by bootstrap

---
class: middle

Tax changes might have a smaller effect on behavior because adjusting earnings is costly (**adjustment costs**): often involves changing jobs, choosing different amounts of education, etc &mdash; this would lead us to underestimate the *elasticity of taxable income* when looking at the empirical data

---
class: middle

Tax brackets change yearly because of policy changes and indirectly per inflation &mdash; this allows us to see if tax payers "follow" the kinks

---
class: middle

**Bunching** is the difference between the observed histogram of earnings and the contrafactual fitted line in red &mdash; and it follows the predictable patterns: it is higher in married women vs single men, and in careers with more control over wages, like teachers (vs military)

---
class: middle

Even though tax brackets change every year, the histogram of earnings exhibits bunching every year

---
class: middle

Bunching analysis estimates (plausibly) unreasonably small elasticities of labor supply to taxation, even in subsamples where it is known to be large (such as married women) &mdash; the authors propose this is because of *adjustment costs*

---
class: middle

In 2001 there was a change in the top MTR tax bracket &mdash; we observe bunching at the new tax bracket, not at the 1997 one, even if adjusted by inflation or average wage growth: this points towards an intentional movement towards the kink

---
class: middle

But teachers bunch at the kink, *even those that do not have tax changes at that point*: this is evidence of **aggregate bunching**, namely that unions and professional associations lead wages to bunch at the kink, not individual behavior

---
class: middle

Tax payers with large deductions seem to bunch at their *individual kink*, while tax payers with small deductions can afford to depend only on aggregate bunching and be close enough to their kink

---
class: inverse, middle, center

# Kleven, H. and M. Waseem (2013). “Using notches to uncover optimization frictions and structural elasticities: Theory and evidence from Pakistan”

---
class: middle

---
class:middle
# Referências

[BM01] M. Bertrand and S. Mullainathan. "Are CEOs rewarded for luck?
The ones without principals are". In: _The Quarterly Journal of
Economics_ 116.3 (2001), pp. 901-932.

[Kea11] M. P. Keane. "Labor supply and taxes: A survey". In: _Journal
of Economic Literature_ 49.4 (2011), pp. 961-1075.

[Kle16] H. J. Kleven. "Bunching". In: _Annual Review of Economics_ 8
(2016), pp. 435-464.

[PS13] T. Piketty and E. Saez. "Optimal labor income taxation". In:
_Handbook of public economics_. Vol. 5. Elsevier, 2013, pp. 391-474.