The Impact of a “Hard Brexit” on EU Labour Mobility – Methodology

Data sources:

Data on Indeed job searches come from a dataset of aggregated and anonymous search data from Indeed sites in 49 countries and economies. We define a cross-border searcher as someone who, based on their IP address location, is currently located in one country but searches for work in another country using the Indeed site in that country. If the same searcher looks in more than one country, we count them once for each country in which they search, but we only count one search per locality per IP address in a month. We conducted our main analysis on the average monthly count of searchers within each locality over the period of April 2016 through August 2016.

Data on distance, language, culture, colonial linkages and religion were taken from the gravity dataset used in “HEAD, K., T. MAYER AND J. RIES, 2010, “The erosion of colonial trade linkages after independence” Journal of International Economics, 81(1):1-14” and the CEPII distance dataset. All the data we used is publicly accessible here:

Data on migration flows refer to 2010 and were taken from the database on Immigrants in OECD countries (DIOC):

The model

Following Beine et al. (2009), we fit a gravity model on a cross-sectional sample of job search and migration flows. The model included fixed effects for both sending and destination countries and used Ordinary Least Squares (OLS) estimation with standard errors adjusted for heteroscedasticity through Huber-White procedure. The final specification used was:

ln(flowij)=α0 + α1EUij + α2languageij + α2colonyij + α2religionij +ln(distanceij )+ γj + γi + εij

Where i is the IP address (sending) country and j is the job search site (receiving) country.

The battery of robustness checks we ran included restricting the sample to an increasingly smaller group of countries based on where the website Indeed is more widely used and established, but results did not change in any significant way even in the most conservative scenario. Given the different sample sizes of the job search and migration models, as a check we estimated both models on the same sample with no significant change in the results.

Detailed results

(Fixed effects coefficients are omitted from the tables but are included in all regressions)

FE coefficientSEMarginal effect (dummy from 0 to 1)*Marginal effect (dummy from 1 to 0)*
Dependent variable: job search flow as a share of sending country traffic on Indeed
EU membership0.380.070.46-0.31
Common language0.990.081.67-0.63
Belongs to former colonial empire0.550.130.71-0.42
Shared religion0.80.081.23-0.55
N=2347; R2=0.88
Dependent variable: job search flow as a share of receiving country traffic on Indeed
EU membership0.370.070.45-0.31
Common language0.990.081.68-0.63
Belongs to former colonial empire0.540.130.71-0.42
Shared religion0.80.081.23-0.55
N=2347; R2=0.86
Dependent variable: Migration flows in 2010 (source: OECD)
EU membership0.560.150.73-0.42
Common language1.010.141.72-0.63
Belongs to former colonial empire0.630.170.85-0.46
Shared religion0.690.160.98-0.49
N=1150; R2=0.79
*Marginal effects are estimated following:
Giles, D. E. (2011a). Interpreting dummy variables in semi-logarithmic regression models: exact distributional results. Econometrics Working Paper EWP1101, Department of Economics, University of Victoria.
Kennedy, P. E. (1981). Estimation with correctly interpreted dummy variables in semilogarithmic equations. American Economic Review, 71, 801.

6. There are Indeed sites in 63 countries but we eliminated countries from this analysis if they did not meet specific criteria. Due to the immaturity of Indeed sites that have been active for less than 1 year, we eliminated 10 countries. We then ran a series of sensitivity tests and determined that 4 additional countries with small Indeed market share were severely skewing the results and ended up with a final set of 49 countries.

7. We were concerned that economic events could strongly impact search activity in specific countries so we pulled the search data month by month to allow us to trim the outliers if necessary. Testing the sensitivity of the analysis to outliers, we used both the mean and the median value of the searches over the 5 months and found negligible differences between the two and decided to use the mean value.