Last July I published a blog post Nuts about NUTS! where I discussed the introduction of the new regional classification for statistics in Ireland. Briefly, a new regional classification for collecting statistics in Ireland was approved by the EU in 2016, following the Local Government Act 2014 and the establishment of three new Regional Assemblies. The CSO first used this new regional classification in the Labour Force Survey for Quarter 1 2018. It has since used the new regional classification in any data release which includes regional data.
The new structure involves three NUTS2 level regions (replacing the two previous NUTS2 regions (Southern & Eastern and Border, Midland & West)) and eight (revised) NUTS3 level regions as follows:
- NUTS2 Northern & Western: Composed of NUTS3 West (Galway, Mayo, Roscommon) and NUTS3 Border (Donegal, Sligo, Leitrim, Cavan, Monaghan).
- NUTS2 Southern: Composed of NUTS3 Mid-West (Clare, Limerick, Tipperary), NUTS3 South East (Wexford, Waterford, Carlow, Kilkenny) and NUTS3 South West (Cork, Kerry).
- NUTS2 Eastern & Midland: Composed of NUTS3 Dublin, NUTS3 Mid-East (Wicklow, Kildare, Meath, Louth) and NUTS3 Midlands (Offaly, Laois, Westmeath, Longford).
The changes at NUTS3 level are the transfer of South Tipperary from the South-East NUTS3 region into the Mid-West NUTS3 region (following the amalgamation of North and South Tipperary Councils) and the movement of Louth from the Border NUTS3 region to the Mid-East NUTS3 region. Therefore four out of the eight NUTS3 regions changed and four (West, South West, Dublin and Midlands) remained the same. The CSO published an Information Note on the revisions.
Like the UK leaving the EU, we are currently in something of a ‘transition period’ when it comes to regional statistics in Ireland, and while nowhere near as traumatic, it is causing a few problems and some confusion, namely:
Which NUTS3 regions? It is easy to tell the ‘new’ classification when it includes the NUTS2 regions. If you see Northern & Western, Southern and Eastern & Midland you know it is the ‘new’ one.
However as the names of the NUTS3 regions remained unchanged, when you just see a list of the NUTS3 regions, it is not clear if the ‘old’ or ‘new’ classification is being used. A good example of how this can be confusing was the recent Census of Industrial Production 2016. In this release data was provided for five NUTS3 regions (Border, Dublin, Mid-East, Midland and West) as the data for the South West, South East and Mid-East were suppressed for confidentiality reasons. No data was therefore given in the release publication for the NUTS2 regions. Reading the release it was not clear which regional classification was used and it was not included in the Background Notes to the release. Only by going into Statbank to download the data, and seeing the NUTS2 regions of Northern & Western etc. listed was it clear that the ‘new’ classification was used.
When using the CSO’s Statbank system, a ‘Note’ will usually appear in a dialog box at the top of the page (see below), and also at the bottom of the spreadsheet you download indicating if the new regional classification is used. If you are looking at a published CSO ‘Release’ such as the Adult Education Survey 2017, then the regional classification is usually included in the ‘Background Notes’.
Different CSO data sets: As mentioned above, any new data (which includes a regional breakdown) issued by the CSO since mid-2018 is issued using the new regional classification. However data sets published prior to that use the older classification. So for example the most recent Labour Force Survey data, which measures employment and unemployment, uses the new regional structure but the County Incomes and Regional GDP 2015 data issued last February uses the older classification. Therefore a report or analysis drawing on a number of different CSO data sets may find that the regional classifications are not necessarily comparable.
Different data sources: While the CSO has adopted the ‘new’ regional classifications for all releases, this may not be true of all data providers. Earlier this year the IDA issued its end of year results for 2018. The results included data for the number of jobs in IDA-backed companies by region and the annual change. The release did not specify which regional classification was used, so it was unclear if their ‘Border’ was the ‘old’ region including Louth or the ‘new’ region without Louth.
I got in touch with IDA and they confirmed they were continuing to use the ‘old’ regional structure as it aligned with their regional office structure, the regional targets set in their Strategy and current EU State Aid regulations. Their intention is to switch to the new structures from 2020.
While this is very understandable, it does raise the possibility for some confusion and misunderstanding. For example someone may compare total employment in the Border region in 2018 (derived from the LFS and using the ‘new’ regions) with employment in IDA companies (derived from IDA results and using the ‘old’ regions) without realising the two ‘Borders’ are not the same region. This clearly illustrates the need for all data providers and users to state which regional classification is being used.
Time series: When releasing new regional data with the ‘new’ classifications, the CSO are (where applicable) issuing ‘backdated’ results for the ‘new’ regions to 2012, so for example in the CSO’s Statbank if you go to the Labour Force Survey, you can download data for the ‘new’ regions for each quarter from Q1 2012 to Q3 2018 (latest). Clearly recoding and backdating massive data sets with new classifications is a time-consuming task and providing six years of time series data is very welcome.
However it does mean conducting time series analysis at regional level (except for the four ‘unchanged’ NUTS3 regions) further back than 2012 involves a break in the data at 2012. This break in the time series can cause some confusion, an example was the Survey on Income and Living Conditions (SILC) published late last year. The CSO used the new classification for the 2017 release and backdated to 2012, with the data prior to that (2004-2011) using the old classification. They noted this in the dialog box (see below) and download in Statbank and in the ‘Information Note’ for the release. However, there were examples of commentary by people comparing the regional poverty data from 2008 and 2017 without making reference to the fact that for four of the eight regions, the data for the two periods was not directly comparable.
Also, while the CSO have committed to providing backdated time series for the new regions to 2012, it is not clear if all data providers will do the same. Therefore it is important to check.
Coping with the transition!
Obviously over time this issue will largely resolve itself as the revised classification becomes the norm for all data and we move further away from the ‘break’ in the time series. In the meantime however here are a few suggestions for coping with the transition:
- Check: When looking at or downloading any data at regional level, check which regional classification is used. For CSO data, it is usually included in a ‘dialog’ box in Statbank and the Background/Information Note for the release or if the data includes NUTS2 data it is easy to tell. In general any CSO data issued since June 2018 uses the ‘new’ classification and anything issued before that will be the ‘old’. Also be sure to check if (and how far) any backdating of the data has been done, for CSO it will generally be to 2012. Other data sources would need to be checked on a case by case basis.
- Ask: If it is not stated or clear from the release or data, contact the data provider to check. The more people who make a query, the more conscious all data providers will be to clarify which classification is used.
- Say: If you are a data provider and publishing data at regional level, be conscious to explicitly state which regional classification is used. If you are a data user and are publishing analysis or commentary using regional data, clarify which classification you are using. This is particularly critical if multiple data sets or sources are used with different regional classifications.
While this may all seem a little pedantic, given the current interest in the impact of Brexit on the border economy, knowing if someone discussing data for the Border is discussing a Border including Louth (with Dundalk and Drogheda) or a Border excluding Louth, could make quite a lot of difference.