NUTS 2 based data - example using Germany

Here you can post your opinions, ask questions and share experiences on the PATSTAT product line. Please always indicate the PATSTAT edition (e.g. 2015 Autumn Edition) and the database (e.g. PATSTAT Online, MySQL, MS SQL Server, ...) you are using.
Post Reply

EPO / PATSTAT Support
Posts: 440
Joined: Thu Feb 22, 2007 5:33 pm
Contact:

NUTS 2 based data - example using Germany

Post by EPO / PATSTAT Support » Mon Jun 22, 2020 10:12 am

Many users are asking how to create NUTS 2 based tables -similar to what Eurostat used to do-. NUTS based data at the lower levels (2-3) can only be created for persons (applicants and inventors) where there is good address data available. In the person table you will find the most granular NUTS code that could be created based on the country code (NUTS 0) and the address (allowing NUTS 1-3) depending on completeness.
To create NUTS 2 based data, users need to aggregate the data from the lower level 3 to level 2.
Concrete: if you want to have data for NUTS 2 = DE11 Stuttgart, you will need to sum up all the data from the lower NUTS 3 codes for Stuttgart. Those will be: DE111: Stuttgart-Stadtkreis, DE112 Böblingen, DE113 Esslingen, DE114 Göppingen, DE115 Ludwigsburg, DE116 Rems-Murr-Kreis, DE117 Heilbronn-Stadtkreis, DE118 Heilbronn-Landkreis, DE119 Hohenlohekreis, DE11A Schwäbisch Hall, DE11B Main-Tauber-Kreis, DE11C Heidenheim and DE11D Ostalbkreis. (The complete lists for all NUTS codes is available in tls904_nuts).
The "adding up together" in SQL is done by "grouping" at the right level.

Here is the SQL code that counts the number of granted patents filed at the EPO, having (at least 1) applicant from Germany.

Code: Select all

SELECT nuts2.nuts ,   nuts2.nuts_label,
COUNT(distinct(case when year(publn_date)  = 2005 then tls211_pat_publn.appln_id end)) as "2005",
COUNT(distinct(case when year(publn_date)  = 2006 then tls211_pat_publn.appln_id end)) as "2006",
COUNT(distinct(case when year(publn_date)  = 2007 then tls211_pat_publn.appln_id end)) as "2007",
COUNT(distinct(case when year(publn_date)  = 2008 then tls211_pat_publn.appln_id end)) as "2008",
COUNT(distinct(case when year(publn_date)  = 2009 then tls211_pat_publn.appln_id end)) as "2009",
COUNT(distinct(case when year(publn_date)  = 2010 then tls211_pat_publn.appln_id end)) as "2010",
COUNT(distinct(case when year(publn_date)  = 2011 then tls211_pat_publn.appln_id end)) as "2011",
COUNT(distinct(case when year(publn_date)  = 2012 then tls211_pat_publn.appln_id end)) as "2012",
COUNT(distinct(case when year(publn_date)  = 2013 then tls211_pat_publn.appln_id end)) as "2013",
COUNT(distinct(case when year(publn_date)  = 2014 then tls211_pat_publn.appln_id end)) as "2014",
COUNT(distinct(case when year(publn_date)  = 2015 then tls211_pat_publn.appln_id end)) as "2015",
COUNT(distinct(case when year(publn_date)  = 2016 then tls211_pat_publn.appln_id end)) as "2016",
COUNT(distinct(case when year(publn_date)  = 2017 then tls211_pat_publn.appln_id end)) as "2017",
COUNT(distinct(case when year(publn_date)  = 2018 then tls211_pat_publn.appln_id end)) as "2018",
COUNT(distinct(case when year(publn_date)  = 2019 then tls211_pat_publn.appln_id end)) as "2019",
COUNT(distinct(case when year(publn_date)  = 2020 then tls211_pat_publn.appln_id end)) as "2020",
COUNT(distinct(case when year(publn_date)  = 2021 then tls211_pat_publn.appln_id end)) as "2021",
COUNT(distinct(case when year(publn_date)  = 2022 then tls211_pat_publn.appln_id end)) as "2022",
COUNT(distinct(case when year(publn_date)  = 2023 then tls211_pat_publn.appln_id end)) as "2023"
  FROM tls211_pat_publn join tls227_pers_publn on tls211_pat_publn.pat_publn_id = tls227_pers_publn.pat_publn_id
  join tls206_person  on tls206_person.person_id = tls227_pers_publn.person_id
  left join (select nuts, nuts_label from tls904_nuts where nuts_level = 2 ) nuts2 on nuts2.nuts = left(tls206_person.nuts,4)
  --NUL value will group the totals for those applicants where no NUTS has been assigned (yet)
    where publn_auth = 'ep'  and publn_kind = 'B1'
	and applt_seq_nr > 0 
	and year(publn_date)  between 2008 and 2023
	and tls206_person.nuts like 'DE%'
group by  nuts2.nuts ,   nuts2.nuts_label
order by   nuts2.nuts  
Nuts-3 level

Code: Select all

SELECT nuts3.nuts ,   nuts3.nuts_label,
COUNT(distinct(case when year(publn_date)  = 2005 then tls211_pat_publn.appln_id end)) as "2005",
COUNT(distinct(case when year(publn_date)  = 2006 then tls211_pat_publn.appln_id end)) as "2006",
COUNT(distinct(case when year(publn_date)  = 2007 then tls211_pat_publn.appln_id end)) as "2007",
COUNT(distinct(case when year(publn_date)  = 2008 then tls211_pat_publn.appln_id end)) as "2008",
COUNT(distinct(case when year(publn_date)  = 2009 then tls211_pat_publn.appln_id end)) as "2009",
COUNT(distinct(case when year(publn_date)  = 2010 then tls211_pat_publn.appln_id end)) as "2010",
COUNT(distinct(case when year(publn_date)  = 2011 then tls211_pat_publn.appln_id end)) as "2011",
COUNT(distinct(case when year(publn_date)  = 2012 then tls211_pat_publn.appln_id end)) as "2012",
COUNT(distinct(case when year(publn_date)  = 2013 then tls211_pat_publn.appln_id end)) as "2013",
COUNT(distinct(case when year(publn_date)  = 2014 then tls211_pat_publn.appln_id end)) as "2014",
COUNT(distinct(case when year(publn_date)  = 2015 then tls211_pat_publn.appln_id end)) as "2015",
COUNT(distinct(case when year(publn_date)  = 2016 then tls211_pat_publn.appln_id end)) as "2016",
COUNT(distinct(case when year(publn_date)  = 2017 then tls211_pat_publn.appln_id end)) as "2017",
COUNT(distinct(case when year(publn_date)  = 2018 then tls211_pat_publn.appln_id end)) as "2018",
COUNT(distinct(case when year(publn_date)  = 2019 then tls211_pat_publn.appln_id end)) as "2019",
COUNT(distinct(case when year(publn_date)  = 2020 then tls211_pat_publn.appln_id end)) as "2020",
COUNT(distinct(case when year(publn_date)  = 2021 then tls211_pat_publn.appln_id end)) as "2021",
COUNT(distinct(case when year(publn_date)  = 2022 then tls211_pat_publn.appln_id end)) as "2022",
COUNT(distinct(case when year(publn_date)  = 2023 then tls211_pat_publn.appln_id end)) as "2023"
  FROM tls211_pat_publn join tls227_pers_publn on tls211_pat_publn.pat_publn_id = tls227_pers_publn.pat_publn_id
  join tls206_person  on tls206_person.person_id = tls227_pers_publn.person_id
  left join (select nuts, nuts_label from tls904_nuts where nuts_level = 3 ) nuts3 on nuts3.nuts = tls206_person.nuts
    where publn_auth = 'ep'  and publn_kind = 'B1'
	and applt_seq_nr > 0 
	and year(publn_date)  between 2008 and 2023
	and tls206_person.nuts like 'DE%'
group by  nuts3.nuts ,   nuts3.nuts_label
order by   nuts3.nuts 
Small side note: if you would compare the data with NUTS 3 level data, you will see that the sum of NUTS 3 is a bit more then the "grouped" result from NUTS 2. The reason for this is the (often overlooked) fact that patent applications can have multiple applicants living in different NUTS areas (or even in different countries). So an application having an applicant from Böblingen and from Esslingen will be counted for each of the NUTS 3 levels, but only once for the Stuttgart.
PATSTAT Support Team
EPO - Vienna
patstat @ epo.org


Post Reply