SQL queries to create simple applicant / year based table.

Here you can post your opinions, ask questions and share experiences on the PATSTAT product line. Please always indicate the PATSTAT edition (e.g. 2015 Autumn Edition) and the database (e.g. PATSTAT Online, MySQL, MS SQL Server, ...) you are using.
Post Reply

EPO / PATSTAT Support
Posts: 425
Joined: Thu Feb 22, 2007 5:33 pm
Contact:

SQL queries to create simple applicant / year based table.

Post by EPO / PATSTAT Support » Wed Apr 28, 2021 5:31 pm

Many researchers are looking for tables giving applicant rankings based on the number of applications.
With PATSTAT, there are endless possibilities to specify the "conditions" for those applicants (or patent applications).
Here are 5 variations on the same theme that have all been tested on PATSTAT Online.
Researchers can obtain a 1 month free test account on PATSTAT Online to get acquainted with the data and create their own search methodology.

1) A SQL query that creates a table containing the top 1000 applicants based on granted EP patents that were filed between 2010 and 2020.

Code: Select all

SELECT top 1000  psn_name, 
COUNT(distinct(case when appln_filing_year = 2010 then tls201_appln.appln_id end)) as '2010',
COUNT(distinct(case when  appln_filing_year = 2011 then tls201_appln.appln_id end)) as '2011',
COUNT(distinct(case when  appln_filing_year = 2012 then tls201_appln.appln_id end)) as '2012',
COUNT(distinct(case when  appln_filing_year = 2013 then tls201_appln.appln_id end)) as '2013',
COUNT(distinct(case when  appln_filing_year = 2014 then tls201_appln.appln_id end)) as '2014',
COUNT(distinct(case when  appln_filing_year = 2015 then tls201_appln.appln_id end)) as '2015',
COUNT(distinct(case when  appln_filing_year = 2016 then tls201_appln.appln_id end)) as '2016',
COUNT(distinct(case when  appln_filing_year = 2017 then tls201_appln.appln_id end)) as '2017',
COUNT(distinct(case when  appln_filing_year = 2018 then tls201_appln.appln_id end)) as '2018',
COUNT(distinct(case when  appln_filing_year = 2019 then tls201_appln.appln_id end)) as '2019',
COUNT(distinct(case when  appln_filing_year = 2020 then tls201_appln.appln_id end)) as '2020',
count (tls201_appln.appln_id) total
FROM   tls201_appln  join tls207_pers_appln on tls201_appln.appln_id = tls207_pers_appln.appln_id
join tls206_person on tls207_pers_appln.person_id = tls206_person.person_id
WHERE
appln_auth = 'EP'
AND appln_filing_year between 2010 and 2020
and applt_seq_nr > 0 
and granted = 'y' --remove this line if you want to have the number of patents filed instead of only granted patents
group by psn_name
order by total desc
2) The same as (1), but only taking into account those patents that have been filed by applicants having their place of business in the EU.

Code: Select all

SELECT top 1000  psn_name, 
COUNT(distinct(case when appln_filing_year = 2010 then tls201_appln.appln_id end)) as '2010',
COUNT(distinct(case when  appln_filing_year = 2011 then tls201_appln.appln_id end)) as '2011',
COUNT(distinct(case when  appln_filing_year = 2012 then tls201_appln.appln_id end)) as '2012',
COUNT(distinct(case when  appln_filing_year = 2013 then tls201_appln.appln_id end)) as '2013',
COUNT(distinct(case when  appln_filing_year = 2014 then tls201_appln.appln_id end)) as '2014',
COUNT(distinct(case when  appln_filing_year = 2015 then tls201_appln.appln_id end)) as '2015',
COUNT(distinct(case when  appln_filing_year = 2016 then tls201_appln.appln_id end)) as '2016',
COUNT(distinct(case when  appln_filing_year = 2017 then tls201_appln.appln_id end)) as '2017',
COUNT(distinct(case when  appln_filing_year = 2018 then tls201_appln.appln_id end)) as '2018',
COUNT(distinct(case when  appln_filing_year = 2019 then tls201_appln.appln_id end)) as '2019',
COUNT(distinct(case when  appln_filing_year = 2020 then tls201_appln.appln_id end)) as '2020',
count (tls201_appln.appln_id) total
FROM   tls201_appln  join tls207_pers_appln on tls201_appln.appln_id = tls207_pers_appln.appln_id
join tls206_person on tls207_pers_appln.person_id = tls206_person.person_id
join tls801_country on tls206_person.person_ctry_code = tls801_country.ctry_code
WHERE
appln_auth = 'EP'
AND appln_filing_year between 2010 and 2020
and applt_seq_nr > 0 
and granted = 'y' --remove this line if you want to have the number of patents filed instead of patents granted 
and eu_member = 'Y'
group by psn_name
order by total desc
3) The same as (2), but further restricted to patents that have been classified in the field of MEDICAL OR VETERINARY SCIENCE defined by IPC or CPC starting with A61.
https://worldwide.espacenet.com/patent/ ... #!/CPC=A61

Code: Select all

SELECT top 1000  psn_name, 
COUNT(distinct(case when appln_filing_year = 2010 then tls201_appln.appln_id end)) as '2010',
COUNT(distinct(case when  appln_filing_year = 2011 then tls201_appln.appln_id end)) as '2011',
COUNT(distinct(case when  appln_filing_year = 2012 then tls201_appln.appln_id end)) as '2012',
COUNT(distinct(case when  appln_filing_year = 2013 then tls201_appln.appln_id end)) as '2013',
COUNT(distinct(case when  appln_filing_year = 2014 then tls201_appln.appln_id end)) as '2014',
COUNT(distinct(case when  appln_filing_year = 2015 then tls201_appln.appln_id end)) as '2015',
COUNT(distinct(case when  appln_filing_year = 2016 then tls201_appln.appln_id end)) as '2016',
COUNT(distinct(case when  appln_filing_year = 2017 then tls201_appln.appln_id end)) as '2017',
COUNT(distinct(case when  appln_filing_year = 2018 then tls201_appln.appln_id end)) as '2018',
COUNT(distinct(case when  appln_filing_year = 2019 then tls201_appln.appln_id end)) as '2019',
COUNT(distinct(case when  appln_filing_year = 2020 then tls201_appln.appln_id end)) as '2020',
count (tls201_appln.appln_id) total
FROM   tls201_appln  join tls207_pers_appln on tls201_appln.appln_id = tls207_pers_appln.appln_id
join tls206_person on tls207_pers_appln.person_id = tls206_person.person_id
join tls801_country on tls206_person.person_ctry_code = tls801_country.ctry_code
WHERE
appln_auth = 'EP'
AND appln_filing_year between 2010 and 2020
and applt_seq_nr > 0 
and granted = 'y' --remove this line if you want to have the number of patents filed instead of only granted patents
and eu_member = 'Y'
and ((tls201_appln.appln_id in (select distinct appln_id from tls209_appln_ipc where left(ipc_class_symbol,3) = 'A61'))
	or (tls201_appln.appln_id in (select distinct appln_id from tls224_appln_cpc where left(cpc_class_symbol,3) = 'A61')))
group by psn_name
order by total desc
4) The same as (3), but with the extra conditions that renewal fees were paid in 'FR','DE','IT' and 'GB'.

Code: Select all

SELECT top 1000  psn_name, 
COUNT(distinct(case when appln_filing_year = 2010 then tls201_appln.appln_id end)) as '2010',
COUNT(distinct(case when  appln_filing_year = 2011 then tls201_appln.appln_id end)) as '2011',
COUNT(distinct(case when  appln_filing_year = 2012 then tls201_appln.appln_id end)) as '2012',
COUNT(distinct(case when  appln_filing_year = 2013 then tls201_appln.appln_id end)) as '2013',
COUNT(distinct(case when  appln_filing_year = 2014 then tls201_appln.appln_id end)) as '2014',
COUNT(distinct(case when  appln_filing_year = 2015 then tls201_appln.appln_id end)) as '2015',
COUNT(distinct(case when  appln_filing_year = 2016 then tls201_appln.appln_id end)) as '2016',
COUNT(distinct(case when  appln_filing_year = 2017 then tls201_appln.appln_id end)) as '2017',
COUNT(distinct(case when  appln_filing_year = 2018 then tls201_appln.appln_id end)) as '2018',
COUNT(distinct(case when  appln_filing_year = 2019 then tls201_appln.appln_id end)) as '2019',
COUNT(distinct(case when  appln_filing_year = 2020 then tls201_appln.appln_id end)) as '2020',
count (tls201_appln.appln_id) total
FROM   tls201_appln  join tls207_pers_appln on tls201_appln.appln_id = tls207_pers_appln.appln_id
join tls206_person on tls207_pers_appln.person_id = tls206_person.person_id
join tls801_country on tls206_person.person_ctry_code = tls801_country.ctry_code
WHERE
appln_auth = 'EP'
AND appln_filing_year between 2010 and 2020
and applt_seq_nr > 0 
and granted = 'y' --remove this line if you want to have the number of patents filed instead of only granted patents
and eu_member = 'Y'
and ((tls201_appln.appln_id in (select distinct appln_id from tls209_appln_ipc where left(ipc_class_symbol,3) = 'A61'))
	or (tls201_appln.appln_id in (select distinct appln_id from tls224_appln_cpc where left(cpc_class_symbol,3) = 'A61')))
and tls201_appln.appln_id in (select appln_id from tls231_inpadoc_legal_event where event_code = 'pgfp' and fee_country in ('FR','DE','IT','GB'))
group by psn_name
order by total desc
5) The same as (4) with the added condition that the patents were opposed. As a proxy to this, we use the fact that opponents data is available in the EPO register data for these granted EP patents. (not granted patents can of course not be opposed)

Code: Select all

SELECT top 1000  psn_name, 
COUNT(distinct(case when appln_filing_year = 2010 then tls201_appln.appln_id end)) as '2010',
COUNT(distinct(case when  appln_filing_year = 2011 then tls201_appln.appln_id end)) as '2011',
COUNT(distinct(case when  appln_filing_year = 2012 then tls201_appln.appln_id end)) as '2012',
COUNT(distinct(case when  appln_filing_year = 2013 then tls201_appln.appln_id end)) as '2013',
COUNT(distinct(case when  appln_filing_year = 2014 then tls201_appln.appln_id end)) as '2014',
COUNT(distinct(case when  appln_filing_year = 2015 then tls201_appln.appln_id end)) as '2015',
COUNT(distinct(case when  appln_filing_year = 2016 then tls201_appln.appln_id end)) as '2016',
COUNT(distinct(case when  appln_filing_year = 2017 then tls201_appln.appln_id end)) as '2017',
COUNT(distinct(case when  appln_filing_year = 2018 then tls201_appln.appln_id end)) as '2018',
COUNT(distinct(case when  appln_filing_year = 2019 then tls201_appln.appln_id end)) as '2019',
COUNT(distinct(case when  appln_filing_year = 2020 then tls201_appln.appln_id end)) as '2020',
count (tls201_appln.appln_id) total
FROM   tls201_appln  join tls207_pers_appln on tls201_appln.appln_id = tls207_pers_appln.appln_id
join tls206_person on tls207_pers_appln.person_id = tls206_person.person_id
join tls801_country on tls206_person.person_ctry_code = tls801_country.ctry_code
WHERE
appln_auth = 'EP'
AND appln_filing_year between 2010 and 2020
and applt_seq_nr > 0 
and granted = 'y' --remove this line if you want to have the number of patents filed instead of only granted patents
and eu_member = 'Y'
and ((tls201_appln.appln_id in (select distinct appln_id from tls209_appln_ipc where left(ipc_class_symbol,3) = 'A61'))
	or (tls201_appln.appln_id in (select distinct appln_id from tls224_appln_cpc where left(cpc_class_symbol,3) = 'A61')))
and tls201_appln.appln_id in (select appln_id from tls231_inpadoc_legal_event where event_code = 'pgfp' and fee_country in ('FR','DE','IT','GB'))
and tls201_appln.appln_id in (select appln_id from reg101_appln join reg130_opponent on reg101_appln.id = reg130_opponent.id)
group by psn_name
order by total desc
6) The last query (5) provides you a list with 602 companies (on PATSTAT 2021b) and involves 1.376 patent applications. Assume that you would now like to make a detailed analysis on this patent sample and therefore you need to extract a PATSTAT data base extraction that includes all the PATSTAT tables on those 1.376 patents. Simplify the above query to only list the application id's which is the only attribute you need to make an extraction. The query would look like this:

Code: Select all

SELECT tls201_appln.appln_id
FROM   tls201_appln  join tls207_pers_appln on tls201_appln.appln_id = tls207_pers_appln.appln_id
join tls206_person on tls207_pers_appln.person_id = tls206_person.person_id
join tls801_country on tls206_person.person_ctry_code = tls801_country.ctry_code
WHERE
appln_auth = 'EP'
AND appln_filing_year between 2010 and 2020
and applt_seq_nr > 0 
and granted = 'y' 
and eu_member = 'Y'
and ((tls201_appln.appln_id in (select distinct appln_id from tls209_appln_ipc where left(ipc_class_symbol,3) = 'A61'))
	or (tls201_appln.appln_id in (select distinct appln_id from tls224_appln_cpc where left(cpc_class_symbol,3) = 'A61')))
and tls201_appln.appln_id in (select appln_id from tls231_inpadoc_legal_event where event_code = 'pgfp' and fee_country in ('FR','DE','IT','GB'))
and tls201_appln.appln_id in (select appln_id from reg101_appln join reg130_opponent on reg101_appln.id = reg130_opponent.id)
group by tls201_appln.appln_id 
Once you have run the query, you can then download a PATSTAT extraction according to your needs; you can choose whether you want a MS ACCESS data base or a set of CSV files. You can also choose what tables you want to have included in the extraction. And you can even specify that you would like to have all the family members and/or cited and/or citing publications from the original set of 1.376 patents. Extracing all the tables, all family members and all citing and cited applications will result in a CSV data set of about 190 MB, containing data on 74.249 patent applications. The ready-to-use MS ACCESS data base will have a size of about 224 MB.

Users can easily adapt the technical scope by changing the the conditions in the WHERE clause of the SQL query. One can also add (or remove) tables depending on your data needs. The PATSTAT Data Catalog gives a comprehensive overview of all the tables and attributes in PATSTAT.
DataCatalog_Global_v5.17.pdf
(4.18 MiB) Downloaded 367 times
Other useful examples:
PATSTAT Support Team
EPO - Vienna
patstat @ epo.org


Post Reply