Understanding results of number of patent applications

Here you can post your opinions, ask questions and share experiences on the PATSTAT product line. Please always indicate the PATSTAT edition (e.g. 2015 Autumn Edition) and the database (e.g. PATSTAT Online, MySQL, MS SQL Server, ...) you are using.
Post Reply

khanhhienbi
Posts: 13
Joined: Thu Sep 03, 2020 9:49 am

Understanding results of number of patent applications

Post by khanhhienbi » Thu Jun 01, 2023 3:45 pm

Dear PATSTAT community,

I am currently working with the PATSTAT Spring 2023 edition. I would like to get the number of patent applications filed by Netherlands companies in the pharmaceutical field for the period of 12 years, from 2011 to 2022. Here are my codes:

Code: Select all

SELECT psn_name, person_ctry_code, appln_auth,
COUNT(distinct(case when t1.appln_filing_year = 2011 then t1.appln_id end)) as '2011',
COUNT(distinct(case when t1.appln_filing_year = 2012 then t1.appln_id end)) as '2012', 
COUNT(distinct(case when t1.appln_filing_year = 2013 then t1.appln_id end)) as '2013',
COUNT(distinct(case when t1.appln_filing_year = 2014 then t1.appln_id end)) as '2014',
COUNT(distinct(case when t1.appln_filing_year = 2015 then t1.appln_id end)) as '2015',
COUNT(distinct(case when t1.appln_filing_year = 2016 then t1.appln_id end)) as '2016',
COUNT(distinct(case when t1.appln_filing_year = 2017 then t1.appln_id end)) as '2017',
COUNT(distinct(case when t1.appln_filing_year = 2018 then t1.appln_id end)) as '2018',
COUNT(distinct(case when t1.appln_filing_year = 2019 then t1.appln_id end)) as '2019',
COUNT(distinct(case when t1.appln_filing_year = 2019 then t1.appln_id end)) as '2020',
COUNT(distinct(case when t1.appln_filing_year = 2019 then t1.appln_id end)) as '2021',
COUNT(distinct(case when t1.appln_filing_year = 2019 then t1.appln_id end)) as '2022'
FROM   tls201_appln t1 JOIN  tls230_appln_techn_field on t1.appln_id = tls230_appln_techn_field.appln_id
join tls207_pers_appln on t1.appln_id = tls207_pers_appln.appln_id
join tls206_person on tls207_pers_appln.person_id = tls206_person.person_id
WHERE
tls230_appln_techn_field.techn_field_nr = 16--> number used for techn_field = 'Pharmaceutical'
AND t1.appln_filing_year between 2011 and 2022
and applt_seq_nr > 0 and invt_seq_nr = 0
and person_ctry_code = 'NL'
group by psn_name, person_ctry_code, appln_auth
I'm not sure why some companies that have values of 0 for every year in the time period specified are included in the dataset?

I also want to look at the biotechnology industry. In this case, how can I deal with the duplications?

Many thanks for your support!


EPO / PATSTAT Support
Posts: 425
Joined: Thu Feb 22, 2007 5:33 pm
Contact:

Re: Understanding results of number of patent applications

Post by EPO / PATSTAT Support » Fri Jun 02, 2023 8:55 am

Hello khanhhienbi,
there is a small error in your query, the last for 3 lines in your SELECT clause have a correct "heading/alias" for the columns (2020,2021,2022), but they all show the values for 2019. So the real values are not shown. Below is a correction.

Code: Select all

SELECT psn_name, person_ctry_code, appln_auth,
COUNT(distinct(case when t1.appln_filing_year = 2011 then t1.appln_id end)) as '2011',
COUNT(distinct(case when t1.appln_filing_year = 2012 then t1.appln_id end)) as '2012', 
COUNT(distinct(case when t1.appln_filing_year = 2013 then t1.appln_id end)) as '2013',
COUNT(distinct(case when t1.appln_filing_year = 2014 then t1.appln_id end)) as '2014',
COUNT(distinct(case when t1.appln_filing_year = 2015 then t1.appln_id end)) as '2015',
COUNT(distinct(case when t1.appln_filing_year = 2016 then t1.appln_id end)) as '2016',
COUNT(distinct(case when t1.appln_filing_year = 2017 then t1.appln_id end)) as '2017',
COUNT(distinct(case when t1.appln_filing_year = 2018 then t1.appln_id end)) as '2018',
COUNT(distinct(case when t1.appln_filing_year = 2019 then t1.appln_id end)) as '2019',
COUNT(distinct(case when t1.appln_filing_year = 2020 then t1.appln_id end)) as '2020',
COUNT(distinct(case when t1.appln_filing_year = 2021 then t1.appln_id end)) as '2021',
COUNT(distinct(case when t1.appln_filing_year = 2022 then t1.appln_id end)) as '2022', 
COUNT(distinct(case when t1.appln_filing_year between 2011 and 2022 then t1.appln_id end)) as 'Total'
FROM   tls201_appln t1 JOIN  tls230_appln_techn_field on t1.appln_id = tls230_appln_techn_field.appln_id
join tls207_pers_appln on t1.appln_id = tls207_pers_appln.appln_id
join tls206_person on tls207_pers_appln.person_id = tls206_person.person_id
WHERE
tls230_appln_techn_field.techn_field_nr = 16--> number used for techn_field = 'Pharmaceutical'
AND t1.appln_filing_year between 2011 and 2022
and applt_seq_nr > 0 and invt_seq_nr = 0
and person_ctry_code = 'NL'
group by psn_name, person_ctry_code, appln_auth
order by total desc
On your question on separating Pharmaceuticals from Biotechnology; that is possible but does it make sense to do so because of the intertwining of the technological fields? From a database point of view, you can indeed take all patents that only have a Pharmaceuticals label separate from those that have a Biotechnology label, but (without checking), I expect that there will be large overlap.
In fact, also Organic fine chemistry and probably Macromolecular chemistry, polymers will probably have overlaps with Pharmaceuticals and Biotechnology.
The technical sectors are assigned based on an IPC concordance table. Because patents can be assigned multiple IPC codes, they can end up in multiple technologies. Therefore we have added a ,[weight] in table tls230_appln_techn_field, so that each patent has a total value of 1 (summing the weights). This can as such be used to analyse in depth the degree of a patent belonging to a certain technology group.
Or aggregated at company level: to label a company as rather Pharmaceutical or Biotech.(care needs to be taken to not draw wrong conclusions, as the "nature" of a company can have switched - which in itself might be an interesting area of research....)
Here is the example query that takes all Pharmaceuticals and excludes Biotechnology

Code: Select all

SELECT psn_name, person_ctry_code, appln_auth,
COUNT(distinct(case when t1.appln_filing_year = 2011 then t1.appln_id end)) as '2011',
COUNT(distinct(case when t1.appln_filing_year = 2012 then t1.appln_id end)) as '2012', 
COUNT(distinct(case when t1.appln_filing_year = 2013 then t1.appln_id end)) as '2013',
COUNT(distinct(case when t1.appln_filing_year = 2014 then t1.appln_id end)) as '2014',
COUNT(distinct(case when t1.appln_filing_year = 2015 then t1.appln_id end)) as '2015',
COUNT(distinct(case when t1.appln_filing_year = 2016 then t1.appln_id end)) as '2016',
COUNT(distinct(case when t1.appln_filing_year = 2017 then t1.appln_id end)) as '2017',
COUNT(distinct(case when t1.appln_filing_year = 2018 then t1.appln_id end)) as '2018',
COUNT(distinct(case when t1.appln_filing_year = 2019 then t1.appln_id end)) as '2019',
COUNT(distinct(case when t1.appln_filing_year = 2020 then t1.appln_id end)) as '2020',
COUNT(distinct(case when t1.appln_filing_year = 2021 then t1.appln_id end)) as '2021',
COUNT(distinct(case when t1.appln_filing_year = 2022 then t1.appln_id end)) as '2022', 
COUNT(distinct(case when t1.appln_filing_year between 2011 and 2022 then t1.appln_id end)) as 'Total'
FROM   tls201_appln t1
join tls207_pers_appln on t1.appln_id = tls207_pers_appln.appln_id
join tls206_person on tls207_pers_appln.person_id = tls206_person.person_id
WHERE
t1.appln_id in (select appln_id from tls230_appln_techn_field where tls230_appln_techn_field.techn_field_nr = 16)
--> number used to include techn_field = 'Pharmaceutical')
and t1.appln_id not in (select appln_id from tls230_appln_techn_field where tls230_appln_techn_field.techn_field_nr = 15)
--> number used to exclude techn_field = 'Biotechnology')
and t1.appln_filing_year between 2011 and 2022
and applt_seq_nr > 0 and invt_seq_nr = 0
and person_ctry_code = 'NL'
group by psn_name, person_ctry_code, appln_auth
order by total desc
PATSTAT Support Team
EPO - Vienna
patstat @ epo.org


khanhhienbi
Posts: 13
Joined: Thu Sep 03, 2020 9:49 am

Re: Understanding results of number of patent applications

Post by khanhhienbi » Sat Jun 03, 2023 9:15 am

Thank you very much for your response! All the codes provided work fine!

I'm wondering if there is any way to aggregate the number of patent applications filed by a Netherland firm both in domestic and foreign countries? As I can see from the result table after running the second code, one firm can file patent applications to different patent authorities then how can one identify if the patent applications are the same or different to correctly aggregate them?

Thank you in advance for your support!


Post Reply