IPC classification - How to get the number of patent applications in Biotechnology

Here you can post your opinions, ask questions and share experiences on the PATSTAT product line. Please always indicate the PATSTAT edition (e.g. 2015 Autumn Edition) and the database (e.g. PATSTAT Online, MySQL, MS SQL Server, ...) you are using.
Post Reply

khanhhienbi
Posts: 13
Joined: Thu Sep 03, 2020 9:49 am

IPC classification - How to get the number of patent applications in Biotechnology

Post by khanhhienbi » Tue Dec 08, 2020 11:15 am

I am currently working with PATSTAT Autumn 2020 edition. I would like to get the total number of applications of each company in the UK in the biotechnology field for the period of 10 years, from 2011 to 2020. I encountered some issues while working with the SQL queries:
1. I am uncertain if the IPC classifications for biotechnology, namely A01H, A61K, C40B, C12N, C12P, C12Q, C12S, G01N are the appropriate and updated ones to use. It would be great if you could confirm.
2. I initially wanted to look at the IPC classification C12N in my research. I then tried to expand my search to cover the whole field with the rest of the IPC classifications mentioned above. I then realised that some of the patent applications have more than one IPC classification. For example, a company can have 5 applications in 2015 under C12N but only 1 in the same year under C12Q. How should I know if a company filed one patent application or filed a total of six patent applications with different classifications (based on the example)?
3. Should the SQL queries be modified to retrieve better results?

Code: Select all

SELECT psn_name, 
COUNT(distinct(case when t1.appln_filing_year = 2011 then t1.appln_id end)) as '2011',
COUNT(distinct(case when t1.appln_filing_year = 2012 then t1.appln_id end)) as '2012',
COUNT(distinct(case when t1.appln_filing_year = 2013 then t1.appln_id end)) as '2013',
COUNT(distinct(case when t1.appln_filing_year = 2014 then t1.appln_id end)) as '2014',
COUNT(distinct(case when t1.appln_filing_year = 2015 then t1.appln_id end)) as '2015',
COUNT(distinct(case when t1.appln_filing_year = 2016 then t1.appln_id end)) as '2016',
COUNT(distinct(case when t1.appln_filing_year = 2017 then t1.appln_id end)) as '2017',
COUNT(distinct(case when t1.appln_filing_year = 2018 then t1.appln_id end)) as '2018',
COUNT(distinct(case when t1.appln_filing_year = 2019 then t1.appln_id end)) as '2019',
COUNT(distinct(case when t1.appln_filing_year = 2019 then t1.appln_id end)) as '2020',
count(t1.appln_id) total
FROM   tls201_appln t1 JOIN  tls209_appln_ipc t2 
	ON t1.appln_id = t2.appln_id
join tls207_pers_appln on t1.appln_id = tls207_pers_appln.appln_id
join tls206_person on tls207_pers_appln.person_id = tls206_person.person_id
WHERE
t2.ipc_class_symbol LIKE 'C12N%' --replace with the rest of the IPC classifications mentioned above
AND t1.appln_auth IN ('GB')
AND t1.appln_filing_year between 2010 and 2019
and applt_seq_nr > 0 and invt_seq_nr = 0
group by psn_name
order by total desc
Any help is much appreciated. Many thanks for you time and support!


EPO / PATSTAT Support
Posts: 425
Joined: Thu Feb 22, 2007 5:33 pm
Contact:

Re: IPC classification - How to get the number of patent applications in Biotechnology

Post by EPO / PATSTAT Support » Fri Dec 18, 2020 1:12 pm

Hello Khanhhienbi,
The criteria of using one or the other IPC or CPC classification codes to define a certain technology sector or fields lies with the researcher. It is rather difficult to draw strict lines between the various fields simply based on the classification codes, and there are "grey" areas.
An easy approach would be to use the WIPO definitions as described in this document.
wipo_ipc_technology.pdf
(110.71 KiB) Downloaded 221 times
This methodology has been integrated into PATSTAT. It is basically an IPC concordance table, generally accepted by many researchers - but you are free to adapt of fine tune it to your needs.
If you run the query below, you can see what IPC codes are considered to define biotechnology. (it is a rather limited list of IPC codes compared to your list)

Code: Select all

SELECT *
  FROM tls901_techn_field_ipc
  where techn_field = 'Biotechnology'
Using this methodology, it is straightforward to create a list of GB companies (applicants) filing patents in the Biotechnology in GB based on your example. (keep in mind that your example query only looks at patents filed in GB ! The number of patents filed as PCT of EP is much larger. )

Code: Select all

SELECT psn_name, 
COUNT(distinct(case when t1.appln_filing_year = 2011 then t1.appln_id end)) as '2011',
COUNT(distinct(case when t1.appln_filing_year = 2012 then t1.appln_id end)) as '2012',
COUNT(distinct(case when t1.appln_filing_year = 2013 then t1.appln_id end)) as '2013',
COUNT(distinct(case when t1.appln_filing_year = 2014 then t1.appln_id end)) as '2014',
COUNT(distinct(case when t1.appln_filing_year = 2015 then t1.appln_id end)) as '2015',
COUNT(distinct(case when t1.appln_filing_year = 2016 then t1.appln_id end)) as '2016',
COUNT(distinct(case when t1.appln_filing_year = 2017 then t1.appln_id end)) as '2017',
COUNT(distinct(case when t1.appln_filing_year = 2018 then t1.appln_id end)) as '2018',
COUNT(distinct(case when t1.appln_filing_year = 2019 then t1.appln_id end)) as '2019',
COUNT(distinct(case when t1.appln_filing_year = 2019 then t1.appln_id end)) as '2020',
count((distinct(t1.appln_id)) total
FROM   tls201_appln t1 JOIN  tls230_appln_techn_field on t1.appln_id = tls230_appln_techn_field.appln_id
join tls207_pers_appln on t1.appln_id = tls207_pers_appln.appln_id
join tls206_person on tls207_pers_appln.person_id = tls206_person.person_id
WHERE
tls230_appln_techn_field.techn_field_nr = 15 --> number used for techn_field = 'Biotechnology'
AND t1.appln_filing_year between 2010 and 2019
and applt_seq_nr > 0 and invt_seq_nr = 0--and psn_sector <> 'individual'
and appln_auth = 'GB' 
and person_ctry_code = 'GB'
group by psn_name
order by total desc
In case you would remove the and appln_auth = 'GB' restriction form the query, it might be appropriate to count patent families instead of patent applications.
Replace the lines: COUNT(distinct(case when t1.appln_filing_year = 2011 then t1.appln_id end)) as '2011',
by COUNT(distinct(case when t1.earliest_filing_year = 2011 then t1.appln_id end)) as '2011' .
The concordance table is bases on the IPC codes, you might want to extend it to CPC codes to increase the recall.
PATSTAT Support Team
EPO - Vienna
patstat @ epo.org


Post Reply