Help with SQL query - How to retrieve name of the patent's applicant(s)

Here you can post your opinions, ask questions and share experiences on the PATSTAT product line. Please always indicate the PATSTAT edition (e.g. 2015 Autumn Edition) and the database (e.g. PATSTAT Online, MySQL, MS SQL Server, ...) you are using.
Post Reply

khanhhienbi
Posts: 15
Joined: Thu Sep 03, 2020 9:49 am

Help with SQL query - How to retrieve name of the patent's applicant(s)

Post by khanhhienbi » Thu Sep 03, 2020 11:16 am

Hi everyone,

I am currently working with PATSTAT Spring 2020 edition. My aim is to retrieve all the name of the patent's applicant(s) for a limited number of UK biotech companies from 2016 to 2020.

Please find the codes I've written below. Here is a brief description of the methodology I used:

Use table tls201_appln to identify the appIn_id of the entries, appln_id and appIn_filing_date =YYYY-MM-DD
Use table tls209_appln_ipc to identify the ipc_class_symbol of the entries. I am only interested in patents with LIKE 'C12N%'
Use table tls202_aapln_title to identify the appln_title.
Join the three table person_id as a primary key.

Code: Select all

SELECT t1.appln_id, t1.appln_auth, t1.appln_filing_date, t2.ipc_class_symbol, t3.appln_title
FROM
 tls201_appln t1
 JOIN
 tls209_appln_ipc t2 ON t1.appln_id = t2.appln_id
 JOIN
 tls202_appln_title t3 ON t3.appln_id = t1.appln_id
WHERE
t2.ipc_class_symbol LIKE 'C12N%'
AND t1.appln_auth IN ('GB')
AND t1.appln_filing_date between '2016-01-01' and '2020-01-01'
My issue here is that I am not sure how to retrieve each patent's applicant(s) using SQL. I can match manually by checking the bibliographic data from Espacenet patent search (e.g., https://worldwide.espacenet.com/publica ... date=&FT=D). Is there any other way to retrieve the applicant's name using SQL?

On a side note, is it possible to merge PATSTAT with Orbis, to save time finding and matching each patent to a company's profile?

Any advice is much appreciated. Many thanks in advance.


EPO / PATSTAT Support
Posts: 433
Joined: Thu Feb 22, 2007 5:33 pm
Contact:

Re: Help with SQL query - How to retrieve name of the patent's applicant(s)

Post by EPO / PATSTAT Support » Thu Sep 03, 2020 1:45 pm

Hello Khanhhienbi,
a couple of remarks on your query:
the syntax is correct, but to include the applicant names, you need to link the applications in tls201_appln to the tls206_person table via tls207.
If you only joined the tls209_appln_ipc table in order to limit your data set, then there is no need to join the table, you can simply move the complete condition into the WHERE clause.
You said you are looking for UK biotech companies, and for this you need to have the the tls206_person.person_ctry_code attribute in your query. The condition "AND t1.appln_auth IN ('GB')" limits your data set to applications filed at the UK patent office, but those applications can (and will) also be filed by non GB applicants. You also have to keep in mind that applicants in GB will also file patents outside GB. I marked that condition as a comment in the query below but you can re-activate it by removing the "--"
Your "t1.appln_filing_date between '2016-01-01' and '2020-01-01'" condition is correct, but here you have to keep in mind that most patents are published (and available in public data bases) 18 months AFTER the application filing date. Therefore you will only find limited patents filed after 06/2018. This is inherent to the patent system; researchers now that this results in a large dip in the figures for the last 2-3 years when creating a timeline based on the applications filing date.
Keeping the above in mind, below is a reworked query. It comes as-is, and you should check and adapt it further to your needs. (I have used the STRING_AGG function to group all applicants in the same cell, but you can remove it and add PSN name in the GROUP BY clause in case you prefer to have a separate row for each individual applicant for those patent applications that have multiple applicants.)

Code: Select all

SELECT t1.appln_id, t1.appln_auth, t1.appln_nr, t1.appln_filing_date, 
STRING_AGG ((psn_name), ', ') applicants ,t3.appln_title
FROM
 tls201_appln t1
 join tls207_pers_appln on tls207_pers_appln.appln_id = t1.appln_id
 join tls206_person on tls207_pers_appln.person_id = tls206_person.person_id
 left JOIN tls202_appln_title t3 ON t3.appln_id = t1.appln_id
WHERE
t1.appln_id in (select distinct appln_id from tls209_appln_ipc where left(ipc_class_symbol,4) = 'C12N')
--AND t1.appln_auth IN ('GB')
AND person_ctry_code = 'GB'
AND applt_seq_nr > 0 and invt_seq_nr = 0
AND t1.appln_filing_date between '2016-01-01' and '2020-01-01'
group by t1.appln_id, t1.appln_auth, t1.appln_nr, t1.appln_filing_date ,t3.appln_title
order by appln_filing_date, t1.appln_id
About linking PATSTAT to ORBIS data, I have heard that ORBIS data contains the person_id used in PATSTAT, but you need to double check that with the supplier. A number of researchers also have concordance tables but EPO can not provide it.
PATSTAT Support Team
EPO - Vienna
patstat @ epo.org


Post Reply