a couple of remarks on your query:
the syntax is correct, but to include the applicant names, you need to link the applications in tls201_appln to the tls206_person table via tls207.
If you only joined the tls209_appln_ipc table in order to limit your data set, then there is no need to join the table, you can simply move the complete condition into the WHERE clause.
You said you are looking for UK biotech companies, and for this you need to have the the tls206_person.person_ctry_code attribute in your query. The condition "AND t1.appln_auth IN ('GB')" limits your data set to applications filed at the UK patent office, but those applications can (and will) also be filed by non GB applicants. You also have to keep in mind that applicants in GB will also file patents outside GB. I marked that condition as a comment in the query below but you can re-activate it by removing the "--"
Your "t1.appln_filing_date between '2016-01-01' and '2020-01-01'" condition is correct, but here you have to keep in mind that most patents are published (and available in public data bases) 18 months AFTER the application filing date. Therefore you will only find limited patents filed after 06/2018. This is inherent to the patent system; researchers now that this results in a large dip in the figures for the last 2-3 years when creating a timeline based on the applications filing date.
Keeping the above in mind, below is a reworked query. It comes as-is, and you should check and adapt it further to your needs. (I have used the STRING_AGG function to group all applicants in the same cell, but you can remove it and add PSN name in the GROUP BY clause in case you prefer to have a separate row for each individual applicant for those patent applications that have multiple applicants.)
Code: Select all
SELECT t1.appln_id, t1.appln_auth, t1.appln_nr, t1.appln_filing_date,
STRING_AGG ((psn_name), ', ') applicants ,t3.appln_title
join tls207_pers_appln on tls207_pers_appln.appln_id = t1.appln_id
join tls206_person on tls207_pers_appln.person_id = tls206_person.person_id
left JOIN tls202_appln_title t3 ON t3.appln_id = t1.appln_id
t1.appln_id in (select distinct appln_id from tls209_appln_ipc where left(ipc_class_symbol,4) = 'C12N')
--AND t1.appln_auth IN ('GB')
AND person_ctry_code = 'GB'
AND applt_seq_nr > 0 and invt_seq_nr = 0
AND t1.appln_filing_date between '2016-01-01' and '2020-01-01'
group by t1.appln_id, t1.appln_auth, t1.appln_nr, t1.appln_filing_date ,t3.appln_title
order by appln_filing_date, t1.appln_id
About linking PATSTAT to ORBIS data, I have heard that ORBIS data contains the person_id used in PATSTAT, but you need to double check that with the supplier. A number of researchers also have concordance tables but EPO can not provide it.