Help with SQL query - How to count number of citation per year

Here you can post your opinions, ask questions and share experiences on the PATSTAT product line. Please always indicate the PATSTAT edition (e.g. 2015 Autumn Edition) and the database (e.g. PATSTAT Online, MySQL, MS SQL Server, ...) you are using.
Post Reply

khanhhienbi
Posts: 13
Joined: Thu Sep 03, 2020 9:49 am

Help with SQL query - How to count number of citation per year

Post by khanhhienbi » Mon Sep 26, 2022 5:38 pm

Hi everyone,

I am currently working with PATSTAT Autumn 2021 edition. I would like to retrieve a sample of the total number of citations in the UK patent documents for biotechnology firms per year, from 2011 to 2021.

This is my code so far. I would like to use COUNT command for nb_citing_docdb_fam to create a panel data sample with companies in columns and years in rows. Any help is much appreciated. Many thanks in advance!

Code: Select all

SELECT distinct psn_name, tls201_appln.docdb_family_id
	  ,tls201_appln.appln_id
      ,[appln_auth]
      ,[appln_nr]
      ,[appln_kind]
      ,[appln_filing_date]
      ,[receiving_office]
      ,[earliest_publn_date]
      ,[granted]
      ,[nb_citing_docdb_fam]
      ,[nb_applicants]
      ,[nb_inventors]
FROM tls201_appln
JOIN tls207_pers_appln ON tls201_appln.appln_id = tls207_pers_appln.appln_id
join tls206_person on tls207_pers_appln.person_id = tls206_person.person_id
join tls230_appln_techn_field on tls201_appln.appln_id = tls230_appln_techn_field.appln_id
WHERE
tls230_appln_techn_field.techn_field_nr = 15
AND appln_filing_year between 2011 and 2021
AND person_ctry_code = 'GB'
order by docdb_family_id, appln_filing_date asc


EPO / PATSTAT Support
Posts: 425
Joined: Thu Feb 22, 2007 5:33 pm
Contact:

Re: Help with SQL query - How to count number of citation per year

Post by EPO / PATSTAT Support » Wed Sep 28, 2022 2:51 pm

Hello Khanhhienbi,
I am rather unsure how you would like your data to look and what you want to "count". The attribute nb_citing_docdb_fam provides you with the number of forward citations based on the family-family level. You could sum them up over years/applicant, but you need to check your level of aggregation. 1 patent family can have multiple patents filed in different years. They will all have the same number of forward citations, but you would not want to double count them. You need to give it a thought what methodology and level of aggregation you want to use for your calculations.
I would recommend you to take a sample data set at application level and then see how you can extract "the counts".
Here is a sample query that extracts 72 patent applications filed by 'ALMAC DIAGNOSTICS' between 2011 and 2021 that have a technical field number 15 (Chemistry, Biotechnology). Take a closer look at the data, and then decide how you want to "group" and "count". Once that is established, then you can adapt the query to produce the table as defined.

Code: Select all

SELECT *
FROM tls201_appln JOIN tls207_pers_appln ON tls201_appln.appln_id = tls207_pers_appln.appln_id
JOIN tls206_person on tls207_pers_appln.person_id = tls206_person.person_id
WHERE tls201_appln.appln_id in 
(select appln_id from tls230_appln_techn_field where tls230_appln_techn_field.techn_field_nr = 15)
AND appln_filing_year between 2011 and 2021
AND person_ctry_code = 'GB'
and applt_seq_nr > 0 and invt_seq_nr = 0
and psn_name = 'ALMAC DIAGNOSTICS'
order by docdb_family_id,appln_filing_year, tls201_appln.appln_id asc
PATSTAT Support Team
EPO - Vienna
patstat @ epo.org


khanhhienbi
Posts: 13
Joined: Thu Sep 03, 2020 9:49 am

Re: Help with SQL query - How to count number of citation per year

Post by khanhhienbi » Mon Mar 13, 2023 5:42 pm

Dear PATSTAT community,

I would like to count the number of forward citations per year for a set of biotechnology companies. The following queries work fine but would be great to double check with you to see if they're correct:

Code: Select all


SELECT psn_name, 
COUNT(distinct(case when  appln_filing_year = 2008 then tls201_appln.appln_id end)) as '2008',
COUNT(distinct(case when  appln_filing_year = 2009 then tls201_appln.appln_id end)) as '2009',
COUNT(distinct(case when appln_filing_year = 2010 then tls201_appln.appln_id end)) as '2010',
COUNT(distinct(case when  appln_filing_year = 2011 then tls201_appln.appln_id end)) as '2011',
COUNT(distinct(case when  appln_filing_year = 2012 then tls201_appln.appln_id end)) as '2012',
COUNT(distinct(case when  appln_filing_year = 2013 then tls201_appln.appln_id end)) as '2013',
COUNT(distinct(case when  appln_filing_year = 2014 then tls201_appln.appln_id end)) as '2014',
count (distinct (tls201_appln.appln_id)) total
FROM tls201_appln JOIN tls207_pers_appln ON tls201_appln.appln_id = tls207_pers_appln.appln_id
JOIN tls206_person ON tls207_pers_appln.person_id = tls206_person.person_id
JOIN tls211_pat_publn ON tls201_appln.appln_id = tls211_pat_publn.appln_id
JOIN tls212_citation ON tls211_pat_publn.pat_publn_id = tls212_citation.pat_publn_id
JOIN tls230_appln_techn_field on tls201_appln.appln_id = tls230_appln_techn_field.appln_id
WHERE
tls230_appln_techn_field.techn_field_nr = 15
and appln_auth = 'GB'
and applt_seq_nr > 0 
and granted = 'y' 
--and (psn_name = 'xx')
group by psn_name
order by total desc

Any help is much appreciated. Many thanks!


EPO / PATSTAT Support
Posts: 425
Joined: Thu Feb 22, 2007 5:33 pm
Contact:

Re: Help with SQL query - How to count number of citation per year

Post by EPO / PATSTAT Support » Thu Apr 06, 2023 4:27 pm

Hello khanhhienbi,
your query does not count forward citations. On the forum there are quite some posts that explain the concept of forward citations.
Looking at the WHERE clause of your query, I assume you want to count the forward citations of patents filed at the UK patent office that have been granted. (and classified techn_field_nr = 15).
So via the tls212_citation table, you count the number of publications that would cite the GB-granted- tech.field 15 applications.) Aggregated on the application filing year(s).
Here is a re-worked query:

Code: Select all

SELECT psn_name, 
COUNT((case when  appln_filing_year = 2008 then tls212_citation.pat_publn_id end)) as '2008',
COUNT((case when  appln_filing_year = 2009 then tls212_citation.pat_publn_id end)) as '2009',
COUNT((case when appln_filing_year = 2010 then tls212_citation.pat_publn_id end)) as '2010',
COUNT((case when  appln_filing_year = 2011 then tls212_citation.pat_publn_id end)) as '2011',
COUNT((case when  appln_filing_year = 2012 then tls212_citation.pat_publn_id end)) as '2012',
COUNT((case when  appln_filing_year = 2013 then tls212_citation.pat_publn_id end)) as '2013',
COUNT((case when  appln_filing_year = 2014 then tls212_citation.pat_publn_id end)) as '2014',
COUNT((case when  appln_filing_year = 2015 then tls212_citation.pat_publn_id end)) as '2015',
COUNT((case when  appln_filing_year = 2016 then tls212_citation.pat_publn_id end)) as '2016',
COUNT((case when appln_filing_year = 2017 then tls212_citation.pat_publn_id end)) as '2017',
COUNT((case when  appln_filing_year = 2018 then tls212_citation.pat_publn_id end)) as '2018',
COUNT((case when  appln_filing_year = 2019 then tls212_citation.pat_publn_id end)) as '2019',
COUNT((case when  appln_filing_year = 2020 then tls212_citation.pat_publn_id end)) as '2020',
COUNT((case when  appln_filing_year = 2021 then tls212_citation.pat_publn_id end)) as '2021',
count ((tls212_citation.pat_publn_id)) total
FROM tls201_appln JOIN tls207_pers_appln ON tls201_appln.appln_id = tls207_pers_appln.appln_id
JOIN tls206_person ON tls207_pers_appln.person_id = tls206_person.person_id
JOIN tls211_pat_publn ON tls201_appln.appln_id = tls211_pat_publn.appln_id
JOIN tls212_citation ON tls211_pat_publn.pat_publn_id = tls212_citation.cited_pat_publn_id
WHERE
appln_auth = 'GB'
and applt_seq_nr > 0 
and granted = 'y' 
and tls201_appln.appln_id in (select appln_id from tls230_appln_techn_field where tls230_appln_techn_field.techn_field_nr = 15)
and appln_filing_year between 2008 and 2021
group by psn_name
order by total desc
And to explicitly illustrate where the data comes from for each applicant,
here is the query that shows all the data records used for psn_name = 'UNIVERSITY OF CALIFORNIA'. (number 12 in the result list from the above query.)
The above query tells us that there are 22 citations for applications filed in 2013 by psn_name = 'UNIVERSITY OF CALIFORNIA' in techn_field_nr = 15. (For 2014 there are 2 citations).
The query below illustrates in detail the list of those 22+2 citations.

Code: Select all

SELECT distinct tls201_appln.appln_id,tls201_appln.appln_auth + tls201_appln.appln_nr cited_application,
tls201_appln.appln_filing_year, psn_name, tls211_pat_publn.pat_publn_id,
tls211_pat_publn.publn_auth+tls211_pat_publn.publn_nr+tls211_pat_publn.publn_kind cited_publication,
tls211_pat_publn.publn_date, tls212_citation.cited_pat_publn_id, tls212_citation.pat_publn_id,
citing.publn_auth+citing.publn_nr+citing.publn_kind citing_publication, citing.publn_date citing_publication_date
FROM tls201_appln JOIN tls207_pers_appln ON tls201_appln.appln_id = tls207_pers_appln.appln_id
JOIN tls206_person ON tls207_pers_appln.person_id = tls206_person.person_id
JOIN tls211_pat_publn ON tls201_appln.appln_id = tls211_pat_publn.appln_id
JOIN tls212_citation ON tls211_pat_publn.pat_publn_id = tls212_citation.cited_pat_publn_id
join tls211_pat_publn citing on tls212_citation.pat_publn_id = citing.pat_publn_id
WHERE
appln_auth = 'GB'
and applt_seq_nr > 0 
and granted = 'y' 
and tls201_appln.appln_id in (select appln_id from tls230_appln_techn_field where tls230_appln_techn_field.techn_field_nr = 15)
and appln_filing_year between 2008 and 2021
and psn_name = 'UNIVERSITY OF CALIFORNIA'
order by appln_filing_year, tls201_appln.appln_id, tls212_citation.pat_publn_id
So from the details you can see that for filing year 2013, 4 patent applications were cited by 22 other patent publications. For 2014, 1 application was cited by 2 patent publication.
Generally spoken, I am not sure why one would specifically count the forward citations of patents filed at the UK patent office.
PATSTAT Support Team
EPO - Vienna
patstat @ epo.org


Post Reply