Help with counting backward citation

Here you can post your opinions, ask questions and share experiences on the PATSTAT product line. Please always indicate the PATSTAT edition (e.g. 2015 Autumn Edition) and the database (e.g. PATSTAT Online, MySQL, MS SQL Server, ...) you are using.
Post Reply

Angel
Posts: 2
Joined: Wed Apr 18, 2018 3:33 pm

Help with counting backward citation

Post by Angel » Wed Apr 18, 2018 3:53 pm

Hi everyone,

I need a bit of help with a complicated (for me atleast) SQL query.
The goal is to find all patents in a given time frame from a selected number of countires, that are citing at least three patents. However these three (or more) cited patens need to belong to a different Section/ Classes or Subclasses. I'm not sure which granularity I can make the most use of.
Example:
For Sections: Patent xxxxxxx is in Section A and is quoting three patents, which are cetegorised (first position only) in respectively Sections A, D and F

For Classes: Patent xxxxxxx is in A23 and is quoting three patents, which are cetegorised (first position only) in respectively A18, A09 and F12

etc.
There are two lines where I'm struggling. Line two (mainly) and I'm not sure about line 21. Both are commented in the query below.

Here my query so far (and I needed help with that too):

Code: Select all

select citing_pub.publn_nr, cited_pub.publn_nr, more2.number_bw_citation 
from -- !!!This is where I need a count function for the cited publication and distinction between the IPC Sections/Classes!!!
tls209_appln_ipc as citing_ipc  
join tls201_appln as citing_ap on citing_ipc.appln_id = citing_ap.appln_id
join tls211_pat_publn as citing_pub on citing_ap.appln_id = citing_pub.appln_id
join tls212_citation on citing_pub.pat_publn_id = tls212_citation.pat_publn_id
join tls211_pat_publn as cited_pub on tls212_citation.cited_pat_publn_id = cited_pub.pat_publn_id
join tls201_appln as cited_ap on cited_pub.appln_id = cited_ap.appln_id
join tls209_appln_ipc as cited_ipc on cited_ap.appln_id = cited_ipc.appln_id
join  

(Select citing_publication.appln_id,  count (cited_publication.appln_id) number_bw_citation
from
tls211_pat_publn cited_publication

join tls212_citation as citation on cited_publication.pat_publn_id = citation.pat_publn_id
join tls211_pat_publn as citing_publication on citation.pat_publn_id = citing_publication.pat_publn_id

where citing_publication.appln_id in ('145086','258027','217882')
group by citing_publication.appln_id
having count(cited_publication.appln_id)>2) as more2 on citing_ap.appln_id = more2.appln_id  -- Is this line correct? I'm a bit confused about this one.

where citing_pub.publn_date BETWEEN '1991-01-01' AND '2016-07-31'
and cited_ap. appln_kind = 'A' 
AND citing_ap.appln_auth IN ('AL','AT','BA','BE','BG','CH','CS','CY','CZ','DE','DK','EE','ES','FI','FR','GB','GE','GR','HR','HU','IE','IS','IT','LI','LT','LU','LV','MC','MD','ME','MK','MT','NL','NO','PL','PT','RO','RS','RU','SE','SI','SK','SM','TR','UA','YU','EP')
AND citing_ap.granted =1 
order by citing_ap.appln_id
As a side note:
I'm using a local (University) PATSTAT database. Version 2016b (thus the end date end of July)

Kind regards
Angel


EPO / PATSTAT Support
Posts: 425
Joined: Thu Feb 22, 2007 5:33 pm
Contact:

Re: Help with counting backward citation

Post by EPO / PATSTAT Support » Fri Apr 27, 2018 1:13 pm

Hello Angel,
the problem with your IPC criteria is that each citing and cited application can have multiple IPC classification codes. So if you want to do this very strict, I fear you will exclude most of the citations. The fact that they are cited, mostly means that there is "some" kind of technical relevant link, and therefore there will be some kind of overlap on IPC classification codes, especially at the higher levels.
Maybe a solution would be to "assign" a main classification code (weighing? or based on the tls230 technical field table ?) on both cited and citing publication pairs and then see how it looks.
I would definitely use intermediate tables (this is not possible in PATSTAT Online), you can start with making a list of all applications that have at least 3 citations, and then aggregate the IPC codes for all what is left. (using whatever approach you decide on for the IPC conditions.)
PATSTAT Support Team
EPO - Vienna
patstat @ epo.org


torben
Posts: 13
Joined: Thu Nov 12, 2015 9:59 am

Re: Help with counting backward citation

Post by torben » Wed May 16, 2018 1:02 pm

Hi,
with the above mentioned search query its easy to create a table with patents which have at least three backward citations. Next step would be to add these table to the database but how would you program the code if you need at least three different technology fields or at least three different main/first IPC-class?


EPO / PATSTAT Support
Posts: 425
Joined: Thu Feb 22, 2007 5:33 pm
Contact:

Re: Help with counting backward citation

Post by EPO / PATSTAT Support » Thu May 17, 2018 2:41 pm

And EPO patent data does not have a concept of "first" or "main classfication".
Therefore you as a researcher will need to define the criteria for "main" classification.
(Weighing at subsclass level ?)
PATSTAT Support Team
EPO - Vienna
patstat @ epo.org


torben
Posts: 13
Joined: Thu Nov 12, 2015 9:59 am

Re: Help with counting backward citation

Post by torben » Fri May 18, 2018 10:15 am

Hi, thanks for your comment. I thought about the german concept of IPC classification.

But the other aspect of my comment is the question whether it is possible to select patents with three backward citations (yes first query) and three different technology fields (this is the unknown part of the query)?


Angel
Posts: 2
Joined: Wed Apr 18, 2018 3:33 pm

Re: Help with counting backward citation

Post by Angel » Sat May 26, 2018 5:30 pm

Hi again and thanks for the replies.
I've tried following query:

Code: Select all

select cited_publication.publn_auth cited_publn_auth 
,cited_publication.publn_nr cited_publn_nr
,cited_publication.publn_kind cited_publn_kind
,cited_publication.publn_date cited_publn_date
,cited_publication.appln_id cited_appln_id
, tech_field.techn_field_nr cited_Tech_Field
, tech_field.Weight cited_tech_field_weight

,citing_publn.publn_auth citing_publn_auth
,citing_publn.publn_nr citing_publn_nr
,citing_publn.publn_kind citing_publn_kind 
,citing_publn.publn_date citing_publn_date
, citing_publn.appln_id citing_appln_id
, more2.number_bw_citation 

from 
tls211_pat_publn cited_publication 
join tls212_citation on cited_publication.pat_publn_id = tls212_citation.cited_pat_publn_id
join tls211_pat_publn citing_publn on tls212_citation.pat_publn_id = citing_publn.pat_publn_id 
join  

(Select distinct count (cited_publication.appln_id) number_bw_citation, citing_publication.appln_id  
from
tls211_pat_publn cited_publication

join tls212_citation as citation on cited_publication.pat_publn_id = citation.pat_publn_id
join tls211_pat_publn as citing_publication on citation.pat_publn_id = citing_publication.pat_publn_id

where citing_publication.publn_date >'1991-01-01'
group by citing_publication.appln_id
having count(cited_publication.appln_id)>2) as more2 on citing_publn.appln_id = more2.appln_id  -- Überprüfen
join tls201_appln as citing_ap on citing_publn.appln_id = citing_ap.appln_id

Join tls230_appln_techn_field Tech_field on cited_publication.appln_id = tech_field.appln_id 

where citing_publn.publn_date BETWEEN '1991-01-01' AND '2016-07-31'
and citing_ap.appln_kind = 'A' 
AND citing_ap.appln_auth IN ('AL','AT','BA','BE','BG','CH','CS','CY','CZ','DE','DK','EE','ES','FI','FR','GB','GE','GR','HR','HU','IE','IS','IT','LI','LT','LU','LV','MC','MD','ME','MK','MT','NL','NO','PL','PT','RO','RS','RU','SE','SI','SK','SM','TR','UA','YU','EP')
AND citing_ap.granted =1 

Order by citing_appln_id, cited_appln_id, cited_tech_field_weight
However it returns 12.873.887 rows. So a second query will be needed.

This is a sample of the three fields query

Code: Select all

SELECT tls201_appln.appln_id,appln_auth,appln_nr,appln_kind,appln_filing_date,
granted, nb_citing_docdb_fam, nb_applicants,nb_inventors, weight,
tls230_appln_techn_field.techn_field_nr
FROM tls201_appln
join tls230_appln_techn_field on tls201_appln.appln_id = tls230_appln_techn_field.appln_id
join (select appln_id , count(distinct(techn_field_nr)) count_techn_field from tls230_appln_techn_field 
group by appln_id having count(distinct(techn_field_nr)) > 2) more_then_2 
on tls201_appln.appln_id = more_then_2.appln_id


where appln_auth = 'DE' -- + others
and appln_kind = 'A'
and granted = 1
and earliest_filing_year > '1991'
group by tls201_appln.appln_id,appln_auth,appln_nr,appln_kind,appln_filing_date,granted,
nb_citing_docdb_fam,nb_applicants,nb_inventors,weight,tls230_appln_techn_field.techn_field_nr
order by tls201_appln.appln_id
So is it possible to somehow "merge" those two queries? Or modify the second query to return just the highest weighted field/class?

Thanks


Post Reply