Help with backward citation at industry-level

Here you can post your opinions, ask questions and share experiences on the PATSTAT product line. Please always indicate the PATSTAT edition (e.g. 2015 Autumn Edition) and the database (e.g. PATSTAT Online, MySQL, MS SQL Server, ...) you are using.
Post Reply

andyluong.vn
Posts: 3
Joined: Wed Mar 22, 2017 5:00 pm

Help with backward citation at industry-level

Post by andyluong.vn » Wed Mar 22, 2017 5:10 pm

Hi everyone!
I'm currently doing PhD in Economics, and trying to construct the technological spillovers measure at industry-level. I'm just wondering is it by any chance that I can collect the number of backward citations for a period (2005-2013) for all manufacturing industries (like how many backward citations from industry A to industry B and vice versa)? If not, is it possible to do it for different IPC?

I am new to SQL and would be really grateful for any support I can get regarding to this.
I'm trying PATSTAT online autumn version.

Kind regards!
Andy Luong


Geert Boedt
Posts: 176
Joined: Tue Oct 19, 2004 10:36 am
Location: Vienna

Re: Help with backward citation at industry-level

Post by Geert Boedt » Tue Mar 28, 2017 10:05 am

Hello Andy,
There is no ready aggregated data available that gives industry-industry citations. But the principle is the same as all citation analysis; you only have to aggregate it at a different level. The PATSTAT website contains a number of self-learning publications that have sample queries. In particular, have a look at : "An introduction to the PATSTAT database with example queries" ; it gives an example on how to calculate the number of received citations. (forward and backward citations are in principle the same, but seen from a different position.) You simply have to bring this to "industry level" via the IPC classification codes. (Assuming you use the NACE classification scheme to define the "industries".)
Also check Google Scholar for "patent citations technology flows", you will find plenty of relevant publications.

The PATSTAT database does contain a ready made family-family citation table (tls228). But this table includes all the different categories of citations, and many researchers want to exclude applicant citations (or only take citations from search and examination - so called examiner citations). So if you want to exclude the "self citation bias", you will need to do it from scratch via the tls212_citation tables. You will also have to make some decisions what to do about patents that belong to more then 1 "industry" because of the assigned IPCs. Weighing might be appropriate, but that depends on the level of analysis you want to do. If you compare industries, weighing will bias negatively applications that cover multiple technical fields.

Doing this kind of analysis on PATSTAT Online will probably not be possible as you will need to work with intermediate tables to keep it manageable. If your data sample is limited (<100.000 applications), you could make a MS ACCESS extraction, and then continue working offline. The data extraction tool offers an option to include cited, citing and family members, so your final sample will be more then 100.000 applications.
I hope this information is helpful.
Best regards,

Geert Boedt
PATSTAT support
Business Use of Patent Information
EPO Vienna


andyluong.vn
Posts: 3
Joined: Wed Mar 22, 2017 5:00 pm

Re: Help with backward citation at industry-level

Post by andyluong.vn » Thu Apr 27, 2017 4:03 pm

Hi Geert!
This is Andy. I hope you are well.

Thank you very much for your reply Geert. I've followed your advice and learned about the SQL from several materials.
However, I'm really struggling with my own query right now. Can you please take a look and let me know if I did something wrong here. I did try to correct it for several days but could not have a solution. Basically what I want is to have a table of 4 variables: the citing_appln_id, citing_ipc, cited_appln_id, cited_ipc, then I can add the count for the number of patent citation later. For the first part, I try to make a sample "t5" for the citing_appln_id and citing_ipc, and for the second part I do it for the cited application, by going backward from tls212.
I hope my question is clear enough for you Geert. If not, please feel free to ask, I will get back to you as soon as I can. And I'm really looking forward to hearing from you Geert.

My query:
SELECT distinct t5.appln_id, t1.appln_filing_year, t5.citing_ipc, t10.cited_id, t10.cited_ipc
FROM (select distinct t1.appln_id, t1.apln_filing_year, t1.appln_kind, t2.ipc_class_symbol as citing_ipc
From (((tls201_appln as t1
INNER JOIN tls209_appln_ipc as t2 on t1.appln_id = t2.appln_id)
INNER JOIN tls211_pat_publn as t3 ON t1.appln_id = t3.appln_id)
INNER JOIN tls212_citation as t4 on t4.pat_publn_id = t3.pat_publn_id
WHERE ((t1.appln_filing_year) >= 2005
AND (t1.appln_filing_year) <= 2013
AND (t1.appln_kind = 'A'))) as t5
INNER JOIN (select distinct t9.ipc_class_symbol as cited_ipc, t9.appln_id as cited_id
From (((tls212_citation as t6
INNER JOIN tls211_pat_publn as t7 ON t6.cited_pat_publn_id = t7.pat_publn_id)
INNER JOIN tls201_appln as t8 on t7.appln_id = t8.appln_id)
INNER JOIN tls209_appln_ipc as t9 on t8.appln_id = t9.appln_id
WHERE ((citing_ipc LIKE 'D03B%'
OR citing_ipc LIKE 'A14G%'
OR citing_ipc LIKE 'D02M%')
AND (cited_ipc LIKE 'A47R%'
OR cited_ipc LIKE 'B68G%'))) as t10 on t10.cited_id = t5.appln_id
GROUP BY t5.appln_id

Problem: 3:51:20 PM [SELECT - 0 row(s), 0 secs] [Error Code: 156, SQL State: S0001] Incorrect syntax near the keyword 'WHERE'.


Geert Boedt
Posts: 176
Joined: Tue Oct 19, 2004 10:36 am
Location: Vienna

Re: Help with backward citation at industry-level

Post by Geert Boedt » Tue May 02, 2017 8:29 am

Hello Andy,
I had some problems with the re-naming and aliases in your query, so I therefore decided to make a query from scratch which I think will serve your quest.
You can adapt it and add some "counts" if you need numbers.
I assume that you will want to look at family/family citations, so I have added the docdb_family_id. You can then remove all the attributes at application level (ID, number,...and IPC codes) to get a very strict family/family citation table.

Code: Select all

Select distinct citing_ap.appln_id, citing_ap.docdb_family_id,
citing_ap.appln_nr_epodoc, citing_ipc.ipc_class_symbol, 
cited_ap.appln_id, cited_ap.docdb_family_id, cited_ap.appln_nr_epodoc,
cited_ipc.ipc_class_symbol
from
tls209_appln_ipc as citing_ipc  join tls201_appln as citing_ap on citing_ipc.appln_id = citing_ap.appln_id
join tls211_pat_publn as citing_pub on citing_ap.appln_id = citing_pub.appln_id
join tls212_citation on citing_pub.pat_publn_id = tls212_citation.pat_publn_id
join tls211_pat_publn as cited_pub on tls212_citation.cited_pat_publn_id = cited_pub.pat_publn_id
join tls201_appln as cited_ap on cited_pub.appln_id = cited_ap.appln_id
join tls209_appln_ipc as cited_ipc on cited_ap.appln_id = cited_ipc.appln_id
where citing_ap.appln_filing_year between 2005 and 2013 
and cited_ap. appln_kind = 'A' 
and left (citing_ipc.ipc_class_symbol,4) in ('A47B','A47C%','A47D','A47F')
and left (cited_ipc.ipc_class_symbol,4) in ('B29B','B29D','B60C', 'B67D','C08C')
order by citing_ap.appln_id
At family /family level (then you loose the granularity for each of the 20 -4x5- possible IPC pairs):

Code: Select all

Select distinct citing_ap.docdb_family_id, cited_ap.docdb_family_id
from
tls209_appln_ipc as citing_ipc  join tls201_appln as citing_ap on citing_ipc.appln_id = citing_ap.appln_id
join tls211_pat_publn as citing_pub on citing_ap.appln_id = citing_pub.appln_id
join tls212_citation on citing_pub.pat_publn_id = tls212_citation.pat_publn_id
join tls211_pat_publn as cited_pub on tls212_citation.cited_pat_publn_id = cited_pub.pat_publn_id
join tls201_appln as cited_ap on cited_pub.appln_id = cited_ap.appln_id
join tls209_appln_ipc as cited_ipc on cited_ap.appln_id = cited_ipc.appln_id
where citing_ap.appln_filing_year between 2005 and 2013 and cited_ap. appln_kind = 'A'
	and left (citing_ipc.ipc_class_symbol,4) in ('A47B','A47C%','A47D','A47F')
	and left (cited_ipc.ipc_class_symbol,4) in ('B29B','B29D','B60C', 'B67D','C08C')
	order by citing_ap.docdb_family_id
Best regards,

Geert Boedt
PATSTAT support
Business Use of Patent Information
EPO Vienna


andyluong.vn
Posts: 3
Joined: Wed Mar 22, 2017 5:00 pm

Re: Help with backward citation at industry-level

Post by andyluong.vn » Fri May 05, 2017 4:39 pm

Thank you very much for these queries Geert. I really appreciate your help.

Have a nice weekend!
Best,
Andy Luong


Post Reply