counting backward and self-citation

Here you can post your opinions, ask questions and share experiences on the PATSTAT product line. Please always indicate the PATSTAT edition (e.g. 2015 Autumn Edition) and the database (e.g. PATSTAT Online, MySQL, MS SQL Server, ...) you are using.
Post Reply

PIERG
Posts: 1
Joined: Fri Aug 12, 2016 6:10 am

counting backward and self-citation

Post by PIERG » Fri Aug 12, 2016 6:17 am

Hi, I am currently working on my thesis, and i'm using the version of Patstat (October 2013).
I want to count the number of forward and backward citations of patents related PV technologies which are applicated in US by Chinese inventor.
I searched this forum for a solution, I found the the solution for forward citation.
However, I couldn't find self-citation and backward citation.

It's my first time with SQL language and I couldn't find solution.
Would you help me, please?

Here is my query that I am working on.

SELECT
DISTINCT year(t1.appln_filing_date) AS year,
t1.appln_auth AS office_ctry,
t1.appln_kind,
t1.appln_id,
t1.appln_nr,
t6.PERSON_ID,
t6.PERSON_CTRY_CODE,
t6.PERSON_NAME,
t9.IPC_CLASS_SYMBOL,
COUNT(distinct t12.PAT_PUBLN_ID) AS forward_citation_5yrs
FROM
tls201_appln t1
INNER JOIN
tls207_pers_appln t7 ON t1.appln_id = t7.APPLN_ID
INNER JOIN
tls206_person t6 ON t7.PERSON_ID = t6.PERSON_ID
INNER JOIN
tls209_appln_ipc t9 ON t1.appln_id = t9.APPLN_ID
INNER JOIN
(SELECT appln_id, MIN(publn_date) AS earliest_date
FROM tls211_pat_publn GROUP BY appln_id) tt ON t1.appln_id = tt.appln_id
INNER JOIN
tls211_pat_publn t11a ON t11a.appln_id = tt.appln_id
INNER JOIN
tls212_citation t12 ON t11a.pat_publn_id = t12.cited_pat_publn_id
INNER JOIN
tls211_pat_publn t11b ON t12.pat_publn_id = t11b.pat_publn_id
WHERE
year(t1.appln_filing_date) >= 1981 AND year(t1.appln_filing_date) <= 2010
AND t1.appln_kind = 'A'
AND t1.appln_auth = 'US'
AND t6.PERSON_CTRY_CODE = 'CN'
AND (t9.IPC_CLASS_SYMBOL LIKE 'F03G 6%'
OR t9.IPC_CLASS_SYMBOL LIKE 'F24J 2%'
OR t9.IPC_CLASS_SYMBOL LIKE 'H01L 27/142%'
OR t9.IPC_CLASS_SYMBOL LIKE 'H01L 31%'
OR t9.IPC_CLASS_SYMBOL LIKE 'H02N 6%'
OR t9.IPC_CLASS_SYMBOL LIKE 'E04D 13/18%')
AND t11b.publn_date <= date_add(tt.earliest_date, INTERVAL 5 YEAR)
GROUP BY t1.appln_id
ORDER BY year(t1.appln_filing_date), t1.appln_id;


Geert Boedt
Posts: 176
Joined: Tue Oct 19, 2004 10:36 am
Location: Vienna

Re: counting backward and self-citation

Post by Geert Boedt » Fri Aug 19, 2016 10:06 am

Hello Pierg,
forward and backward citations are basically the same, but looked at from 2 different directions. If A cites B then B has been cited by A. This reciprocal relation is defined through the tls212_citation table. The test data base (in MS ACCESS) and related documentation which you can find at the link below give a good example of forward citations. (Query 8)
https://publication.epo.org/raw-data/pr ... roductId=1
And the attached presentation also explains the principles. All types of citation analysis are variations on the principle explained in Query8. The query calculates the number of forward citations, if you want to "count by applicant" instead of application then you will need to make the sum of all the citations grouped over applicant. The same if you want to look at technology-technology, inventor or country-country citations (to analyse technology flows between the "entities".

Your query contains syntax errors, all attributes in the SELECT clause should also appear in the GROUP BY clause (except the attribute you are counting.)
Query 8 from the test data base, also described in the documentation, does not work "straight out of the box" on PATSTAT Online because of slightly different SQL syntax rules between MS ACCESS, MS SQL and MySQL. I have adapted it so it works on PATSTAT Online (which is MS SQL based).
It becomes rather entangled because all the "sub queries" generating intermediate tables have to be embedded in one single query. PATSTAT Online does not allow users to upload their own queries.

Code: Select all

SELECT t1.appln_id, Count(Query8_t3.pat_publn_id) AS cites_3y
FROM ((((SELECT DISTINCT t1.appln_id, t1.appln_auth, t1.appln_nr, t1.appln_kind
				FROM tls201_appln AS t1 INNER JOIN tls209_appln_ipc AS t2 ON t1.appln_id = t2.appln_id WHERE (((t1.appln_kind)='A' Or (t1.appln_kind)='W') AND ((Year([t1].[appln_filing_date]))=2005) AND ((t2.ipc_class_symbol) Like 'F03D%'))) AS t1 
                INNER JOIN (SELECT appln_id, MIN(publn_date) AS earliest_date FROM tls211_pat_publn GROUP BY appln_id ) as Query8_t2 ON t1.appln_id = Query8_t2.appln_id) 
                INNER JOIN tls211_pat_publn AS t2b ON Query8_t2.appln_id = t2b.appln_id)
				INNER JOIN (SELECT DISTINCT tls212_citation.pat_publn_id, tls212_citation.cited_pat_publn_id FROM tls212_citation GROUP BY tls212_citation.pat_publn_id, tls212_citation.cited_pat_publn_id)  as Query8_t3 
                     ON t2b.pat_publn_id = Query8_t3.cited_pat_publn_id) INNER JOIN tls211_pat_publn AS t4 ON Query8_t3.pat_publn_id = t4.pat_publn_id
WHERE (((t2b.publn_auth)='DE') 
	AND ((t4.publn_auth)='EP') 
    AND ((Year([t4].[publn_date]))<>9999) 
    AND ((Year([earliest_date]))<>9999) 
    AND ((DateDiff("yyyy",[Query8_t2].[earliest_date],[t4].[publn_date]))<4))
GROUP BY t1.appln_id
ORDER BY Count(Query8_t3.pat_publn_id) DESC , t1.appln_id;
In case you have MS ACCESS available on your local computer, my advice would be to extract a data set that contains the applications according to your query, and then make an extraction in PATSTAT Online (via download) that contains the cited and citing documents. (The family members you do not need to add). After that is done, you can import the queries from the test data base, and simply adapt them to your needs.
Below is the query based on your conditions, which you can use to extract the data sample. This only results in 61 distinct applications which is rather low for statistical analysis. You might want to review the conditions in the the WHERE clause and make them less stringent.

Code: Select all

select distinct tls201_appln.appln_id
from  tls201_appln join tls209_appln_ipc on tls201_appln.appln_id = tls209_appln_ipc.appln_id
join tls207_pers_appln on tls201_appln.appln_id = tls207_pers_appln.appln_id
join tls206_person on tls207_pers_appln.person_id = tls206_person.person_id
where appln_filing_year between 1981 AND 2010
AND appln_kind = 'A' -- this will exclude PCT applications !
AND appln_auth = 'US' -- applied in the US
AND PERSON_CTRY_CODE = 'CN' --CN applicants or inventors
AND (IPC_CLASS_SYMBOL LIKE 'F03G   6%'
	OR IPC_CLASS_SYMBOL LIKE 'F24J   2%'
	OR IPC_CLASS_SYMBOL LIKE 'H01L  27/142%'
	OR IPC_CLASS_SYMBOL LIKE 'H01L 31%'
	OR IPC_CLASS_SYMBOL LIKE 'H02N   6%'
	OR IPC_CLASS_SYMBOL LIKE 'E04D  13/18%')
I hope this helps you forward.
Best regards,

Geert Boedt
PATSTAT support
Business Use of Patent Information
EPO Vienna


Post Reply