Citations by industry and country filed at the EPO

Here you can post your opinions, ask questions and share experiences on the PATSTAT product line. Please always indicate the PATSTAT edition (e.g. 2015 Autumn Edition) and the database (e.g. PATSTAT Online, MySQL, MS SQL Server, ...) you are using.
Post Reply

nicole.crichton
Posts: 3
Joined: Tue Aug 16, 2022 3:28 pm

Citations by industry and country filed at the EPO

Post by nicole.crichton » Mon Aug 29, 2022 5:11 pm

Hi, I would like to extract data on the number of patent citations by industry (industries 10 to 33) and country filed at the EPO from 2007-2018.

I have formed the following code and would like to know if this is correct for the data I would like to collect:

Code: Select all

SELECT DISTINCT tls201_appln.earliest_publn_year , person_ctry_code, tls229_appln_nace2.nace2_code, COUNT(DISTINCT tls212_citation.cited_appln_id) NumberOfCitations
FROM tls201_appln JOIN tls207_pers_appln ON tls201_appln.appln_id = tls207_pers_appln.appln_id
JOIN tls206_person ON tls207_pers_appln.person_id = tls206_person.person_id
JOIN tls212_citation ON tls201_appln.appln_id = tls212_citation.cited_appln_id
JOIN tls229_appln_nace2 ON tls201_appln.appln_id = tls229_appln_nace2.appln_id
JOIN tls211_pat_publn ON tls212_citation.pat_publn_id = tls211_pat_publn.pat_publn_id
WHERE tls201_appln.appln_filing_year BETWEEN 2007 AND 2018
AND publn_auth = 'EP'
AND person_ctry_code = 'BE'
AND nace2_code BETWEEN '10%' AND '33%'
GROUP BY  person_ctry_code, tls201_appln.earliest_publn_year, tls229_appln_nace2.nace2_code
ORDER BY  person_ctry_code, tls201_appln.earliest_publn_year, tls229_appln_nace2.nace2_code


EPO / PATSTAT Support
Posts: 425
Joined: Thu Feb 22, 2007 5:33 pm
Contact:

Re: Citations by industry and country filed at the EPO

Post by EPO / PATSTAT Support » Wed Aug 31, 2022 10:40 am

Hello Nicole,

I think the logics in your query are flawed. (But I am not 100 % sure what you want to do neither.)
The tls212_citation.cited_pat_publn_id has no connection to tls212_citation.cited_appln_id in the tls212_citation records, apart from the fact that both are cited by the same publication.
So with other words: tls212_citation.cited_appln_id is not the appln_id from tls212_citation.pat_publn_id.
You can easily observer this because they are mutely exclusive, if one is 0 then the other is not. They are never both <> 0.

Have a look at forward-and-backward-citation-7892

The large majority (99,8%) of citation records are linking cited publications. Cited applications is only a minority. So if you only take the publications you will cover most of the citations. (But to be 100 % complete, one would need to use both approaches.)

I would execute a query/search strategy on a limited number of applications, then manually double check to see if the "count" and search strategy works correct for a limited number, and then expand the boundaries and limitations.

Code: Select all

SELECT DISTINCT tls201_appln.earliest_publn_year , person_ctry_code, tls229_appln_nace2.nace2_code,
count (tls212_citation.cited_pat_publn_id) NumberOfCitations
FROM tls201_appln JOIN tls207_pers_appln ON tls201_appln.appln_id = tls207_pers_appln.appln_id
JOIN tls206_person ON tls207_pers_appln.person_id = tls206_person.person_id
JOIN tls211_pat_publn ON tls201_appln.appln_id = tls211_pat_publn.appln_id
JOIN tls212_citation ON tls211_pat_publn.pat_publn_id = tls212_citation.pat_publn_id
JOIN tls229_appln_nace2 ON tls201_appln.appln_id = tls229_appln_nace2.appln_id
WHERE tls201_appln.appln_filing_date between  '2007/1/10' and '2007/1/15'
AND publn_auth = 'EP'
AND person_ctry_code = 'BE'
AND nace2_code BETWEEN '10%' AND '33%'
and tls201_appln.appln_id in (32715,265128,332473813 ,332473787,332473813)
and tls212_citation.cited_pat_publn_id = 0
GROUP BY person_ctry_code, tls201_appln.earliest_publn_year, tls229_appln_nace2.nace2_code
ORDER BY person_ctry_code, tls201_appln.earliest_publn_year, tls229_appln_nace2.nace2_code
PATSTAT Support Team
EPO - Vienna
patstat @ epo.org


Post Reply