Page 1 of 1

Data retrieval of patent references possible?

Posted: Sun Oct 26, 2014 3:44 am
by EngineeringIntern
Hello,

I am new to patent research and I could use some help. I am researching about a half a dozen patents, and I need help creating a code to find the patents that each of them reference. I need the publication date, filing date, and priority date of each patent that is referenced from the original patent. I would then like to search for each of the resulting patents using the same method and record the third tier of patents that are referenced.

Also, what would be the best method to extract the data and put it in Microsoft Excel? Is all of this possible using this service? If not, is there a way to complete my research using a more effective method than just searching through each patent individually? Thank you for your help.

Re: Data retrieval of patent references possible?

Posted: Tue Nov 11, 2014 12:11 pm
by Geert Boedt
Hello,
If it is effectively only half a dozen of patents, it might be quicker & easier to do it manually via ESPACENET: http://worldwide.espacenet.com
ESPACENET allows you to download the cited (and also citing) documents as an excel or CSV file for any specific patent application. If you want to have the citing documents from the citing documents, you would need to repeat the process for each of the citing documents from the first cycle.

In PATSTAT one could write an sql query that does it in one go. From a data aggregation point of view, you first would need to define your "half dozen" of patents, then look op the citations, and then repeat the process for the second set.

On a locally installed PATSTAT database one would ideally create an intermediate table or view. Via PATSTAT on-line, you need to squeeze it all in one query, or create a second query which you feed with the results from the first query. Below is an example query that will give you all the citations from the citations from US8112496 .
The first "join via the tls212_citation table" will give you 185 cited publications from US8112496, the second "join via the tls212_citation_1 table) will expand these 185 publications to 1952 publications. I excluded non-patent-literature citations by forcing the pat_citn_seq_nr > 0 .

You can paste/copy this in PATSTAT (on-line - APRIL 2014 release) and it should give a table with the 1952 publications.

SELECT tls211_pat_publn.publn_auth, tls211_pat_publn.publn_nr, tls211_pat_publn.publn_kind, tls211_pat_publn.publn_date, tls212_citation.citn_id, tls211_pat_publn_1.publn_auth, tls211_pat_publn_1.publn_nr, tls211_pat_publn_1.publn_kind, tls211_pat_publn_1.publn_date, tls212_citation_1.citn_id, tls211_pat_publn_2.publn_auth, tls211_pat_publn_2.publn_nr, tls211_pat_publn_2.publn_kind, tls211_pat_publn_2.publn_date
FROM tls211_pat_publn INNER JOIN tls212_citation ON tls211_pat_publn.pat_publn_id = tls212_citation.pat_publn_id
INNER JOIN tls211_pat_publn AS tls211_pat_publn_1 ON tls212_citation.cited_pat_publn_id = tls211_pat_publn_1.pat_publn_id
INNER JOIN tls212_citation AS tls212_citation_1 ON tls211_pat_publn_1.pat_publn_id = tls212_citation_1.pat_publn_id
INNER JOIN tls211_pat_publn AS tls211_pat_publn_2 ON tls212_citation_1.cited_pat_publn_id = tls211_pat_publn_2.pat_publn_id
WHERE tls211_pat_publn.publn_auth='us' AND tls211_pat_publn.publn_nr = "8112496" AND tls212_citation.pat_citn_seq_nr >0 AND tls212_citation_1.pat_citn_seq_nr > 0
GROUP BY tls211_pat_publn.pat_publn_id, tls211_pat_publn.publn_auth, tls211_pat_publn.publn_nr, tls211_pat_publn.publn_kind, tls211_pat_publn.publn_date, tls212_citation.citn_id, tls211_pat_publn_1.publn_auth, tls211_pat_publn_1.publn_nr, tls211_pat_publn_1.publn_nr, tls211_pat_publn_1.publn_kind, tls211_pat_publn_1.publn_date, tls212_citation_1.citn_id, tls211_pat_publn_2.publn_auth, tls211_pat_publn_2.publn_nr, tls211_pat_publn_2.publn_kind, tls211_pat_publn_2.publn_date
ORDER BY tls212_citation.citn_id, tls212_citation_1.citn_id;

Depending on the nature of your research, you would probably look at the family-family citations and not at the publication level citations. You would probably also distinguish between (or exclude) citations given by the examiner/patent office or citations by the applicant (or any other origins of the citations -via the citn_origin attribute.) To make the above query complete according to your requirement of having the filing date and the priority date, you would need to make an extra join with the tls201_appln table in order to pick-up those extra attributes.

Geert Boedt
EPO - PATSTAT team

Re: Data retrieval of patent references possible?

Posted: Thu Nov 27, 2014 1:25 pm
by curcitiz
Hi,

I am doing a MSc thesis and I need the number of patents published and the number of citing for each patent for a particular set of companies. I just read your comment that "ESPACENET allows you to download the cited (and also citing) documents as an excel or CSV file for any specific patent application".When i download as an excel file the list of patents of a company the different columns provide many information but not the CITING DOCUMENTS.
I would like to know if it is possible to dowload the list of the patents with also the citing documents as a column.

Thank you very much

Re: Data retrieval of patent references possible?

Posted: Thu Dec 04, 2014 6:08 pm
by Geert Boedt
ESPACENET allows to download of a list of Cited or Citing document from 1 specific patent.
It does not allow you to download the "list" of a "list" of patents belonging to an applicant.
So if your list of patents from an applicant only contains a "couple of dozens" of patents, then it might be quicker to do it via ESPACNET. For larger groups, one could do it via PATSTAT (on-line).

Here below is a sample query that retrieves all the citations from all patents filed by the company: 'ENTROPY SOLUTIONS' .
'ENTROPY SOLUTIONS' has 10 applications in PATSTAT.
These 10 applications have in total 16 publications (normally only 1 publication from each application will have citations, but there are exceptions: f.ex. supplementary search reports.)

These 16 publications have in total 106 citations, but some are non-patent literature documents.
To force the final result list to only contain cited patents (and not XP documents), we add the condition cited_pat_publn_id > 0.
This gives a final list of 99 publications cited by the original applications.
The interesting analysis are of course to see who is citing who.
With other words, if the 99 publications cited by 'ENTROPY SOLUTIONS' belong to company A and B, this might be an indication that 'ENTROPY SOLUTIONS' is building technology on A and B..... Partners or competition ?

select ap.appln_id, ap.appln_auth, ap.appln_nr, ap.appln_filing_date, pub.publn_auth, pub.publn_nr,
cit.citn_id as id,
citations.publn_auth as cited_author,
citations.publn_nr as cited_number,
citations.publn_kind as cited_kind,
citations.publn_first_grant as cited_grant
from tls201_appln ap
join tls207_pers_appln pa on ap.appln_id = pa.appln_id
join tls206_person per on per.person_id = pa.person_id
join tls211_pat_publn pub on ap.appln_id = pub.appln_id
join tls212_citation cit on pub.pat_publn_id = cit.pat_publn_id
join tls211_pat_publn citations on cit.cited_pat_publn_id = citations.pat_publn_id
where hrm_l2 like 'ENTROPY SOLUTIONS' -- name of the person
and applt_seq_nr > 0 -- to force the query to look only at persons taht are applicants
and cit.cited_pat_publn_id > 0 -- to exclude the links to non-patent literature citations
order by ap.appln_id, pub.pat_publn_id, cit.citn_id


You can paste/copy the above query in PATSTAT on-line.
Geert Boedt
Patent Information for Business Use