A query for my MA final paper

Here you can post your opinions, ask questions and share experiences on the PATSTAT product line. Please always indicate the PATSTAT edition (e.g. 2015 Autumn Edition) and the database (e.g. PATSTAT Online, MySQL, MS SQL Server, ...) you are using.
Post Reply

Posts: 2
Joined: Mon Feb 03, 2014 7:49 pm

A query for my MA final paper

Post by URL260 » Mon Feb 03, 2014 10:23 pm

Hi everyone!
I was a complete newbie two weeks ago, so I hope that someone can help me out with this huge problem.
My aim is to gather informations about all the patents filed by a set german and polish companies from 1995 till 2012.
the informations i need are:
-patent specification (title, date of filing, date of publication)
-number of years in which the company has paid the fees
-number of citation received by the patent (divided if possibile in: citations done by the company itsself, and citation done by others).

I've found the hrm_l2_id for each company cause I guess (correct me if I'm wrong) it's the safest way to pick the company's patents.

the query I've set up is this:

SELECT han_id,hrm_l2,appln_title,publn_auth,publn_nr,appln_filing_year,publn_earliest_year,MAX(l520ep)
FROM tls201_appln a
JOIN tls207_pers_appln b ON a.appln_id = b.appln_id
JOIN tls202_appln_title t ON a.appln_id =t.appln_id
JOIN tls206_person c ON b.person_id = c.person_id
JOIN tls211_pat_publn d ON a.appln_id = d.appln_id
JOIN tls221_inpadoc_prs i ON i.appln_id = d.appln_id
join tls212_citation e on a.appln_id = e.cited_appln_id
( hrm_l2_id='2480779'
OR hrm_l2_id='3396740'
OR hrm_l2_id='3396741'
OR hrm_l2_id='4284824'
OR hrm_l2_id='4284826'
OR hrm_l2_id='5079271'
OR hrm_l2_id='5079272'
OR hrm_l2_id='5079273'
OR hrm_l2_id='5079337'
OR hrm_l2_id='6318438'
OR hrm_l2_id='6318440'
OR hrm_l2_id='6318439'
OR hrm_l2_id='6318442'
OR hrm_l2_id='10240046'
OR hrm_l2_id='10240047'
OR hrm_l2_id='10240048'
OR hrm_l2_id='10240049'
OR hrm_l2_id='10240051'
OR hrm_l2_id='10240052'
OR hrm_l2_id='10240053'
OR hrm_l2_id='10240054'

AND appln_auth='EP'
AND publn_auth='EP'
AND applt_seq_nr > 0
AND appln_filing_year BETWEEN '1995' AND '2012'
GROUP BY a.appln_id
ORDER BY hrm_l2_id,appln_filing_year, publn_nr

I can't figure out a way to receive the number of citations.
Can some good-willing soul help me out with this?
I need the data only for my final paper, and i think i won't be using the query more than 3-4 times :/
thank you in advance

Posts: 140
Joined: Wed Jul 08, 2009 5:51 pm

Re: A query for my MA final paper

Post by nico.rasters » Fri Feb 21, 2014 7:50 pm

It's best not to double post. Not a very easy question btw.

I assume you are using PATSTAT Online?
Instead of the long WHERE (hrm_l2_id= OR OR OR) you could do WHERE `hrm_l2_id` IN (1,2,3,4,etc.)
And those IDs are numbers, so no need to put them within quotes.

Also you should do a LEFT OUTER JOIN with TLS202 because not all patents have a title.

Leave out the ORDER BY (slows things down) and expand the GROUP BY so it lists all non-aggregate fields. Otherwise your results will be non-deterministic aka wrong.

You are looking for FORWARD citations only? You mention "citations received".
And you want to split those in self-citations and the rest? You probably need to use the DOCDB_FAMILY_CITATION table for starters. I assume you don't care about subsidiaries (you count those as "others"), as it's "only" a MA paper.

I'd say start with a single company just to understand the process.
Retrieve all of its patents (and their DOCDB family IDs).
Retrieve all patents citing those patents (and their DOCDB family IDs).
The overlap between these two tables would be self-citations. It's up to you if you want to count each patent as it is, or if you want to count the families.

To do this for all companies on your list you will probably have to download all data to Excel (or STATA, SPSS, whatever) and process it further.
Nico Doranov
Data Manager

Daigu Academic Services & Data Stewardship

Post Reply