Backward and forward citations; Problems with syntax

Here you can post your opinions, ask questions and share experiences on the PATSTAT product line. Please always indicate the PATSTAT edition (e.g. 2015 Autumn Edition) and the database (e.g. PATSTAT Online, MySQL, MS SQL Server, ...) you are using.
Post Reply

lukassbe
Posts: 2
Joined: Fri Jul 18, 2014 4:41 pm

Backward and forward citations; Problems with syntax

Post by lukassbe » Fri Jul 18, 2014 6:29 pm

I am currently working on my thesis, where I identified a sample of firms. For each of those firms I need the patents and the corresponding backward and forward citation. My plan is to be able to create a patent portfolio for each firm that allows me to construct aggregated variables. Therefore I want a query that results in a list of patents from all of my firms.

I have 3 problems I am struggling with and some help would really be appreciated.

1. The query below does not seem to function, due to an error in the syntax. I cant seem to find error in it .....
(This query is not complete, in order to get faster results. For the final query I would leave out the WHERE restriction on country, the LIMIT to 10 and then I would also add all the names from my sample in a similar fashion)

Code: Select all

SELECT f.appln_filing_year, f.appln_id, f.appln_filing_date, f.nb_citations, f.granted, d.person_id, d.doc_std_name, d.doc_std_name_id, a.*

FROM tls212_citation a 
JOIN tls211_pat_publn b ON a.pat_publn_id = b.pat_publn_id
JOIN tls227_pers_publn c ON b.pat_publn_id = c.pat_publn_id
JOIN tls206_person d ON c.person_id = d.person_id
JOIN tls207_pers_appln e ON d.person_id = e.person_id
JOIN tls201_appln f ON e.appln_id = f.appln_id

WHERE d.person_ctry_code = 'AT'
AND f.ipr_type = 'PI'
AND (d.doc_std_name LIKE ‘%novartis%’
		OR d.doc_std_name LIKE ‘%vaillant%’)

ORDER BY f.appln_filing_year 
LIMIT 10
2. I am still missing the forward and backward citations for the patents, yet I do not know how to solve this issue, although I read the older posts about this topic. How can this be done?


3. My sample consists of US companies and due to the nature of my research it would be necessary to retrieve data ranging from around 1990 until today. Is such data available for the US?
(I am using PATSTAT Online)

It would really help me out a lot if anyone is able to help me with some of my problems. Thank you in advance.

Lukas


nico.rasters
Posts: 140
Joined: Wed Jul 08, 2009 5:51 pm
Contact:

Re: Backward and forward citations; Problems with syntax

Post by nico.rasters » Fri Aug 08, 2014 4:14 pm

I prefer to split these kind of queries into subqueries, using intermediate tables.
I'm using PATSTAT October 2013 btw.

Step 1: Finding all patent by Novartis. Which Novartis? There are so many...

SELECT `tls208_doc_std_nms`.`doc_std_name_id`
, `tls208_doc_std_nms`.`doc_std_name`
, COUNT(DISTINCT(`tls207_pers_appln`.`appln_id`))
FROM `tls208_doc_std_nms`
INNER JOIN `tls206_person` ON `tls208_doc_std_nms`.`doc_std_name_id` = `tls206_person`.`doc_std_name_id`
INNER JOIN `tls207_pers_appln` ON `tls206_person`.`person_id` = `tls207_pers_appln`.`person_id`
WHERE `doc_std_name` LIKE "%novartis%"
GROUP BY `tls208_doc_std_nms`.`doc_std_name_id`
, `tls208_doc_std_nms`.`doc_std_name`
ORDER BY COUNT(DISTINCT(`tls207_pers_appln`.`appln_id`)) DESC
LIMIT 10;

NOVARTIS AG with 38214 is your best bet. Its doc_std_name_id is 3396.
Create an intermediate table called "Novartis" and store the application IDs in it.
You can also call your table "Focals" and store the applications IDs of all your firms in it, but I'd like to keep this example short and simple.

INSERT INTO `Novartis` (`appln_id`)
SELECT DISTINCT `tls207_pers_appln`.`appln_id`
FROM `tls206_person`
INNER JOIN `tls207_pers_appln` ON `tls206_person`.`person_id` = `tls207_pers_appln`.`person_id`
WHERE `doc_std_name_id`=3396 AND `tls207_pers_appln`.`applt_seq_nr`>0;
# 38,203 total

You could add: AND `tls206_person`.`person_ctry_code` IN ("AT","") to the query above, or just have faith in the standardisation procedure.

Now that we have all Novartis patents, we can focus on those citations which are stored in tls212.
Every application has one or more publications, and every publication has zero or more citations.
There is a double count in citations because of `citn_origin`. In other words, the same citation may occur more than once if has more than one origin (source).

The first application that I found has id 1561.

SELECT `pat_publn_id` FROM `tls211_pat_publn` WHERE `appln_id` =1561;
This returns 2510 and 2511 (both EP but different kind codes).

Backward and forward citations:
SELECT DISTINCT `cited_appln_id` FROM `tls212_citation` WHERE `pat_publn_id` IN (2510, 2511) AND `pat_citn_seq_nr`>0;
SELECT DISTINCT `pat_publn_id` FROM `tls212_citation` WHERE `cited_appln_id` IN (2510, 2511) AND `pat_citn_seq_nr`>0;

I would suggest to build an intermediate table called `bf_citations` for this, with the following fields:
`ID`, `citing_publn_id`, `citing_appln_id`, `cited_publn_id`, `cited_appln_id`

Simply copy tls212, delete the NPL citations, and update the citing_ and cited_ appln_ids by joining with tls211.

US data should be available, but PATSTAT coverage is only decent up to the release data minus two years.
________________________________________
Nico Doranov
Data Manager

Daigu Academic Services & Data Stewardship
http://www.daigu.nl/


curcitiz
Posts: 2
Joined: Thu Nov 27, 2014 11:56 am

Re: Backward and forward citations; Problems with syntax

Post by curcitiz » Fri Nov 28, 2014 9:27 am

I am doing a research for my MSc, and I will need those information for a bunch of firms operating in the wind energy industry.

1) Publication number
2) Publication date year: patent from 2006 to 2010 included.
3) Name of the applicant (in my case will be the firm (example VESTAS). I know there are a lot of VESTAS but I will delete manually the ones I don't need)
4) Cited by (So how many another patents cited the one I am viewing. I will use this data to assess the quality of that patent).

I try already to put some queries in PATSTAT, and I was suggested to use GPI but it always show me data I don't need and moreover I m not able to download the data in excel in order to delete the noisy results.

Could you please help me out with the query (PATSTAT or even better GPI)?

Thank you very much


Patrick Le Gonidec
Posts: 23
Joined: Mon Nov 17, 2014 1:31 pm

Re: Backward and forward citations; Problems with syntax

Post by Patrick Le Gonidec » Thu Dec 11, 2014 10:33 am

A possible GPI query would be:
pud[2006,2010] and app=vestas
Today, it returns 1.528 publications (corresponding to 504 simple families and 1.277 applications)

In terms of citation data, a GPI document includes:
- Backward citations: They are searchable and can be added in your result list for download
- Forward citations: They are not searchable and are include on-the-fly when displaying a GPI document. They cannot be included in your result list.

In terms of download:
- Current download limit is 1.500 items (GPI documents or result list entries)
- Result list download formats are: PDF HTML XLS CSV XML - let me know the kind of issue you had when downloading in XLS.
- Document download formats are: PDF RTF XML - PDF and RTF include forward citations.
- Result list and document contents can be customised for display and download (e.g. add new columns in your result list, remove data in a document).

For future posts about GPI, please use the newly created GPI forum http://forums.epo.org/global-patent-index-gpi/


Geert Boedt
Posts: 176
Joined: Tue Oct 19, 2004 10:36 am
Location: Vienna

Re: Backward and forward citations; Problems with syntax

Post by Geert Boedt » Thu Jan 08, 2015 10:47 am

@ curcitiz

Additional to the previous replies, I would like to shortly explain how it can be done in PATSTAT on-line.
Your question:
1) Publication number
2) Publication date year: patent from 2006 to 2010 included.
3) Name of the applicant (in my case will be the firm (example VESTAS). I know there are a lot of VESTAS but I will delete manually the ones I don't need)
4) Cited by (So how many another patents cited the one I am viewing. I will use this data to assess the quality of that patent).


Small observation: most researchers will NOT count citations based on publications, but based on (simple) patent families. This is called family-family citations. The main reason to do a family-family citation count is to avoid introduction of artefacts based on family size or citation practices in patent offices (by patent examiners). A family-family citation is simply looking at "how many inventions, cite another invention", independent from the geographical scope of the protection (family size). The concept of family-family citations is well explained in the "OECD Patent Statistics Manual": http://www.oecd.org/science/inno/oecdpa ... manual.htm

Because the EPO knows that many researchers are interest in this kind of data, we have pre-calculated it for you in PATSTAT on-line. Each application in the TLS201_APPLN table has an attribute NB_CITATIONS that holds the number of family citations for the family to which that specific application belongs. (each application belongs to one and only one simple family, so this value is the same for each simple family member).

The query below will give you a list of all applications filed > 2004 that have the applicant name starting with "Vestas", an the number of times this application has been cited by another family.

Code: Select all

Select  tls201_appln.appln_id, tls201_appln.appln_auth, tls201_appln.appln_nr, tls201_appln.appln_kind, tls201_appln.appln_filing_date, tls201_appln.nb_citations, tls206_person.hrm_l2  from
tls201_appln join tls207_pers_appln on tls201_appln.appln_id = tls207_pers_appln.appln_id
join tls206_person on tls207_pers_appln.person_id = tls206_person.person_id
where applt_seq_nr > 0 and hrm_l2 like 'vestas%' and tls201_appln.appln_filing_year > '2004'
group by tls201_appln.appln_id, tls201_appln.appln_auth, tls201_appln.appln_nr, tls201_appln.appln_kind, tls201_appln.appln_filing_date, tls201_appln.nb_citations, tls206_person.hrm_l2 
order by  tls201_appln.nb_citations desc
Geert Boedt
Patent information for business use
EPO Vienna
Best regards,

Geert Boedt
PATSTAT support
Business Use of Patent Information
EPO Vienna


Post Reply