Page 1 of 1

Collaborative and non-collaborative patents

Posted: Fri Jun 04, 2021 11:12 am
by Isabel C.
Hello,

I need some help to improve a code that I am using to retrieve ALL French patents involving universities (both collaborative and non-collaborative) from 1978 to 2018. We will have to replicate the same thing for Germany and Italy, for universities and gov non-profit.

To get the collaborative patents, I am using the following code:

Code: Select all

SELECT distinct  tls201_appln.appln_id, appln_auth,appln_nr, appln_filing_year, nb_applicants, nb_inventors, psn_name, tls206_person.NUTS, nuts_label , person_ctry_code, psn_sector, granted ,techn_field.techn_sector,techn_field.techn_field, docdb_family_size, nb_citing_docdb_fam
  FROM tls201_appln
  join tls207_pers_appln on tls201_appln.appln_id = tls207_pers_appln.appln_id
  join tls206_person on tls207_pers_appln.person_id = tls206_person.person_id
  join tls230_appln_techn_field on tls201_appln.appln_id = tls230_appln_techn_field.appln_id
  join (SELECT distinct techn_field_nr, techn_sector, techn_field FROM tls901_techn_field_ipc) techn_field
  on tls230_appln_techn_field.techn_field_nr = techn_field.techn_field_nr
  join tls904_nuts on tls206_person.nuts = tls904_nuts.nuts
WHERE nb_applicants > 1
and
tls201_appln.appln_id in
(select appln_id from tls207_pers_appln join tls206_person on tls207_pers_appln.person_id = tls206_person.person_id where psn_sector in 
([b]'UNIVERSITY',
		'UNIVERSITY GOV NON-PROFIT',
		'UNIVERSITY GOV NON-PROFIT COMPANY',
		'UNIVERSITY HOSPITAL',
		'UNIVERSITY UNIVERSITY'[/b]))
and
tls201_appln.appln_id in
(select appln_id from tls207_pers_appln join tls206_person on tls207_pers_appln.person_id = tls206_person.person_id where psn_sector in 
('[b]GOV NON-PROFIT UNIVERSITY',
		'HOSPITAL',
		'UNIVERSITY',
		'UNIVERSITY HOSPITAL'[/b]
		))
and applt_seq_nr > 0
and appln_filing_year between 1977 and 2018
and appln_auth = 'EP'
and tls201_appln.appln_id in (select appln_id from tls207_pers_appln
                                                     join tls206_person on tls207_pers_appln.person_id = tls206_person.person_id
                                                     where applt_seq_nr > 0 and person_ctry_code ='FR')
and techn_sector = 'Chemistry'
order by appln_filing_year, tls201_appln.appln_id
Yet, we also need to get the patents owned also only by universities. You already helped me develop a code to get all French patents involving universities. The distinction on who is the collaborative actor is not clear (COMPANY? HOSPITAL? etc.). We will probably use the following code to generate an index of output by universities in an econometric model:

Code: Select all

SELECT distinct  tls201_appln.appln_id, appln_auth,appln_nr, appln_filing_year, nb_applicants, nb_inventors,
psn_name, tls206_person.NUTS, nuts_label , person_ctry_code, psn_sector, granted , techn_sector,
STRING_AGG(CAST(techn_field.techn_field AS NVARCHAR(MAX)), CHAR(13) + CHAR(10)) techn_fields
  FROM tls201_appln
  join tls207_pers_appln on tls201_appln.appln_id = tls207_pers_appln.appln_id
  join tls206_person on tls207_pers_appln.person_id = tls206_person.person_id
  join tls230_appln_techn_field on tls201_appln.appln_id = tls230_appln_techn_field.appln_id
  join (SELECT distinct techn_field_nr, techn_sector, techn_field FROM tls901_techn_field_ipc) techn_field
  on tls230_appln_techn_field.techn_field_nr = techn_field.techn_field_nr
  join tls904_nuts on tls206_person.nuts = tls904_nuts.nuts
WHERE psn_sector = 'UNIVERSITY'
and applt_seq_nr > 0
and appln_filing_year between 1977 and 2018
and appln_auth = 'EP'
and person_ctry_code ='DE'
and techn_sector = 'Chemistry'
group by tls201_appln.appln_id, appln_auth,appln_nr, appln_filing_year, nb_applicants, nb_inventors,
psn_name, tls206_person.NUTS, nuts_label , person_ctry_code, psn_sector, granted, techn_sector
order by appln_filing_year, tls201_appln.appln_id
It would be better for my colleagues and me to have already a unique dataset. Nonetheless, if it is not feasible, we will proceed by merging and further cleaning manually.

Last but not least, we would like to get information on the patent family and see where those patents are protected and its quality. We assume that the higher the countries' number where a patent is protected, the higher its economic value. I used docdb_family_size and nb_citing_docdb_fam to get this information. If I am right, the former corresponds to the size of the family of a given application. The same invention disclosed by a common inventor is patented in more than one country. The latter indicates the number of forward citations (simple families citing at least one of the publications or applications of the DOCDB simple family of the current application). Hence, it should be a good indicator of the quality and impact of the patent. Is this correct as a proxy for our assumption or would you have any other suggestion?
On the contrary, granted is assumed to indicate the novelty of the discovery.

Sorry for this very long question. Hope you could help us improve our work.

Best,

I.C.

Re: Collaborative and non-collaborative patents

Posted: Mon Jun 07, 2021 11:15 am
by EPO / PATSTAT Support
Hello Isabel,
here is a simple query that selects all EP patents (labelled "Chemistry") having at least 1 German University as applicant and nothing else. (Some patents have more then 1 German applicant that is labelled as a university: https://worldwide.espacenet.com/patent/ ... ep07000507 )

Code: Select all

SELECT distinct  tls201_appln.appln_id, appln_auth,appln_nr, appln_filing_year, nb_applicants, nb_inventors,
psn_name, tls206_person.NUTS, nuts_label , person_ctry_code, psn_sector, granted , techn_sector,
STRING_AGG(CAST(techn_field.techn_field AS NVARCHAR(MAX)), CHAR(13) + CHAR(10)) techn_fields
  FROM tls201_appln
  join tls207_pers_appln on tls201_appln.appln_id = tls207_pers_appln.appln_id
  join tls206_person on tls207_pers_appln.person_id = tls206_person.person_id
  join tls230_appln_techn_field on tls201_appln.appln_id = tls230_appln_techn_field.appln_id
  join (SELECT distinct techn_field_nr, techn_sector, techn_field FROM tls901_techn_field_ipc) techn_field
  on tls230_appln_techn_field.techn_field_nr = techn_field.techn_field_nr
  join tls904_nuts on tls206_person.nuts = tls904_nuts.nuts
WHERE psn_sector = 'UNIVERSITY'
and applt_seq_nr > 0
and appln_filing_year between 1977 and 2018
and appln_auth = 'EP'
and person_ctry_code ='DE'
and techn_sector = 'Chemistry'
and tls201_appln.appln_id not in 
(select appln_id from tls207_pers_appln join tls206_person on tls206_person.person_id = tls207_pers_appln.person_id and applt_seq_nr > 0
where psn_sector in (' ',
'COMPANY',
'COMPANY GOV NON-PROFIT',
'COMPANY GOV NON-PROFIT UNIVERSITY',
'COMPANY HOSPITAL',
'COMPANY INDIVIDUAL',
'COMPANY UNIVERSITY',
'GOV NON-PROFIT',
'GOV NON-PROFIT HOSPITAL',
'GOV NON-PROFIT UNIVERSITY',
'HOSPITAL',
'INDIVIDUAL',
'UNKNOWN'))
group by tls201_appln.appln_id, appln_auth,appln_nr, appln_filing_year, nb_applicants, nb_inventors,
psn_name, tls206_person.NUTS, nuts_label , person_ctry_code, psn_sector, granted, techn_sector
Extracting a data set is very easy in PATSTAT, just go to the download section and download a subset in CSV or MS ACCESS format. You can then use that data set for whatever other analysis you want to do.

On the family information: the docdb_family_size (in tls201) gives indeed the number of family members. It is generally accepted (and confirmed in many research papers) that a larger family is a proxy of value because the applicant "is willing to invest resources" in protecting his invention in a larger geographical region.
Forward citations have nothing to do with the family as such, although larger families probably receive more forward citations as they are more visible (in different languages). There are many forum posts on forward patent citations.

Re: Collaborative and non-collaborative patents

Posted: Mon Jun 07, 2021 11:36 am
by Isabel C.
Hello,

Thank you for the answer. I know how to download from PATSTAT. I already extracted many data from this database. Yet, this time I needed clarifications on how to improve the code. I need both collaborative and non-collaborative data. In the first code that I uploaded, I can get all the info for universities co-patenting with another actor (GOV NON-PROFIT, COMPANY, etc.). With the second, code I already get all the patents owned only by one university and owned by a university and another actor, which I cannot see.
I would like to improve the first code by getting in one unique dataset also non-collaborative patents. That is I need to see in one unique dataset all patents involving universities and when there are two applicants or more, who is the external collaborator.

Thank you very much for the clarification on the forward citation.

Re: Collaborative and non-collaborative patents

Posted: Mon Jun 07, 2021 12:25 pm
by EPO / PATSTAT Support
The queries are all variations on the same theme; in the WHERE part of the SQL query, you specify what "group" of patents the final result list should look like. The "AND tls201_appln.appln_id in (select.....)" defines intersections of samples of patents.
So here is a query that selects all the patents that have at least 1 university and at least one non-university (this can be a physical person, like the professor working at the university); and it shows in the final list ALL the applicants.

Code: Select all

SELECT distinct  tls201_appln.appln_id, appln_auth,appln_nr, appln_filing_year, nb_applicants, nb_inventors,
psn_name, tls206_person.NUTS, nuts_label , person_ctry_code, psn_sector, granted , techn_sector,
STRING_AGG(CAST(techn_field.techn_field AS NVARCHAR(MAX)), CHAR(13) + CHAR(10)) techn_fields
  FROM tls201_appln
  join tls207_pers_appln on tls201_appln.appln_id = tls207_pers_appln.appln_id
  join tls206_person on tls207_pers_appln.person_id = tls206_person.person_id
  join tls230_appln_techn_field on tls201_appln.appln_id = tls230_appln_techn_field.appln_id
  join (SELECT distinct techn_field_nr, techn_sector, techn_field FROM tls901_techn_field_ipc) techn_field
  on tls230_appln_techn_field.techn_field_nr = techn_field.techn_field_nr
  join tls904_nuts on tls206_person.nuts = tls904_nuts.nuts
WHERE
appln_filing_year between 1977 and 2018
and appln_auth = 'EP'
and techn_sector = 'Chemistry'
and applt_seq_nr > 0
and invt_seq_nr = 0 --remove this line if you als want to see more applicants/inventors in your list.
and person_ctry_code = 'DE'

and  tls201_appln.appln_id in
(select appln_id from tls207_pers_appln join tls206_person on tls206_person.person_id = tls207_pers_appln.person_id and applt_seq_nr > 0
where psn_sector in ('university') and person_ctry_code = 'DE' and applt_seq_nr > 0 and invt_seq_nr = 0)

and tls201_appln.appln_id in 
(select appln_id from tls207_pers_appln join tls206_person on tls206_person.person_id = tls207_pers_appln.person_id 
where psn_sector in (' ',
'COMPANY',
'COMPANY GOV NON-PROFIT',
'COMPANY GOV NON-PROFIT UNIVERSITY',
'COMPANY HOSPITAL',
'COMPANY INDIVIDUAL',
'COMPANY UNIVERSITY',
'GOV NON-PROFIT',
'GOV NON-PROFIT HOSPITAL',
'GOV NON-PROFIT UNIVERSITY',
'HOSPITAL',
'INDIVIDUAL',
'UNKNOWN') and person_ctry_code = 'DE' and applt_seq_nr > 0 and invt_seq_nr= 0)

group by tls201_appln.appln_id, appln_auth,appln_nr, appln_filing_year, nb_applicants, nb_inventors,
psn_name, tls206_person.NUTS, nuts_label , person_ctry_code, psn_sector, granted, techn_sector

Re: Collaborative and non-collaborative patents

Posted: Sat Jul 03, 2021 1:17 pm
by jackyjoy123
Isabel C. wrote:
Mon Jun 07, 2021 11:36 am
Hello,

Thank you for the answer. I know how to download from PATSTAT. I already extracted many data from this database. Yet, this time I needed clarifications on how to improve the code. I need both collaborative and non-collaborative data. In the first code that I uploaded, I can get all the info for universities co-patenting with another actor (GOV NON-PROFIT, COMPANY, etc.). With the second, code I already get all the patents owned only by one university and owned by a university and another actor, which I cannot see.
I would like to improve the first code by getting in one unique dataset also non-collaborative patents. That is I need to see in one unique dataset all patents involving universities and when there are two applicants or more, who is the external collaborator.

Thank you very much for the clarification on the forward citation.
thanks my issue has been fixed.