SQL Query on Patent Data from 2005 - 2017

Here you can post your opinions, ask questions and share experiences on the PATSTAT product line. Please always indicate the PATSTAT edition (e.g. 2015 Autumn Edition) and the database (e.g. PATSTAT Online, MySQL, MS SQL Server, ...) you are using.
Post Reply

Lennart_O
Posts: 1
Joined: Fri Sep 20, 2019 1:19 pm

SQL Query on Patent Data from 2005 - 2017

Post by Lennart_O » Fri Sep 20, 2019 1:30 pm

Hello!

I am using PATSTAT Online with the Spring 2019 data base.

I want to extract patent data on a firm-level. I have tried to apply the SQL Query, which I found in the Help document ("5 easy to understand SQL queries") with the only adjustment that I deleted the italic entry as I am interested in patents by all companies.

SELECT tls201_appln.appln_id, appln_auth +appln_nr + appln_kind number, appln_filing_date, psn_name, appln_nr_epodoc, appln_title
FROM tls201_appln JOIN tls207_pers_appln ON tls201_appln.appln_id = tls207_pers_appln.appln_id
JOIN tls206_person ON tls207_pers_appln.person_id = tls206_person.person_id
JOIN tls202_appln_title ON tls202_appln_title.appln_id = tls201_appln.appln_id
WHERE psn_name = 'NOKIA SOLUTIONS AND NETWORKS'
-- all application filed by = 'NOKIA SOLUTIONS AND NETWORKS'
ORDER BY appln_filing_date desc -- orders the results on the application filing date

However this returns a data table that does not include the company name.

Just as an FYI I am intending to create a data file that displays for each year from 2005-2015:
- company name of the applicant
- the company's NACE code
- the ECLA code of the patent

- all limited to EU member states.

I am also not confident whether the standardized company name is sufficient as I have a lot of smaller companies. Can you help out here?

Thank you for your help and sorry if this is a rather trivial task, but my experience with SQL is very limited.

Kind regards
Lennart


EPO / PATSTAT Support
Posts: 425
Joined: Thu Feb 22, 2007 5:33 pm
Contact:

Re: SQL Query on Patent Data from 2005 - 2017

Post by EPO / PATSTAT Support » Tue Sep 24, 2019 11:10 am

Hello Lennart,
a couple of small observations:
In PATSTAT we used the IPC/NACE concordance table to link NACE codes to patent applications.
In fact, 1 application can belong to multiple NACE codes and will then be given a weight. (table tls229)
Non EPO data sources mostly assign a (or multiple) NACE codes to the company / applicant. But such is not the case in PATSTAT. Assigning NACE codes at application level gives a much more granular overview on the various industrial areas a company is active in. Researchers can create their own concordance table by for example assigning 1 NACE code based on an application count for the respective companies. We have done this for this study:
ip_intensive_industries_en.pdf
(1.02 MiB) Downloaded 261 times
About ECLA: this has been replaced by CPC.
1 patent application can have (and mostly has) multiple CPC codes assigned. If you aggregate applications at the same time at CPC and NACE level, then you will have multiple rows (sometimes > 20) for each application. I aggregated the CPC codes at 4 digits to reduce the number, but this kind of query will always give many rows for 1 application.

If your data is aggregated at application level, then you can just as well use the original applicant name as filed. In the example, I kept both to illustrate the difference; and this does not create extra rows.
Your conditions "all limited to EU member states"; I assume you mean the country of the applicants.
I further limited the query to EP applications to reduce the scope.

The query below will give you a data table more or less to your description. But as it results in more then 2.1 million rows, I would apply more stringent conditions, or maybe remove the CPC table/data from the query if you don't intend to use if for further analysis.

Code: Select all

SELECT tls201_appln.appln_id, appln_auth +appln_nr + appln_kind number, appln_filing_date, psn_name,
person_name, left (cpc_class_symbol, 4) as CPC, tls229_appln_nace2.nace2_code, nace_descrip.nace2_descr
FROM tls201_appln JOIN tls207_pers_appln ON tls201_appln.appln_id = tls207_pers_appln.appln_id
JOIN tls206_person ON tls207_pers_appln.person_id = tls206_person.person_id
JOIN tls224_appln_cpc on tls201_appln .appln_id = tls224_appln_cpc.appln_id 
join tls229_appln_nace2 on tls201_appln.appln_id = tls229_appln_nace2.appln_id
join (SELECT distinct nace2_code,nace2_descr FROM tls902_ipc_nace2) 
nace_descrip on tls229_appln_nace2.nace2_code = nace_descrip.nace2_code
WHERE appln_filing_year between 2005 and 2015
and appln_auth = 'EP'
and person_ctry_code  in (select  ctry_code from tls801_country where eu_member = 'Y')
and applt_seq_nr > 0 and invt_seq_nr = 0 -- excluding pure inventors and inventor/applicants 
group by tls201_appln.appln_id, appln_auth +appln_nr + appln_kind, appln_filing_date, psn_name,
person_name, left (cpc_class_symbol, 4), tls229_appln_nace2.nace2_code, nace_descrip.nace2_descr
ORDER BY psn_name, appln_filing_date, tls201_appln.appln_id, left (cpc_class_symbol,4)
,tls229_appln_nace2.nace2_code, nace_descrip.nace2_descr
PATSTAT Support Team
EPO - Vienna
patstat @ epo.org


Post Reply