US Patents per company and year - SQL issue

Here you can post your opinions, ask questions and share experiences on the PATSTAT product line. Please always indicate the PATSTAT edition (e.g. 2015 Autumn Edition) and the database (e.g. PATSTAT Online, MySQL, MS SQL Server, ...) you are using.
Post Reply

Research_Account
Posts: 1
Joined: Thu Jun 11, 2020 4:15 pm

US Patents per company and year - SQL issue

Post by Research_Account » Thu Jun 11, 2020 4:46 pm

Dear all,

I am currently writing a research paper on corporate innovativeness in the US and trying to retrieve patent data from PATSTAT Online (Spring 2020), instead of using the query form provided by the USPTO.

However, I am currently struggling a bit with the SQL coding and hoping to find some help here.

My aim is to collect the number of patent application per US company per year (for the period of 2013-2019). For example:

Company name - Fiscal year - Number of patent applications
Microsoft Corporation - 2013 - 2108
Microsoft Corporation - 2014 - 1496
Microsoft Corporation - 2015 - …
Microsoft Corporation - …
Agilent Technologies - 2013 - 108
Agilent Technologies - 2014 - …
Agilent Technologies - ...

So far, I managed to work out the following code:

Code: Select all

SELECT psn_name,
count(distinct((case when appln_filing_year = 2013 then tls201_appln.appln_id end))) as "2013",
count(distinct((case when appln_filing_year = 2014 then tls201_appln.appln_id end))) as "2014",
count(distinct((case when appln_filing_year = 2015 then tls201_appln.appln_id end))) as "2015",
count(distinct((case when appln_filing_year = 2016 then tls201_appln.appln_id end))) as "2016",
count(distinct((case when appln_filing_year = 2017 then tls201_appln.appln_id end))) as "2017",
count(distinct((case when appln_filing_year = 2018 then tls201_appln.appln_id end))) as "2018",
count(distinct((case when appln_filing_year = 2019 then tls201_appln.appln_id end))) as "2019",
count(distinct(tls201_appln.appln_id )) as "total"
FROM tls201_appln 
  join tls207_pers_appln on tls201_appln.appln_id = tls207_pers_appln.appln_id
  join tls206_person on tls207_pers_appln.person_id = tls206_person.person_id
where 
(psn_name like 'MICROSOFT CORP%' or psn_name like 'AGILENT TECHNO%')
and applt_seq_nr > 0
and (appln_auth = 'US' or receiving_office = ‘US’)
group by psn_name
order by psn_name
However, I am facing now two key issues that I would really appreciate to receive some help with:
  1. Starting with year 2015 onwards, the count of patents is drastically lower in comparison to numbers for 2013, 2014 (e.g. Microsoft 437 in 2015 vs 3128 in 2013) as well as in comparison with data provided by the USPTO (e.g., Microsoft 437 in 2015 via PATSTAT vs 2296 in 2015 via USPTO). Am I missing anything in my code to ensure the number of patents revealed via PASTAT is similar to the numbers provided by USPTO?
  2. As my sample of US companies exceeds 1200, is there an easier way to access the patent data for these corporations besides putting the individual’s name in? It would also be absolutely fine if the list just contains the counts of US patent applications for the period from 2013 to 2019 for all US companies, and I extract the necessary companies in Excel at a later stage.
Thank you so much in advance.
Best, Katharina


EPO / PATSTAT Support
Posts: 433
Joined: Thu Feb 22, 2007 5:33 pm
Contact:

Re: US Patents per company and year - SQL issue

Post by EPO / PATSTAT Support » Fri Jun 19, 2020 10:07 am

Dear Katharina,
your query is fine, and it is indeed good practice to question results when they do not seem to fit the expected pattern.
In this case, the applicant "Microsoft Corp" is being phased out - and replaced by "MICROSOFT TECHNOLOGY LICENSING" as the main applicant name used by Microsoft to file its patents.
Have a look at this:

Code: Select all

SELECT psn_name,
count(distinct((case when appln_filing_year = 2013 then tls201_appln.appln_id end))) as "2013",
count(distinct((case when appln_filing_year = 2014 then tls201_appln.appln_id end))) as "2014",
count(distinct((case when appln_filing_year = 2015 then tls201_appln.appln_id end))) as "2015",
count(distinct((case when appln_filing_year = 2016 then tls201_appln.appln_id end))) as "2016",
count(distinct((case when appln_filing_year = 2017 then tls201_appln.appln_id end))) as "2017",
count(distinct((case when appln_filing_year = 2018 then tls201_appln.appln_id end))) as "2018",
count(distinct((case when appln_filing_year = 2019 then tls201_appln.appln_id end))) as "2019",
count(distinct(tls201_appln.appln_id )) as "total"
FROM tls201_appln 
  join tls207_pers_appln on tls201_appln.appln_id = tls207_pers_appln.appln_id
  join tls206_person on tls207_pers_appln.person_id = tls206_person.person_id
where 
(psn_name like 'MICROSOFt%' or psn_name like 'AGILENT TECHNO%')
and applt_seq_nr > 0
and (appln_auth = 'US' or receiving_office = 'US')
group by psn_name
order by psn_name
One could of course question whether the PATSTAT team should "harmonise" further the applicant names and group them; but doing so would also mean that some applicant filing behaviour would disappear in obscurity.
Many multinationals have created separate legal entities to manage their IP. In Europe this is very visible in the pharmaceutical sector, where companies have IP holding entities in Switzerland (with high turnover and low numbers of staff).
"Patent boxes" are partly the reason for such strategy. (https://qdd.oecd.org/data/IP_Regimes)
On your second question: indeed, putting the name of 1200 companies is not possible in PATSTAT Online.
On a locally installed PATSTAT data base, one would work with a file containing the company names, and then use some kind of matching algorithm.
On PATSTAT Online, I would remove the company name and wild cards, and extract a list of companies based on a criteria based on number of filings. You can adapt the "HAVING"clause conditions to your needs.
Something like this:

Code: Select all

SELECT psn_name,
count(distinct((case when appln_filing_year = 2013 then tls201_appln.appln_id end))) as "2013",
count(distinct((case when appln_filing_year = 2014 then tls201_appln.appln_id end))) as "2014",
count(distinct((case when appln_filing_year = 2015 then tls201_appln.appln_id end))) as "2015",
count(distinct((case when appln_filing_year = 2016 then tls201_appln.appln_id end))) as "2016",
count(distinct((case when appln_filing_year = 2017 then tls201_appln.appln_id end))) as "2017",
count(distinct((case when appln_filing_year = 2018 then tls201_appln.appln_id end))) as "2018",
count(distinct((case when appln_filing_year = 2019 then tls201_appln.appln_id end))) as "2019",
count(distinct((case when appln_filing_year between 2013 and 2019 then tls201_appln.appln_id end))) as "total_2013-2019"
FROM tls201_appln 
  join tls207_pers_appln on tls201_appln.appln_id = tls207_pers_appln.appln_id
  join tls206_person on tls207_pers_appln.person_id = tls206_person.person_id
where 
applt_seq_nr > 0 and invt_seq_nr = 0
and (appln_auth = 'US' or receiving_office = 'US')
and appln_filing_year between 2013 and 2019
group by psn_name
having count(distinct((case when appln_filing_year between 2013 and 2019 then tls201_appln.appln_id end))) > 20
order by psn_name
You can then export the list and do further selecting or grouping in for example Excel.
PATSTAT Support Team
EPO - Vienna
patstat @ epo.org


Post Reply