I am currently writing a research paper on corporate innovativeness in the US and trying to retrieve patent data from PATSTAT Online (Spring 2020), instead of using the query form provided by the USPTO.
However, I am currently struggling a bit with the SQL coding and hoping to find some help here.
My aim is to collect the number of patent application per US company per year (for the period of 2013-2019). For example:
Company name - Fiscal year - Number of patent applications
Microsoft Corporation - 2013 - 2108
Microsoft Corporation - 2014 - 1496
Microsoft Corporation - 2015 - …
Microsoft Corporation - …
Agilent Technologies - 2013 - 108
Agilent Technologies - 2014 - …
Agilent Technologies - ...
So far, I managed to work out the following code:
Code: Select all
SELECT psn_name,
count(distinct((case when appln_filing_year = 2013 then tls201_appln.appln_id end))) as "2013",
count(distinct((case when appln_filing_year = 2014 then tls201_appln.appln_id end))) as "2014",
count(distinct((case when appln_filing_year = 2015 then tls201_appln.appln_id end))) as "2015",
count(distinct((case when appln_filing_year = 2016 then tls201_appln.appln_id end))) as "2016",
count(distinct((case when appln_filing_year = 2017 then tls201_appln.appln_id end))) as "2017",
count(distinct((case when appln_filing_year = 2018 then tls201_appln.appln_id end))) as "2018",
count(distinct((case when appln_filing_year = 2019 then tls201_appln.appln_id end))) as "2019",
count(distinct(tls201_appln.appln_id )) as "total"
FROM tls201_appln
join tls207_pers_appln on tls201_appln.appln_id = tls207_pers_appln.appln_id
join tls206_person on tls207_pers_appln.person_id = tls206_person.person_id
where
(psn_name like 'MICROSOFT CORP%' or psn_name like 'AGILENT TECHNO%')
and applt_seq_nr > 0
and (appln_auth = 'US' or receiving_office = ‘US’)
group by psn_name
order by psn_name
- Starting with year 2015 onwards, the count of patents is drastically lower in comparison to numbers for 2013, 2014 (e.g. Microsoft 437 in 2015 vs 3128 in 2013) as well as in comparison with data provided by the USPTO (e.g., Microsoft 437 in 2015 via PATSTAT vs 2296 in 2015 via USPTO). Am I missing anything in my code to ensure the number of patents revealed via PASTAT is similar to the numbers provided by USPTO?
- As my sample of US companies exceeds 1200, is there an easier way to access the patent data for these corporations besides putting the individual’s name in? It would also be absolutely fine if the list just contains the counts of US patent applications for the period from 2013 to 2019 for all US companies, and I extract the necessary companies in Excel at a later stage.
Best, Katharina