Multiple companies

Here you can post your opinions, ask questions and share experiences on the PATSTAT product line. Please always indicate the PATSTAT edition (e.g. 2015 Autumn Edition) and the database (e.g. PATSTAT Online, MySQL, MS SQL Server, ...) you are using.
Post Reply

stanpat
Posts: 1
Joined: Sat Feb 15, 2020 11:01 pm

Multiple companies

Post by stanpat » Sat Feb 15, 2020 11:14 pm

Hello guys,

I'm not sure if this is possible at all, was hoping that some of you could provide some guidance on the feasibility of my idea. I have an excel file with the names of companies (around 3,000) and different dates for each of the companies. I need to find the number of patents and citations (by technology class) for each of those companies 5 years before and 5 years after those dates (each company has a different unique date). I was wondering if there is a way to avoid doing it manually for each of the companies?

Cheers


EPO / PATSTAT Support
Posts: 230
Joined: Thu Feb 22, 2007 5:33 pm
Contact:

Re: Multiple companies

Post by EPO / PATSTAT Support » Thu Feb 20, 2020 8:39 am

Hello Stanpat,
The data is there, the problem is to link your list with company names to the applicant names in the tls206_person table from PATSTAT.
With PATSTAT Online, this is basically not possible as you can not upload your own data into the hosted database to start your matching and data aggregation process. Currently, the only option is to get a local PATSTAT installation. Are there common characteristics amongst the 3000 companies that would allow grouping them ?
PATSTAT Support Team
EPO - Vienna
patstat @ epo.org


Rutgervtspijker
Posts: 3
Joined: Mon Apr 27, 2020 12:09 pm

Re: Multiple companies

Post by Rutgervtspijker » Mon Apr 27, 2020 5:41 pm

Hi,

My question relates to the above question. I would need the same amount of data for around 3000 companies. However, by doing this manually it could be something like this, with one applicant at the time.

SELECT psn_name ,
COUNT(distinct(case when year(appln_filing_date) = 2002 then tls201_appln.appln_id end)) as "2002",
COUNT(distinct(case when year(appln_filing_date) = 2003 then tls201_appln.appln_id end)) as "2003",
COUNT(distinct(case when year(appln_filing_date) = 2004 then tls201_appln.appln_id end)) as "2004",
COUNT(distinct(case when year(appln_filing_date) = 2005 then tls201_appln.appln_id end)) as "2005",
COUNT(distinct(case when year(appln_filing_date) = 2006 then tls201_appln.appln_id end)) as "2006",
COUNT(distinct(case when year(appln_filing_date) = 2007 then tls201_appln.appln_id end)) as "2007",
COUNT(distinct(case when year(appln_filing_date) = 2008 then tls201_appln.appln_id end)) as "2008",
COUNT(distinct(case when year(appln_filing_date) = 2009 then tls201_appln.appln_id end)) as "2009",
COUNT(distinct(case when year(appln_filing_date) = 2010 then tls201_appln.appln_id end)) as "2010",
COUNT(distinct(case when year(appln_filing_date) = 2011 then tls201_appln.appln_id end)) as "2011",
COUNT(distinct(case when year(appln_filing_date) = 2012 then tls201_appln.appln_id end)) as "2012",
COUNT(distinct(case when year(appln_filing_date) = 2013 then tls201_appln.appln_id end)) as "2013",
COUNT(distinct(case when year(appln_filing_date) = 2014 then tls201_appln.appln_id end)) as "2014",
COUNT(distinct(case when year(appln_filing_date) = 2015 then tls201_appln.appln_id end)) as "2015",
COUNT(distinct(case when year(appln_filing_date) = 2016 then tls201_appln.appln_id end)) as "2016",
COUNT(distinct(case when year(appln_filing_date) = 2017 then tls201_appln.appln_id end)) as "2017",
COUNT(distinct(case when year(appln_filing_date) = 2018 then tls201_appln.appln_id end)) as "2018",
COUNT(distinct(case when year(appln_filing_date) = 2019 then tls201_appln.appln_id end)) as "2019"
FROM tls201_appln
join tls207_pers_appln on tls201_appln.appln_id = tls207_pers_appln.appln_id
join tls206_person on tls206_person.person_id = tls207_pers_appln.person_id
where psn_name = 'samsung' and applt_seq_nr > 0 and appln_filing_year between 2002 and 2020
group by psn_name
order by psn_name

I would really like to link this to citations, if possible. My question to you guys is, is this a feasible way to do it and besides count can I link this to citations? I am new to patstat so any help is welcome

Best regards,
Rutger


EPO / PATSTAT Support
Posts: 230
Joined: Thu Feb 22, 2007 5:33 pm
Contact:

Re: Multiple companies

Post by EPO / PATSTAT Support » Tue Apr 28, 2020 9:29 am

Hello Rutger,
doing an analysis on a user-defined list of 3000 companies is not possible in PATSTAT Online because users can not upload their own list to our hosted data base.
Options are: you purchase PATSTAT and install it on a local server.
Or you find a "best common denominator in the patent data" that would group the majority of your companies. The most obvious candidates would be "the country" of the applicant or maybe an IPC code that defines a technology those 3000 companies have in common.
In the example: Samsung being a Korean company, one could make a table for all Korean companies, and then sort & group out what you need after you have downloaded the results.
Something like the query below.
Pitfall: one has to keep in mind that the precision of the numbers depends on the precision of the PSN name and the completeness of the data. In this case, applications filed in China by Korean applicants will hardly be in the data because the applicant country data received from China is incomplete.
I also filtered away all applicants that have less then 20 applications, but this might result in some applicants being filtered out because of spelling variations that have not been picked up by the PSN process. So to improve quality, set the filter at a lower valued, but you will get more manual work to do in cleaning & grouping names that obviously belong together.

Code: Select all

COUNT(distinct(case when year(appln_filing_date) = 2003 then tls201_appln.appln_id end)) as "2003",
COUNT(distinct(case when year(appln_filing_date) = 2004 then tls201_appln.appln_id end)) as "2004",
COUNT(distinct(case when year(appln_filing_date) = 2005 then tls201_appln.appln_id end)) as "2005",
COUNT(distinct(case when year(appln_filing_date) = 2006 then tls201_appln.appln_id end)) as "2006",
COUNT(distinct(case when year(appln_filing_date) = 2007 then tls201_appln.appln_id end)) as "2007",
COUNT(distinct(case when year(appln_filing_date) = 2008 then tls201_appln.appln_id end)) as "2008",
COUNT(distinct(case when year(appln_filing_date) = 2009 then tls201_appln.appln_id end)) as "2009",
COUNT(distinct(case when year(appln_filing_date) = 2010 then tls201_appln.appln_id end)) as "2010",
COUNT(distinct(case when year(appln_filing_date) = 2011 then tls201_appln.appln_id end)) as "2011",
COUNT(distinct(case when year(appln_filing_date) = 2012 then tls201_appln.appln_id end)) as "2012",
COUNT(distinct(case when year(appln_filing_date) = 2013 then tls201_appln.appln_id end)) as "2013",
COUNT(distinct(case when year(appln_filing_date) = 2014 then tls201_appln.appln_id end)) as "2014",
COUNT(distinct(case when year(appln_filing_date) = 2015 then tls201_appln.appln_id end)) as "2015",
COUNT(distinct(case when year(appln_filing_date) = 2016 then tls201_appln.appln_id end)) as "2016",
COUNT(distinct(case when year(appln_filing_date) = 2017 then tls201_appln.appln_id end)) as "2017",
COUNT(distinct(case when year(appln_filing_date) = 2018 then tls201_appln.appln_id end)) as "2018",
COUNT(distinct(case when year(appln_filing_date) = 2019 then tls201_appln.appln_id end)) as "2019",
COUNT(distinct( tls201_appln.appln_id )) as "total"
FROM tls201_appln
join tls207_pers_appln on tls201_appln.appln_id = tls207_pers_appln.appln_id
join tls206_person on tls206_person.person_id = tls207_pers_appln.person_id
where person_ctry_code =  'KR' 
and applt_seq_nr > 0 
and invt_seq_nr =  0 
and appln_filing_year between 2002 and 2020
group by psn_name,tls206_person.person_ctry_code
having COUNT(distinct( tls201_appln.appln_id )) > 20
order by psn_name
PATSTAT Support Team
EPO - Vienna
patstat @ epo.org


Post Reply