Page 1 of 1

different number country-specific applications

Posted: Mon Jun 14, 2021 11:18 am
by polina_u
Hi,

I previously worked with PATSTAT 2016 and have recently switched to the latest version, PATSTAT Spring 2021. I only extract data for patent applications filed by applicants with Swedish address.

When I compared data extracted from PATSTAT 2016 and from PATSTAT 2021 for some overlapping years (2000-2015), I noticed that there are some observations that were previously identified as Swedish (i.e., PERSON_CTRY_CODE=="SE" from TLS206) have any other country code but not Swedish in PATSTAT 2021. The number of such applications, however, is relatively low, appr. 4580 of 288386, or 1,6%.

I tried to investigate the issue and found two reasons why that might occur:
1. there was a Swedish applicant when the application was filed but there is a new applicant with an address in another country in the latest application. For example, in the case of change in patent ownership;
2. it seems that in PATSTAT 2016, one application could have multiple observations. For example, for appln_nr==10530001, there are at least two different appln_id and appln_nr_epodoc. appln_nr_epodoc are almost identical but one has the letter "D" at the end. It seems that in PATSTAT 2021, there are no such duplicates, i.e., there is only one unique applications id for one real patent application.

Do my findings make sense? Are there any other reasons why the number of "Swedish applications" in PATSTAT 2016 might be larger than in PATSTAT 2021 or why they don't match exactly? Is there anything else I should be aware of?

Re: different number country-specific applications

Posted: Thu Jun 17, 2021 7:14 pm
by EPO / PATSTAT Support
Hello Polina,
your analysis - reason 1 is correct.
In PATSTAT we have aggregated applicant and inventor names at application level via the tls207 table. This means that (generally spoken) we take the names from the latest publication instance from that application. So when a change of ownership happens between A and B publication, then this will indeed change the country of the applicant if there was such change. (This should obviously not happen with the inventors -- unless corrections have been made in the inventor country.) This business rule (and the exceptions) are described in the PATSTAT data catalog under TLS207.
But PATSTAT also has the tls227 table that links the "names as published linked to the publication". This allows you to analyse this topic in detail. The query below selects all applications that had an SE applicant in "a publication instance", but not in the "final" tls207 table.
It's obvious from the list that some companies do this systematically: SONY MOBILE COMMUNICATIONS SE -> SONY CORPORATION JP; ABB (ASEA BROWN BOVERI) SE -> ABB SCHWEIZ CH, ARCELORMITTAL LU; 3M SVENSKA SE -> 3M INNOVATIVE PROPERTIES COMPANY (MINNESOTA MINING AND MANUFACTURING INNOVATIVE PROPERTIES COMPANY) US ; DOLBY SWEDEN SE -> DOLBY INTERNATIONAL NL; PHILIPS NORDEN SE -> PHILIPS ELECTRONICS Nl etc...
This list could be the subject of some interesting research .

Code: Select all

from
tls211_pat_publn JOIN tls227_pers_publn
	ON tls211_pat_publn.pat_publn_id = tls227_pers_publn.pat_publn_id
	join tls206_person on tls227_pers_publn.person_id = tls206_person.person_id
	left join (select appln_id, STRING_AGG ((psn_name+' ' +person_ctry_code), ', ') names
	from tls207_pers_appln join tls206_person on tls207_pers_appln.person_id = tls206_person.person_id where applt_seq_nr > 0 group by appln_id) non_swedes
	on tls211_pat_publn.appln_id = non_swedes.appln_id
WHERE
tls211_pat_publn.appln_id not in (select appln_id from tls207_pers_appln join tls206_person on tls207_pers_appln.person_id
= tls206_person.person_id where person_ctry_code ='SE' and applt_seq_nr > 0)
AND tls206_person.person_ctry_code = 'SE'
and tls211_pat_publn.publn_auth = 'EP'
and applt_seq_nr > 0
order by tls211_pat_publn.appln_id desc, tls211_pat_publn.pat_publn_id, applt_seq_nr, invt_seq_nr
As the "later publication" mostly means a publications of a B1 (grant), it is normal that those appear 2,3,4 years after the first publication, and therefore they also appear in the later PATSTAT releases.
Maybe a proxy for IP in/out flow in a country... I assume CH would show the opposite...

On point (2) the applications with a D... in the old PATSTAT releases. Filter them out. They are there because of data base integrity - in the past-. As far as I know, they have all been removed.

Geert Boedt

Re: different number country-specific applications

Posted: Fri Jun 18, 2021 7:35 am
by polina_u
Thank you very much for your reply!