PSN_ID changed?

Here you can post your opinions, ask questions and share experiences on the PATSTAT product line. Please always indicate the PATSTAT edition (e.g. 2015 Autumn Edition) and the database (e.g. PATSTAT Online, MySQL, MS SQL Server, ...) you are using.
Post Reply

Tim Grünebaum
Posts: 18
Joined: Thu Aug 27, 2015 12:43 pm

PSN_ID changed?

Post by Tim Grünebaum » Tue Mar 31, 2020 11:16 pm

Hello,

I was wondering if the PSN_ID changed at some point.
In my "old" data from Patstat spring 2018 (not online) I have for example for

Code: Select all

PSN_NAME=SIEMENS
the

Code: Select all

PSN_ID=26339403
However, in Patstat online autumn 2019 for Siemens it is

Code: Select all

PSN_ID=27767728
This does hold also for other companies, which is kind of bothering me right now and prohibits an appropriate matching.
Is there I reason for this or do I compare apples and oranges in any way?

Best
Tim

PS: Were there any significant changes made in 2016 when HRM_L2_ID was changed to PSN_ID or was it a simple renaming? I also observe some discrepancies in my Patstat 2015 data.
TU Dortmund


mkracker
Posts: 114
Joined: Wed Sep 04, 2013 6:17 am
Location: Vienna

Re: PSN_ID changed?

Post by mkracker » Wed Apr 01, 2020 7:12 am

Hi Tim,

Only some of the Identifiers in PATSTAT data are "stable", meaning they identify the same records across editions of PATSTAT. For details please see the Data Catalog or PATSTAT Global, section 4.3.2.

While PERSON_ID is such as stable identifier, PSN_NAME (the PATSTAT Standardised Name) is not. So, as you observed, the same value of PSN_NAME will identify different records of the person table in different PATSTAT editions.

Conceptually it is not possible to have both PERSON_ID and PSN_NAME stable within the same records, if the assignment of a harmonized name to a person is not absolutely fix - which it is not, due to improvements in the harmonisation algorithm and data corrections.

So to compare data across versions, you must also keep stable identifiers like PERSON_ID, APPLN_ID, PAT_PUBLN_ID etc. If you have not done so, then you could re-match person records of different versions via
name (and address). If this is not possible, please contact the PATSTAT helpdesk at patstat@epo.org for a remedy.

As a side note: Till autumn 2015 the now-called attribute PSN_NAME was called HRM_L2_ID. For better comprehensibility, we did a simple rename.

So in short: the PSN_ID is not stable accross releases, but data can be matched via the PERSON_ID.

Best regards,
-------------------------------------------
Martin Kracker / EPO


Tim Grünebaum
Posts: 18
Joined: Thu Aug 27, 2015 12:43 pm

Re: PSN_ID changed?

Post by Tim Grünebaum » Wed Sep 23, 2020 11:57 am

Hi mkracker,

thank you very much for your advice, which solves the basic problem and avoided many mismatches in my studies.

However, I have tried to match PATSTAT autumn 2019 data with my older downloads from PATSTAT autumn 2015 or 2018 via PERSON_ID and there were many records where a PERSON_ID from 2019 could not be found in 2015, even though I referred to data before 2010.
So when I draw information from PATSTAT 2019 about a PERSON_ID that has filed an application in 2010, I would expect it to already be included in PATSTAT 2015. These cases are also not rare. If I did my thing right then after 2015 there were many new id's introduced (for applicants in 2010 and earlier).
Do you have any idea on this topic or do I miss something?

Best,
Tim
TU Dortmund


EPO / PATSTAT Support
Posts: 227
Joined: Thu Feb 22, 2007 5:33 pm
Contact:

Filing date on divisional applications & new person_id's

Post by EPO / PATSTAT Support » Fri Sep 25, 2020 10:13 am

Hello Tim,
I would say that your question touches on 2 different but related issues.
a) how is it possible that patent applications having a filing year < 2010 were not in PATSTAT2015
b) and the person_id's from those applications do not appear neither in the PATSTAT2015-but suddenly appear in a later PATSTAT release.

On a) there are 2 main reasons for this to occur: the main one is the so called divisional applications. There is a whole set of rules and case law governing these kind of applications, all very well defined in the EPC. But for the purpose of this forum (addressing rather researchers and not patent attorneys), I think this WIKI gives a good overview: https://en.wikipedia.org/wiki/Divisiona ... Convention
The bottom line message: divisional applications can be filed when an earlier EP patent is pending (also not granted), and will get the application filing date of the "parent application". This can result in "many years" between the earlier parent and child application(s). The link between those applications is documented via the tls216_appln_contn table. The US patent system has a similar patent application process via so called continuations. A second -smaller- reason is data related: sometimes patent offices provide "backlog data" which is then added to the DocDB database and later into PATSTAT. There is also a minor number of corrections (many on them on the application kind code) that might result in applications receiving a "new" appln_id - or being removed from the tls201_appln table.
b) the divisional and continuations are in principle new applications each going there own way. The applicant and inventor data is not "copied" from the parent application to the child(ren), and therefore "new persons and person_id's" might be assigned to those child applications. But if the names of the applicants have the same spelling, address and country code, then the person_id's from the parent application will be linked to the children. This follows the normal rules for person_id's as described in the data catalog.

At the end: PATSTAT is a snapshot of a situation at the moment the backfile is extracted. Later PATSTAT releases will always result in better and more data from the past.

Geert BOEDT
PATSTAT Support Team
EPO - Vienna
patstat @ epo.org


Post Reply