Dear all
As the Patstat data catalog says, appln_nr should be unique in combination with appln_auth and appln_kind. However, in the Patstat autumn 2022 version it seems that there are duplicates in PCT applications. So I find several entries with the same appln_nr and identical appln_auth (WO) and appln_kind (W), where they have different appln_id and docdb-familiy-ids. The receiving office is also different.
Did I misunderstand something here and can there be duplicates after all if the appln_kind is W or should there actually be different appln_nr here?
Thanks a lot in advance for your help!
Duplicate appln_nr
-
- Posts: 358
- Joined: Fri Mar 03, 2017 1:16 pm
Re: Duplicate appln_nr
Dear customer,
Please see in the PATSTAT data catalogue https://documents.epo.org/projects/baby ... _19_en.pdf
There are indeed duplicates when only looking at appln_nr, appln_auth (WO) and appln_kind (W). That is the reason why the attribute RECEIVING_OFFICE is part of the so called alternate key.
5.1 TLS201_APPLN: Application, that the alternate key is
APPLN_AUTH, APPLN_NR, APPLN_KIND, RECEIVING_OFFICE
and for this alternate key there are no duplicates.
In 6.6. APPLN_AUTH the Data catalog says
Description: The competent authority, which is the national, international or regional patent office responsible for the processing of the patent application.
…
Since PATSTAT Spring 2018 there is an exception to the rules below: The value of APPLN_AUTH is set at “WO” if APPLN_KIND = “W”; in that case the attribute RECEIVING_OFFICE will contain the original (DOCDB) version of APPLN_AUTH. The motivation for the change was to make retrieval of "WO" (PCT) applications more coherent and to avoid the pitfall of PCT applications being counted as "national" applications.
If you would like to check that in the database:
The following searches give no results on patstat2022b
select * from
(
select count(distinct appln_id) as mycount, appln_auth, appln_nr, appln_kind, receiving_office from tls201_appln
group by appln_auth, appln_nr, appln_kind, receiving_office
) mytable
where mycount > 1
select * from
(
select count(distinct appln_id) as mycount, appln_auth, appln_nr, appln_kind from tls201_appln
group by appln_auth, appln_nr, appln_kind
) mytable
where mycount > 1 and appln_auth <> 'WO' and appln_kind <> 'W'
i.e. The exception is also only limited to WO applications.
Please see in the PATSTAT data catalogue https://documents.epo.org/projects/baby ... _19_en.pdf
There are indeed duplicates when only looking at appln_nr, appln_auth (WO) and appln_kind (W). That is the reason why the attribute RECEIVING_OFFICE is part of the so called alternate key.
5.1 TLS201_APPLN: Application, that the alternate key is
APPLN_AUTH, APPLN_NR, APPLN_KIND, RECEIVING_OFFICE
and for this alternate key there are no duplicates.
In 6.6. APPLN_AUTH the Data catalog says
Description: The competent authority, which is the national, international or regional patent office responsible for the processing of the patent application.
…
Since PATSTAT Spring 2018 there is an exception to the rules below: The value of APPLN_AUTH is set at “WO” if APPLN_KIND = “W”; in that case the attribute RECEIVING_OFFICE will contain the original (DOCDB) version of APPLN_AUTH. The motivation for the change was to make retrieval of "WO" (PCT) applications more coherent and to avoid the pitfall of PCT applications being counted as "national" applications.
If you would like to check that in the database:
The following searches give no results on patstat2022b
select * from
(
select count(distinct appln_id) as mycount, appln_auth, appln_nr, appln_kind, receiving_office from tls201_appln
group by appln_auth, appln_nr, appln_kind, receiving_office
) mytable
where mycount > 1
select * from
(
select count(distinct appln_id) as mycount, appln_auth, appln_nr, appln_kind from tls201_appln
group by appln_auth, appln_nr, appln_kind
) mytable
where mycount > 1 and appln_auth <> 'WO' and appln_kind <> 'W'
i.e. The exception is also only limited to WO applications.
Kind regards
Patent Information Marketing
Patent Information Marketing