missing prior_appln_id

Here you can post your opinions, ask questions and share experiences on the PATSTAT product line. Please always indicate the PATSTAT edition (e.g. 2015 Autumn Edition) and the database (e.g. PATSTAT Online, MySQL, MS SQL Server, ...) you are using.
Post Reply

clem stern
Posts: 3
Joined: Fri Oct 26, 2018 10:46 pm

missing prior_appln_id

Post by clem stern » Tue Jan 15, 2019 4:11 pm

Hello,

I'm using the equivalent family definition based on priority to group patents and compute statistics. I have noticed that an important amount of patent (mostly asian from Japan, China and Korea) don't have any prior_appln_id in the tab tls204. Most of them are mono patent family in the Docdb family definition so it doesn't matter for them, but a lot of them are not and it could affect my results.
I would like to know the resaon behind this ? Is there any solution ?

Thanks a lot


EPO / PATSTAT Support
Posts: 425
Joined: Thu Feb 22, 2007 5:33 pm
Contact:

Re: missing prior_appln_id

Post by EPO / PATSTAT Support » Tue Jan 29, 2019 5:18 pm

Hi,
Can you please provide some specific examples, so I can have a look?
And with which version of PATSTAT are you working?

Best regards,
Martin / EPO PATSTAT
PATSTAT Support Team
EPO - Vienna
patstat @ epo.org


clem stern
Posts: 3
Joined: Fri Oct 26, 2018 10:46 pm

Re: missing prior_appln_id

Post by clem stern » Wed Jan 30, 2019 1:16 pm

Hello, thank you for your respons.
I'm using Patstat Online Autumn 2018.

I'm using a code similarn to this :
SELECT distinct t201.appln_id
from TLS201_Appln as t201
inner join tls209_appln_ipc as t209 on t201.appln_id= t209.appln_id
WHERE ( t209.ipc_class_symbol LIKE 'A45D%')

Which gave me 191 896 results, but when I'm merging with tls204 tab with this code:

SELECT distinct t201.appln_id, t204.prior_appln_id
from TLS201_Appln as t201
inner join tls209_appln_ipc as t209 on t201.appln_id= t209.appln_id
inner join tls204_appln_prior as t204 on t201.appln_id= t204.appln_id
WHERE ( t209.ipc_class_symbol LIKE 'A45D%' )

I have only 79 822 results, so I'm guessing that there is no prior_appln_id available for a very large number of patents.
Furthermore, there is different priority number for each appln_id, meaning I have lost way more than 120 000 patents.
I have made some analysis on the patent without priority data and a lot of them are patents which were only applied in one office ( mostly in China, Japan or Korea).

Thank you for your help,
Best


EPO / PATSTAT Support
Posts: 425
Joined: Thu Feb 22, 2007 5:33 pm
Contact:

Re: missing prior_appln_id

Post by EPO / PATSTAT Support » Mon Feb 04, 2019 6:31 pm

Hello Clem,
your observation is correct in as such that a lot of applications are removed when you furhter join your initial query with the tls204_appln_prior table. This means that all applications that do not have a priority will be removed from your result list. In fact, your final result of "DISTINCT" applications will even be less (64.850) because some applications claim >1 priority and they are duplicated in the result list.
This query will give you the unique applications:

Code: Select all

SELECT distinct t201.appln_id
from TLS201_Appln as t201
inner join tls209_appln_ipc as t209 on t201.appln_id= t209.appln_id
inner join tls204_appln_prior as t204 on t201.appln_id= t204.appln_id
WHERE left (t209.ipc_class_symbol,4) = 'A45D'
You can avoid this reduction by using a "LEFT JOIN"; this will again introduce duplicates for the appliations that have mulitple priorities, and you will have a NULL value for those applications that have no priority.

Code: Select all

SELECT distinct t201.appln_id, prior_appln_id
from TLS201_Appln as t201
inner join tls209_appln_ipc as t209 on t201.appln_id= t209.appln_id
left join tls204_appln_prior as t204 on t201.appln_id= t204.appln_id
WHERE left (t209.ipc_class_symbol,4) = 'A45D'
order by t201.appln_id
You also mentioned that you were looking at the DOCDB family definintions in relation to the priority filings. This is not so easy to explain. For many DOCDB families (size > 1) having a JP family member, there is NO hard data in PATSTAT that shows or confirms how the family was created. Families are created based on a number of rules and (EPO internal) business decisions. They are explained upto a certain level of detail in this document:
Patent_Families_at_the_EPO_en.pdf
(5.21 MiB) Downloaded 746 times
But in principle, it is not posssible to replicate the DOCDB family business process only with PATSTAT data because a part of the proces is based on intellectual & manual work and not only on analysing priorities, continuations or manual assigned "technical relations" (data available via tls204, 205 and 216). In many of those cases, and earlier PCT filing creates the link between the family members so that "national phase applications" of an earlier PCT are grouped in the same DOCDB family.

Geert BOEDT
PATSTAT Support Team
EPO - Vienna
patstat @ epo.org


clem stern
Posts: 3
Joined: Fri Oct 26, 2018 10:46 pm

Re: missing prior_appln_id

Post by clem stern » Tue Feb 05, 2019 4:52 pm

Thank you for this document which is loaded with useful insight for my research.
I only use Docdb family size as a proxy to measure those patents families size since I can't group them using priorities, but I'm not trying to replicate the process of DocDB. The main idea was that if most of those patent belongs to a "single patent family", some of them are most likely to belong to larger families, but can't be bonded to them (in my definition of a family) because of missing priorities.

I understand that I can keep patent which don't have any priority with the "left join" or even "outer join" , but my main question is :
- why those patents don't have priority numbers and why does it concern mostly Asian patents ?
- if they exist, is there any solution to get access to those missing priority number at a large scale ?

Once again thak you for your returns and the document you provided.


EPO / OPS Support
Posts: 1298
Joined: Thu Feb 22, 2007 5:32 pm

Re: missing prior_appln_id

Post by EPO / OPS Support » Tue Feb 12, 2019 1:14 pm

Hi,

Not all applications claim priority and not all claimed priorities, especially of older data, are available at the EPO. For purpose of building DOCDB simple and INPADOC extended families in Espacenet we populate priority field for every record in a database with either:
- active priority (real priorities, as claimed, INID code 30-31, domestic priorities )
- inactive priorities (related applications which do not add new technical content, divisional, CIP, WO’s provisional fillings )
- dummy priorities (can be application date or for very old documents that are missing application date there will be publication date used, and technical priorities)

So, you could partially build your own family rules by adding data as we do for Espacenet. But in order to avoid doing all that and have INPADOC extended family as is you can use OPS, or more precisely, our OPS INPADOC service: https://www.epo.org/searching-for-paten ... html#tab-1 .

You can combine use of PATSTAT with OPS INPADOC family service if your aim is to get extended families for your research

Please note that the more recent a biblo record is the more chances are that the extended and simple family will change from one week (in case of simple family, one day to another) to another (INPADOC family is recalculated every weekend).

Vesna for OPS support


Post Reply