Page 2 of 2

Re: Issue with Collecting Patent Data

Posted: Sun Jan 20, 2019 3:18 am
by jay2018
Hi,

Thank you for your explanation. I have one more question here.
For US patents, I find there are two types of publn_nr, one is ten-digit such as "2001000344", and the other is seven-digit such as "6077663". The seven-digit matches the patent number I find from USPTO website, and I wonder what the ten-digit patent number means?
Is it the so-called "dummy publication" mentioned in the previous post as "When the EPO receives data that a certain publication/application has been cited without that publication//application being in our database, then the EPO will create a so-called 'dummy' publication"? If so, after the "dummy publication" publication gets a solid publication number from USPTO, do you consolidate the "dummy" one with the read one?
For Chinese patents, there are also two types of publn_nr, one is nine-digit such as "100334102", and the other is seven-digit such as "1665922". Does that difference have the same meaning as US patents? Thank you.

Best regards,
Wei

Re: Issue with Collecting Patent Data

Posted: Thu Jan 24, 2019 5:58 pm
by EPO / PATSTAT Support
Hello Wei,
these are NOT dummy publications or applications. Artificial applications or publications will have an appln_id or publn_id > 900000000.
The EPO has to adapt the number formats so that they become distinct and unique in order to keep the data base coherent. These adapted numbers are appln_nr and publn_nr.

But in order to allow linking to national patent data bases we have added in PATSTAT the attribute "appln_nr_original", which should normally allow you to link to the application numbers used in data sets provided by the national patent offices. (no guarantee, we just know it mostly works fine)
This "appln_nr_original" number is mostly the number "as printed on the document".

Geert BOEDT

Re: Issue with Collecting Patent Data

Posted: Thu Jan 24, 2019 9:00 pm
by jay2018
Hi Geert,

I'm sorry but I don't quite understand. Let's use US patent data as an example.
For patent with ten-digit publn_nr, such as "2001000344", I also notice those types of patents all have zero publn_claims. However, every granted patent should have at least 1 claims. So, I wonder what kind of patents are these with ten-digit publn_nr?

Best regards,
Wei