Page 1 of 1

What does it mean when DOCDB is 0 (zero)?

Posted: Mon Feb 04, 2019 4:36 pm
by swimwolfe
When I look at the TLS201 table of applications, there are about ~9.2 million appln_id that have a docdb_family_id of 0 (Spring 2016 vintage of PATSTAT). What does it mean when an application has a patent family of 0? Does it imply its a lone application and there are no other patents/applications that could be lumped with it by the algorithm to group patents/applications?

Thanks!
Brian

Re: What does it mean when DOCDB is 0 (zero)?

Posted: Mon Feb 04, 2019 5:01 pm
by EPO / PATSTAT Support
Hi Brian,

For the vintage :-) 2016 Spring Edition the Data Catalog explains the attribute DOCDB_FAMILY_ID like this:
Domain: Number 0 … 999 999 999; A value 0 indicates that the application does not belong to any DOCDB family. This is only the case for the dummy application (APPLN_ID = 0) and for artificial applications (APPLN_ID ≥ 900 000 000)
Default value: 0
I strongly recommend to check the Data Catalog (available on the EPO web site) when working with PATSTAT. Ideally, this document should answer all of your data-related questions.

The newer versions of PATSTAT contain improved data and data structures. Here every application belongs to a DOCDB family. To make use of all improvements, I recommend to subscribe to the newest PATSTAT edition, which is currently 2018 Autumn. (We have just started with the production of 2019 Spring, which will be ready in April / May).

Best regards,
Martin

Re: What does it mean when DOCDB is 0 (zero)?

Posted: Mon Feb 04, 2019 5:09 pm
by EPO / PATSTAT Support
Hello Brian,
Up to the 2016 Spring "vintage", we did not assign family identifications to replenished applications. They all had the default value 0 and you can also observe that they all have an appln_id > 900000000 (+ 0) .
The result was that this created a very big family which always ended up on top of all statistics unless they were actively excluded.
In the 2016 Autumn (and later) edition we slightly changed the business rule so that all applications (also the replenished ones) belong to exactly 1 docdb and 1 inpadoc patent family. The assigned family id's are identical to the application ID's to avoid confusion (so they all have a value 900.000.000).
And they do not end up in the top of "family size statistics" anymore.

Geert BOEDT

Re: What does it mean when DOCDB is 0 (zero)?

Posted: Mon Feb 04, 2019 5:51 pm
by swimwolfe
Martin/Geert -

Thanks so much for the VERY quick reply. Much appreciated and very helpful. That was exactly what I needed to understand.

I'm trying to manually calculate patent statistics such as citations/originality/generality etc... similar to the Hall et al. literature for US patents. Given your responses, I am thinking I need to assign a fake family id to these applications/patents and keep them around for citation counting.

I know tbl201 has NB_citing_docdb_fam but I'm guessing this omits any citations from the set of replenished applications - is this correct?

Cheers,
Brian