Family Cleaned Forwrad Patent Citations

Here you can post your opinions, ask questions and share experiences on the PATSTAT product line. Please always indicate the PATSTAT edition (e.g. 2015 Autumn Edition) and the database (e.g. PATSTAT Online, MySQL, MS SQL Server, ...) you are using.
Post Reply

VanessaB
Posts: 1
Joined: Thu Nov 29, 2018 1:30 pm

Family Cleaned Forwrad Patent Citations

Post by VanessaB » Fri Nov 30, 2018 9:40 am

Hello Geert,

I would like to count the number of family-cleaned forward citations. However, half of my forward citations are made before the patent was published. And the time difference is not just a couple of months or so, but a rather substantial number of citations were made many years before.

I got to this result by merging the TLS228_docdb_fam_citn on the docdb_family_id of the patents I am interested in cited_docdb_family_id (inner join TLS228_docdb_fam_citn t2 on 1.docdb_family_id=t2.cited_docdb_family_id) and using the min(earliest_filing_date) of the citing docdb_family_id.

I just do not know what to do with the citations made before the cited patent was even published/filed. Do you have any advice?

Best wishes,
Vanessa


EPO / PATSTAT Support
Posts: 425
Joined: Thu Feb 22, 2007 5:33 pm
Contact:

Re: Family Cleaned Forwrad Patent Citations

Post by EPO / PATSTAT Support » Mon Dec 10, 2018 10:22 am

Hello Vanessa,
To make it easy, I only look at citations of publications; citations to applications are rather minimal in comparison.
The basics:
Your forward citations are in principle the same as a backward citation, if A cites B then B is cited by A.
Logics define that B cannot be cited by A, unless B is published, and A is not yet published.
Here is a straight forward query looking at the documents cited by WO2002US35880:

Code: Select all

SELECT  citing.appln_id, citing.publn_auth, citing.publn_nr, citing.publn_kind, citing.publn_date
,[citn_origin] ,[cited_pat_publn_id],[citn_gener_auth]
,cited.publn_auth, cited.publn_nr, cited.publn_kind, cited.publn_date
  FROM tls211_pat_publn citing join tls212_citation on citing.pat_publn_id = tls212_citation.pat_publn_id
  join tls211_pat_publn cited on tls212_citation.cited_pat_publn_id = cited.pat_publn_id
   join tls201_appln citing_app on citing.appln_id = citing_app.appln_id
  where citing_app.appln_nr_epodoc = 'WO2002US35880' and cited_pat_publn_id <> 0
order by citing.appln_id
result1.xlsx
(14.72 KiB) Downloaded 312 times
Result: you can see that all publication dates from the cited documents (last column) are before the publn_date of the A2 or A3
Observe also that the citations given by the applicant (APP) are linked to the A2 publication, the International Search Report citations (citn_origin = ISR) .
Depending on your research, you might also exclude the APP citations, and only consider citations given through search and examination. (For a complete overview of the flavours see attribute citn_origin in data catalog.)

I then adapted the query and looked at all the cases where the publication date from the cited document > publication date citing document.

Code: Select all

SELECT distinct citing.appln_id, citing.publn_auth, citing.publn_nr, citing.publn_kind, citing.publn_date
,citn_origin, cited_pat_publn_id, citn_gener_auth
,cited.publn_auth, cited.publn_nr, cited.publn_kind, cited.publn_date
FROM tls211_pat_publn citing join tls212_citation on citing.pat_publn_id = tls212_citation.pat_publn_id
  join tls211_pat_publn cited on tls212_citation.cited_pat_publn_id = cited.pat_publn_id
   join tls201_appln citing_app on citing.appln_id = citing_app.appln_id
  where cited_pat_publn_id <> 0 and cited.publn_date > citing.publn_date and cited.publn_date < '9999-12-31' and citing.publn_date < '9999-12-31'
order by cited_pat_publn_id
There are about 3.8 million of them, that seems a lot but is only about 1.3 % of the total.

I then looked at the distribution over the origin of those citations: 3 million are US linked via the Pre Search Citations, these are citations that are not printed on the document, only available via the US databases and then later added to the database. (I assume they are not taken into account by US examiners as prior art.)
The next big chunk are the one’s that come in via the search (See table in excel sheet).
I checked where they come from: out of 740.000, 575.000 were assigned by the CN patent office.
I checked 3 of them manually:All 3 are data errors, basically wrongly assigned citations , out of scope. (I reported them as data errors ... quality at source :)
Then I also observed that from the 575.678 citations, assigned by CN, 555.000 are citing utility models.
I also checked a couple of them, and they all seem to be utility models filed on the same date as the patent, same applicant, only they are published 6 months ahead of the patent. This results in a negative date difference. I assume that CN patent law has provisions for this.

Code: Select all

SELECT [citn_origin],[citn_gener_auth], count(*)
   FROM [patstat2018b].[dbo].[citat]
  where citat.citn_origin = 'SEA'
  group by citn_origin,[citn_gener_auth]
Maybe the EP's are worth looking at in detail, but I know that sometimes the B1 publications are cited, while the A1 is published years before:
https://worldwide.espacenet.com/publica ... le=en_EP I assume that in fact the A document should be cited instead of the B document.

4629 out of 4891 are B1 cited publications, probably the A1 or A2 is before the publication (and application) date of the citing document.
Bottom line: I would exclude all the cases where the publication date of the cited document > publication date citing document.
PATSTAT Support Team
EPO - Vienna
patstat @ epo.org


Post Reply