Page 1 of 1

Non-patent-literature, encoding error?

Posted: Thu Apr 30, 2020 5:33 pm
by elisa
Dear PATSTAT community,

I am using PATSTAT for the first time and I am working with the Non-patent-literature (table: tls214_npl_publication).
I find several non-patent-publications which show me strange characters, I think due to encoding errors.
I put here an example of a query that will return a problematic publication:

select *
from tls_214_npl_publn
where npl_publn_id=985931482

This will return a string that is not intelligibile.
Did someone have the same issue? Are you aware of a solution?

Thanks a lot,
Elisa

ps. I am using Azure Data Studio.

Re: Non-patent-literature, encoding error?

Posted: Mon May 04, 2020 10:58 am
by EPO / PATSTAT Support
Hello Elisa,
the data is correct; that NPL document has been cited in the Chinese search report, and no translation is currently available. (In fact, there are 2 CN NPLs cited.)
With the query below you retrieve from PATSTAT all the citations from the CN application that cites that NPL. There are 3 references to patent publications and 2 citations to (CN) NPL. My collegues from the ASIAN patent information desk confirmed that those citations are correct and effectively cited in the CN search report. (which you can consult through Global Dossier... if you understand Chinese:
https://worldwide.espacenet.com/patent/ ... 151621859 )

Code: Select all

SELECT  citing.appln_id, citing.publn_auth, citing.publn_nr, citing.publn_kind, 
citing.publn_date, tls212_citation.*, cited.publn_auth, cited.publn_nr,
cited.publn_kind, cited.publn_date, tls214_npl_publn.*
FROM tls211_pat_publn citing join tls212_citation 
  	on citing.pat_publn_id = tls212_citation.pat_publn_id
  join tls211_pat_publn cited 
  	on tls212_citation.cited_pat_publn_id = cited.pat_publn_id
  join tls214_npl_publn 
  	on tls212_citation.cited_npl_publn_id = tls214_npl_publn.npl_publn_id 
  join tls201_appln citing_app 
  	on citing.appln_id = citing_app.appln_id
WHERER citing.pat_publn_id = 449498851
ORDER BY citn_id
Geert Boedt

Re: Non-patent-literature, encoding error?

Posted: Mon May 04, 2020 11:18 am
by mkracker
Hi Elisa,

Make sure you imported the data as UTF-8. A Chinese character is typically encoded by 3 bytes.

The result of your query should look like in the attached screenshot.

Best regards,
Martin