Non-patent-literature, encoding error?

Here you can post your opinions, ask questions and share experiences on the PATSTAT product line. Please always indicate the PATSTAT edition (e.g. 2015 Autumn Edition) and the database (e.g. PATSTAT Online, MySQL, MS SQL Server, ...) you are using.
Post Reply

Posts: 1
Joined: Thu Apr 30, 2020 5:21 pm

Non-patent-literature, encoding error?

Post by elisa » Thu Apr 30, 2020 5:33 pm

Dear PATSTAT community,

I am using PATSTAT for the first time and I am working with the Non-patent-literature (table: tls214_npl_publication).
I find several non-patent-publications which show me strange characters, I think due to encoding errors.
I put here an example of a query that will return a problematic publication:

select *
from tls_214_npl_publn
where npl_publn_id=985931482

This will return a string that is not intelligibile.
Did someone have the same issue? Are you aware of a solution?

Thanks a lot,

ps. I am using Azure Data Studio.

Posts: 230
Joined: Thu Feb 22, 2007 5:33 pm

Re: Non-patent-literature, encoding error?

Post by EPO / PATSTAT Support » Mon May 04, 2020 10:58 am

Hello Elisa,
the data is correct; that NPL document has been cited in the Chinese search report, and no translation is currently available. (In fact, there are 2 CN NPLs cited.)
With the query below you retrieve from PATSTAT all the citations from the CN application that cites that NPL. There are 3 references to patent publications and 2 citations to (CN) NPL. My collegues from the ASIAN patent information desk confirmed that those citations are correct and effectively cited in the CN search report. (which you can consult through Global Dossier... if you understand Chinese: ... 151621859 )

Code: Select all

SELECT  citing.appln_id, citing.publn_auth, citing.publn_nr, citing.publn_kind, 
citing.publn_date, tls212_citation.*, cited.publn_auth, cited.publn_nr,
cited.publn_kind, cited.publn_date, tls214_npl_publn.*
FROM tls211_pat_publn citing join tls212_citation 
  	on citing.pat_publn_id = tls212_citation.pat_publn_id
  join tls211_pat_publn cited 
  	on tls212_citation.cited_pat_publn_id = cited.pat_publn_id
  join tls214_npl_publn 
  	on tls212_citation.cited_npl_publn_id = tls214_npl_publn.npl_publn_id 
  join tls201_appln citing_app 
  	on citing.appln_id = citing_app.appln_id
WHERER citing.pat_publn_id = 449498851
ORDER BY citn_id
Geert Boedt
PATSTAT Support Team
EPO - Vienna
patstat @

Posts: 111
Joined: Wed Sep 04, 2013 6:17 am
Location: Vienna

Re: Non-patent-literature, encoding error?

Post by mkracker » Mon May 04, 2020 11:18 am

Hi Elisa,

Make sure you imported the data as UTF-8. A Chinese character is typically encoded by 3 bytes.

The result of your query should look like in the attached screenshot.

Best regards,

Post Reply