Page 1 of 2

Wrong UTF-8 caracters

Posted: Tue Mar 11, 2014 11:02 am
by AstriumServices
All,

During a REST request on OPS server on March 5th, I have received as result of my request an XML including wrong UTF-8 caracters sequence " ".
Neither a standard ASCII file reader like UltraEdit nor C# HttpWebRequest piece of code are able to parse properly this sequence.
More over, I did not found this sequence in the UTF-8 coding chart.
This sequence has been generated for a couple of families (exemple famn=33457358) captured during my update sequence but not for all the families.

Could you confirm that the origin of these wrong caracters is located on the OPS server side ?
If confirmed, should I have to expect other type of wrong sequences in data captured from OPS server and how can I work around this problem ?

Thanks
Michel

Re: Wrong UTF-8 caracters

Posted: Tue Mar 11, 2014 3:48 pm
by EPO / OPS Support
Dear user,

I am sorry, but OPS does not have possibility to search for famn. Can you give me a patent number of the document that has family members in question. Also, have you tried the same request via our Developers portal (API console): https://developers.epo.org/?

Thank you in advance,

OPS support

Re: Wrong UTF-8 caracters

Posted: Thu Mar 13, 2014 9:15 am
by AstriumServices
Thanks for your reactivity.
Please find just below the request which delivers said wrong UTF-8 caracters sequence in my XML file.
https://ops.epo.org/3.1/rest-services/p ... n=33457358

On IE 9, switch coding format from UTF-8 to Occidental ISO for instance to see the sequence.
It appears for some members of the family in the epodoc applicant or the epodoc inventor field.

I didn't find this sequence in standard UTF-8 coding charts and this sequence is not present in my previous requests made during a 2months period at weekly basis.
Any idea what's wrong with this?

Re: Wrong UTF-8 caracters

Posted: Thu Mar 13, 2014 12:49 pm
by EPO / OPS Support
Dear user,

I see I was a bit premature claiming that famn does not work in CQL - it seems that in last documentation several new filed identifiers were added. Thanks for pointing that to us :-)

As far as characters, when I display it on my computer I don't see anything wrong with epodoc format or inventor or applicant so I will ask our technical team if they have any idea what is causing your problem.

I will get back to you once I hear something from our colleagues,

Kind regards,

OPS support

Re: Wrong UTF-8 caracters

Posted: Thu Mar 13, 2014 1:57 pm
by EPO / OPS Support
Dear user,

We are very sorry, but no one here can reproduce your problem in our system, we all get proper coding and we do get all 3 are completely valid UTF-8 symbols (0xE2 0x20AC 0x201A). We can only conclude that this issue is at your end and we can not be of help this time.

Kind regards,

OPS support

Re: Wrong UTF-8 caracters

Posted: Mon Mar 17, 2014 5:56 pm
by AstriumServices
Dear support,

Deeper investigation has led me to identify the wrong UTF8 caracter [EN SPACE] or [\u2002].
I confirm that the problem is coming from a downstream process, while exporting my database into a UTF16 csv file. MySQL conversion of said caracter from UTF8 into UTF16 does not work while exporting DB content into CSV file.
Replacing said caracter with standard [SPACE] during the import process one solves the issue.

The misleading points were that this [EN SPACE] caracter has appeared in recent request and not before, and that standard text editor like UltraEdit or VisualStudio does not parse this caracter properly.
Thanks for your help

Re: Wrong UTF-8 caracters

Posted: Thu Apr 27, 2017 8:14 pm
by Kerstin Thoma
3 years later and I find these characters. â€

To Reproduce:
Http://ops.epo.org/3.1/rest-services/pu ... iblio.json

Bibliographic-data / parties / inventors / inventor / [0] / inventor-name / name / $
Value: RAST UWEâ €, [DE]

Seen in: Chrome, Firefox and Edge.
You can also see these characters in https://developers.epo.org.

Some applicant-name (@ data-format: epodoc) also have these characters.

Is this a temporary problem?
And is there a way to get the county-value by inventor separately?

Thanks for your support,
Best Regards
Kerstin Thoma.

Re: Wrong UTF-8 caracters

Posted: Fri Apr 28, 2017 9:13 am
by EPO / OPS Support
Capture4.PNG
Capture4.PNG (71.97 KiB) Viewed 3380 times
Hi,

When I do a search using API Console I get this:


Kind regards,
OPS support

Re: Wrong UTF-8 caracters

Posted: Fri Apr 28, 2017 1:38 pm
by Kerstin Thoma
When I do a search using API Console I get this:

Image

I'm working on Windows10, see in Chrome, Firefox, Edge and Opera.
Thank you very much for your support!

Re: Wrong UTF-8 caracters

Posted: Fri Apr 28, 2017 1:42 pm
by EPO / OPS Support
You are using 3.1 instead of official production environment which is 3.2. Please check announcements part of this forum. 3.1 is a version that is not updated any longer.

Regards,

OPS support