Page 2 of 2

Re: Wrong UTF-8 caracters

Posted: Thu May 04, 2017 9:45 am
by martien
Hello,

the problem with the bad characters is not a problem of the interface.
It seems to be inserted on epodoc/docdb generation for e.g. inventor- or applicant data. It combines country within the used field.

Have a look on xml-transformation, ( in Perl) :

Code: Select all

$data->[0]->{'bibliographic-data'}->[0]->{'parties'}->[0]->{'applicants'}->[0]->{'applicant'}->[0]->{'applicant-name'}->[0]->{'name'}->[0]->{'content'} = "RAST UWE\x{2002}[DE]";
$data->[0]->{'bibliographic-data'}->[0]->{'parties'}->[0]->{'applicants'}->[0]->{'applicant'}->[0]->{'data-format'} = 'epodoc';
$data->[0]->{'bibliographic-data'}->[0]->{'parties'}->[0]->{'applicants'}->[0]->{'applicant'}->[0]->{'sequence'} = '1';
$data->[0]->{'bibliographic-data'}->[0]->{'parties'}->[0]->{'applicants'}->[0]->{'applicant'}->[1]->{'applicant-name'}->[0]->{'name'}->[0]->{'content'} = " SCHMIDT KUPPLUNG GMBH\x{2002}[DE]";
$data->[0]->{'bibliographic-data'}->[0]->{'parties'}->[0]->{'applicants'}->[0]->{'applicant'}->[1]->{'data-format'} = 'epodoc';
$data->[0]->{'bibliographic-data'}->[0]->{'parties'}->[0]->{'applicants'}->[0]->{'applicant'}->[1]->{'sequence'} = '2';
$data->[0]->{'bibliographic-data'}->[0]->{'parties'}->[0]->{'applicants'}->[0]->{'applicant'}->[2]->{'applicant-name'}->[0]->{'name'}->[0]->{'content'} = 'RAST UWE, ';
$data->[0]->{'bibliographic-data'}->[0]->{'parties'}->[0]->{'applicants'}->[0]->{'applicant'}->[2]->{'data-format'} = 'original';
$data->[0]->{'bibliographic-data'}->[0]->{'parties'}->[0]->{'applicants'}->[0]->{'applicant'}->[2]->{'sequence'} = '1';
$data->[0]->{'bibliographic-data'}->[0]->{'parties'}->[0]->{'applicants'}->[0]->{'applicant'}->[3]->{'applicant-name'}->[0]->{'name'}->[0]->{'content'} = 'SCHMIDT-KUPPLUNG GMBH';
$data->[0]->{'bibliographic-data'}->[0]->{'parties'}->[0]->{'applicants'}->[0]->{'applicant'}->[3]->{'data-format'} = 'original';
$data->[0]->{'bibliographic-data'}->[0]->{'parties'}->[0]->{'applicants'}->[0]->{'applicant'}->[3]->{'sequence'} = '2';
$data->[0]->{'bibliographic-data'}->[0]->{'parties'}->[0]->{'inventors'}->[0]->{'inventor'}->[0]->{'data-format'} = 'epodoc';
$data->[0]->{'bibliographic-data'}->[0]->{'parties'}->[0]->{'inventors'}->[0]->{'inventor'}->[0]->{'inventor-name'}->[0]->{'name'}->[0]->{'content'} = "RAST UWE\x{2002}[DE]";
$data->[0]->{'bibliographic-data'}->[0]->{'parties'}->[0]->{'inventors'}->[0]->{'inventor'}->[0]->{'sequence'} = '1';
$data->[0]->{'bibliographic-data'}->[0]->{'parties'}->[0]->{'inventors'}->[0]->{'inventor'}->[1]->{'data-format'} = 'original';
$data->[0]->{'bibliographic-data'}->[0]->{'parties'}->[0]->{'inventors'}->[0]->{'inventor'}->[1]->{'inventor-name'}->[0]->{'name'}->[0]->{'content'} = 'RAST UWE';
$data->[0]->{'bibliographic-data'}->[0]->{'parties'}->[0]->{'inventors'}->[0]->{'inventor'}->[1]->{'sequence'} = '1';


The original-format data or ok, epodoc/docdb add \x{2002} instead of a blank.
Especially if you want to use the field for getting more info (e.g. about an inventor) you'll
have to correct the field yourself.