Searchable PDF

This space is made available to users of Open Patent Services (OPS) web-service and now also to users of EPO’s bulk data subscription products such as 14. EPO worldwide bibliographic database (DOCDB), 14.11 EPO worldwide legal status database (INPADOC), 14.12 EP full text data, 14.1 EP bibliographic data (EBD)and more.

Users can ask each other questions, exchange experiences and solutions, post ideas. The moderator will use this space to announce changes or other relevant information.
Post Reply

Romina Mossi
Posts: 14
Joined: Tue Dec 05, 2017 10:18 pm

Searchable PDF

Post by Romina Mossi » Thu Dec 14, 2017 8:15 pm

Hi!

As a response of an image request one gets as document format options tiff and pdf.

<ops:document-format-options>
<ops:document-format>application/pdf</ops:document-format>
<ops:document-format>application/tiff</ops:document-format>
</ops:document-format-options>

Could you please tell if the pdf are searchable (OCR)? In Epoline pdf documents are searchable and it would be nice to have the searchable pdfs also in OPS.

Thank you very much and kind regards
Romina Mossi


EPO / OPS Support
Posts: 1298
Joined: Thu Feb 22, 2007 5:32 pm

Re: Searchable PDF

Post by EPO / OPS Support » Fri Dec 15, 2017 7:10 am

Hi,

Our PDF's are data received from offices worldwide this is why they are also mostly in different formats. We can only load what we get, we are not a commercial company producing patent data so we don't have much influence on what we receive,. But we do offer character-coded full text (constituents full text, claims, description) for some countries including EP.

Regards,
OPS support


Romina Mossi
Posts: 14
Joined: Tue Dec 05, 2017 10:18 pm

Re: Searchable PDF

Post by Romina Mossi » Fri Dec 15, 2017 10:01 am

HI,
Thank you for your answer. I am aware of your dependencies of the quality of what you get. Although Swiss patents are sent to you in searchable format but in Espacenet/ OPS they are not anymore...
I was also thinking of the EP patents, which are searchable in Epoline and not searchable in OPS.
Some years ago, when I asked the reason in Vienna at an OPS user meeting, they explained that the pdfs for epoline and Espacenet/OPS come from different databases, one searchable one not. Why Swiss patents were transformed, they did not know.
They would think about it, they said.
That's why my question. I thnk it's a pity to have OCRed documents and transform them back to images.
Can you tell me please if it will be changed oone day?
Thank you!


EPO / OPS Support
Posts: 1298
Joined: Thu Feb 22, 2007 5:32 pm

Re: Searchable PDF

Post by EPO / OPS Support » Fri Dec 15, 2017 1:14 pm

Hi,

Please note that in-between we've already changed databases so whatever you were told then does not apply anymore. Our new "MOSES" is capable of storing different formats which BNS was not capable of doing. BNS could only store pure image data. We are only now finishing BNS to MOSES project so in the future we will be able to add different sets of data to MOSES too.

But as I said, we have better option then searchable PDF, we offer character-coded full text for EP and CH and few other countries, also in OPS.
We also have PDF/A format of EP documents in European Publication Server which has a web service as well and in 14.12 EP full text data (raw data bulk product) and you can use those if don’t want XML of full text.

Regards,
OPS support


Romina Mossi
Posts: 14
Joined: Tue Dec 05, 2017 10:18 pm

Re: Searchable PDF

Post by Romina Mossi » Mon Dec 18, 2017 8:54 pm

Thank you very muc for your answer.
Regards
Romina Mossi


Post Reply