docdb biblio queries are extremely slow for recent US patents

This space is made available to users of Open Patent Services (OPS) web-service and now also to users of EPO’s bulk data subscription products such as 14. EPO worldwide bibliographic database (DOCDB), 14.11 EPO worldwide legal status database (INPADOC), 14.12 EP full text data, 14.1 EP bibliographic data (EBD)and more.

Users can ask each other questions, exchange experiences and solutions, post ideas. The moderator will use this space to announce changes or other relevant information.
Post Reply

jalipert
Posts: 35
Joined: Wed Nov 15, 2017 5:16 am

docdb biblio queries are extremely slow for recent US patents

Post by jalipert » Tue Feb 25, 2020 3:28 pm

Example:

https://ops.epo.org/3.2/rest-services/p ... cdb/biblio
US.9750499.B2

takes over a minute to get a response, whereas if I take an older patent such as:
US.7123123.B2
the response is nearly instant.

This time delay seems even worse in batch. When querying this batch of 10 patents at once, for example:
US.9750499.B2
US.9750498.B2
US.9743994.B2
US.9743929.B2
US.9743928.B2
US.9743927.B2
US.9737355.B2
US.9737303.B2
US.9737302.B2
US.9737301.B2

I let Postman wait for over 40 minutes, and still no response from the API. Any idea why this is happening and if it might be fixed soon?

Thanks!
Joe


jalipert
Posts: 35
Joined: Wed Nov 15, 2017 5:16 am

Re: docdb biblio queries are extremely slow for recent US patents

Post by jalipert » Tue Feb 25, 2020 4:08 pm

The issue seems to have resolved itself as of 4:08 CET.

- Joe


jalipert
Posts: 35
Joined: Wed Nov 15, 2017 5:16 am

Re: docdb biblio queries are extremely slow for recent US patents

Post by jalipert » Wed Feb 26, 2020 6:06 pm

Having the issue again today. Making an API request for the biblio data of docdb for this list of patents just runs indefinitely with no response. Can someone from the support staff try this and see if it can be reproduced?

API endpoint: https://ops.epo.org/3.2/rest-services/p ... cdb/biblio

Post body:
US.9750499.B2
US.9750498.B2
US.9743994.B2
US.9743929.B2
US.9743928.B2
US.9743927.B2
US.9737355.B2
US.9737303.B2
US.9737302.B2
US.9737301.B2


EPO / OPS Support
Posts: 1298
Joined: Thu Feb 22, 2007 5:32 pm

Re: docdb biblio queries are extremely slow for recent US patents

Post by EPO / OPS Support » Thu Feb 27, 2020 10:04 am

Hi,

What does URL Throttling says when you do that? Is the system overloaded at the time? Is it really only US data or if you query at the same time other documents then those are loaded quicker?


Regards,
Vesna for OPS support


jalipert
Posts: 35
Joined: Wed Nov 15, 2017 5:16 am

Re: docdb biblio queries are extremely slow for recent US patents

Post by jalipert » Thu Feb 27, 2020 6:01 pm

Hi Vesna,

I just ran a comparison between two patents: US7123123B2 and US9750499B2.

US.7123123.B2 took 1 second to get a response from the API.
US.9750499.B2 took 40 seconds to get a response from the API.

See images below of postman responses. Maybe it has something to do with the second patent having a larger response size, but 40 seconds to get biblio for 1 patent is pretty much unusable for our solution. Are you seeing similar response times on your end?

Image

Image

Best,
Joe


jalipert
Posts: 35
Joined: Wed Nov 15, 2017 5:16 am

Re: docdb biblio queries are extremely slow for recent US patents

Post by jalipert » Thu Feb 27, 2020 6:26 pm

OK, so I dug a little deeper, and I've narrowed down the issue to these three patents in my list. All of which have thousands of references cited, and thus, a large response size. See below:

US.9750499.B2 - 40 seconds, 3.87 MB, ~5,000 references cited
US.9743929.B2 - 45 seconds, 3.79 MB, ~4,900 references cited
US.9737301.B2 - 47 seconds, 3.87 MB, ~5,000 references cited

Consequently, we don't even need the references cited for our purposes. We only really need the basic info like Inventor, Assignee, Title, Abstract, Issue Date, etc. Given that there are some patents with many thousands of references cited lumped in with the bibliographic data, would it be possible to request an update to the API to allow the return of references cited to be optional? That would likely vastly improve the response time of the API for us, and also save you lots of bandwidth.

Thanks!
Joe


EPO / OPS Support
Posts: 1298
Joined: Thu Feb 22, 2007 5:32 pm

Re: docdb biblio queries are extremely slow for recent US patents

Post by EPO / OPS Support » Thu Feb 27, 2020 10:11 pm

Hi Joe,

Unfortunately, this is not possible at present, but I agree with you, citations could be optional. I will see if we could implement some changes like this in the future. It makes sense to have citations as constituent, outside basic biblio. On the other hand, US patents are famous for their size and no. of citations so that can really be an issue mostly in some PCT data and US national data, but still. Let me see what could be done here. I will talk to developers on Monday and let you know if and when we can do something.

Regards
Vesna for OPS forum


jalipert
Posts: 35
Joined: Wed Nov 15, 2017 5:16 am

Re: docdb biblio queries are extremely slow for recent US patents

Post by jalipert » Mon Mar 02, 2020 6:13 pm

Thank you for looking into supporting this feature. That would be an excellent addition to the API.

Best,
Joe


EPO / OPS Support
Posts: 1298
Joined: Thu Feb 22, 2007 5:32 pm

Re: docdb biblio queries are extremely slow for recent US patents

Post by EPO / OPS Support » Tue Mar 03, 2020 7:19 am

Hi

I spoke with developers and we are already exploring this possibility in order to see if we can implement it in next OPS version (the version that will connect t with the same data as in New Espacenet)


Regards,
Vesna for OPS


Post Reply