Page 1 of 1

docdb biblio queries are extremely slow for recent US patents

Posted: Tue Feb 25, 2020 3:28 pm
by jalipert
Example:

https://ops.epo.org/3.2/rest-services/p ... cdb/biblio
US.9750499.B2

takes over a minute to get a response, whereas if I take an older patent such as:
US.7123123.B2
the response is nearly instant.

This time delay seems even worse in batch. When querying this batch of 10 patents at once, for example:
US.9750499.B2
US.9750498.B2
US.9743994.B2
US.9743929.B2
US.9743928.B2
US.9743927.B2
US.9737355.B2
US.9737303.B2
US.9737302.B2
US.9737301.B2

I let Postman wait for over 40 minutes, and still no response from the API. Any idea why this is happening and if it might be fixed soon?

Thanks!
Joe

Re: docdb biblio queries are extremely slow for recent US patents

Posted: Tue Feb 25, 2020 4:08 pm
by jalipert
The issue seems to have resolved itself as of 4:08 CET.

- Joe

Re: docdb biblio queries are extremely slow for recent US patents

Posted: Wed Feb 26, 2020 6:06 pm
by jalipert
Having the issue again today. Making an API request for the biblio data of docdb for this list of patents just runs indefinitely with no response. Can someone from the support staff try this and see if it can be reproduced?

API endpoint: https://ops.epo.org/3.2/rest-services/p ... cdb/biblio

Post body:
US.9750499.B2
US.9750498.B2
US.9743994.B2
US.9743929.B2
US.9743928.B2
US.9743927.B2
US.9737355.B2
US.9737303.B2
US.9737302.B2
US.9737301.B2

Re: docdb biblio queries are extremely slow for recent US patents

Posted: Thu Feb 27, 2020 10:04 am
by EPO / OPS Support
Hi,

What does URL Throttling says when you do that? Is the system overloaded at the time? Is it really only US data or if you query at the same time other documents then those are loaded quicker?


Regards,
Vesna for OPS support

Re: docdb biblio queries are extremely slow for recent US patents

Posted: Thu Feb 27, 2020 6:01 pm
by jalipert
Hi Vesna,

I just ran a comparison between two patents: US7123123B2 and US9750499B2.

US.7123123.B2 took 1 second to get a response from the API.
US.9750499.B2 took 40 seconds to get a response from the API.

See images below of postman responses. Maybe it has something to do with the second patent having a larger response size, but 40 seconds to get biblio for 1 patent is pretty much unusable for our solution. Are you seeing similar response times on your end?

Image

Image

Best,
Joe

Re: docdb biblio queries are extremely slow for recent US patents

Posted: Thu Feb 27, 2020 6:26 pm
by jalipert
OK, so I dug a little deeper, and I've narrowed down the issue to these three patents in my list. All of which have thousands of references cited, and thus, a large response size. See below:

US.9750499.B2 - 40 seconds, 3.87 MB, ~5,000 references cited
US.9743929.B2 - 45 seconds, 3.79 MB, ~4,900 references cited
US.9737301.B2 - 47 seconds, 3.87 MB, ~5,000 references cited

Consequently, we don't even need the references cited for our purposes. We only really need the basic info like Inventor, Assignee, Title, Abstract, Issue Date, etc. Given that there are some patents with many thousands of references cited lumped in with the bibliographic data, would it be possible to request an update to the API to allow the return of references cited to be optional? That would likely vastly improve the response time of the API for us, and also save you lots of bandwidth.

Thanks!
Joe

Re: docdb biblio queries are extremely slow for recent US patents

Posted: Thu Feb 27, 2020 10:11 pm
by EPO / OPS Support
Hi Joe,

Unfortunately, this is not possible at present, but I agree with you, citations could be optional. I will see if we could implement some changes like this in the future. It makes sense to have citations as constituent, outside basic biblio. On the other hand, US patents are famous for their size and no. of citations so that can really be an issue mostly in some PCT data and US national data, but still. Let me see what could be done here. I will talk to developers on Monday and let you know if and when we can do something.

Regards
Vesna for OPS forum

Re: docdb biblio queries are extremely slow for recent US patents

Posted: Mon Mar 02, 2020 6:13 pm
by jalipert
Thank you for looking into supporting this feature. That would be an excellent addition to the API.

Best,
Joe

Re: docdb biblio queries are extremely slow for recent US patents

Posted: Tue Mar 03, 2020 7:19 am
by EPO / OPS Support
Hi

I spoke with developers and we are already exploring this possibility in order to see if we can implement it in next OPS version (the version that will connect t with the same data as in New Espacenet)


Regards,
Vesna for OPS