Page 1 of 2

Robot Detected Error only from one computer

Posted: Wed May 05, 2021 11:31 pm
by dtandon
Hi,

I have tweaked my scripts to execute within the 1Mbps limit.

They are working fine on one computer. But when I execute the same script from my second computer, it is failing with "Robot Detected Error" issue. I am using different App (Consumer Key/Secret) on both computers. I restarted the second computer and retried, but I face the same issue again. On first computer everything is working fine.

What could be the issue? Can you please help?

Re: Robot Detected Error only from one computer

Posted: Thu May 06, 2021 7:15 am
by EPO / OPS Support
Hi,

I cannot help you there, I am afraid . If it works on one than this means that you are reaching the service correctly. Clearly, from the response given, one of your computers doesn't authenticate properly from whatever reason...

Maybe some of the developers reading this will have an idea what you could look at next. Lets see if some will take time to help you

Regards,
Vesna for OPS support

Re: Robot Detected Error only from one computer

Posted: Thu May 06, 2021 8:47 am
by dtandon
No there is no authentication issue. The script is authenticating properly. But we are hitting with robot detected error only on one computer. Could you please check your logs and let me know what is the issue? We are not hitting any limits yet, and that’s why the script is working fine on one computer.

Re: Robot Detected Error only from one computer

Posted: Thu May 06, 2021 9:54 am
by EPO / OPS Support
HI

If you need that, then please send us an email to PATENTDATA (at) epo.org and tell me you user name in that email.

Regards,
Vesna for OPS support

Re: Robot Detected Error only from one computer

Posted: Tue May 11, 2021 9:20 am
by gerben
dtandon,

"Robot Detected Error" is something I relate to usage of espacenet, not to OPS. OPS was intended to be used programmatically and therefore should not complain using "Robot Detection", although it does have "fair use limitations".
In the espacenet context the outgoing IP address is an important parameter. In our company we frequently use espacenet for reading patent publications and ran into "Robot Detected" issues when multiple colleagues work using espacent in the same moment. This because all our computers communicate with the same outgoing IP address (our companies outside network connection), multiple colleagues all sending requests to the espacenet servers can easily create a usage pattern that appears to be "robot-like behaviour". At home, where I work using my private connection to the rest of the world, I never run into such issues with this same computer.

Do your testing machine and production machine communicate to the outside world via the same connection to the outside world, or do they use different connections (different outgoing IP address)? Secondly, does your tool include usage of any non-OPS service? If the tool includes a non-OPS service and the machines use different connections, the outgoing IP address of the production machine communicating with a non-OPS service may be the key factor in being detected to be a robot.


Gerben

Re: Robot Detected Error only from one computer

Posted: Tue May 11, 2021 9:22 pm
by dtandon
Hi Gerben,

Thanks for your message. I appreciate your response.

>>> "Robot Detected Error" is something I relate to usage of espacenet, not to OPS. OPS was intended to be used programmatically and therefore should not complain using "Robot Detection", although it does have "fair use limitations".

I agree. Unfortunately, it does complain though with "Robot Detection" errors in the form of HTTP 403.

>>> In the espacenet context the outgoing IP address is an important parameter. ...

Yes this is a typical scenario. A network is expected to use few public IP addresses and EPO servers unfortunately throttle traffic on the basis of IP address they see (public IPs). I was unaware that even manual access to espacenet is leading to "Robot Detection" errors. This is something EPO must look into.

>>> Do your testing machine and production machine communicate to the outside world via the same connection ...

Both my machines use different public IPs. All our traffic to EPO servers is through OPS APIs so I am not using non-OPS service traffic to EPO servers. Traffic to other non-EPO systems should not affect EPO behavior.

As what was suggested by EPO Support, I had implemented throttling at our end. This largely solves the problem, as we are able to throttle our download rate based on EPO server current load.

However, it doesn't guarantee that we won't receive "Robot Detected" errors. I still see them, and after a while the EPO servers block our IPs from downloading for a while, leading to download failures, and we end up downloading more data than what is required (because we have to re-trigger downloads after a while).

This is one of the issues, EPO needs to look into, since it is making their systems unreliable.

I am not sure if there is any other solution to this problem, and if we can do anything different in our code?

Re: Robot Detected Error only from one computer

Posted: Wed May 12, 2021 7:17 am
by EPO / OPS Support
Hi,

Manual access only leads to Robot detection when we have companies using one or only few IP addresses but have many users that are all simultaneously using Espacenet/Register all at the same time. This is normal since Espacenet is open to public and requires no registration so only an IP address can be an indicator that we have to focus on when we try to deter machines/bots off of manual search Interfaces. If you have a look at OPS video, you will see this example as one of the main reasons to use OPS instead or tools on the internet for such companies.

Please note that OPS RESTful is now already quite old technology and that at this point we don't invest into major corrections of those services because we, as the entire Office does, are following SP2023 strategic plan programmes and will hopefully soon we able to a start working/building new tools and services for external clients that will replace existing tools.

Regards,
Vesna for OPS

Re: Robot Detected Error only from one computer

Posted: Wed May 12, 2021 8:59 am
by gerben
Vesna,

According to the response of dtandon, his tool relies on OPS only. Nevertheless he reports receiving a "robot detected" message.

@Vesna
Do you know of any condition in which the OPS servers may respond with a "robot detected" message?

@dtandon
If not, then the most likely explanation would be that there is an error in your tool which makes it communicate to a non-OPS server.
Is there any additional information in the "robot detected" response that may identify the server that produced the response, this may help the OPS team in locating the origin of the error.


Gerben

Re: Robot Detected Error only from one computer

Posted: Wed May 12, 2021 10:31 am
by EPO / OPS Support
Hi,
Yes, that is message that appears with 403 error. As stated in OPS documentation - mostly it appears when there are problems with authentication, but there are other reasons why OPS will give that error, including Throttling related actions
Capture_error.PNG
As far as your second question, anything above 10 search- related actions per minute from 1 IP address can already cause the message on Espacenet. This is also stated in Fair use charter: https://www.epo.org/service-support/ord ... r-use.html

Regards,
Vesna for OPS

Re: Robot Detected Error only from one computer

Posted: Wed May 12, 2021 7:48 pm
by dtandon
Vesna,
>>> This is normal since Espacenet is open to public and requires no registration so only an IP address can be an indicator that we have to focus on when we try to deter machines/bots off of manual search Interfaces

The modern approach to solving this problem is to rely on Captcha. IP address is not a correct indicator, since organizations have multiple users (computers) behind few public IPs and that is a common scenario. With proper Captcha techniques, you can at least allow manual searches.

@gerben

>>> Is there any additional information in the "robot detected" response that may identify the server that produced the response, this may help the OPS team in locating the origin of the error.

Can a OPS API lead to traffic on non-OPS server? All our requests are beginning with:

https://ops.epo.org/3.2/rest-services so all traffic should be going to OPS servers only.

I can provide the Source IPs of the servers that are responding with the 403 if that can help.

Second problem is that documents indicate "Retry-After" field to be present in "X-Throttling-Control" headers, but I never see that in a 403 response header. So we don't know how long we have to wait before we can resend traffic, and that probably extends the rolling window and we get blocked longer.