Page 1 of 1

(approximate) truncatedFamily limits

Posted: Tue May 01, 2018 12:47 pm
by gerben
Dear OPS team,

I might have overlooked it, but in the manual I have only found some general indication for truncated families on page 88.
In situations when the requested family is very large (several hundred members), all of the members cannot be included in the response due to technical reasons. The patent-family structure is returned with attribute truncatedFamily="true" and only a limited number of family members are returned.
Do you have some indication as to how many records can be included in one request. I'm looking for upper limits that can still be handled in one request for
  1. family requests listing only numbers
  2. family requests including biblio
  3. family requests including legal
  4. family requests including biblio and legal
I consider tackling the families issue as follows:
  1. Start with requesting a number list for the family
  2. Let a parser count the number of applications and publications
  3. If the counters permit, request the family with biblio and legal
  4. If the counters do not permit all at once, but still allow all biblio at once and all legal at once, run these two (with sufficient delay depending on the throttling parameters)
  5. Or, if over all limits, request the biblio information in batches of 100 using the bulk download as described in page 50 and request the legal status one by one. This also with sufficient delays depending on the throttling parameters returned.
I can of course start with the easy all in one attempt and if a truncated family is returned fall back to groups of smaller requests until the system does not respond with truncated family, but that would result in your server being busy for quite some time and returning a lot of data adding to our quota, while producing large sets of incomplete, and therefore useless, intermediate data. I prefer to limit the requests to only usefull requests.
Do you have defined such upper limits, or, if they are related to the throttling parameters, what is the throttling parameter based algorithm for determining such upper limits?

Re: (approximate) truncatedFamily limits

Posted: Wed May 02, 2018 1:52 pm
by EPO / OPS Support

Truncated family means that we display only 700 family members of specific extended INPADOC patent family. This would be one such example:
You can see that is truncated when you find this in your response
legal: false - truncatedFamily: true - total-result-count: 5504
In this case, you cannot query biblio at all, because the request is already truncated and the query would be to heavy:
You have to query biblio from OPS Published service with biblio (you can do batches of 100 here for your 701 result)

You can query legal, but even here, your response can get truncated at 1300 lines of received data. The success of this request depends on how many members actually even had legal events and how many events is there altogether.
You can query all legal events available (we don’t have legal status for all collections that we have biblio for) by using OPS Legal

Any family request with added constituents, truncated or not will cut (truncated) any data from line 1300 on, which is another reason to rather use seperate requests when you have bigger families

Any un-truncated family (so, those from 1-699 members) can only give you biblio in OPS family for first 100 results – we asked developers team few weeks ago to upgrade results from 50 biblio’s to 100 so it can happen that this is not implemented yet, but it will be by the time of the next release, which we expect by the end of May of beginning of June)

I hope this helps,

Re: (approximate) truncatedFamily limits

Posted: Wed May 02, 2018 4:01 pm
by gerben
Thanks for the information,

One thing is not clear to me, it is the cut-off after 1300 lines of information. Does this refer to the entire xml result or to the legal status information only?
If it refers to the entire result, it cannot contain up to 700 publications in the familylist.
One entry contains application information in docdb, publication info in docdb and epodoc and at least one priority reference in docdb. Therefore the minimum amount of lines for one entry is 31 lines of xml code.
With the limit being 1300 lines of information this will be about 40 entries. 700 entries require at least 21,000 lines of code and 100 biblio datasets (several hundreds lines of code each) add several tenthousands lines of code to that.

Is it correct if I conclude that this 1300 lines limitation only covers the legal status parts of the xml-output, not the number information or biblio information?

Re: (approximate) truncatedFamily limits

Posted: Thu May 03, 2018 7:56 am
by EPO / OPS Support

I see I wasn't clear enough and I now also noticed I wrote 1300 instead 1500 lines, so thank you for correcting me, I would not notice my typos otherwise :
Any family request with added constituent(s) in Family service, truncated or not, will cut (truncate) any constituents- related data (biblio or legal) from line 1500 on, which is another reason to rather use separate requests when you have bigger families
I hope that makes it clearer
Vesna for OPS support

Re: (approximate) truncatedFamily limits

Posted: Thu May 03, 2018 9:54 am
by gerben

Yes, this makes it clear.

Thank you.