Page 1 of 1

Update for "EP full-text data for text analytics"

Posted: Fri Mar 13, 2020 10:09 am
by mkracker
This product is a free bulk data set consisting of XML-tagged titles, abstracts, descriptions, claims and search reports covering all EP publications, designed to facilitate natural language processing work. It can be used alone, but is best if combined with bibliographic patent data, like it is available in PATSTAT, Global Patent Index, Open Patent Service and many more . It has the open data license CC-BY.

A new version has been recently released, which contains all EP publications from 1978 (publication number 1) to week 08 of 2020.

The data can be downloaded from Google Cloud from bucket “epo-public”. After signing into Google you may access the bucket via https://console.cloud.google.com/storag ... po-public/ .

For downloading larger volumes it is recommended to use the CLI "gsutil" from Google’s Cloud SDK. Details and explanations can be found in the User Guide of the “EP full-text data for text analytics” data set, available at the bottom of its product page https://www.epo.org/searching-for-paten ... html#tab-1 .

Re: Update for "EP full-text data for text analytics"

Posted: Thu Oct 08, 2020 2:22 pm
by mkracker
There has been another update this week. The new release contains all EP publications from 1978 (publication number 1) to week 37 of 2020.