Survey on the Implementation and Promotion of WIPO Standard ST.22
Response submitted by World Intellectual Property Organization (WIPO) (International Bureau of) in 2011
The present questionnaire addresses issues concerning WIPO Standard ST.22 (Recommendation for the authoring of patent applications for the purpose of facilitating optical character recognition (OCR)) and patent applications submitted on paper or submitted electronically (e-filed) but having the text body of the application submitted in image form (e.g., PDF or TIFF images). Even if your Office does not perform OCR on its documents, please respond to the questions which are applicable.
A revised version of WIPO Standard ST.22 was adopted by the Standards and Documentation Working Group (SDWG) on November 21, 2008. It is available at: https://www.wipo.int/standards/en/pdf/03-22-01.pdf
The results of the survey will be presented for consideration by the Committee on WIPO Standards (CWS).
Please take note that some questions might not be displayed depending on the response(s) to preceding question(s). This could cause gaps in numbering of displayed questions and sections.
Section 1: Patent filing in your Office
1. Does your Office accept patent applications submitted on paper or submitted electronically but having the text body of the application submitted in image form (e.g., PDF or TIFF images)?
Yes
Please comment if necessary:
Both e-filing image and paper
2. If applicable, please indicate the percentage, with respect to the total number of applications received by your Office, and the year of reference (e.g., 60% in 2008), of the following
applications filed on paper: | 18.6% in 2010 |
applications filed electronically but having the text body of the application submitted in image form: | 54.9% in 2010 |
3. Does your Office perform optical character recognition (OCR) on patent applications?
Yes
Section 2: Promotion and use of WIPO Standard ST.22
4. Has your Office adapted the filing guidance that it provides to applicants to take into account the recommendations of the revised version of WIPO Standard ST.22?
Yes
5. Has your Office promoted the use by applicants of the recommendations provided by WIPO Standard ST.22?
Yes
Please comment if necessary:
Indirectly PCT filing requirements are synchronised with ST.22
6. If applicable, what publication means has your Office used to promote the use of WIPO Standard ST.22 (e.g., article in the official gazette, amendment of Office’s filing recommendations, publication on the Office's website, newsletters)?
Please specify the details (e.g., entry or section of the official gazette, URL of the location where the announcement is available):
7. Has your Office promoted WIPO Standard ST.22 in any other way (e.g., conferences, information circulars)?
No
Section 3: Implementation of WIPO Standard ST.22
This Section 3 refers to the applications that are filed on paper or electronically (e-filed) but having the text body of the application submitted in image form (e.g., PDF or TIFF images).
8. Has your Office noticed any improvement in the quality of the formal presentation and layout of the applications that follow the recommendations of WIPO Standard ST.22?
No improvement
Please comment if necessary:
No improvement checked for questions 8,9,10 because we have not measured this - a large proportion of the low confidence OCR documents we proof-read each week are those not confirming to ST.22 especially the 300 dpi, black and white A4 recommendations
9. Has your Office noticed any improvement in the OCR quality output that resulted from the applicants’ awareness of WIPO Standard ST.22?
No improvement
10. Has your Office noticed any decrease in the OCR costs that have resulted from the applicants’ awareness of WIPO Standard ST.22?
No decrease
11. Does your Office use the non-conformity to WIPO Standard ST.22 as a reason to request replacement sheets of the application?
Yes
Please comment if necessary:
When documents do not comply with PCT requirements
12. If applicable, please indicate the percentage of applications for which replacement sheets are requested with respect to the total number of applications (filed on paper or e-filed) having the text body of the application submitted in image form, and the period of time of reference:
(e.g., 15% in the first half of 2009)
Percentage: |
Please comment if necessary:
Statistic not measured
13. Does your Office have the intention to take into account, for the calculation of fees, the level of compliance with WIPO Standard ST.22 of the applications filed on paper or e-filed but having the text body of the application submitted in image form?
No
If applicable, please explain how:
Section 4: OCR practices of IPOs
14. Since your answer to Question 3 was “Yes”, please indicate if the following purposes are applicable and, if “Yes”, the accuracy requirements established by your Office:
a) Security screening of patent applications
No
b) Publication of the patent applications
Yes
Accuracy requirements:
It is Search quality (>99.5%), not publication quality (>99.95%)
c) Publication of the granted patents
No
d) Please indicate other purpose(s) and corresponding accuracy requirements if necessary:
15. Does your Office have in-house quality checking measures in place to control the quality of the OCRed patent documents?
Yes
Please provide a concise description of the measures (e.g., refer to the relative automation of the quality checking indicating if it comprises the review by staff of randomly selected output, and/or if it is based on the accuracy confidence metrics produced by the OCR software):
QA/Correct is done based on the reported OCR confidence reported by the software
16. Does your Office OCR patent documents in foreign languages?
Yes
Please indicate which foreign language(s):
en,fr,de,es,pt,ko,zh,ja,ru
17. Does your Office outsource the OCR of patent documents?
No
18. If you answered “Yes” to the previous Question:
(a) If applicable, please indicate any comments or feedback that your Office might have received from the contractor about the recommendations of WIPO Standard ST.22:
(b) Please also describe the quality checking measures used to control the quality of the OCRed patent documents that are performed by your contractor:
(c) Since your answer to Questions 4, 8 or 13 was “Yes”, please indicate whether your Office has renegotiated, or intends to renegotiate, the service contract with its contractor as a consequence of the adoption of the revised version of WIPO Standard ST.22 by the Standards and Documentation Working Group (SDWG) on November 21, 2008:
Section 5: Software and hardware used to OCR
19. What software tools does your Office, or its contractor, use to perform the OCR of patent documents?
ABBYY Finereader
20. Has your Office, or its contractor, developed OCR software extensions specific to patent documents?
Yes
Please provide a concise description of the specific features:
XML Output
21. What hardware does your Office, or its contractor, use to perform the OCR of patent documents?
Linux PC servers
Section 6: Workflow
22. Please describe the workflow for the OCR of your patent documents:
Automatic OCR in batch, human proofreading of worse cases identified by the OCR software character recognition confidence levels, export of the OCR in XML and HTML
23. Does your Office itself check the quality of its OCRed patent documents?
Yes
Please provide a concise description of how the quality check is carried out:
Human proof reading of selected documents
24. Please describe how your Office handles patent documents found to be defective later in the process (e.g., after publication):
The OCR of backfile published documents is improved by using external contarctors.
25. Please provide a concise response to the following issues concerning the storage of OCRed patent documents:
(a) Format(s) in which your Office stores the OCRed patent documents:
proprietary binary format containing all the information coming out of the OCR process (notably position and recognition confidence level for each character)
(b) Does the storage format(s) used by your Office allow for later quality improvements either by your Office or by external contractors?
Yes
(c) Does the storage format used by your Office allow for quick identification of patent documents with OCR defects?
Yes
(d) Does the storage format used by your Office allow for different renditions to view or exchange the OCRed patent documents (e.g., PDF, HTML)?
Yes
(e) Does the storage format used by your Office retain all the raw detailed information obtained from the OCR process (e.g., individual character accuracy estimation, position in image, etc.)?
Yes
(f) Does the storage format used by your Office capture, in text format, table contents, and mathematical and chemical formulae?
No
26. Is the OCR of patent documents also used to increase the efficiency of the work of the Office, (e.g., bibliographic data input from paper applications can be considerably speeded up with accurate OCR)?
Yes
Please provide a concise description of specific features indicating how the efficiency is increased:
OCR is used to assist the translation process (abstracts and reports) and is used to complete a full text publication product (for applications where the description is in TIFF)
27. Does your Office OCR documents other than patent documents?
No
28. If it is known by your Office, please provide a description of the usages by your customers of the documents OCRed by your Office (e.g., internal office patent application searches by examiners, Internet patent application searches by the public, electronic products sold to private subscribers, etc.):
OCR output is used for search purposes
29. Does your Office use OCRed patent documents provided by other offices?
Yes
Please indicate from which office(s) and for which documents, formats and purposes:
yes, for PATENTSCOPE national collections. We OCR documents from Mexico, South Africa, Morroco, Israel, Brazil, Panama, Cuba, Spain (very old documents), Dominican Republic, ARIPO and Kenya.
Notably published EP documents searchable in PATENTSCOPE
Section 7: Additional comments
30. Please provide further comments regarding the implementation and promotion of WIPO Standard ST.22, as well as OCR practices of your Office, if you feel it is necessary:
You have reached the end of the Questionnaire. Please check the response clicking on "Print my answer" icon below.
If your response is complete, please press the button "End of questionnaire" to submit it.