Survey on the Implementation and Promotion of WIPO Standard ST.22
Response submitted by China in 2011
The present questionnaire addresses issues concerning WIPO Standard ST.22 (Recommendation for the authoring of patent applications for the purpose of facilitating optical character recognition (OCR)) and patent applications submitted on paper or submitted electronically (e-filed) but having the text body of the application submitted in image form (e.g., PDF or TIFF images). Even if your Office does not perform OCR on its documents, please respond to the questions which are applicable.
A revised version of WIPO Standard ST.22 was adopted by the Standards and Documentation Working Group (SDWG) on November 21, 2008. It is available at: https://www.wipo.int/standards/en/pdf/03-22-01.pdf
The results of the survey will be presented for consideration by the Committee on WIPO Standards (CWS).
Please take note that some questions might not be displayed depending on the response(s) to preceding question(s). This could cause gaps in numbering of displayed questions and sections.
Section 1: Patent filing in your Office
1. Does your Office accept patent applications submitted on paper or submitted electronically but having the text body of the application submitted in image form (e.g., PDF or TIFF images)?
Yes
Please comment if necessary:
The patent applications submitted electronically but having the text body in PDF form are accepted.
2. If applicable, please indicate the percentage, with respect to the total number of applications received by your Office, and the year of reference (e.g., 60% in 2008), of the following
applications filed on paper: | 78.5% for invention in 2010, 74.1% for utility model in 2010. |
applications filed electronically but having the text body of the application submitted in image form: | 9.4% for invention in 2010, 10.2% for utility model in 2010. |
3. Does your Office perform optical character recognition (OCR) on patent applications?
Yes
Please comment if necessary:
Since 2001
Section 2: Promotion and use of WIPO Standard ST.22
4. Has your Office adapted the filing guidance that it provides to applicants to take into account the recommendations of the revised version of WIPO Standard ST.22?
No
Please comment if necessary:
The writing rules for application filing in SIPO's examination guidance has never been revised since its first edition in 1993, and embodies a favorable inheritance feature. To respect applicants' writing habits, the adaptive revision to correspond with ST. 22 has yet not been made. However, the writing rules for application filing in SIPO's examination guidance partly conform to ST. 22, and the research of WIPO standards is also ongoing.
5. Has your Office promoted the use by applicants of the recommendations provided by WIPO Standard ST.22?
No
6. If applicable, what publication means has your Office used to promote the use of WIPO Standard ST.22 (e.g., article in the official gazette, amendment of Office's filing recommendations, publication on the Office's website, newsletters)?
Please specify the details (e.g., entry or section of the official gazette, URL of the location where the announcement is available):
7. Has your Office promoted WIPO Standard ST.22 in any other way (e.g., conferences, information circulars)?
No
Section 4: OCR practices of IPOs
14. Since your answer to Question 3 was"Yes", please indicate if the following purposes are applicable and, if"Yes", the accuracy requirements established by your Office:
a) Security screening of patent applications
Yes
Accuracy requirements:
99.99%
b) Publication of the patent applications
Yes
Accuracy requirements:
99.99%
c) Publication of the granted patents
Yes
Accuracy requirements:
As OCR is already carried out for new filing documents, following-up documents and amendment documents, there exists no necessity to specially implement OCR process for publication of the patent applications and granted patents.
d) Please indicate other purpose(s) and corresponding accuracy requirements if necessary:
15. Does your Office have in-house quality checking measures in place to control the quality of the OCRed patent documents?
Yes
Please provide a concise description of the measures (e.g., refer to the relative automation of the quality checking indicating if it comprises the review by staff of randomly selected output, and/or if it is based on the accuracy confidence metrics produced by the OCR software):
The review by staff of randomly selected output is carried out.
16. Does your Office OCR patent documents in foreign languages?
No
17. Does your Office outsource the OCR of patent documents?
Yes
At what stage(s) of the procedure does your Office forward the patent documents to the external contractor?
As soon as the documents are received, they are forwarded to the contractor.
18. If you answered"Yes" to the previous Question:
(a) If applicable, please indicate any comments or feedback that your Office might have received from the contractor about the recommendations of WIPO Standard ST.22:
No.
(b) Please also describe the quality checking measures used to control the quality of the OCRed patent documents that are performed by your contractor:
The quality checking measures include vertical word check, horizontal word check, text check and tag check.
(c) Since your answer to Questions 4, 8 or 13 was"Yes", please indicate whether your Office has renegotiated, or intends to renegotiate, the service contract with its contractor as a consequence of the adoption of the revised version of WIPO Standard ST.22 by the Standards and Documentation Working Group (SDWG) on November 21, 2008:
Section 5: Software and hardware used to OCR
19. What software tools does your Office, or its contractor, use to perform the OCR of patent documents?
The contractor of SIPO applies several commercial softwares with high Chinese character recognition accuracy to the OCR of patent documents.
20. Has your Office, or its contractor, developed OCR software extensions specific to patent documents?
No
Please comment if necessary:
If the recognition results of the same character by different softwares are the same, manual check will not be carried out. On the contrary, if the results are different, the character will be highlighted and manual check will be carried out.
21. What hardware does your Office, or its contractor, use to perform the OCR of patent documents?
SIPO applies standard PCs to the OCR of patent documents.
Section 6: Workflow
22. Please describe the workflow for the OCR of your patent documents:
SIPO's OCR workflow includes eight procedures: scanning, recognition, vertical work check, horizontal word check, text check, tagging, tag check and quality check.
23. Does your Office itself check the quality of its OCRed patent documents?
Yes
Please provide a concise description of how the quality check is carried out:
The OCRed documents are sampled and the samples are manually checked.
24. Please describe how your Office handles patent documents found to be defective later in the process (e.g., after publication):
The OCRed documents are firstly returned to applicants for defection spotting and examiners are also expected to check the accuracy of the OCRed documents. If the documents are identified as defective, they will be withdrawn back to the contractor for reprocessing. If the documents are found to be defective after publication, the documents will be republished or re-announced respectively with A8, A9, etc. as document type identification.
25. Please provide a concise response to the following issues concerning the storage of OCRed patent documents:
(a) Format(s) in which your Office stores the OCRed patent documents:
in XML format.
(b) Does the storage format(s) used by your Office allow for later quality improvements either by your Office or by external contractors?
Yes
(c) Does the storage format used by your Office allow for quick identification of patent documents with OCR defects?
No
(d) Does the storage format used by your Office allow for different renditions to view or exchange the OCRed patent documents (e.g., PDF, HTML)?
Yes
(e) Does the storage format used by your Office retain all the raw detailed information obtained from the OCR process (e.g., individual character accuracy estimation, position in image, etc.)?
Yes
Please comment if necessary:
As to the Complex Work Units (CWU) such as mathematical and chemical formula, and tables, the position in the page can be retained.
(f) Does the storage format used by your Office capture, in text format, table contents, and mathematical and chemical formulae?
No.
26. Is the OCR of patent documents also used to increase the efficiency of the work of the Office, (e.g., bibliographic data input from paper applications can be considerably speeded up with accurate OCR)?
Yes
Please provide a concise description of specific features indicating how the efficiency is increased:
The OCR of patent documents can facilitate the security screening, the notification writing and abstract rewriting by examiners, and etc.
27. Does your Office OCR documents other than patent documents?
Yes
Please indicate which documents:
The amendment documents and observations from applicants are also OCRed.
28. If it is known by your Office, please provide a description of the usages by your customers of the documents OCRed by your Office (e.g., internal office patent application searches by examiners, Internet patent application searches by the public, electronic products sold to private subscribers, etc.):
SIPO's OCRed documents are incorporated in the search database, so as to be searched by examiners and the public, and etc.
29. Does your Office use OCRed patent documents provided by other offices?
No
Section 7: Additional comments
30. Please provide further comments regarding the implementation and promotion of WIPO Standard ST.22, as well as OCR practices of your Office, if you feel it is necessary:
You have reached the end of the Questionnaire. Please check the response clicking on"Print my answer" icon below.
If your response is complete, please press the button"End of questionnaire" to submit it.