About us		Location		Products		Major current activities		Contact us

User-Adaptive System For Document Authoring

1 AUTOPAT OVERVIEW

AutoPat is an NLP application and consists of an interactive technical knowledge elicitation module and fully automatic text generation module. The input to the system is natural language phrases. Apart from automatic generator AutoPat includes morphological and semantico-syntactic analyzers that convert natural language input into a shallow knowledge representation. The two stages of AutoPat are not strictly pipelined. Lexical selection and some other text planning tasks are interleaved with the process of content specification. The latter results in the production of a "draft" claim. This draft, while not yet an English text, is a list of proposition-level structures ("templates") specifying the proposition head and case role values filled by POS-tagged word strings. The draft is then submitted to an automatic text planner which using week methods outputs an hierarchical structure of templates which is ordered according to rhetorical and stylistic requirements.
      The analysis and generation algorithms are programmed as a dynamic link library while the user interface is an executable module that uses the functions of this library. Hash tables and cashing are used to speed up data access and processing that otherwise would be rather slow due to the complexity of processing algorithms. The knowledge supplied by the user is saved in a special *.pcp file format which stores both the internal representation of the elicited knowledge and its "human" image on the interface screen. The user can quit the program at any moment of elicitation session so that next time she starts it she can resume her work where she left off. The selection is reproduced exactly as it was and it is possible to continue the session without delay.
      The AutoPat knowledge base is corpus-based and draws heavily on the sublanguage. It contains AutoPat inherent knowledge and authoring memory (cf. "translation memory"). The inherent knowledge includes a shallow lexicon of lexical units simply listed with their class membership that is a morpho-semantic classification of words and phrases (this lexicon is used for content support in claim composition and for morphological analysis of the input). And a deep (information-rich) lexicon of predicates (heads of predicative phrases describing essential features of an invention). This lexicon is the main part of the AutoPat static knowledge and covers both the lexical and, crucially for our system, the syntactic knowledge. It is used both to provide content support for technical knowledge elicitation and for generation heuristics. The user can automatically customize this lexicon.
      The AutoPat authoring memory contains lists of terminological units (words and phrases) that were used during AutoPat sessions. It is annotated with document(s) it was used in. This supports content specification and terminology consistency.
      Other characteristic features of the AutoPat are Developer's tool kit (hidden from the user) for knowledge acquisition and testing rules for different levels of linguistic analysis and generation (It includes a number of convenient interfaces and compilers which make it possible for a linguist to improve AutoPat output without programmers' help. It will be described in a separate paper) and an intelligent user-adapted interface for eliciting technical knowledge from the user, - the main subject of the present paper.

2 USERS INTERFACE

2.1 Interface overview
Our elicitation technique is a domain-dependent automated mixed-initiative interview. All user-computer communication is done in a natural language (English in our case). The knowledge elicitation scenario consists of the system requesting the user, in English, to supply information about the invention. Using common graphical interface tools (mouse support, dialogue boxes, menus, templates and slide bars) the interface draws the user through a step-by step procedure of describing every essential feature of the invention. It provides content, composition and terminology maintenance support through choices of standing and pull-down menus. These menus supply access to words and phrases required in a claim. Though the user is encouraged to use the AutoPat controlled language given in the menus the user has always a choice to type in active text areas of interface windows. If a word is in a menu it will be automatically completed right after the first characters are typed. In case the word cannot be found in the inherent knowledge of the system the user will be asked to add it through an easy-to-use pop-up entry box. To add a new word to the dictionary the user is presented with a word template where the slots are automatically filled out with the semantic class of the word and its grammatical forms required by the generator. The user is only supposed to check the fillers and correct them if necessary. All the new words thus added will further be found in the interface menus. Phrases constructed by the user are put in the authoring memory and stay displayed in the "Your terminology" screen area through the end of the session. These and some other lexical units displayed on the screen can be transferred to a new text area if the user clicks on them. All transferred phrases can be edited. The user can check the content elicited so far in an output window where the immediate results of each quantum of acquisition are displayed. If the content appears incorrect, the user can undo the latest quantum of acquisition and do it again correctly. The interface has two main components, -the background window were the results of elicitation procedure stay displayed through the whole session and a set of pop-up windows corresponding to elicitations steps. The two modes of the interface share the background window while the sets of pop-up windows are mainly different. All pop-up windows in both modes can be moved freely around the screen to allow the user to see any part of the background window at any time.
Background Window (see AutoPat screen shots). The left pane of this window is headed "Your invention comprises" and displays a graphical representation of the hierarchy of all main elements and sub-elements after the user supplies the knowledge about them into the system. The names of the elements at its nodes can be transferred to any of pop-up windows by simply clicking on them. The right pane is headed "Essential features of your invention". It displays the title of the invention and every essential feature of the invention in the form of a simple sentence that is generated every time the user supplies a quantum of technical knowledge. Visualization of the results of the elicitation procedure is only done to make it possible and convenient for the user to control the results of her session. The simple sentences correspond to statements in the system's internal knowledge representation language that are created from the knowledge elicitation procedures. At the stage of eliciting knowledge about relations of invention elements a new section headed "Your terminology" appears in the bottom of the left pane. Form now on all phrases used in relation descriptions stay displayed and "click able" there for further reuse.

2.2 Wizard Guide mode
This mode of the interface is highly recommended for a beginner. It guides a user through a step-by step procedure of describing essential features of invention. The main screen elements of the interface are the background window and Wizard windows that contain detailed instructions, the "Help" button and the "Back" button. A brief description of Wizard windows and functionalities is given below.
      Title. Helps the user to select the most appropriate title for the invention. This window contains a title template. The slots of this template contain menus of words and phrases for optional inclusion in the title. To compose the title of an invention the user can either select words from the template slot menus or type them in.
      Main Elements. Prompts the user to describe the main elements of the prototype of the invention. This window displays a template of menus similar to that in the Title window.
      Complex Element. Makes the user specify (by highlighting it in the element tree in the background window) the element whose sub-elements it is necessary to include in the claim. The name of the selected element is transferred to the next window to help the user keep in mind what she is working on.
      Sub-Elements. Prompts the user to describe parts of the element selected at the previous step. This window displays a template of menus similar to that in the Title window.
      Element with Novel Characteristics. Makes the user specify (Highlight in the element tree) the element whose novel properties (that, according to Patent Law, can only be its shape or material) it is necessary to include in the claim. After an element is selected in the tree it appears in the active text area of this window and it is possible to edit it. For example, the user selects, say, the node "four doors" in the element tree and it appears in the text area of the window. The user may now edit into, say, "one of the doors", and this new phrase will appear in the next window to describe its shape or material.
      Shape/Material. Prompts the user to describe novel shapes of materials of the elements specified in the previous window. This window is divided into parts. One part contains two menus of shapes and another displays two menus of materials. This gives the user two ways to describe an element. If the word is selected from the pop-up menus in the "shape" or "material" part of the window AutoPat generates sentences as follows: "An element is in the shape of a circle." If the word is selected from one of the standing menus the user gets the description as follows: "An element is round". The knowledge about shape and material of one element can be elicited in one take by just selecting the words in the shape and material menus. This window has an area where the sentences following the elicitation step are generated (apart from being generated in the background window). This makes it more convenient for the user to control her input.
      Relations. Within the procedure the user selects two or more objects in the element tree then specify the relation between them. The initial setup in this window involves two menus, one listing names of relation types (semantic classes) and another listing words (predicates) that can describe these relations. One can start by first selecting a relation type and then, after a semantic class is selected the second menu displays predicates which belong to this class for further selection. By checking a corresponding radio button it is possible to start directly with selecting a predicate among all the predicates included in the AutoPat knowledge base and listed in the predicate menu. In case the selected predicate is polysemantic, i.e. belongs to more than one semantic classes, these classes appear in the semantic class menu and the user is asked to select one of them to specify the meaning of the predicate. The user can also type in a new predicate if she does not find the word in the menu. In such a case she is guided through a semi-automated and extremely easy procedure of introducing a new word in the underlying predicate dictionary. Selecting a predicate constitutes lexical selection, whereupon the system determines the roles played by the highlighted elements.
      Relation Specification. Presents the user with a predicate (sentence) template based on knowledge about the case-roles (semantic arguments) of the semantic class underlying the selected dictionary item. The user fills appropriate slots - "What", "Where", "How", and so forth (see AutoPat screen shots). (The system records the boundaries of the fillers and their case-role status to be used later for morphological disambiguation, and syntactic analysis and applied to AutoPat's automatic components). To make this easier apart from clickable nodes in the element tree and in phrases in "You terminology" section every template slot has a pop-up menu of auxiliary phrases from the underlying predicate dictionary entry.
      Co-reference. Highlights coreference candidates and ask to mark any elements that are coreferential among them. The coreference candidates are searched by morphosyntactic analyzer and are noun lexemes regardless of their grammatical form.
      Main Claim Format-All. Presents a "checkable" menu of all generated sentences-features. The user can either check the novel features of the invention to thus have a final claim text containing generic and difference parts with the "characterized in that" expression between them, which is a must according to the European Patent Office, or skip this stage. In the latter case the final claim text will be generated without generic and difference parts in the format accepted by the US Patent Office.
      Main Claim Format-Generic and Main Claim Format-Difference. Appear only if underlying meaning representation (hidden from the user) of the generic or difference part of the claim built by the generator exceeds a given threshold of complexity. It presents a "checkable" menu of generic/difference sentences-features for the user to check those features of the invention that are closer related. This breaks the corresponding knowledge representation into two parts thus improving the quality of the generator output.
      Main Claim Text. Presents the output of the Auto generator, - the main claim text in legally acceptable format (see AutoPat screen shots). If necessary the user may edit the text right in this interface window.

The initiative in the Wizard scenario is mixed: the human can use any number of iterations working with windows eliciting elements, shapes, materials or relations but the order in which the user is guided from window to window and, in the case of eliciting coreferences, the order of the presentation of candidates is controlled by the interface.

2.3 Professional Mode
Professional interface mode is designed for a trained user who is instructed by window buttons. The initiative in Professional scenario is mainly for the human. The content of knowledge elicitation and its output are the same as in Wizard Guide. But Professional allows for more speed and flexibility when authoring a claim, - the user may freely navigate among the stages of claim composition, authoring them in any order. In case of coreferences, for example, the user is presented with a list of coreference candidates and is free to decide whether and which of them to check if at all. The user can also see a generated claim part at any authoring stage. Professional is especially convenient when editing a claim draft but can also be used for composing a claim from scratch. This mode of interface keeps the standing background window while elicitation windows change each other. The difference is that they do not appear after the user fulfills a certain part of the interview as in Wizard but are called through the Main Menu in any order. The setups of changeable windows are mainly different from those of Wizard Guide. They are augmented with extra buttons and functionalities and different menus. For example, the Elements window is a merger of the four Wizard windows designed to elicit knowledge about elements and sub-elements. In this window an element tree can easily be restructured; the names of elements can be edited. All changes done at any stage of authoring propagate through the rest of the draft. Deletion of an element in the element tree automatically deletes all invention features-sentences with this element. Change of an element name in the tree automatically changes its name in all corresponding sentences. Using this mode of interface it is possible to delete/add/edit any essential feature in a claim draft keeping the rest of the content intact. Pop-up dialogue boxes for dictionary customizing and content support through the menus of words and phrases are provided in the same way as in Wizard. Professional has two extra windows: Dependent Claims and Dependent Claim Text. The former appears only if called by the user who wants to compose a dependent claim, it elicits information upon which of other claims the current one depends and lets the user return to feature elicitation pages. The latter presents the text of the main claim and all dependent claims.

>>> see AutoPat screen shots