Patents belong to the few types of public information that have a big impact on the European economy, and whose proper monitoring, retrieval, representation, interpretation, and assessment so clearly depend on the access to its content, and, thus, on advances in semantics-based techniques. However, research and development in the area of patent processing still focuses on selected traditional tasks such as text retrieval, classification, and shallow linguistic analysis. Recent initiatives that target the automatic access to content of patents attempt to cover ALL knowledge areas. This forces them to rely on term frequency, term co-occurrence and grammatical term categories. I.e., despite the use of a Semantic Web-based formalism, the resulting representation is not a real content representation. As a consequence, tasks that ultimately require knowledge-based multimedia techniques (content-oriented search, assessment, abstracting, etc.) are still, to a major extent, carried out manually.
PatExpert’s overall scientific goal is to change the paradigm currently followed for patent processing from textual (viewing patents as text blocks enriched by “canned” picture material, sequences of morpho-syntactic tokens, or collections of syntactic structures) to semantic (viewing patents as multimedia knowledge objects) processing. PatExpert will develop a multimedia content representation formalism based on Semantic Web technologies for selected technology areas and investigate the retrieval, classification, multilingual generation of concise patent information, assessment and visualization of patent material encoded in this formalism, taking the information needs of all user types as defined in a user typology into account. PatExpert’s technological goal is to develop a showcase that demonstrates the viability of PatExpert’s approach to content representation for real applications. The composition and the competence of the Consortium, ensure the achievement of these goals.