{"id":24165,"date":"2021-01-21T19:24:30","date_gmt":"2021-01-21T19:24:30","guid":{"rendered":"https:\/\/www.rightsdirect.com\/?post_type=blog_post&p=24165"},"modified":"2021-01-21T19:24:30","modified_gmt":"2021-01-21T19:24:30","slug":"fuenf-empfehlungen-fuer-smartes-text-und-data-mining","status":"publish","type":"blog_post","link":"https:\/\/www.rightsdirect.com\/de\/blog\/fuenf-empfehlungen-fuer-smartes-text-und-data-mining\/","title":{"rendered":"F\u00fcnf Empfehlungen f\u00fcr smartes Text- und Data-Mining"},"content":{"rendered":"

J\u00fcngste Untersuchungen<\/a> zeigen, dass die Einreichungsrate bei wissenschaftlichen Zeitschriften in den ersten Monaten des Jahres 2020 exponentiell gestiegen ist. Da die Menge an verf\u00fcgbaren Informationen st\u00e4ndig w\u00e4chst, wenden sich F&E-intensive Unternehmen zunehmend dem Text- und Data-Mining von wissenschaftlicher Volltextliteratur zu – sowohl im gro\u00dfen Ma\u00dfstab als auch im Kontext einzelner Projekte – um Informationen zu extrahieren und ihre Wissenslieferkette zu st\u00e4rken. Nat\u00fcrlich variieren diese Anstrengungen in Umfang und Anforderungen von Unternehmen zu Unternehmen oder sogar von Projekt zu Projekt. Bei der Nutzung von Volltextinhalten gibt es viele Faktoren, die ein Wissensmanager ber\u00fccksichtigen sollte, wenn er versucht, einen optimalen Workflow zu entwickeln, der f\u00fcr die Bed\u00fcrfnisse seines Unternehmens geeignet ist. Im Folgenden m\u00f6chten wir ein paar dieser Faktoren n\u00e4her beleuchten.<\/p>\n

1) End-to-End Workflow<\/strong><\/p>\n

As a knowledge manager, it is essential to understand your company\u2019s expected end-to-end workflow for text mining full-text literature. It can be useful to map the anticipated inputs and outputs at each phase of the workflow, as well as clarifying expected timelines and business criticality. This applies both to any backend data processing pipeline as well as to the dependent end-user workflows. By looking at this workflow as one continuous stream, a knowledge manager can ensure that adjustments upstream do not break processes downstream.<\/p>\n

2) Corpus Parameters<\/strong><\/p>\n

The parameters for defining a full-text corpus of scientific literature will vary depending on the organization\u2019s end-to-end workflow. For example, the dimensions of a corpus being leveraged in a text mining process applied to specific projects \u2013 such as a pharmacovigilance workflow \u2013 will differ from those used within broader initiatives to process scientific information at scale, apply machine learning or artificial intelligence capabilities, or construct knowledge graph representations. In narrower use cases, specific queries may rely on keywords or subject-related metadata (such as Medical Subject Headings aka MeSH or other indexing aids) that will pull relevant content based on the project specifications. The broader the use case, the less likely an organization is to be able to pre-filter for specific topics; in these cases, time- or journal-based, or other broader categories of content, need to be applied. Based on the end-to-end workflow envisioned, knowledge managers can help their stakeholders by identifying key questions that will define the approach to creating a useful corpus, such as:<\/p>\n