{"id":41489,"date":"2022-11-30T15:25:57","date_gmt":"2022-11-30T15:25:57","guid":{"rendered":"https:\/\/www.rightsdirect.com\/?post_type=blog_post&#038;p=41489"},"modified":"2023-02-16T13:49:35","modified_gmt":"2023-02-16T13:49:35","slug":"3-tipps-zur-datenpipeline","status":"publish","type":"blog_post","link":"https:\/\/www.rightsdirect.com\/de\/blog\/3-tipps-zur-datenpipeline\/","title":{"rendered":"3 Tipps zum Einbinden von Volltextartikeln in Ihre Datenpipeline"},"content":{"rendered":"\n<p><\/p>\n\n\n\n<h2 class=\"wp-block-heading\">F&amp;E-Organisationen t\u00e4tigen heute erhebliche Investitionen in wissenschaftliche Literatur. Anspruchsvolle Wissensmanagement-Teams erkennen die Bedeutung von Forschungsdaten im gesamten Unternehmen und erg\u00e4nzen traditionelle Zeitschriftenabonnementpakete zunehmend mit Datenfeeds und den entsprechenden Rechten, um sie in einer Datenpipeline mithilfe von KI- und maschinellen Lerntechniken zu nutzen.<\/h2>\n\n\n\n<h2 class=\"wp-block-heading\">Lesen Sie in dem nachfolgenden Blogpost, der zun\u00e4chst im Blog des Copyright Clearance Centers ver\u00f6ffentlicht wurde, wie Sie Volltextartikel in Ihre Datenpipeline einbinden k\u00f6nnen.<\/h2>\n\n\n\n<p><\/p>\n\n\n\n<p>After ingesting these feeds, use of the data by particular project teams, internal data lakes, and applications can range from early phase R&amp;D to competitive intelligence, M&amp;A, and licensing, to post-market surveillance and pharmacovigilance.<\/p>\n\n\n\n<p>But often, these groups will face a variety of challenges and opportunities to get the most return for the organization\u2019s investment.&nbsp; Here are a few examples of challenges you face when normalizing full-text XML data, with tips to overcome them:&nbsp;<\/p>\n\n\n\n<h3 class=\"wp-block-heading\"><strong>Delivery method integration<\/strong><\/h3>\n\n\n\n<p>A data provider may deliver materials via SFTP, API, AWS S3 bucket, or some other option, requiring proper scheduling of data transfer jobs. Ideally, these necessitate minimal manual intervention, whilst permitting oversight and awareness of data feeds that have missed scheduled deliveries or produced anomalies (such as an unusual volume of data). The farther upstream such anomalies can be noticed and acted on, the better.\u202f<\/p>\n\n\n\n<p>Tip: Take a baseline of your data feeds and then regularly calculate variance against this baseline in order to detect potential underlying changes. These changes may turn out to be perfectly explainable \u2013 such as a change in the ownership of a journal that results in its disappearance from a longstanding delivery; but in other cases, this comparison may turn up discrepancies that are true mistakes, giving you the opportunity to rectify.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\"><strong>Data parsing<\/strong><\/h3>\n\n\n\n<p>Across data providers, and even within a single provider of data, there can and will be format variations. Changes over time, and from imprint to imprint in the case of published journals that may have changed ownership, require attention at the parsing stage. In an example project from experience at CCC, ingesting full-text data across more than 50 STM publishers resulted in having to account for not only variations of more than 10 stated XML formats (including NLM, JATS, and proprietary), but also to address varying levels of adherence to these stated formats.<\/p>\n\n\n\n<p>Tip: Discuss the potential for variation with your data provider(s) in advance, probing on differences across time and across lines of the provider\u2019s business.\u202f&nbsp;<\/p>\n\n\n\n<h3 class=\"wp-block-heading\"><strong>Desired user experience<\/strong><\/h3>\n\n\n\n<p>Someone, somewhere downstream in the data pipeline, will do something with these data. What is their intended experience to interact with the data, and does this require further processing of the data to satisfy the need?&nbsp;&nbsp;<\/p>\n\n\n\n<p>Examples to consider:&nbsp;<\/p>\n\n\n\n<ul class=\"wp-block-list\"><li>If the user needs to interact with tabular data or figures, are these provided in a consistent manner and properly extracted from the data?&nbsp;&nbsp;<\/li><li>If the data feed(s) are to be merged with other aggregate feeds of data from repositories like MEDLINE\/PubMed, how will the pipeline identify and manage duplicate records, ensuring the appropriate metadata survives for the end user\u2019s needs?&nbsp;&nbsp;<\/li><li>Does longer textual information need to be enriched or annotated using vocabularies or ontologies to support consistent search\/discovery, analytics, or knowledge graph applications?&nbsp;<\/li><\/ul>\n\n\n\n<p>Tip: Establish a clear set of requirements for data parsing, linking those to the business benefits your stakeholders expect downstream. With this set of requirements, you can then also prioritize work and set phases of scope. For example: tabular data and figures might be difficult to extract in a first phase, and could be set aside as you refine your approach.&nbsp;<\/p>\n\n\n\n<h3 class=\"wp-block-heading\"><strong>A single point of access for article content in normalized XML format&nbsp;&nbsp;<\/strong><\/h3>\n\n\n\n<p>Insights that can be found only in the full text of scientific articles undoubtedly enrich AI, machine learning, and data visualization projects.\u00a0<a href=\"https:\/\/www.copyright.com\/solutions-rightfind-xml\/\">With RightFind XML<\/a>, organizations can choose from a variety of flexible models to access normalized, full-text scientific literature in XML format, licensed for commercial text and data mining.\u00a0\u00a0<\/p>\n\n\n\n<p class=\"has-text-align-center has-small-font-size\"><em>This post is by Michael Iarrobino, Director of Product Management and originally appeared on CCC&#8217;s Velocity of Content blog.<\/em><\/p>\n","protected":false},"excerpt":{"rendered":"<p>F&amp;E-Organisationen t\u00e4tigen heute erhebliche Investitionen in wissenschaftliche Literatur. Anspruchsvolle Wissensmanagement-Teams erkennen die Bedeutung von Forschungsdaten im gesamten Unternehmen und erg\u00e4nzen&nbsp;&hellip;<\/p>\n","protected":false},"author":242,"featured_media":41490,"template":"","meta":{"_acf_changed":false,"inline_featured_image":false,"footnotes":"","_links_to":"","_links_to_target":""},"internal_tag":[],"topic":[],"coauthors":[],"class_list":["post-41489","blog_post","type-blog_post","status-publish","has-post-thumbnail","hentry"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v27.2 - https:\/\/yoast.com\/product\/yoast-seo-wordpress\/ -->\n<title>3 Tipps zum Einbinden von Volltextartikeln in Ihre Datenpipeline - RightsDirect<\/title>\n<meta name=\"description\" content=\"Dieser Artikel enth\u00e4lt drei Tipps f\u00fcr die Aufnahme von Volltextartikeln in die Datenpipeline Ihres Unternehmens.\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/www.rightsdirect.com\/de\/blog\/3-tipps-zur-datenpipeline\/\" \/>\n<meta property=\"og:locale\" content=\"de_DE\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"3 Tipps zum Einbinden von Volltextartikeln in Ihre Datenpipeline - RightsDirect\" \/>\n<meta property=\"og:description\" content=\"Dieser Artikel enth\u00e4lt drei Tipps f\u00fcr die Aufnahme von Volltextartikeln in die Datenpipeline Ihres Unternehmens.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.rightsdirect.com\/de\/blog\/3-tipps-zur-datenpipeline\/\" \/>\n<meta property=\"og:site_name\" content=\"RightsDirect\" \/>\n<meta property=\"article:publisher\" content=\"https:\/\/www.facebook.com\/RightsDirect\" \/>\n<meta property=\"article:modified_time\" content=\"2023-02-16T13:49:35+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/www.rightsdirect.com\/wp-content\/uploads\/sites\/6\/2022\/11\/3-tips-to-incorporate-data-pipeline-1.jpg\" \/>\n\t<meta property=\"og:image:width\" content=\"3000\" \/>\n\t<meta property=\"og:image:height\" content=\"1275\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/jpeg\" \/>\n<meta name=\"twitter:label1\" content=\"Gesch\u00e4tzte Lesezeit\" \/>\n\t<meta name=\"twitter:data1\" content=\"3\u00a0Minuten\" \/>\n\t<meta name=\"twitter:label2\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data2\" content=\"RD\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"WebPage\",\"@id\":\"https:\/\/www.rightsdirect.com\/de\/blog\/3-tipps-zur-datenpipeline\/\",\"url\":\"https:\/\/www.rightsdirect.com\/de\/blog\/3-tipps-zur-datenpipeline\/\",\"name\":\"3 Tipps zum Einbinden von Volltextartikeln in Ihre Datenpipeline - RightsDirect\",\"isPartOf\":{\"@id\":\"https:\/\/www.rightsdirect.com\/de\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\/\/www.rightsdirect.com\/de\/blog\/3-tipps-zur-datenpipeline\/#primaryimage\"},\"image\":{\"@id\":\"https:\/\/www.rightsdirect.com\/de\/blog\/3-tipps-zur-datenpipeline\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/www.rightsdirect.com\/wp-content\/uploads\/sites\/6\/2022\/11\/3-tips-to-incorporate-data-pipeline-1.jpg\",\"datePublished\":\"2022-11-30T15:25:57+00:00\",\"dateModified\":\"2023-02-16T13:49:35+00:00\",\"description\":\"Dieser Artikel enth\u00e4lt drei Tipps f\u00fcr die Aufnahme von Volltextartikeln in die Datenpipeline Ihres Unternehmens.\",\"breadcrumb\":{\"@id\":\"https:\/\/www.rightsdirect.com\/de\/blog\/3-tipps-zur-datenpipeline\/#breadcrumb\"},\"inLanguage\":\"de\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/www.rightsdirect.com\/de\/blog\/3-tipps-zur-datenpipeline\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"de\",\"@id\":\"https:\/\/www.rightsdirect.com\/de\/blog\/3-tipps-zur-datenpipeline\/#primaryimage\",\"url\":\"https:\/\/www.rightsdirect.com\/wp-content\/uploads\/sites\/6\/2022\/11\/3-tips-to-incorporate-data-pipeline-1.jpg\",\"contentUrl\":\"https:\/\/www.rightsdirect.com\/wp-content\/uploads\/sites\/6\/2022\/11\/3-tips-to-incorporate-data-pipeline-1.jpg\",\"width\":3000,\"height\":1275,\"caption\":\"Datenpipeline, KI, Machine Learning\"},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/www.rightsdirect.com\/de\/blog\/3-tipps-zur-datenpipeline\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/www.rightsdirect.com\/de\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Blog Posts\",\"item\":\"https:\/\/www.rightsdirect.com\/de\/blog\/\"},{\"@type\":\"ListItem\",\"position\":3,\"name\":\"3 Tipps zum Einbinden von Volltextartikeln in Ihre Datenpipeline\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/www.rightsdirect.com\/de\/#website\",\"url\":\"https:\/\/www.rightsdirect.com\/de\/\",\"name\":\"RightsDirect\",\"description\":\"Global Copyright Compliance Solutions | Rights Licensing | Copyright Education\",\"publisher\":{\"@id\":\"https:\/\/www.rightsdirect.com\/de\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/www.rightsdirect.com\/de\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"de\"},{\"@type\":\"Organization\",\"@id\":\"https:\/\/www.rightsdirect.com\/de\/#organization\",\"name\":\"RightsDirect\",\"url\":\"https:\/\/www.rightsdirect.com\/de\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"de\",\"@id\":\"https:\/\/www.rightsdirect.com\/de\/#\/schema\/logo\/image\/\",\"url\":\"https:\/\/www.rightsdirect.com\/wp-content\/uploads\/sites\/6\/2016\/05\/RightsDirect-Logo.RGB-300ppi.jpg\",\"contentUrl\":\"https:\/\/www.rightsdirect.com\/wp-content\/uploads\/sites\/6\/2016\/05\/RightsDirect-Logo.RGB-300ppi.jpg\",\"width\":2000,\"height\":1200,\"caption\":\"RightsDirect\"},\"image\":{\"@id\":\"https:\/\/www.rightsdirect.com\/de\/#\/schema\/logo\/image\/\"},\"sameAs\":[\"https:\/\/www.facebook.com\/RightsDirect\",\"https:\/\/x.com\/RightsDirect\",\"https:\/\/www.linkedin.com\/company\/rightsdirect\",\"https:\/\/www.youtube.com\/user\/copyrightclear\"]}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"3 Tipps zum Einbinden von Volltextartikeln in Ihre Datenpipeline - RightsDirect","description":"Dieser Artikel enth\u00e4lt drei Tipps f\u00fcr die Aufnahme von Volltextartikeln in die Datenpipeline Ihres Unternehmens.","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/www.rightsdirect.com\/de\/blog\/3-tipps-zur-datenpipeline\/","og_locale":"de_DE","og_type":"article","og_title":"3 Tipps zum Einbinden von Volltextartikeln in Ihre Datenpipeline - RightsDirect","og_description":"Dieser Artikel enth\u00e4lt drei Tipps f\u00fcr die Aufnahme von Volltextartikeln in die Datenpipeline Ihres Unternehmens.","og_url":"https:\/\/www.rightsdirect.com\/de\/blog\/3-tipps-zur-datenpipeline\/","og_site_name":"RightsDirect","article_publisher":"https:\/\/www.facebook.com\/RightsDirect","article_modified_time":"2023-02-16T13:49:35+00:00","og_image":[{"width":3000,"height":1275,"url":"https:\/\/www.rightsdirect.com\/wp-content\/uploads\/sites\/6\/2022\/11\/3-tips-to-incorporate-data-pipeline-1.jpg","type":"image\/jpeg"}],"twitter_misc":{"Gesch\u00e4tzte Lesezeit":"3\u00a0Minuten","Written by":"RD"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"WebPage","@id":"https:\/\/www.rightsdirect.com\/de\/blog\/3-tipps-zur-datenpipeline\/","url":"https:\/\/www.rightsdirect.com\/de\/blog\/3-tipps-zur-datenpipeline\/","name":"3 Tipps zum Einbinden von Volltextartikeln in Ihre Datenpipeline - RightsDirect","isPartOf":{"@id":"https:\/\/www.rightsdirect.com\/de\/#website"},"primaryImageOfPage":{"@id":"https:\/\/www.rightsdirect.com\/de\/blog\/3-tipps-zur-datenpipeline\/#primaryimage"},"image":{"@id":"https:\/\/www.rightsdirect.com\/de\/blog\/3-tipps-zur-datenpipeline\/#primaryimage"},"thumbnailUrl":"https:\/\/www.rightsdirect.com\/wp-content\/uploads\/sites\/6\/2022\/11\/3-tips-to-incorporate-data-pipeline-1.jpg","datePublished":"2022-11-30T15:25:57+00:00","dateModified":"2023-02-16T13:49:35+00:00","description":"Dieser Artikel enth\u00e4lt drei Tipps f\u00fcr die Aufnahme von Volltextartikeln in die Datenpipeline Ihres Unternehmens.","breadcrumb":{"@id":"https:\/\/www.rightsdirect.com\/de\/blog\/3-tipps-zur-datenpipeline\/#breadcrumb"},"inLanguage":"de","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.rightsdirect.com\/de\/blog\/3-tipps-zur-datenpipeline\/"]}]},{"@type":"ImageObject","inLanguage":"de","@id":"https:\/\/www.rightsdirect.com\/de\/blog\/3-tipps-zur-datenpipeline\/#primaryimage","url":"https:\/\/www.rightsdirect.com\/wp-content\/uploads\/sites\/6\/2022\/11\/3-tips-to-incorporate-data-pipeline-1.jpg","contentUrl":"https:\/\/www.rightsdirect.com\/wp-content\/uploads\/sites\/6\/2022\/11\/3-tips-to-incorporate-data-pipeline-1.jpg","width":3000,"height":1275,"caption":"Datenpipeline, KI, Machine Learning"},{"@type":"BreadcrumbList","@id":"https:\/\/www.rightsdirect.com\/de\/blog\/3-tipps-zur-datenpipeline\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/www.rightsdirect.com\/de\/"},{"@type":"ListItem","position":2,"name":"Blog Posts","item":"https:\/\/www.rightsdirect.com\/de\/blog\/"},{"@type":"ListItem","position":3,"name":"3 Tipps zum Einbinden von Volltextartikeln in Ihre Datenpipeline"}]},{"@type":"WebSite","@id":"https:\/\/www.rightsdirect.com\/de\/#website","url":"https:\/\/www.rightsdirect.com\/de\/","name":"RightsDirect","description":"Global Copyright Compliance Solutions | Rights Licensing | Copyright Education","publisher":{"@id":"https:\/\/www.rightsdirect.com\/de\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/www.rightsdirect.com\/de\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"de"},{"@type":"Organization","@id":"https:\/\/www.rightsdirect.com\/de\/#organization","name":"RightsDirect","url":"https:\/\/www.rightsdirect.com\/de\/","logo":{"@type":"ImageObject","inLanguage":"de","@id":"https:\/\/www.rightsdirect.com\/de\/#\/schema\/logo\/image\/","url":"https:\/\/www.rightsdirect.com\/wp-content\/uploads\/sites\/6\/2016\/05\/RightsDirect-Logo.RGB-300ppi.jpg","contentUrl":"https:\/\/www.rightsdirect.com\/wp-content\/uploads\/sites\/6\/2016\/05\/RightsDirect-Logo.RGB-300ppi.jpg","width":2000,"height":1200,"caption":"RightsDirect"},"image":{"@id":"https:\/\/www.rightsdirect.com\/de\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/www.facebook.com\/RightsDirect","https:\/\/x.com\/RightsDirect","https:\/\/www.linkedin.com\/company\/rightsdirect","https:\/\/www.youtube.com\/user\/copyrightclear"]}]}},"acf":[],"publishpress_future_workflow_manual_trigger":{"enabledWorkflows":[]},"_links":{"self":[{"href":"https:\/\/www.rightsdirect.com\/de\/wp-json\/wp\/v2\/blog_post\/41489","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.rightsdirect.com\/de\/wp-json\/wp\/v2\/blog_post"}],"about":[{"href":"https:\/\/www.rightsdirect.com\/de\/wp-json\/wp\/v2\/types\/blog_post"}],"author":[{"embeddable":true,"href":"https:\/\/www.rightsdirect.com\/de\/wp-json\/wp\/v2\/users\/242"}],"version-history":[{"count":1,"href":"https:\/\/www.rightsdirect.com\/de\/wp-json\/wp\/v2\/blog_post\/41489\/revisions"}],"predecessor-version":[{"id":41493,"href":"https:\/\/www.rightsdirect.com\/de\/wp-json\/wp\/v2\/blog_post\/41489\/revisions\/41493"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.rightsdirect.com\/de\/wp-json\/wp\/v2\/media\/41490"}],"wp:attachment":[{"href":"https:\/\/www.rightsdirect.com\/de\/wp-json\/wp\/v2\/media?parent=41489"}],"wp:term":[{"taxonomy":"internal_tag","embeddable":true,"href":"https:\/\/www.rightsdirect.com\/de\/wp-json\/wp\/v2\/internal_tag?post=41489"},{"taxonomy":"topic","embeddable":true,"href":"https:\/\/www.rightsdirect.com\/de\/wp-json\/wp\/v2\/topic?post=41489"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/www.rightsdirect.com\/de\/wp-json\/wp\/v2\/coauthors?post=41489"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}