a group of people in an office analyzing graphs

Da die Rechenleistung und der Zugriff auf Informationen zugenommen haben, sind Daten und das daraus abgeleitete Wissen zu einem immer wichtigeren Faktor für die Entscheidungsfindung in Unternehmen geworden. In fast jeder Position in der heutigen Welt werden Entscheidungen auf der Grundlage der Erfassung und Analyse von Daten getroffen. Das gilt auch für Wissenschaftsverlage.

Erfahren Sie mehr über dieses Thema in dem nachfolgenden Blogpost.

In my world, there isn’t a day that goes by when I am not doing something with data. The scholarly publishing community is certainly not an exception. Publishers rely on data in a variety of ways, from modeling OA institutional agreements to looking at publishing trends to increase diversity and inclusivity in authorship to increasing efficiency in the time between the submission and publication of research.

Understand the Question First

When given a large data set, the temptation is almost always to be drawn into the data and to start exploring, but in practice, I have found the data should not be your first step. The first step in analytics or for any analyst is to understand the question that is being asked or the problem that needs to be solved. Data can only be understood in that problem context, and I won’t know the data that I will need or the potential problems with the data until I know what I am trying to do with it. As you explore the data, you begin to discover new questions which let you return to your stakeholders and find out more information and gain a deeper understanding of what is being asked. Analytics always starts with a question, and it is worth spending time defining that question before any data analysis.

Finding Value in Cleaning Data

There is a common saying that cleaning data takes 80%-90% of a data scientist’s or analyst’s time. There is truth to this statement, but I think that its real meaning is often overlooked. This statement is often spoken with a certain amount of derision as if time is being wasted by having to clean the data. However, I find that the time spent cleaning data is where value is added. I equate data cleaning to data quality, ensuring the data is fit for purpose and a fair representation of the real-world entities and events it depicts. It is in data cleaning that you learn just how well your data represents the problem you seek to solve.

Data Analysis Doesn’t Stop

It is important to recognize that data cleaning is not entirely an ‘up-front’ process. Your data will shape-shift over time. You need to always pay attention to how your data is changing so that your models and analyses don’t become invalid. Just as the data will shift, so may the problem you are attempting to solve. Iterative understanding and frequent communication with stakeholders is integral in the data process to make sure that you are answering the question you’ve been presented and that are important to your users.

Take Action with Your Data

There are plenty of well-known examples from the history of statistics and analytics that illustrate the power of data to reveal previously unseen trends, from Florence Nightingale’s work during the Crimean War illustrating how more British soldiers were dying from contagious diseases than in battle to W.E.B. Dubois vivid infographics that helped to explain systematic racism. I come from the world of product management, and I have a very simple yet powerful model for explaining how to build products: we start with human experience, we find words for these, and then build systems around them.

As Senior Product Manager, Analytics I work with the foundational engineering teams to define and build the data processing pipelines and analytics capabilities needed to support our data operations, internal reporting and intelligence, and external analytics and data product needs. If you stopped by my office at any given time, I am probably writing some queries to answer a question, looking at our data to learn how we can do things better, or having a conversation about data models and technology.

In my work at CCC, we use data to drive what we do. We use it when we are analyzing problems and trying to find their root causes. We also use data to proactively find potential problems to fix or new areas of interest that can provide value to our partners proactively. And above all, we use data to learn how we are doing and how we can improve.

This was originally written by Stephen Howe and published on the Velocity of Content blog on copyright.com.

Author: RD

RightsDirect, eine Tochtergesellschaft von Copyright Clearance Center, bietet fortschrittliche Informations- und Datenintegrationslösungen für Organisationen in ganz Europa und Asien. Als Pionier der freiwilligen kollektiven Lizenzierung ist CCC ein führender Anbieter von Informationslösungen für Organisationen auf der ganzen Welt. Mit umfassender Fachkompetenz in den Bereichen Urheberrecht, Technologie, Fachinhalte, PIDs, FAIR-Datenprinzipien, Metadaten und mehr arbeitet CCC daran Urheberrechte zu stärken, den Austausch von Wissen zu beschleunigen und Innovationen voranzutreiben. CCC und seine Tochtergesellschaft RightsDirect unterstützen Unternehmen dabei, die Leistungsfähigkeit von Daten, KI und maschinellem Lernen zu nutzen, um strategische Entscheidungen zu treffen, ihr Geschäft auszubauen und Wettbewerbsvorteile zu erlangen.