Download Data and Information Quality: Dimensions, Principles and by Carlo Batini, Monica Scannapieco PDF

By Carlo Batini, Monica Scannapieco

This publication offers a scientific and comparative description of the significant variety of study concerns regarding the standard of information and data. It does so via supplying a valid, built-in and entire evaluation of the state-of-the-art and destiny improvement of information and data caliber in databases and knowledge systems.

To this finish, it offers an intensive description of the ideas that represent the middle of knowledge and data caliber learn, together with list linkage (also known as item identification), info integration, errors localization and correction, and examines the similar suggestions in a complete and unique methodological framework. caliber size definitions and followed versions also are analyzed intimately, and changes among the proposed suggestions are highlighted and mentioned. additionally, whereas systematically describing info and data caliber as an self sustaining study region, paradigms and impacts deriving from different parts, resembling likelihood idea, statistical facts research, facts mining, wisdom illustration, and laptop studying also are incorporated. final now not least, the booklet additionally highlights very functional recommendations, akin to methodologies, benchmarks for the best thoughts, case experiences, and examples.

The e-book has been written essentially for researchers within the fields of databases and knowledge administration or in usual sciences who're drawn to investigating homes of information and data that experience an influence at the caliber of experiments, methods and on genuine lifestyles. the cloth awarded is usually sufficiently self-contained for masters or PhD-level classes, and it covers the entire basics and themes with out the necessity for different textbooks. info and data process directors and practitioners, who take care of structures uncovered to data-quality concerns and for this reason desire a systematization of the sector and useful tools within the zone, also will enjoy the mix of concrete functional methods with sound theoretical formalisms.

Show description

Read or Download Data and Information Quality: Dimensions, Principles and Techniques PDF

Similar information theory books

Networks and Grids: Technology and Theory

This textbook is meant for an undergraduate/graduate path on machine networks and for introductory classes facing functionality review of pcs, networks, grids and telecommunication platforms. not like different books at the topic, this article provides a balanced method among know-how and mathematical modeling.

Future Information Technology - II

The hot multimedia criteria (for instance, MPEG-21) facilitate the seamless integration of a number of modalities into interoperable multimedia frameworks, remodeling the best way humans paintings and engage with multimedia facts. those key applied sciences and multimedia options have interaction and collaborate with one another in more and more potent methods, contributing to the multimedia revolution and having an important impression throughout a large spectrum of purchaser, company, healthcare, schooling, and governmental domain names.

Data and Information Quality: Dimensions, Principles and Techniques

This booklet presents a scientific and comparative description of the colossal variety of examine concerns relating to the standard of knowledge and knowledge. It does so by way of providing a legitimate, built-in and accomplished evaluate of the state-of-the-art and destiny improvement of knowledge and knowledge caliber in databases and knowledge structures.

Extra info for Data and Information Quality: Dimensions, Principles and Techniques

Example text

8 deals with schema dimensions, briefly describing correctness, minimality, completeness, and pertinence and, in more detail, readability and normalization. 2 A Classification Framework for Data and Information Quality Dimensions Dimensions for data quality introduced in this chapter and dimensions for information quality discussed in subsequent chapters of the book can be characterized by a common classification framework that allows us to compare dimensions across different information types.

In practical cases, coarser accuracy definitions and metrics may be applied. As an example, it is possible to calculate the accuracy of an attribute called attribute (or column) accuracy, of a relation (relation accuracy), or of a whole database (database accuracy). When considering accuracy for sets of values instead of single values, a further notion of accuracy can be introduced, namely, duplication. Duplication occurs when a real-world entity is stored twice or more in a data source. Of course, if a primary key consistency check is performed when populating a relational table, a duplication problem does not occur, provided that the primary key assignment has been made with a reliable procedure.

2 A Classification Framework for Data and Information Quality Dimensions Dimensions for data quality introduced in this chapter and dimensions for information quality discussed in subsequent chapters of the book can be characterized by a common classification framework that allows us to compare dimensions across different information types. The framework is based on a classification in clusters of dimensions proposed in [45] where dimensions are included in the same cluster according to their similarity.

Download PDF sample

Rated 4.26 of 5 – based on 35 votes