The Inverse Tree-OLAP Problem: Definitions, Models, Complexity Analysis, and a Possible Solution

Document Type

Conference Proceeding

Publication Date

2018

DOI

https://doi.org/10.1145/3216122.3216129

Abstract

Count constraint is a data dependency that requires the results of given count operations on a relation to be within a certain range. By means of count constraints a new decisional problem, called the Inverse OLAP, has been recently introduced: given a flat fact table, does there exist an instance satisfying a set of given count constraints? This paper focuses on a special case of Inverse OLAP, called Inverse Tree-OLAP, for which the flat fact table key is modeled by a Dimensional Fact Model (DFM) with a tree structure. The count constraints define aggregation patterns to be respected by both the many-to-many relationship among the basic dimensions and the one-to-many relationships within dimension hierarchies. A count constraint is required to have a particular structure so that the problem of handling fact table projections with duplicates is avoided. The simplified structure enables the invention of an effective method for its solution that consists of three main steps: (1) using some of the count constraints to extract a subproblem that is formulated as a known data mining problem (inverse frequent itemset mining), (2) solving the subproblem using a recent method that has been shown to be effective in practical situations also for large size instances and (3) enforcing the remaining count constraints on the solution returned by step 2 using a system of linear equations. The overall proposed approach can be effectively used to generate OLAP cubes for benchmarking that reflect patterns of real datasets.

Share

COinS