|Thesis abstract: |
The term information overload refers to the difficulty to understand and make decisions when too much information is available; in the era of Big Data this problem is becoming dramatic, since users may be literally overwhelmed by the cataract of data accessible in the most varied forms. With context-aware data tailoring, given a target application, in each specific context the system allows the user to access only the view (over a global data schema) which is relevant for that application in that context; this normally produces a great reduction of the mass of available data, along with a specialization of this data to the current personal interests of the user.
This doctoral research begins by considering an existing context model named Context Dimension Model (CDM), able to represent all the available contexts in a given scenario ¿ constituting the context schema ¿ through a tree-shaped structure. Starting from the CDM, some context-related issues particularly relevant for data tailoring are studied.
First, an RDF representation of context schemas is proposed, providing suitable RDFS classes and properties. A complete and independent set of RDF integrity constraints is used to guarantee the compliance of the representation with the CDM definition. To this aim, some categories of constraints already defined in the literature are employed together with novel ones; SPARQL queries to check the satisfaction of the new kinds of constraints are proposed, and some theoretical properties are investigated.
Second, context schema evolution is considered. The useful perspectives to be used in context-aware data reduction depend on the application requirements, which are intrinsically dynamic and thus can evolve. In this scenario it is natural that some context-aware applications be not up-to-date, since they still use obsolete context schemas. This issue is tackled defining a set of evolution operators that the designer must employ to perform the updates; each operator is also associated with the changes that have to be applied to the contexts defined according to the old schema in order to make them compliant with the new one. Moreover, we study the implications of the schema evolution on the association between contexts and data, and provide techniques to optimize sequences of operators. A prototype tool implementing the proposed operators confirms the effectiveness of our strategy.
Finally, we leave context modeling issues to deal with the interaction between context and user preferences. In fact, in order to determine the most suitable data portion for a certain user in a certain context, the contextual information may be coupled with the user personal preferences. Since the contexts are obtained by combining the values of the various dimensions, and the number of possible configurations may rapidly grow to several hundreds also for small context schemas, requiring users to manually specify long lists of preferences for each possible context means really expecting too much from their spirit of collaboration. In this research, we propose a methodology where contextual preferences on tuples and attributes of a relational database are learned from the previous user¿s querying activity, gathering knowledge in terms of association rules. Experimental results highlight both the effectiveness of the approach and the utility of enriching user preferences with contextual information.