A Fuzzy Search Model for Dealing with Retrieval Issues in
Some Classes of Dirty Data
Full Text |
Pdf |
Author |
Olufade, F. W. ONIFADE, Oladeji, P. AKOMOLAFE
|
ISSN |
2079-8407 |
On Pages
|
615-626
|
Volume No. |
2
|
Issue No. |
11
|
Issue Date |
November 01, 2011 |
Publishing Date |
November 01, 2011 |
Keywords |
Dirty data, Fuzzy search, Fuzzy string matching, Data quality
|
Abstract
Potential capital losses and heightened exposure are inherent in the usage of poor data quality management. Existing
efforts like treating data as products; capturing metadata to manage data quality; statistical techniques; source calculus
and algebra; data stewardship and dimensional gap analysis all failed in inculcating the contextual factors which a fuzzy
in nature. The conventional manner of using information requires discrete values which are precise and devoid of
ambiguity, however, this is not realizable as human being employs imprecise expression with high level of uncertainty or
no clear boundaries to describe a situation e.g I am very hungry, it is going to be cloudy today. The bulk of the challenges
to dirty data can be seen to stem from the “not missing, but wrong data”. These result from different data across database,
ambiguous data, use of abbreviation or incomplete text and non-standard data which engulf different representation of
compound data. This research employs fuzzy model to facilitate retrieval despite these myriads of dirty data problems.
Back