Open Access
Subscription Access
Open Access
Subscription Access
CDPSM: A New Optimized Progressive Big Data Analytics For Partial Cancer Data using Amazon EMR
Subscribe/Renew Journal
Identifying of symptoms and treating cancer requires a thorough investigation and research requiring analysis of multiple levels available (partial or full) cancer data. Cancer data is spread across multiple data sources and data warehouses which are decentralized and are in different locations. Therefore only half or partial data is available. Progressive analytics provide an efficient way for querying data from various data clusters where each cluster contains only a piece of the examined data. We propose an effective framework to perform analytics over the available cancer data say Cancer Data Progressive Sampling Model (CDPSM) built for partially available cancer data deployed on Amazon EMR. Through a large number of experiments, we reveal the advantages of the proposed model and give numerical results comparing them with a deterministic model. These results indicate that the proposed model can efficiently reduce the time for performing progressive data analytics over partial cancer data and maintaining the quality of the result at high levels.
Keywords
Big Data, Progressive Sampling.
Subscription
Login to verify subscription
User
Font Size
Information
Abstract Views: 289
PDF Views: 0