Open Access Open Access  Restricted Access Subscription Access

Minimization of Datasets : Using a Master Interlinked Dataset


Affiliations
1 Student, Bharati Vidyapeeth Deemed University College of Engineering, Pune - 411043, India
2 Assistant Professor, Computer Engineering, Bharati Vidyapeeth Deemed University College of Engineering, Pune - 411043, India

   Subscribe/Renew Journal


We all know there are a lot of datasets. Each data set corresponds to the contents of a single statistical database. Datasets have several properties based on statistical measures applicable to the number and type of attributes or variables. Here, the focus is mainly on statistics i.e., sampling of data based on observation and analysis. Each data of a dataset is sampled quantitatively by doing binary encoding. Sampling of a dataset using a predictor can often result in error. However, these errors can have a trend that might be related to one or more datasets. This can differentiate every variable of one dataset from remaining datasets. All these datasets can be unified into a single master dataset based on user requirements.

Keywords

Controlled Datasets, Dataset Binary Encoding, Machine Learning, Master Datasets, Progressive Sampling of Data

No Classification

Publishing Chronology Manuscript received July 28, 2018; revised August 12, 2018; accepted August 14, 2018. Date of publication September 6, 2018

User
Subscription Login to verify subscription
Notifications
Font Size

Abstract Views: 252

PDF Views: 0




  • Minimization of Datasets : Using a Master Interlinked Dataset

Abstract Views: 252  |  PDF Views: 0

Authors

Syed Ausaf Haider
Student, Bharati Vidyapeeth Deemed University College of Engineering, Pune - 411043, India
N. S. Patil
Assistant Professor, Computer Engineering, Bharati Vidyapeeth Deemed University College of Engineering, Pune - 411043, India

Abstract


We all know there are a lot of datasets. Each data set corresponds to the contents of a single statistical database. Datasets have several properties based on statistical measures applicable to the number and type of attributes or variables. Here, the focus is mainly on statistics i.e., sampling of data based on observation and analysis. Each data of a dataset is sampled quantitatively by doing binary encoding. Sampling of a dataset using a predictor can often result in error. However, these errors can have a trend that might be related to one or more datasets. This can differentiate every variable of one dataset from remaining datasets. All these datasets can be unified into a single master dataset based on user requirements.

Keywords


Controlled Datasets, Dataset Binary Encoding, Machine Learning, Master Datasets, Progressive Sampling of Data

No Classification

Publishing Chronology Manuscript received July 28, 2018; revised August 12, 2018; accepted August 14, 2018. Date of publication September 6, 2018




DOI: https://doi.org/10.17010/ijcs%2F2018%2Fv3%2Fi5%2F138778