Data Processing :-
Data processing is a process of converting raw facts or data into a meaningful information.In other words, Everyone is familiar with term "word processing," but computers were really developed for "data processing" _ the organization and manipulation of large amounts of numeric data, or in computer jargon, " number crunching." Some examples of data processing are calculation of satellite orbits, weather forecasting, statistical analyses, and in a more practice sense, business applications such as accounting, payroll, and billing.
image 1 |
Since the beginning of time people have sought ways to help in the computing, handling, merging, and sorting of numeric data. Think of all the labour that Bob Cratchit performed when keeping track of Ebenezer Scrooge's figures and accounts. Certainly Cratchit wished for an easier approach and undoubtedly Mr. Scrooge longed for a more accurate method to keep track of his accounts.
Stages of Data Processing :-
There are Six main stages in the data processing cycle :
1. Data Collection
2. Data Preparation
3. Data Input
4. Data Processing
5. Data Output / Interpretation
6. Data Storage
ii. Cycle of stages in Data Processing |
1. Data Collection :-
i. The collection of raw data is the first step of the data processing cycle.
ii. The type of raw data collected has a huge impact on the output produced.
iii. It is important that the raw data sources are trustworthy and well - build so that data collected is of the highest possible quality.
iv. Raw data can include monetary figures, website cookies, profit/ loss statements of a company, user behaviour, etc.
2. Data Preparation :-
i. Data preparation or data cleaning is the process of sorting and filtering the raw data to remove unnecessary and inaccurate data.
ii. Raw data is checked for error and then transformed into a suitable form for future analysis and processing.
iii. This is done to ensure that only the highest quality data is fed into the processing unit.
3. Data Input :-
i. In the step, the raw data is converted into machine readable form and fed into the processing unit.
ii. Data input is the first stage in which raw data begins to take the form of usable information.
iii. This can be in the form of data entry though a keyboard, scanner or any other input source.
4. Data Processing :-
i. During this stage, the data inputted in the previous stage is actually processed for interpretation.
ii. The raw data is subjected to various data processing methods using machine learning and artificial intelligence algorithms to generate a desirable output.
iii. The process itself may vary slightly depending on the source of data being processed and its intended use.
5. Data Storage :-
Forms of Data Processing :-
1. Data Cleaning :-
Data cleaning is a process to remove the noisy data, clean the data by filling in the missing values and correct the inconsistencies in data.
2. Data Integration :-
Data integration is a technique that combines the data from multiple heterogeneous data sources into a coherent data store. Data integration may involve inconsistent data and therefore needs data cleaning.
3. Data Transformation :-
a). Smoothing :-
b). Aggregation :-
Aggregation is a process where summary or aggregation operations are applied to the data.
c). Generalization :-
In generalization low-level data are replaced with high-level data by using concept hierarchies climbing.
d). Normalization:-
Normalization scaled attribute data so as to fall within a small specified range, such as 0.0 to 1.0. It is of two types :
i. Min-max normalization : It is a technique that helps to normalize data. It will scale the data between 0 and 1.
ii. z- score normalization : Transform the data by converting the values to a common scale with an average of zero and a standard deviation of one.
e. Attribute/feature construction :
New attributes constructed from the given ones.
0 Comments