In mathematical and statistical analysis, data is defined as a collected group of information. Information, in this case, could be anything which may be used to prove or disprove a scientific guess during an experiment.
Data collected may be age, name, a person’s opinion, type of pet, hair colour etc. Although there is no restriction to the form this data may take, it is classified into two main categories depending on its nature—namely; categorical and numerical data.
Categorical data, as the name implies, are usually grouped into a category or multiple categories. Similarly, numerical data, as the name implies, deals with number variables.
Categorical data is a collection of information that is divided into groups. I.e, if an organisation or agency is trying to get a biodata of its employees, the resulting data is referred to as categorical. This data is called categorical because it may be grouped according to the variables present in the biodata such as sex, state of residence, etc.
Categorical data can take on numerical values (such as “1” indicating Yes and “2” indicating No), but those numbers don’t have mathematical meaning. One can neither add them together nor subtract them from each other.
Types of Categorical Data
There are two types of categorical data, namely; nominal and ordinal data.
This is a type of data used to name variables without providing any numerical value. Coined from the Latin nomenclature “Nomen” (meaning name), this data type is a subcategory of categorical data.
Nominal data is sometimes called “labelled” or “named” data. Examples of nominal data include name, hair colour, sex etc.
Mostly collected using surveys or questionnaires, this data type is descriptive, as it sometimes allows respondents the freedom to type in responses. Although this characteristic helps in arriving at better conclusions, it sometimes poses problems for researchers as they have to deal with so much irrelevant data.
This is a data type with a set order or scale to it. However, this order does not have a standard scale on which the difference in variables in each scale is measured.
Although mostly classified as categorical data, it is said to exhibit both categorical and numerical data characteristics making it in between. Its classification under categorical data has to do with the fact that it exhibits more categorical data character.
Some ordinal data examples include; the Likert scale, interval scale, bug severity, customer satisfaction survey data etc. Each of these examples may have different collection and analysis techniques, but they are all ordinal data.
These consist of two categories of categorical data, namely; nominal data and ordinal data. Nominal data, also known as named data is the type of data used to name variables, while ordinal data is a type of data with a scale or order to it.
Categorical data is qualitative. That is, it describes an event using a string of words rather than numbers.
Categorical data is analysed using mode and median distributions, where nominal data is analysed with mode while ordinal data uses both. In some cases, ordinal data may also be analysed using univariate statistics, bivariate statistics, regression applications, linear trends and classification methods.
It can also be analysed graphically using a bar chart and pie chart. A bar chart is mostly used to analyse frequency while a pie chart analysis percentage. This is done after grouping it into a table.
In the case of ordinal data, which has a given order or scale, the scale does not have a standardised interval. This is not applicable for nominal data.
Although categorical data is qualitative, it may sometimes take numerical values. However, these values do not exhibit quantitative characteristics. Arithmetic operations can not be performed on them.
Categorical data may also be classified into binary and non-binary depending on its nature. A given question with options “Yes” or “No” is classified as binary because it has two options while adding “Maybe” to the given options will make it non-binary.
1. Household Income: Categorical data is mostly used by businesses when investigating the spending power of their target audience, to conclude on an affordable price for their products. For example:
What is your household income?
This is a closed ended nominal data example.
2. Education Level: The level of education of a respondent may be requested for when filling forms for job applications, admission, training etc. This is used to assess their qualification for a specific role. Consider the example below:
What is your highest level of education?
This is also a closed-ended nominal data example.
3. Gender: Respondents are asked for their gender when filling out a biodata. This is mostly categorised as male or female, but may also be nonbinary. For example:
What is your gender?
This is a binary and closed-ended nominal data example.
What is your gender? (Others signify)
This is a nonbinary and open-closed ended nominal data example.
4. Customer satisfaction: After rendering service to customers, businesses like to get feedback from customers regarding their service to improve. For example;
Kindly rate your customer service experience with us
The above is an example of an ordinal data collection process. The responses have a specific order to them, listed in ascending order.
5. Brand of soaps: When doing competitive analysis research, a soap brand may want to study the popularity of its competitors among its target audience. In this case, we have something of this nature:
Which of the following soap brands are you familiar with?
This is a multiple-choice nominal data collection example.
6. Hair colour
This is a key categorical data example used in profiling a respondent. Although not accurate, a person’s hair colour together with some racially prominent traits may be used to predict whether the person is black, caucasian, Hispanic, etc. For example
What is your hair colour?
This is a closed-ended example of nominal data.
7. Surveys or Questionnaires: Online surveys are commonly used to carry out investigations on certain topics. The data gathered in some cases are categorical. For example
How many siblings do you have?
The above is an example of an open-ended nominal data collection form. The response may be quantitative but will possess qualitative properties.
8. Happiness level: This example may be used by a therapist or psychologist when examining a patient for mental illness. It is usually collected together with some important data that may affect a person’s mental health.
Rate your happiness level on a scale of 1-5.
This is an ordinal data example.
9. Motives for employees to work better: Companies who want to improve employee productivity may use this method to discover what motivates employees to work better. For example:
What motivates you to work better? (Others specify)
This is a closed open-ended nominal data collection example.
10. Motives for travelling: Travel and tourism companies ask their customers or target audience this question to inform marketing strategies.
What are your motives for travelling? (Others specify)
This is a closed open-ended nominal data collection example.
11. Interval scale: An event planning company may use an interval scale to get the demographics of attendees of a particular event. It is also used by Instagram and Facebook to give audience insights. For example:
In which of the following age bracket do you fall?
This is an example of ordinal data collection.
12. Checking account location: Some timesheet calculator tool collects real-time employee location so that employers can know which employee is at work and which one isn’t. This is also used in several other cases. For example:
When a user gives Instagram access to his/her location, it uses this data to give insights using a bar chart. E.g. 50% is from Texas, 30% from Texas and 20% from Colorado.
13. Bug severity: When software companies perform quality assurance testing to discover bugs in the software, the bugs are treated according to their severity level.
When a bug bounty hunter submits a bug to a company, it is given a severity level like critical, medium or low. This is an example of ordinal data.
14. Likert scale: A Likert scale is a point scale used by researchers to take surveys and get people’s opinions on a subject matter. Consider this example:
How will you rate the dessert served tonight?
This is a 5-point Likert scale, a common example of ordinal data.
15. Proficiency level: Employees measure a job applicant’s proficiency level in skills required to perform well in the job. This helps in choosing the best applicant for the job. For example;
What is your proficiency level in Excel?
This is a simple example of ordinal data.
A categorical variable is a variable type with two or more categories. Sometimes called a discrete variable, it is mainly classified into two (nominal and ordinal).
For example, if a restaurant is trying to collect data on the amount of pizza ordered in a day according to type, we regard this as categorical data. When gathering the data, the restaurant will group the number of orders according to the type of pizza (e.g. pepperoni, chicken etc.) ordered.
In this case, the type of pizza ordered is the Categorical variable. Categorical Data Variables are divided into two, namely; ordinal variable and nominal variable.
This type of categorical data variable has no intrinsic ordering to its categories. For example, marital status is a categorical variable having two categories (single and married) with no intrinsic ordering to the categories.
There are two main categories of nominal data variables, namely; matched and unmatched categories. Below are the tests carried out on each category:
Matched Category in Nominal Data Variables
Unmatched Category in Nominal Data Variables
This type of categorical variable has an intrinsic ordering to its categories. For example, when studying the severity of the bug in the software, severity is a categorical variable with ordered categories which are; critical, medium and low.
There are two main categories of ordinal data variables, namely; matched and unmatched categories. Below are the tests carried out on each category:
Matched Category in Ordinal Data Variables
Unmatched Category in Ordinal Data Variables
When applying for jobs, employers collect both nominal and ordinal data. This includes the job seeker’s biodata and a combination of relevant skills and experience. Employers do this to determine the best candidate for the job.
When placing an order for a product or service on an e-commerce website, one is required to input some details which are regarded as categorical data. The data collected in this case is nominal.
Users of online dating platforms are usually required to input a set of categorical data to match them with the right person. This data may include personal information and partner preferences.
Organisations or companies use this after selling their product or service to a customer. This is used to know how the customer feels about the company’s service to improve the overall customer experience.
Categorical data is used to gather information from both online and offline surveys or questionnaires as the case may be. The type of categorical data used may differ depending on the aim of data collection.
This is a common test that is used for investigating the kind of personality traits a respondent possess. This test is used by companies for investigating whether a personality trait is compatible with the company’s work culture.
Categorical data may easily be collected through various collection techniques using Formplus form builder. This online form builder provides effective categorical data gathering and management.
Formplus not only provide easy data collection through customisable form feature but also create data analytics which helps drive easy and proper decision-making. It also contains useful statistical data analysis features, making it the best tool for collecting categorical data.
Categorical and Numerical data are the main types of data. These data types may have the same number of subcategories, with two each, but they have many differences. These differences give them unique attributes which are equally useful in statistical analysis.
Numerical data are quantitative data types. For example, weight, temperature, height, GPA, annual income, etc. are classified under numerical or quantitative data.
In comparison, categorical data are qualitative data types. Some examples include: name, hair colour, qualification etc.
Unlike categorical data which deals with groups and categories, Continuous data focuses on numerical values. This means continuous data are numerical variables that have an infinite number of values. This could be a number, date or time. For example, the date payment is received for a transaction.
Another difference is that categorical data might not have a logical order, like gender, hair etc. While continuous data has logical data like the duration of a video.
As you can see, there is a non-exhaustive list of categorical data examples which can be given to better understand the meaning and purpose of qualitative data. When working with data management, it’s crucial to clearly understand some of the main terms, including quantitative and categorical data and what their role is.
The distinction between categorical and quantitative variables is crucial for deciding which types of data analysis methods to use. The first step towards selecting the right data analysis method today is understanding categorical data.
Quantitative data are analyzed using descriptive statistics, time series, linear regression models, and much more. For categorical data, typically only graphical and descriptive methods are used.
You may also like:
Introduction A data collection plan is a way to get specific information on your audience. You can use it to better understand what they...
In this article, we’ll look at coefficient of variation as a statistical measure, its definition, calculation examples, and other...
Guide on the differences in numerical and categorical data as it relates with definitions, examples, types, data collection, advantage,...
In this article we’ll look at the different types and characteristics of extrapolation, plus how it contrasts to interpolation.