descriptive research likert scale

Want to create or adapt books like this? Learn more about how Pressbooks supports open publishing practices.

14 Quantitative analysis: Descriptive statistics

Numeric data collected in a research project can be analysed quantitatively using statistical tools in two different ways. Descriptive analysis refers to statistically describing, aggregating, and presenting the constructs of interest or associations between these constructs. Inferential analysis refers to the statistical testing of hypotheses (theory testing). In this chapter, we will examine statistical techniques used for descriptive analysis, and the next chapter will examine statistical techniques for inferential analysis. Much of today’s quantitative data analysis is conducted using software programs such as SPSS or SAS. Readers are advised to familiarise themselves with one of these programs for understanding the concepts described in this chapter.

Data preparation

In research projects, data may be collected from a variety of sources: postal surveys, interviews, pretest or posttest experimental data, observational data, and so forth. This data must be converted into a machine-readable, numeric format, such as in a spreadsheet or a text file, so that they can be analysed by computer programs like SPSS or SAS. Data preparation usually follows the following steps:

Data coding. Coding is the process of converting data into numeric format. A codebook should be created to guide the coding process. A codebook is a comprehensive document containing a detailed description of each variable in a research study, items or measures for that variable, the format of each item (numeric, text, etc.), the response scale for each item (i.e., whether it is measured on a nominal, ordinal, interval, or ratio scale, and whether this scale is a five-point, seven-point scale, etc.), and how to code each value into a numeric format. For instance, if we have a measurement item on a seven-point Likert scale with anchors ranging from ‘strongly disagree’ to ‘strongly agree’, we may code that item as 1 for strongly disagree, 4 for neutral, and 7 for strongly agree, with the intermediate anchors in between. Nominal data such as industry type can be coded in numeric form using a coding scheme such as: 1 for manufacturing, 2 for retailing, 3 for financial, 4 for healthcare, and so forth (of course, nominal data cannot be analysed statistically). Ratio scale data such as age, income, or test scores can be coded as entered by the respondent. Sometimes, data may need to be aggregated into a different form than the format used for data collection. For instance, if a survey measuring a construct such as ‘benefits of computers’ provided respondents with a checklist of benefits that they could select from, and respondents were encouraged to choose as many of those benefits as they wanted, then the total number of checked items could be used as an aggregate measure of benefits. Note that many other forms of data—such as interview transcripts—cannot be converted into a numeric format for statistical analysis. Codebooks are especially important for large complex studies involving many variables and measurement items, where the coding process is conducted by different people, to help the coding team code data in a consistent manner, and also to help others understand and interpret the coded data.

Data entry. Coded data can be entered into a spreadsheet, database, text file, or directly into a statistical program like SPSS. Most statistical programs provide a data editor for entering data. However, these programs store data in their own native format—e.g., SPSS stores data as .sav files—which makes it difficult to share that data with other statistical programs. Hence, it is often better to enter data into a spreadsheet or database where it can be reorganised as needed, shared across programs, and subsets of data can be extracted for analysis. Smaller data sets with less than 65,000 observations and 256 items can be stored in a spreadsheet created using a program such as Microsoft Excel, while larger datasets with millions of observations will require a database. Each observation can be entered as one row in the spreadsheet, and each measurement item can be represented as one column. Data should be checked for accuracy during and after entry via occasional spot checks on a set of items or observations. Furthermore, while entering data, the coder should watch out for obvious evidence of bad data, such as the respondent selecting the ‘strongly agree’ response to all items irrespective of content, including reverse-coded items. If so, such data can be entered but should be excluded from subsequent analysis.

Data transformation. Sometimes, it is necessary to transform data values before they can be meaningfully interpreted. For instance, reverse coded items—where items convey the opposite meaning of that of their underlying construct—should be reversed (e.g., in a 1-7 interval scale, 8 minus the observed value will reverse the value) before they can be compared or combined with items that are not reverse coded. Other kinds of transformations may include creating scale measures by adding individual scale items, creating a weighted index from a set of observed measures, and collapsing multiple values into fewer categories (e.g., collapsing incomes into income ranges).

Univariate analysis

Univariate analysis—or analysis of a single variable—refers to a set of statistical techniques that can describe the general properties of one variable. Univariate statistics include: frequency distribution, central tendency, and dispersion. The frequency distribution of a variable is a summary of the frequency—or percentages—of individual values or ranges of values for that variable. For instance, we can measure how many times a sample of respondents attend religious services—as a gauge of their ‘religiosity’—using a categorical scale: never, once per year, several times per year, about once a month, several times per month, several times per week, and an optional category for ‘did not answer’. If we count the number or percentage of observations within each category—except ‘did not answer’ which is really a missing value rather than a category—and display it in the form of a table, as shown in Figure 14.1, what we have is a frequency distribution. This distribution can also be depicted in the form of a bar chart, as shown on the right panel of Figure 14.1, with the horizontal axis representing each category of that variable and the vertical axis representing the frequency or percentage of observations within each category.

With very large samples, where observations are independent and random, the frequency distribution tends to follow a plot that looks like a bell-shaped curve—a smoothed bar chart of the frequency distribution—similar to that shown in Figure 14.2. Here most observations are clustered toward the centre of the range of values, with fewer and fewer observations clustered toward the extreme ends of the range. Such a curve is called a normal distribution .

Lastly, the mode is the most frequently occurring value in a distribution of values. In the previous example, the most frequently occurring value is 15, which is the mode of the above set of test scores. Note that any value that is estimated from a sample, such as mean, median, mode, or any of the later estimates are called a statistic .

Bivariate analysis

Bivariate analysis examines how two variables are related to one another. The most common bivariate statistic is the bivariate correlation —often, simply called ‘correlation’—which is a number between -1 and +1 denoting the strength of the relationship between two variables. Say that we wish to study how age is related to self-esteem in a sample of 20 respondents—i.e., as age increases, does self-esteem increase, decrease, or remain unchanged?. If self-esteem increases, then we have a positive correlation between the two variables, if self-esteem decreases, then we have a negative correlation, and if it remains the same, we have a zero correlation. To calculate the value of this correlation, consider the hypothetical dataset shown in Table 14.1.

After computing bivariate correlation, researchers are often interested in knowing whether the correlation is significant (i.e., a real one) or caused by mere chance. Answering such a question would require testing the following hypothesis:

$H_0:\quad r = 0$

Social Science Research: Principles, Methods and Practices (Revised edition) Copyright © 2019 by Anol Bhattacherjee is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License , except where otherwise noted.

Share This Book

Skip to main content
Skip to primary sidebar
Skip to footer
QuestionPro

Solutions Industries Gaming Automotive Sports and events Education Government Travel & Hospitality Financial Services Healthcare Cannabis Technology Use Case NPS+ Communities Audience Contactless surveys Mobile LivePolls Member Experience GDPR Positive People Science 360 Feedback Surveys
Resources Blog eBooks Survey Templates Case Studies Training Help center

Home Market Research

Descriptive Research: Definition, Characteristics, Methods + Examples

Suppose an apparel brand wants to understand the fashion purchasing trends among New York’s buyers, then it must conduct a demographic survey of the specific region, gather population data, and then conduct descriptive research on this demographic segment.

The study will then uncover details on “what is the purchasing pattern of New York buyers,” but will not cover any investigative information about “ why ” the patterns exist. Because for the apparel brand trying to break into this market, understanding the nature of their market is the study’s main goal. Let’s talk about it.

What is descriptive research?

Descriptive research is a research method describing the characteristics of the population or phenomenon studied. This descriptive methodology focuses more on the “what” of the research subject than the “why” of the research subject.

The method primarily focuses on describing the nature of a demographic segment without focusing on “why” a particular phenomenon occurs. In other words, it “describes” the research subject without covering “why” it happens.

Characteristics of descriptive research

The term descriptive research then refers to research questions, the design of the study, and data analysis conducted on that topic. We call it an observational research method because none of the research study variables are influenced in any capacity.

Some distinctive characteristics of descriptive research are:

Quantitative research: It is a quantitative research method that attempts to collect quantifiable information for statistical analysis of the population sample. It is a popular market research tool that allows us to collect and describe the demographic segment’s nature.
Uncontrolled variables: In it, none of the variables are influenced in any way. This uses observational methods to conduct the research. Hence, the nature of the variables or their behavior is not in the hands of the researcher.
Cross-sectional studies: It is generally a cross-sectional study where different sections belonging to the same group are studied.
The basis for further research: Researchers further research the data collected and analyzed from descriptive research using different research techniques. The data can also help point towards the types of research methods used for the subsequent research.

Applications of descriptive research with examples

A descriptive research method can be used in multiple ways and for various reasons. Before getting into any survey , though, the survey goals and survey design are crucial. Despite following these steps, there is no way to know if one will meet the research outcome. How to use descriptive research? To understand the end objective of research goals, below are some ways organizations currently use descriptive research today:

Define respondent characteristics: The aim of using close-ended questions is to draw concrete conclusions about the respondents. This could be the need to derive patterns, traits, and behaviors of the respondents. It could also be to understand from a respondent their attitude, or opinion about the phenomenon. For example, understand millennials and the hours per week they spend browsing the internet. All this information helps the organization researching to make informed business decisions.
Measure data trends: Researchers measure data trends over time with a descriptive research design’s statistical capabilities. Consider if an apparel company researches different demographics like age groups from 24-35 and 36-45 on a new range launch of autumn wear. If one of those groups doesn’t take too well to the new launch, it provides insight into what clothes are like and what is not. The brand drops the clothes and apparel that customers don’t like.
Conduct comparisons: Organizations also use a descriptive research design to understand how different groups respond to a specific product or service. For example, an apparel brand creates a survey asking general questions that measure the brand’s image. The same study also asks demographic questions like age, income, gender, geographical location, geographic segmentation , etc. This consumer research helps the organization understand what aspects of the brand appeal to the population and what aspects do not. It also helps make product or marketing fixes or even create a new product line to cater to high-growth potential groups.
Validate existing conditions: Researchers widely use descriptive research to help ascertain the research object’s prevailing conditions and underlying patterns. Due to the non-invasive research method and the use of quantitative observation and some aspects of qualitative observation , researchers observe each variable and conduct an in-depth analysis . Researchers also use it to validate any existing conditions that may be prevalent in a population.
Conduct research at different times: The analysis can be conducted at different periods to ascertain any similarities or differences. This also allows any number of variables to be evaluated. For verification, studies on prevailing conditions can also be repeated to draw trends.

Advantages of descriptive research

Some of the significant advantages of descriptive research are:

Data collection: A researcher can conduct descriptive research using specific methods like observational method, case study method, and survey method. Between these three, all primary data collection methods are covered, which provides a lot of information. This can be used for future research or even for developing a hypothesis for your research object.
Varied: Since the data collected is qualitative and quantitative, it gives a holistic understanding of a research topic. The information is varied, diverse, and thorough.
Natural environment: Descriptive research allows for the research to be conducted in the respondent’s natural environment, which ensures that high-quality and honest data is collected.
Quick to perform and cheap: As the sample size is generally large in descriptive research, the data collection is quick to conduct and is inexpensive.

Descriptive research methods

There are three distinctive methods to conduct descriptive research. They are:

Observational method

The observational method is the most effective method to conduct this research, and researchers make use of both quantitative and qualitative observations.

A quantitative observation is the objective collection of data primarily focused on numbers and values. It suggests “associated with, of or depicted in terms of a quantity.” Results of quantitative observation are derived using statistical and numerical analysis methods. It implies observation of any entity associated with a numeric value such as age, shape, weight, volume, scale, etc. For example, the researcher can track if current customers will refer the brand using a simple Net Promoter Score question .

Qualitative observation doesn’t involve measurements or numbers but instead just monitoring characteristics. In this case, the researcher observes the respondents from a distance. Since the respondents are in a comfortable environment, the characteristics observed are natural and effective. In a descriptive research design, the researcher can choose to be either a complete observer, an observer as a participant, a participant as an observer, or a full participant. For example, in a supermarket, a researcher can from afar monitor and track the customers’ selection and purchasing trends. This offers a more in-depth insight into the purchasing experience of the customer.

Case study method

Case studies involve in-depth research and study of individuals or groups. Case studies lead to a hypothesis and widen a further scope of studying a phenomenon. However, case studies should not be used to determine cause and effect as they can’t make accurate predictions because there could be a bias on the researcher’s part. The other reason why case studies are not a reliable way of conducting descriptive research is that there could be an atypical respondent in the survey. Describing them leads to weak generalizations and moving away from external validity.

Survey research

In survey research, respondents answer through surveys or questionnaires or polls . They are a popular market research tool to collect feedback from respondents. A study to gather useful data should have the right survey questions. It should be a balanced mix of open-ended questions and close ended-questions . The survey method can be conducted online or offline, making it the go-to option for descriptive research where the sample size is enormous.

Examples of descriptive research

Some examples of descriptive research are:

A specialty food group launching a new range of barbecue rubs would like to understand what flavors of rubs are favored by different people. To understand the preferred flavor palette, they conduct this type of research study using various methods like observational methods in supermarkets. By also surveying while collecting in-depth demographic information, offers insights about the preference of different markets. This can also help tailor make the rubs and spreads to various preferred meats in that demographic. Conducting this type of research helps the organization tweak their business model and amplify marketing in core markets.
Another example of where this research can be used is if a school district wishes to evaluate teachers’ attitudes about using technology in the classroom. By conducting surveys and observing their comfortableness using technology through observational methods, the researcher can gauge what they can help understand if a full-fledged implementation can face an issue. This also helps in understanding if the students are impacted in any way with this change.

Some other research problems and research questions that can lead to descriptive research are:

Market researchers want to observe the habits of consumers.
A company wants to evaluate the morale of its staff.
A school district wants to understand if students will access online lessons rather than textbooks.
To understand if its wellness questionnaire programs enhance the overall health of the employees.

FREE TRIAL LEARN MORE

MORE LIKE THIS

Data Information vs Insight: Essential differences

May 14, 2024

Pricing Analytics Software: Optimize Your Pricing Strategy

May 13, 2024

Relationship Marketing: What It Is, Examples & Top 7 Benefits

May 8, 2024

The Best Email Survey Tool to Boost Your Feedback Game

May 7, 2024

Likert scale interpretation: How to analyze the data with examples

January 10, 2022
10 min read
Best practice

What are Likert scale and Likert scale questionnaires?

Likert scale examples: the types and uses of satisfaction scale questions, likert scale interpretation: analyzing likert scale/type data, how to use filtering and cross tabulation for your likert scale analysis, 1. compare new and old information to ensure a better understanding of progress, 2. compare information with other types of data and objective indicators, 3. make a visual representation: help the audience understand the data better, 4. focus on insights instead of just the numbers, how to analyze likert scale data, likert scale interpretation example overview, interpreting likert scale results, explore useful surveyplanet features for data analyzing.

Likert scaling consists of questions that are answerable with a statement that is scaled with 5 or 7 options that the respondent can choose from.

Have you ever answered a survey question that asks to what extent you agree with a statement? The answers were probably: strongly disagree, disagree, neither disagree nor agree, agree, or strongly agree. Well, that’s a Likert question.

Regardless of the name—a satisfaction scale, an agree-disagree scale, or a strongly agree scale—the format is pretty powerful and a widely used means of survey measurement, primarily used in customer experience and employee satisfaction surveys.

In this article, we’ll answer some common questions about Likert scales and how they are used, though most importantly Likert scale scoring and interpretation. Learn our advice about how to benefit from conclusions drawn from satisfaction surveys and how to use them to implement changes that will improve your business!

A Likert scale usually contains 5 or 7 response options—ranging from strongly agree to strongly disagree—with differing nuances between these and a mandatory mid-point of neither agree nor disagree (for those who hold no opinion). The Likert-type scale got its name from psychologist Rensis Likert, who developed it in 1932.

Likert scales are a type of closed-ended question, like common yes-or-no questions, they allow participants to choose from a predefined set of answers, as opposed to being able to phrase their opinions in their own words. But unlike yes-or-no questions, satisfaction-scale questions allow for the measurement of people’s views on a specific topic with a greater degree of nuance.

Since these questions are predefined, it’s essential to include questions that are as specific and understandable as possible.

Answer presets can be numerical, descriptive, or a combination of both numbers and words. Responses range from one extreme attitude to the other, while always including a neutral opinion in the middle.

A Likert scale question is one of the most commonly used in surveys to measure how satisfied a customer or employee is. The most common example of their use is in customer satisfaction surveys , which are an integral part of market research .

Are satisfaction-scale questions the best survey questions?

Maybe you’ve answered one too many customer satisfaction surveys with Likert scales in your lifetime and now consider them way too generic and bland. But, the fact is they are one of the most popular types of survey questions.

First of all, they are pretty appealing to respondents because they are easy to understand and do not require too much thinking to answer.

And, while binary (yes-or-no) questions offer only two response options (i.e., if a customer is satisfied with your products and services or not), satisfaction-scale questions provide a clearer understanding of customers’ thoughts and opinions.

By using well-prepared additional questions, questions about particular products or service segments can be asked. That way, getting to the bottom of customer dissatisfaction is possible, making it easier to find a way to address their complaints and improve their experience.

Such surveys enable figuring out why customers are satisfied with one product but not another. This empowers the recognition of products and service areas that customers are confident in while helping to find ways to improve others.

When it comes to analyzing and interpreting survey scale results, Likert questions are helpful because they provide quantitative data that is easy to code and interpret. Results can also be analyzed through cross-tabulation analysis (we’ll get back to that later).

Likert questions can be used for many kinds of research. For example, determine the level of customer satisfaction with the latest product, assess employee satisfaction, or get post-event feedback from attendees after a specific event.

Questions can take different forms, but the most common is the 5-point or 7-point Likert scale question. There are 4-point and even 10-point Likert scale questions as well.

How to choose from these options?

The most common is the 5-point question. Most researchers advise the use of at least five response options (if not more). This ensures that respondents have enough choices to express their opinion as accurately as possible.

Some researchers suggest always using an even number of responses so respondents are not presented with a neutral answer, therefore having to “choose a side.” This is to avoid a tepid response even when respondents have an opinion, which is one of the most common types of errors in surveying .

Likert scale interpretation involves analyzing the responses to understand the participants’ attitudes toward the statements.

It’s important to note that Likert scales provide a quantitative representation of attitudes but do not necessarily capture underlying reasoning or motivations. Qualitative methods, such as interviews or open-ended questions, are often used in conjunction with Likert scales to gain a deeper understanding of participants’ perspectives.

Overall, Likert scale interpretation of data involves analyzing the numerical ratings, considering the directionality of the scale, examining central tendency and variability, identifying response patterns, and conducting comparative analyses to draw meaningful conclusions about people’s attitudes or opinions.

How to analyze satisfaction survey scale questions

For a survey to be its best , how gathered information is analyzed is as important as the gathering itself. That’s why we’ll now turn to the most effective ways of analyzing responses from satisfaction survey scales.

When using Likert scale questions, the analysis tools used are mean, median, and mode. These help better understand the information collected.

The mean (or average) is the average value of data, calculated by adding all the numbers and dividing this sum by the total number of values offered to respondents. The median is the middle value of a data set, while the mode is the number that occurs most often.

Some other useful ways of analyzing information are filtering and cross tabulation.

Using a filter, the responses of one particular group of respondents are focused upon and the rest filtered out. For example, how female customers rate a product can be determined by filtering out male respondents, while concentrating on customers aged 20 to 30 can be gleaned by filtering out older respondents.

Cross tabulation, on the other hand, is a method to compare two sets of information in one chart and analyze the relationship between multiple variables. In other words, it can show the responses of a particular subgroup while it can also be combined with other subgroups.

Say you want to look at the responses of unemployed female respondents aged 20 to 30. By using cross tabulation, all three parameters—gender, age, and employment status—can be combined and correlation calculated.

If this all sounds confusing, SurveyPlanet luckily doesn’t just offer great examples of surveys and the ability to create custom themes , but also the power to export survey results into several different formats, such as Microsoft Excel and Word, CSV, PDF, and JSON files.

How to interpret Likert scale data?

When information has been gathered and analyzed, it’s time to present it to stakeholders. This is the final stage of research. Analyzing the results of Likert scale questionnaires is a vital way to improve services and grow a business. Presenting the results correctly is a key step.

Here’s how to develop a clear goal and present it understandably and engagingly.

Compare the newly obtained information with data gathered from previous surveys. Sure, information gathered from the latest research is valuable on its own, but not helpful enough. For example, it tells you if customers are currently satisfied with products or services, but not whether things are better or worse than last year.

The key to improving customer service—and thus developing a business—is comparing current responses with previous ones. This is called longitudinal analysis. It can provide valuable insights about how a business is developing, if things are improving or declining, and what issues need to be solved.

If there is no previous data, then start collecting feedback immediately in order to compare results with future surveys. This is called benchmarking. It helps keep track of progress and how products, services, and overall customer satisfaction changes over time.

The most crucial information to compare new findings with is previous surveys. But it is highly recommended to constantly compare findings with other types of information, such as Google Analytics, sales data, and other objective indicators.

Another good practice is comparing qualitative with quantitative data . The more information, the more accurate the research results, which will help better convey findings to stakeholders. This will also improve business decision-making, strengthening the experiences of customers and employees.

Numbers are easier to understand when suitable visual representation is provided. However, it is essential to use a medium that adequately highlights key findings.

Line graphs, pie charts, bar charts, histograms, scatterplots, infographics, and many more techniques can be used.

But don’t forget good old tables. Even if they’re not so visually dynamic and a little harder on the eyes, some information is simply best presented in tables, especially numerical data.

Working with all of these options, more satisfactory presentations can be created.

When presenting findings to stakeholders, don’t just focus on the numbers. Instead, highlight the conclusions about customer or employee satisfaction drawn from the research. That way, everyone present at the meeting will gain a deeper understanding of what you’re trying to convey.

A valuable and exciting piece of advice is to focus on the story the numbers tell. Don’t simply list the numbers collected. Instead, use relevant examples and connect all the information, building on each dataset to make a meaningful whole.

Define and describe problems that need to be solved in engaging and easy-to-understand terms so that listeners don’t have a hard time understanding what is being shared. Include suggestions that could improve, for example, customer experience outcomes. It is also important to share findings with the relevant teams, listen to their perspectives, and find solutions together.

An example of Likert scale data analysis and interpretation

Let’s consider an example scenario and go through the steps of analyzing and interpreting Likert scale data.

Scenario: A company conducts an employee satisfaction survey using a Likert scale to measure employees’ attitudes toward various aspects of their work environment. The scale ranges from 1 (Strongly Disagree) to 5 (Strongly Agree).

Item 1: “I feel valued and appreciated at work.”

Item 2: “My workload is manageable.”

Item 3: “I receive adequate training and support.”

Item 4: “I have opportunities for growth and advancement.”

Item 5: “My supervisor provides constructive feedback.”

Step 1: Calculate mean scores by summing up the responses and dividing by the number of respondents.

Item 1: Mean score = (4+5+5+4+3)/5 = 4.2

Item 2: Mean score = (3+4+3+3+4)/5 = 3.4

Item 3: Mean score = (4+4+5+4+3)/5 = 4.0

Item 4: Mean score = (3+4+3+2+4)/5 = 3.2

Item 5: Mean score = (4+3+4+3+5)/5 = 3.8

Step 2: Assess central tendency by looking at the distribution of responses to identify the most frequent response or central point.

Item 1: 4 (Agree) is the most frequent response.

Item 2: 3 (Neutral) is the most frequent response.

Item 3: 4 (Agree) is the most frequent response.

Item 4: 3 (Neutral) is the most frequent response.

Item 5: 4 (Agree) is the most frequent response.

Step 3: Consider Variability by assessing the range or spread of responses to understand the diversity of opinions.

Item 1: Range = 5-3 = 2 (relatively low variability)

Item 2: Range = 4-3 = 1 (low variability)

Item 3: Range = 5-3 = 2 (relatively low variability)

Item 4: Range = 4-2 = 2 (relatively low variability)

Item 5: Range = 5-3 = 2 (relatively low variability)

Step 4: Identify response patterns By looking for consistent agreement or disagreement across items or patterns of response clusters.

Step 5: Comparative analysis of responses among different groups, such as other departments or job positions, to identify attitude variations.

In this example, there is a pattern of agreement on items related to feeling valued at work (Item 1), receiving training and support (Item 3), and receiving constructive feedback (Item 5). However, there is a relatively neutral response pattern for workload manageability (Item 2) and growth opportunities (Item 4).

For example, you could compare responses between different departments to see if there are significant differences in employee satisfaction levels.

Based on the analysis, employees feel valued and appreciated at work (Item 1) and perceive adequate training and support (Item 3). However, there may be room for improvement regarding workload manageability (Item 2), opportunities for growth (Item 4), and the provision of constructive feedback (Item 5).

The relatively low variability across items suggests moderate agreement within the group. However, the neutral response pattern for workload manageability and opportunities for growth may indicate areas that require attention to enhance employee satisfaction.

Likert scales are a highly effective way of collecting qualitative data. They help you gain a deeper understanding of customers’ or employees’ opinions and needs.

Make this kind of vital research easier. Discover our unique features —like exporting and printing results —that will save time and energy. Let SurveyPlanet take care of your surveys!

Photo by Lukas from Pexels

Likert Scale

Reference work entry
First Online: 01 January 2022
pp 2938–2941
Cite this reference work entry

Takashi Yamashita 3 &
Roberto J. Millar 4

523 Accesses

1 Citations

Likert-type scale ; Rating scale

Likert scaling is one of the most fundamental and frequently used assessment strategies in social science research (Joshi et al. 2015 ). A social psychologist, Rensis Likert ( 1932 ), developed the Likert scale to measure attitudes. Although attitudes and opinions had been popular research topics in the social sciences, the measurement of these concepts was not established until this time. In a groundbreaking study, Likert ( 1932 ) introduced this new approach of measuring attitudes toward internationalism with a 5-point scale – (1) strongly approve, (2) approve, (3) undecided, (4) disapprove, and (5) strongly disapprove. For example, one of nine internationalism scale items measured attitudes toward statements like, “All men who have the opportunity should enlist in the Citizen’s Military Training Camps.” Based on the survey of 100 male students from one university, Likert showed the sound psychometric properties (i.e., validity and...

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Baker TA, Buchanan NT, Small BJ, Hines RD, Whitfield KE (2011) Identifying the relationship between chronic pain, depression, and life satisfaction in older African Americans. Res Aging 33(4):426–443

Google Scholar

Bishop PA, Herron RL (2015) Use and misuse of the Likert item responses and other ordinal measures. Int J Exerc Sci 8(3):297

Carifio J, Perla R (2008) Resolving the 50-year debate around using and misusing Likert scales. Med Educ 42(12):1150–1152

DeMaris A (2004) Regression with social data: modeling continuous and limited response variables. Wiley, Hoboken

Femia EE, Zarit SH, Johansson B (1997) Predicting change in activities of daily living: a longitudinal study of the oldest old in Sweden. J Gerontol 52B(6):P294–P302. https://doi.org/10.1093/geronb/52B.6.P294

Article Google Scholar

Gomez RG, Madey SF (2001) Coping-with-hearing-loss model for older adults. J Gerontol 56(4):P223–P225. https://doi.org/10.1093/geronb/56.4.P223

Joshi A, Kale S, Chandel S, Pal D (2015) Likert scale: explored and explained. Br J Appl Sci Technol 7(4):396

Kong J (2017) Childhood maltreatment and psychological well-being in later life: the mediating effect of contemporary relationships with the abusive parent. J Gerontol 73(5):e39–e48. https://doi.org/10.1093/geronb/gbx039

Kuzon W, Urbanchek M, McCabe S (1996) The seven deadly sins of statistical analysis. Ann Plast Surg 37:265–272

Likert R (1932) A technique for the measurement of attitudes. Arch Psychol 22(140):55–55

Pruchno RA, McKenney D (2002) Psychological well-being of black and white grandmothers raising grandchildren: examination of a two-factor model. J Gerontol 57(5):P444–P452. https://doi.org/10.1093/geronb/57.5.P444

Sullivan GM, Artino AR Jr (2013) Analyzing and interpreting data from Likert-type scales. J Grad Med Educ 5(4):541–542

Trochim WM, Donnelly JP, Arora K (2016) Research methods: the essential knowledge base. Cengage Learning, Boston

Download references

Author information

Authors and affiliations.

Department of Sociology, Anthropology, and Health Administration and Policy, University of Maryland Baltimore County, Baltimore, MD, USA

Takashi Yamashita

Gerontology Doctoral Program, University of Maryland Baltimore, Baltimore, MD, USA

Roberto J. Millar

You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Takashi Yamashita .

Editor information

Editors and affiliations.

Population Division, Department of Economics and Social Affairs, United Nations, New York, NY, USA

Department of Population Health Sciences, Department of Sociology, Duke University, Durham, NC, USA

Matthew E. Dupre

Section Editor information

Department of Sociology and Center for Population Health and Aging, Duke University, Durham, NC, USA

Kenneth C. Land

Department of Sociology, University of Kentucky, Lexington, KY, USA

Anthony Bardo

Rights and permissions

Reprints and permissions

Copyright information

About this entry

Cite this entry.

Yamashita, T., Millar, R.J. (2021). Likert Scale. In: Gu, D., Dupre, M.E. (eds) Encyclopedia of Gerontology and Population Aging. Springer, Cham. https://doi.org/10.1007/978-3-030-22009-9_559

Download citation

DOI : https://doi.org/10.1007/978-3-030-22009-9_559

Published : 24 May 2022

Publisher Name : Springer, Cham

Print ISBN : 978-3-030-22008-2

Online ISBN : 978-3-030-22009-9

eBook Packages : Social Sciences Reference Module Humanities and Social Sciences Reference Module Business, Economics and Social Sciences

Share this entry

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

Publish with us

Policies and ethics

Find a journal
Track your research

Likert Scale Questionnaire: Examples & Analysis

Saul Mcleod, PhD

Editor-in-Chief for Simply Psychology

BSc (Hons) Psychology, MRes, PhD, University of Manchester

Saul Mcleod, PhD., is a qualified psychology teacher with over 18 years of experience in further and higher education. He has been published in peer-reviewed journals, including the Journal of Clinical Psychology.

Learn about our Editorial Process

Olivia Guy-Evans, MSc

Associate Editor for Simply Psychology

BSc (Hons) Psychology, MSc Psychology of Education

Olivia Guy-Evans is a writer and associate editor for Simply Psychology. She has previously worked in healthcare and educational sectors.

On This Page:

Various kinds of rating scales have been developed to measure attitudes directly (i.e., the person knows their attitude is being studied). The most widely used is the Likert scale (1932).

In its final form, the Likert scale is a five (or seven) point scale that is used to allow an individual to express how much they agree or disagree with a particular statement.

The Likert scale (typically) provides five possible answers to a statement or question that allows respondents to indicate their positive-to-negative strength of agreement or strength of feeling regarding the question or statement.

I believe that ecological questions are the most important issues facing human beings today.

A Likert scale assumes that the strength/intensity of an attitude is linear, i.e., on a continuum from strongly agree to strongly disagree, and makes the assumption that attitudes can be measured.

For example, each of the five (or seven) responses would have a numerical value that would be used to measure the attitude under investigation.

Examples of Items for Surveys

In addition to measuring statements of agreement, Likert scales can measure other variations such as frequency, quality, importance, and likelihood, etc.

Analyzing Data

The response categories in the Likert scales have a rank order, but the intervals between values cannot be presumed equal. Therefore, the mean (and standard deviation) are inappropriate for ordinal data (Jamieson, 2004).

Statistics you can use are:

Summarize using a median or a mode (not a mean as it is ordinal scale data ); the mode is probably the most suitable for easy interpretation.
Display the distribution of observations in a bar chart (it can’t be a histogram because the data is not continuous).

Critical Evaluation

Likert Scales have the advantage that they do not expect a simple yes / no answer from the respondent but rather allow for degrees of opinion and even no opinion at all.

Therefore, quantitative data is obtained, which means that the data can be analyzed relatively easily.

Offering anonymity on self-administered questionnaires should further reduce social pressure and thus may likewise reduce social desirability bias.

Paulhus (1984) found that more desirable personality characteristics were reported when people were asked to write their names, addresses, and telephone numbers on their questionnaire than when they were told not to put identifying information on the questionnaire.

Limitations

However, like all surveys, the validity of the Likert scale attitude measurement can be compromised due to social desirability.

This means that individuals may lie to put themselves in a positive light. For example, if a Likert scale was measuring discrimination, who would admit to being racist?

Bowling, A. (1997). Research Methods in Health . Buckingham: Open University Press.

Burns, N., & Grove, S. K. (1997). The Practice of Nursing Research Conduct, Critique, & Utilization . Philadelphia: W.B. Saunders and Co.

Jamieson, S. (2004). Likert scales: how to (ab) use them . Medical Education, 38(12) , 1217-1218.

Likert, R. (1932). A Technique for the Measurement of Attitudes. Archives of Psychology , 140, 1–55.

Paulhus, D. L. (1984). Two-component models of socially desirable responding . Journal of personality and social psychology, 46(3) , 598.

Further Information

History of the Likert Scale
Essential Elements of Questionnaire Design and Development

Research Methodology

Qualitative Data Coding

What Is a Focus Group?

Cross-Cultural Research Methodology In Psychology

What Is Internal Validity In Research?

Research Methodology , Statistics

What Is Face Validity In Research? Importance & How To Measure

Criterion Validity: Definition & Examples

Master Your Homework
Do My Homework

Likert Scale Research: A Comprehensive Guide

The Likert scale is a versatile research tool for gathering quantitative data about people’s attitudes and opinions. This comprehensive guide provides a thorough overview of the use of the Likert scale in research, from its historical origins to modern applications. We begin with an exploration of the history and development of this powerful tool, followed by discussions on how it can be applied in different types of studies. Next, we will explore the advantages and disadvantages associated with using a Likert-type survey as well as best practices for designing one. Finally, readers are provided examples demonstrating appropriate analysis techniques that can be used when analyzing data collected through these surveys. With this detailed guide at their disposal, researchers have all they need to successfully employ and leverage the full potential offered by conducting robust scientific inquiry based upon responses gathered via Likert scales.

I. Introduction to Likert Scale Research

Ii. history of the likert scale, iii. designing a likert survey, iv. benefits and limitations of using the likert scale for research purposes, v. analysing data collected from a likert survey, vi. improving response rates when administering a likert survey, vii. conclusion: summary of key points.

Likert Scale: Likert scale research is a powerful tool for understanding the attitudes and beliefs of individuals in various areas. The scale measures responses on a five-point, seven-point, or nine-point numerical continuum that ranges from “Strongly Agree” to “Strongly Disagree.” It allows researchers to accurately gauge people’s feelings about an issue or topic by asking them to choose one option out of many.

By using this type of survey instrument, researchers can explore differences between groups in terms of opinions on certain topics. This information can be used for academic research papers as well as marketing initiatives. For instance, a researcher might use the Likert scale when studying how different types of customers respond differently towards advertising campaigns.

The data obtained through such surveys can help businesses better tailor their strategies so they are more effective at reaching potential customers.
In academia it may also provide useful insight into why some student populations have stronger feelings than others concerning particular issues facing education today.

Exploring the Use of Likert Scales The use of a survey instrument to measure attitude dates back as far as 1932. This is when social scientist Rensis Likert developed and published his landmark paper on the “A Technique for the Measurement of Attitudes”, in which he introduced what we now refer to as the Likert Scale. The most widely known version involves rating responses across a continuum from strongly agree to strongly disagree with one’s opinion on any given topic or situation.

This technique has become popular among researchers over time due to its accuracy at measuring sentiment about certain issues. It can be used for both structured and unstructured surveys because it requires respondents to provide either quantitative answers such as numerical ratings (1-5) or qualitative answers such as descriptive statements that are then scored using predetermined criteria by the researcher(s). Additionally, this scale works well with large sample sizes so it is often utilized when conducting research involving multiple participants in order to obtain more accurate results than would otherwise be possible if only smaller samples were available for analysis. Furthermore, since these scales are relatively easy for people unfamiliar with statistics or research methods understand how they should answer questions presented within them – making them ideal tools for reaching out broad populations who may not have had previous experience participating in studies related topics like psychology or sociology!

Choosing the Right Survey Format

When designing a survey, it is important to choose the right format. Likert scale surveys are one of the most popular methods for measuring people’s attitudes and opinions on a particular topic. A Likert survey typically involves providing respondents with several statements related to a specific area or subject and asking them to indicate their level of agreement or disagreement with each statement using an anchored rating scale. This type of survey can be used in many research contexts, such as market research studies and academic papers.

To create an effective Likert-scale survey, there are certain key elements that should be considered. The questions should cover all points within your study’s scope; they must also be worded clearly so that respondents will understand what is being asked of them. Furthermore, when constructing a five-point (or seven-point) rating scale for each question – which provides gradations between strong agreement/disagreement – make sure you use clear labels at either end (e.g., Strongly Agree / Strongly Disagree). As demonstrated by Yeung et al.(2017), this step allows researchers to measure nuances in opinion more accurately than if only two ratings were offered.

Ensure all questions relate back to your original goal.
Make sure wording of each question is precise yet concise.

Advantages of the Likert Scale The Likert scale is a well-established tool used to collect survey data in research. It has long been valued for its simplicity and ease of use, as well as its ability to effectively measure responses across multiple variables or questions. The structure of the scale – ranging from ‘strongly agree’ to ‘strongly disagree’ – allows researchers to quickly and reliably capture attitudes towards any particular topic or issue under investigation. Moreover, recent studies suggest that this type of measurement can be advantageous when compared with other approaches such as semantic differential scales (Smith & Johnson, 2020).

Furthermore, another advantage is its flexibility; it can be used in both quantitative and qualitative research settings without needing significant adjustments or reconfigurations (Matzler et al., 2018). In addition, if deemed necessary by an investigator’s design choices one could easily modify a traditional 5-point likert item into higher order items on a 7 point degree based off their own criteria and objectives. Consequently this makes it ideal for conducting large surveys which require collecting data over wide ranges amongst numerous respondents within short time frames whilst maintaining accuracy throughout the entire process (Matzler et al., 2018). Limitations Despite these advantages there are certain limitations associated with using this instrumentation technique in comparison with more established methods such as interviews/focus groups etc.. For instance self-reported measures like likert scales rely heavily on respondent honesty when filling out questionnaires which may lead to artificially inflated results due to desirability bias i.e someone responding positively because they want people perceive them favourably rather than being honest about what they actually think e.g participants answering negatively regardless of whether they agree with the statement just because that was expected response instead providing true answers accordingly (Mavridis & Zafra‐Gómez , 2017 ). Additionally some individuals don’t have strong opinions regarding specific topics thus making it difficult score accurately via conventional rating systems versus alternative options available depending upon specified project requirements(Aluya 2017). Furthermore oftentimes investigators find themselves unable identify adequately detailed contextual factors surrounding individual responses since all information gathered must conform rigid formatting constraints imposed standardised procedures necessitated using likser scale instruments particularly high volume cases involving large sample sizes(Babenko & Babenko 2016)

Analyzing Survey Results

Quantitative data collected from a Likert survey can provide valuable insight into customer preferences and behavior. The use of a scale with which to measure responses gives us the ability to compare and contrast answers across different groups, allowing for more precise analysis than if respondents were simply asked open-ended questions.

In order to analyze the results of a Likert survey effectively, it is important that researchers understand how ratings are assigned and interpreted. A traditional five-point rating scale may be used in which responders rate items or statements on an increasing level of agreement (e.g., 1 = strongly disagree; 5 = strongly agree). As outlined by Krosnick & Alwin (1989) in their research paper titled “The Reliability of Trend Estimates From Surveys: The Role Of Question Context”, when using this kind of rating system it is beneficial for researchers to ask questions that both include verbal labels along with numerical equivalents.

This helps eliminate confusion among respondents who may not have experience taking surveys.
It also increases consistency in scoring between participants as they should all assign similar meanings to each number provided.

Additionally, conducting focus groups before deploying any survey can be useful when trying to identify what types of feedback will best help answer research objectives.

These types of qualitative methods allow investigators the opportunity gain further context surrounding topics being discussed within the study.

Practical Strategies

Offer incentives: Providing incentives such as vouchers or gift cards has been proven to be an effective way of encouraging respondents to complete surveys. This was demonstrated in a recent research paper that found offering participants monetary rewards increased response rates for Likert scale surveys.
Timely reminders: Follow-up emails sent at regular intervals can help remind people who haven’t responded yet and encourage them to take part. Several studies have shown this is an efficient way of increasing completion rates, especially when the reminder messages are personalised.

Survey Design Tips Using best practices during survey design helps increase engagement from the respondent and maximises response rate potential.

Keep it concise: Long questionnaires tend to lead to disengagement and frustration which will negatively impact responses. Consider how essential each section is before adding extra questions.
Layout matters : Make sure your layout follows good design principles – make use of white space, visual cues (such as arrows), font size changes etc., so it’s easy on the eye but still clear enough for instructions not go unnoticed by your audience.

Summary of Findings The research conducted using a Likert scale revealed several key insights. Firstly, there is a clear trend among respondents towards favouring the new product launch. Over 70% of survey participants indicated that they would be interested in purchasing it when available. Additionally, there was strong interest in features such as customisation and personalised recommendations, with almost all participants indicating this to some degree or another.

Another finding from the survey results shows that customers are willing to pay more for enhanced services and experiences associated with the products offered by the company; however cost remains an important factor influencing their decision-making process. Furthermore, customer service expectations remain high – satisfaction ratings were generally positive but could still be improved upon.

This article on Likert Scale research is an invaluable resource for anyone looking to conduct their own studies or gain a better understanding of the topic. It provides a comprehensive overview, along with insightful tips and advice from experts in the field. We hope this guide has been helpful in demystifying what can be an intimidating subject and will serve as a useful reference point as researchers delve deeper into conducting surveys using the scale. With continued innovation, we believe that Likert Scale Research will remain at the forefront of survey methodology well into future decades, providing valuable data sets for organizations across all industries.

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Publications
Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

Advanced Search
Journal List
Front Psychol

A Review of Key Likert Scale Development Advances: 1995–2019

Andrew t. jebb.

1 Department of Psychological Sciences, Purdue University, West Lafayette, IN, United States

2 Department of Psychology, University of Houston, Houston, TX, United States

Associated Data

Developing self-report Likert scales is an essential part of modern psychology. However, it is hard for psychologists to remain apprised of best practices as methodological developments accumulate. To address this, this current paper offers a selective review of advances in Likert scale development that have occurred over the past 25 years. We reviewed six major measurement journals (e.g., Psychological Methods , Educational , and Psychological Measurement ) between the years 1995–2019 and identified key advances, ultimately including 40 papers and offering written summaries of each. We supplemented this review with an in-depth discussion of five particular advances: (1) conceptions of construct validity, (2) creating better construct definitions, (3) readability tests for generating items, (4) alternative measures of precision [e.g., coefficient omega and item response theory (IRT) information], and (5) ant colony optimization (ACO) for creating short forms. The Supplementary Material provides further technical details on these advances and offers guidance on software implementation. This paper is intended to be a resource for psychological researchers to be informed about more recent psychometric progress in Likert scale creation.

Introduction

Psychological data are diverse and range from observations of behavior to face-to-face interviews. However, in modern times, one of the most common measurement methods is the self-report Likert scale ( Baumeister et al., 2007 ; Clark and Watson, 2019 ). Likert scales provide a convenient way to measure unobservable constructs, and published tutorials detailing the process of their development have been highly influential, such as Clark and Watson (1995) and Hinkin (1998) (being cited over 6,500 and 3,000 times, respectively, according to Google scholar).

Notably, however, it has been roughly 25 years since these seminal papers were published, and specific best-practices have changed or evolved since then. Recently, Clark and Watson (2019) gave an update to their 1995 article, integrating some newer topics into a general tutorial of Likert scale creation. However, scale creation—from defining the construct to testing nomological relationships—is such an extensive process that it is challenging for any paper to give full coverage to each of its stages. The authors were quick to note this themselves several times, e.g., “[w]e have space only to raise briefly some key issues” and “unfortunately we do not have the space to do justice to these developments here” (p. 5). Therefore, a contribution to psychology would be a paper that provides a review of advances in Likert scale development since classic tutorials were published. This paper would not be a general tutorial in scale development like Clark and Watson (1995 , 2019) , Hinkin (1998) , or others. Instead, it would focus on more recent advances and serve as a complement to these broader tutorials.

The present paper seeks to serve as such a resource by reviewing developments in Likert scale creation from the past 25 years. However, given that scale development is such an extensive topic, the limitations of this review should be made very explicit. The first limitations are with regard to scope. This is not a review of psychometrics , which would be impossibly broad, or advances in self-report in general , which would also be unwieldy (e.g., including measurement techniques like implicit measures and forced choice scales). This is a review of the initial development and validation of self-report Likert scales . Therefore, we also excluded measurement topics related the use self-report scales, like identifying and controlling for response biases. 1 Although this scope obviously omits many important aspects of measurement, it was necessary to do the review.

Importantly, like Clark and Watson (1995 , 2019 ), Hinkin (1998) , this paper was written at the level of the general psychologist, not methodologists, in order to benefit the field of psychology most broadly. This also meant that our scope was to fine articles that were broad enough to apply to most cases of Likert scale development. As a result, we omitted articles, for example, that only discussed measuring certain types of constructs [e.g., Haynes and Lench’s (2003) paper on the incremental validation of new clinical measures].

The second major limitation concerns its objectivity. Performing any review of what is “significant” requires, at a point, making subjective judgment calls. The majority of the papers we reviewed were fairly easy to decide on. For example, we included Simms et al. (2019) because they tackled a major Likert scale issue: the ideal number of response options (as well as the comparative performance of visual analog scales). By contrast, we excluded Permut et al. (2019) because their advance was about monitoring the attention of subjects taking surveys online, not about scale development, per se . However, other papers were more difficult to decide on. Our method of handling this ambuity is described below, but we do not try claim that subjectivity did not play a part of the review process in some way.

Additionally, (a) we did not survey every single journal where advances may have been published 2 and (b) articles published after 2019 were not included. Despite all these limitations, this review was still worth performing. Self-report Likert scales are an incredibly dominant source of data in psychology and the social sciences in general. The divide between methodological and substantive literatures—and between methodologists and substantive researchers ( Sharpe, 2013 )—can increase over time, but they can also be reduced by good communication and dissemination ( Sharpe, 2013 ). The current review is our attempt to bridge, in part, that gap.

To conduct this review, we examined every issue of six major journals related to psychological measurement from January 1995 to December 2019 (inclusive), screening out articles by either title and/or abstract. The full text of any potentially relevant article was reviewed by either the first or second author, and any borderline cases were discussed until a consensus was reached. A PRISMA flowchart of the process is shown in Figure 1 . The journals we surveyed were: Applied Psychological Measurement , Psychological Assessment , Educational and Psychological Measurement , Psychological Methods , Advances in Methods and Practices in Psychological Science , and Organizational Research Methods . For inclusion, our criteria were that the advance had to be: (a) related to the creation of self-report Likert scales (seven excluded), (b) broad and significant enough for a general psychological audience (23 excluded), and (c) not superseded or encapsulated by newer developments (11 excluded). The advances we included are shown in Table 1 , along with a short descriptive summary of each. Scale developers should not feel compelled to use all of these techniques, just those that contribute to better measurement in their context. More specific contexts (e.g., measuring socially sensitive constructs) can utilize additional resources.

An external file that holds a picture, illustration, etc.
Object name is fpsyg-12-637547-g001.jpg

PRISMA flowchart of review process.

Summary of Likert scale creation developments from 1995–2019.

To supplement this literature review, the remainder of the paper provides a more in-depth discussion of five of these advances that span a range of topics. These were chosen due to their importance, uniqueness, or ease-of-use, and lack of general coverage in classic scale creation papers. These are: (1) conceptualizations of construct validity, (2) approaches for creating more precise construct definitions, (3) readability tests for generating items, (4) alternative measures of precision (e.g., coefficient omega), and (5) ant colony optimization (ACO) for creating short forms. These developments are presented in roughly the order of what stage they occur in the process of scale creation, a schematic diagram of which is shown in Figure 2 .

An external file that holds a picture, illustration, etc.
Object name is fpsyg-12-637547-g002.jpg

Schematic diagram of Likert scale development (with advances in current paper, bolded).

Conceptualizing Construct Validity

Two views of validity.

Psychologists recognize validity as the fundamental concept of psychometrics and one of the most critical aspects of psychological science ( Hood, 2009 ; Cizek, 2012 ). However, what is “validity?” Despite the widespread agreement about its importance, there is disagreement about how validity should be defined ( Newton and Shaw, 2013 ). In particular, there are two divergent perspectives on the definition. The first major perspective defines validity not as a property of tests but as a property of the interpretations of test scores ( Messick, 1989 ; Kane, 1992 ). This view can be therefore called the interpretation camp ( Hood, 2009 ) or validity as construct validity ( Cronbach and Meehl, 1955 ), which is the perspective endorsed by Clark and Watson (1995 , 2019) and standards set forth by governing agencies for the North American educational and psychological measurement supracommunity ( Newton and Shaw, 2013 ). Construct validity is based on a synthesis and analysis of the evidence that supports a certain interpretation of test scores, so validity is a property of interpretive inferences about test scores ( Messick, 1989 , p. 13), especially interpreting score meaning ( Messick, 1989 , p. 17). Because the context of measurement affects test scores ( Messick, 1989 , pp. 14–15), the results of any validation effort are conditional upon the context in and group characteristics of the sample with which the studies were done, as are claims of validity drawn from these empirical results ( Newton, 2012 ; Newton and Shaw, 2013 ).

The other major perspective ( Borsboom et al., 2004 ) revivifies one of the oldest and most intuitive definitions of validity: “…whether or not a test measures what it purports to measure” ( Kelley, 1927 , p. 14). In other words, on this view, validity is a property of tests rather than interpretations. Validity is simply whether or not the statement, “test X measures attribute Y,” is true. To be true, it requires (a) that Y exists and (b) that variations in Y cause variations in X ( Borsboom et al., 2004 ). This definition can be called the test validity view and finds ample precedent in psychometric texts ( Hood, 2009 ). However, Clark and Watson (2019) , citing the Standards for Educational and Psychological Testing ( American Educational Research Association et al., 2014 ), reject this conception of validity.

Ultimately, this disagreement does not show any signs of resolving, and interested readers can consult papers that have attempted to integrate or adjudicate on the two views ( Lissitz and Samuelson, 2007 ; Hood, 2009 ; Cizek, 2012 ).

There Aren’t “Types” of Validity; Validity Is “One”

Even though there are stark differences between these two definitions of validity, one thing they do agree on is that there are not different “types” of validity ( Newton and Shaw, 2013 ). Language like “content validity” and “criterion-related validity” is misleading because it implies that their typical analytic procedures produce empirical evidence that does not bear on the central inference of interpreting the score’s meaning (i.e., construct validity; Messick, 1989 , pp. 13–14, 17, 19–21). Rather, there is only (construct) validity, and different validation procedures and types of evidence all contribute to making inferences about score meaning ( Messick, 1980 ; Binning and Barrett, 1989 ; Borsboom et al., 2004 ).

Despite the agreement that validity is a unitary concept, psychologists seem to disagree in practice; as of 2013, there were 122 distinct subtypes of validity ( Newton and Shaw, 2013 ), many of them named after the fourth edition of the Standards that stated that validity-type language was inappropriate ( American Educational Research Association et al., 1985 ). A consequence of speaking this way is that it perpetuates the view (a) that there are independent “types” of validity (b) that entail different analytic procedures to (c) produce corresponding types of evidence that (d) themselves correspond to different categories of inference ( Messick, 1989 ). This is why to even speak of content, construct, and criterion-related “analyses” (e.g., Lawshe, 1985 ; Landy, 1986 ; Binning and Barrett, 1989 ) can be problematic, since this misleads researchers into thinking that these produce distinct kinds of empirical evidence that have a direct, one-to-one correspondence to the three broad categories of inferences with which they are typically associated ( Messick, 1989 ).

However, an analytic procedure traditionally associated with a certain “type” of validity can be used to produce empirical evidence for another “type” of validity not typically associated with it. For instance, showing that the focal construct is empirically discriminable from similar constructs would constitute strong evidence for the inference of discriminability ( Messick, 1989 ). However, the researcher could use analyses typically associated with “criterion and incremental validity” ( Sechrest, 1963 ) to investigate discriminability as well (e.g., Credé et al., 2017 ). Thus, the key takeaway is to think not of “discriminant validity” or distinct “types” of validity, but to use a wide variety of research designs and statistical analyses to potentially provide evidence that may or may not support a given inference under investigation (e.g., discriminability). This demonstrates that thinking about validity “types” can be unnecessarily restrictive because it misleads researchers into thinking about validity as a fragmented concept ( Newton and Shaw, 2013 ), leading to negative downstream consequences in validation practice.

Creating Clearer Construct Definitions

Ensuring concept clarity.

Defining the construct one is interested in measuring is a foundational part of scale development; failing to do so properly undermines every scientific activity that follows (T. L. Thorndike, 1904 ; Kelley, 1927 ; Mackenzie, 2003 ; Podsakoff et al., 2016 ). However, there are lingering issues with conceptual clarity in the social sciences. Locke (2012) noted that “As someone who has been reviewing journal articles for more than 30 years, I estimate that about 90% of the submissions I get suffer from problems of conceptual clarity” (p. 146), and Podsakoff et al. (2016) stated that, “it is…obvious that the problem of inadequate conceptual definitions remains an issue for scholars in the organizational, behavioral, and social sciences” (p. 160). To support this effort, we surveyed key papers on construct clarity and integrated their recommendations into Table 2 , adding our own comments where appropriate. We cluster this advice into three “aspects” of formulating a construct definition, each of which contains several specific strategies.

Integrative summary of advice for defining constructs.

Specifying the Latent Continuum

In addition to clearly articulating the concept, there are other parts to defining a psychological construct for empirical measurement. Another recent development demonstrates the importance of incorporating the latent continuum in measurement ( Tay and Jebb, 2018 ). Briefly, many psychological concepts like emotion and self-esteem are conceived as having degrees of magnitudes (e.g., “low,” “moderate,” and “high”), and these degrees can be represented by a construct continuum. The continuum was originally a primary focus in early psychological measurement, but the advent of the convenient Likert(-type) scaling ( Likert, 1932 ) pushed it into the background.

However, defining the characteristics of this continuum is needed for proper measurement. For instance, what do the poles (i.e., endpoints) of the construct represent? Is the lower pole its absence , or is it the presence of an opposing construct (i.e., a unipolar or bipolar continuum)? And, what do the different continuum degrees actually represent? If the construct is a positive emotion, do they represent the intensity of experience or the frequency of experience? Quite often, scale developers do not define these aspects but leave them implicit. Tay and Jebb (2018) discuss different problems that can arise from this.

In addition to defining the continuum, there is also the practical issue of fully operationalizing the continuum ( Tay and Jebb, 2018 ). This involves ensuring that the whole continuum is well-represented when creating items. It also means being mindful when including reverse-worded items in their scales. These items may measure an opposite construct , which is desirable if the construct is bipolar (e.g., positive emotions as including happy and sad), but contaminates measurement if the construct is unipolar (e.g., positive emotions as only including feeling happy). Finally, developers should choose a response format that aligns with whether the continuum has been specified as unipolar or bipolar. For example, the numerical rating of 0–4 typically implies a unipolar scale to the respondent, whereas a −3-to-3 response scale implies a bipolar scale. Verbal labels like “Not at all” to “Extremely” imply unipolarity, whereas formats like “Strongly disagree” to “Strongly agree” imply bipolarity. Tay and Jebb (2018) also discuss operationalizing the continuum with regard to two other issues, assessing dimensionality of the scale and assuming the correct response process.

Readability Tests for Items

The current psychometric practice is to keep item statements short and simple with language that is familiar to the target respondents ( Hinkin, 1998 ). Instructions like these alleviate readability problems because psychologists are usually good at identifying and revising difficult items. However, professional psychologists also have a much higher degree of education compared to the rest of the population. In the United States, less than 2% of adults have doctorates, and a majority do not have a degree past high school ( U.S. Census Bureau, 2014 ). The average United States adult has an estimated 8th-grade reading level, with 20% of adults falling below a 5th-grade level ( Doak et al., 1998 ). Researchers can probably catch and remove scale items that are extremely verbose (e.g., “I am garrulous”), but items that might not be easily understood by target respondents may slip through the item creation process. Social science samples frequently consist of university students ( Henrich et al., 2010 ), but this subpopulation has a higher reading level than the general population ( Baer et al., 2006 ), and issues that would manifest for other respondents might not be evident when using such samples.

In addition to asking respondents directly (see Parrigon et al., 2017 for an example), another tool to assess readability is to use readability tests , which have already been used by scale developers in psychology (e.g., Lubin et al., 1990 ; Ravens-Sieberer et al., 2014 ). Readability tests are formulas that score the readability of some piece of writing, often as a function of the number of words per sentence and number of syllables per word. These tests only take seconds to implement and can serve as an additional way to check item language beyond the intuitions of scale developers. When these tests are used, scale items should only be analyzed individually , as testing the readability of the whole scale together can hide one or more difficult items. If an item receives a low readability score, the developer can revise the item.

There are many different readability tests available, such as the Flesch Reading Ease test, the Flesch-Kincaid Grade Level Studies test, the Gunning fog index, SMOG index, Automated Readability Index, and Coleman-Liau Index. These operate in much the same way, outputting an estimated grade level based on sentence and word length.

We reviewed their formulas and reviews on the topic (e.g., Benjamin, 2012 ). At the outset, we state that no statistic is univocally superior to all the others. It is possible to implement several tests and compare the results. However, we recommend the Flesch-Kincaid Grade Level Studies test because it (a) is among the most commonly used, (b) is expressed in grade school levels, and (c) is easily implemented in Microsoft Word. The score indicates what United States grade level the readability is suited. Given average reading grade levels in the United States, researchers can aim for a readability score of 8.0 or below for their items. There are several examples of scale developers using this reading test. Lubin et al. (1990) found that 80% of the Depression Adjective Check Lists was at an eighth-grade reading level. Ravens-Sieberer et al. (2014) used the test to check whether a measure of subjective well-being was suitable for children. As our own exercise, we took three recent instances of scale development in the Journal of Applied Psychology and ran readability tests on their items. This analysis is presented in the Supplementary Material .

Alternative Estimates of Measurement Precision

Alpha and omega.

A major focus of scale development is demonstrating its reliability, defined formally as the proportion of true score variance to total score variance ( Lord and Novick, 1968 ). The most common estimator of reliability in psychology is coefficient alpha ( Cronbach, 1951 ). However, alpha is sometimes a less-than-ideal measure because it assumes that all scale items have the same true score variance ( Novick and Lewis, 1967 ; Sijtsma, 2009 ; Dunn et al., 2014 ; McNeish, 2018 ). Put in terms of latent variable modeling, this means that alpha estimates true reliability only if the factor loadings across items are the same ( Graham, 2006 ), 3 something that is “rare for psychological scales” ( Dunn et al., 2014 , p. 409). Violating this assumption makes alpha underestimate true reliability. Often, this underestimation may be small, but it will increase for scales with fewer items and with greater differences in population factor loadings ( Raykov, 1997 ; Graham, 2006 ).

A proposed solution to this is to relax this assumption and adopt the less stringent congeneric model of measurement. The most prominent estimator in this group is coefficient omega ( McDonald, 1999 ), 4 which uses a factor model to obtain reliability estimates. Importantly, omega performs at least as well as alpha if alpha’s assumptions hold ( Zinbarg et al., 2005 ). However, one caveat is that the estimator requires a good-fitting factor model for estimation. Omega and its confidence interval can be computed with the psych package in R (for unidimensional scales, the “omega.tot” statistic from the function “omega;” Revelle, 2008 ). McNeish (2018) provides a software tutorial in R and Excel [see also Dunn et al. (2014) and Revelle and Condon (2019) ].

Reliability vs. IRT Information

Alpha, omega, and other reliability estimators stem from the classical test theory paradigm of measurement, where the focus is on the overall reliability of the psychological scale. The other measurement paradigm, item response theory (IRT), focuses on the “reliability” of the scale at a given level of the latent trait or at the level of the item ( DeMars, 2010 ). In IRT, this is operationalized as information IRT ( Mellenbergh, 1996 ) 5 .

Although they are analogous concepts, information IRT and reliability are different.

Whereas traditional reliability is only assessed at the scale-level, information IRT can be assessed at three levels: the response category, item, and test. Information IRT is a full mathematical function which shows how the precision changes across latent trait levels. These features translate into several advantages for the scale developer.

First, items can be evaluated for how much precision they have. Items that are not informative can be eliminated in favor of items that are (for a tutorial, see Edelen and Reeve, 2007 ). Second, the test information function shows how precisely the full scale measures each region of the latent trait. If a certain region is deficient, items can be added to better capture that region (or removed, if the region has been measured enough). Finally, suppose the scale developer is only interested in measuring a certain region of the latent trait range, such as middle-performers or high and low performers. In that case, information IRT can help them do so. Further details are provided in the Supplementary Material .

Maximizing Validity in Short Forms Using Ant Colony Optimization

Increasingly, psychologists wish to use short scales in their work ( Leite et al., 2008 ), 6 as they reduce respondent time, fatigue, and required financial compensation. To date, the most common approaches aim to maintain reliability ( Leite et al., 2008 ; Kruyen et al., 2013 ) and include retaining items with the highest factor loadings and item-total correlations. However, these strategies can incidentally impair measurement ( Janssen et al., 2015 ; Olaru et al., 2015 ; Schroeders et al., 2016 ), as items with higher intercorrelations will usually have more similar content, resulting in less scale content (i.e., the attenuation paradox ; Loevinger, 1954 ).

A more recent method for constructing short forms is a computational algorithm called ACO ( Dorigo, 1992 ; Dorigo and Stützle, 2004 ). Instead of just maximizing reliability, this method can incorporate any number of evaluative criteria, such as associations with variables, factor model fit, and others. When reducing a Big 5 personality scale, Olaru et al. (2015) found that, for a mixture of criteria (e.g., CFA fit indices, latent correlations), ACO either equaled or surpassed the alternative methods for creating short forms, such as maximizing factor loadings, minimizing modification indices, a genetic algorithm, and the PURIFY algorithm (see also Schroeders et al., 2016 ). Since ACO has been introduced to psychology, it has been used in the creation of real psychological scales for proactive personality and supervisor support ( Janssen et al., 2015 ), psychological situational characteristics ( Parrigon et al., 2017 ), and others ( Olaru et al., 2015 ; Olderbak et al., 2015 ).

The logic of ACO comes from how ants resolve the problem of determining the shortest path to their hive when they find food ( Deneubourg et al., 1983 ). The ants solve it by (a) randomly sampling different paths toward the food and (b) laying down chemical pheromones that attract other ants. The paths that provide quicker solutions acquire pheromones more rapidly, attracting more ants, and thus more pheromone. Ultimately, a positive feedback loop is created until the ants converge on the best path (the solution).

The ACO algorithm works similarly. When creating a short form of N items, ACO first randomly samples N items from the full scale (the N “paths”). Next, the performance of that short form is evaluated by one or more statistical measures, such as the association with another variable, reliability, and/or factor model fit. Based on these measures, if the sampled items performed well, their probability weight is increased (the amount of “pheromone”). Over repeated iterations, the items that led to good performance will become increasingly weighted for selection, creating a positive feedback loop that eventually converges to a final solution. Thus, ACO, like the ants, does not search and test all possible solutions. Instead, it uses some criterion for evaluating the items and then uses this to update the probability of selecting those items.

ACO is an automated procedure, but this does not mean that researchers should accept its results automatically. Foremost, ACO does not guarantee that the shortened scale has satisfactory content ( Kruyen et al., 2013 ). Therefore, the items that comprise the final scale should always be examined to see if their content is sufficient.

We also strongly recommend that authors using ACO be explicit about the specifications of the algorithm. Authors should always report (a) what criteria they are using to evaluate short form performance and (b) how these are mathematically translated into pheromone weights. Authors should also report all the other relevant details of conducting the algorithm (e.g., the software package, the number of total iterations). In the Supplementary Material , we provide further details and a full R software walkthrough. For more information, the reader can consult additional resources ( Marcoulides and Drezner, 2003 ; Leite et al., 2008 ; Janssen et al., 2015 ; Olaru et al., 2015 ; Schroeders et al., 2016 ).

Measurement in psychology comes in many forms, and for many constructs, one of the best methods is the psychological Likert scale. A recent review suggests that, in the span of just a few years, dozens of scales are added to the psychological science literature ( Colquitt et al., 2019 ). Thus, psychologists must have a clear understanding of the proper theory and procedures for scale creation. This present article aims to increase this clarity by offering a selective review of Likert scale development advances over the past 25 years. Classic papers delineating the process of Likert scale development have proven immensely useful to the field ( Clark and Watson, 1995 , 2019 ; Hinkin, 1998 ), but it is difficult to do justice to this whole topic in a single paper, especially as methodological developments accumulate.

Though this paper reviewed past work, we end with some notes about the future. As methods progress, they become more sophisticated, but sophistication should not be mistaken for accuracy. This applies even to some of the techniques discussed here, such as ACO, which has crucial limitations (e.g., it depends on what predicted external variable is chosen and requires a subjective examination of sufficient content).

Second, we are concerned with the problem of construct proliferation , as are other social scientists (e.g., Shaffer et al., 2016 ; Colquitt et al., 2019 ). Solutions to this problem include paying close attention to the constructs that have already been established in the literature, as well as engaging in a critical and honest reflection on whether one’s target construct is meaningfully different. In cases of scale development, the developer should provide sufficient arguments for these two criteria: the construct’s (a) importance and (b) distinctiveness. Although scholars are quite adept at theoretically distinguishing a “new” construct from a prior one ( Harter and Schmidt, 2008 ), empirical methods should only be enlisted after this has been established.

Finally, as psychological theory progresses, it tends to become more complex. One issue with this increasing complexity is the danger of creating incoherent constructs. Borsboom (2005 , p. 33) provides an example of a scale with three items: (1) “I would like to be a military leader,” (2) “.10 sqrt (0.05+0.05) = …,” and (3) “I am over six feet tall” (p. 33). Although no common construct exists among these items, the scale can certainly be scored and will probably even be reliable, as the random error variance will be low ( Borsboom, 2005 ). Therefore, measures of such incoherent constructs can display good psychometric properties, and psychologists cannot merely rely on empirical evidence for justifying them. Thus, the challenges of scale development of the present and future are equally empirical and theoretical.

Author Contributions

LT conceived the idea for the manuscript and provided feedback and editing. AJ conducted most of the literature review and wrote much of the manuscript. VN assisted with the literature review and contributed writing. All authors contributed to the article and approved the submitted version.

Conflict of Interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

1 We also do not include the topic of measurement invariance, as this is typically done to validate a Likert scale with regard to a new population .

2 Nor is it true that just because a paper has been published it is a significant advance. A good example is Westen and Rosenthal’s (2003) , two coefficients for quantifying construct validity, which were shown to be severely limited by Smith (2005) .

3 Alpha also assumes normal and uncorrelated errors.

4 There are several versions of omega, such as hierarchical omega for multidimensional scales. McNeish (2018) provides an exceptional discussion of alternatives to alpha, including software tutorials in R and Excel.

5 There are two uses of the word “information” used in this section: as the formal IRT statistic and the general, everyday sense of the word (“We don’t have enough information.”). For the technical term, we will use information IRT , and the latter we will leave simply as “information.”

6 One important distinction is between short scales and short forms . Short forms are a type of short scales, but of course, not all short scales were taken from a larger measure. In this section, we are concerned with the process of developing a short form from an original scale only.

Supplementary Material

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpsyg.2021.637547/full#supplementary-material

American Educational Research Association, American Psychological Association, and National Council on Measurement in Education (1985). Standards for Educational and Psychological Testing. Washington, DC: American Educational Research Association. [ Google Scholar ]
American Educational Research Association, American Psychological Association, National Council on Measurement in Education, and Joint Committee on Standards for Educational and Psychological Testing (AERA, APA, & NCME) (2014). Standards for Educational and Psychological Testing. Washington, DC: American Educational Research Association. [ Google Scholar ]
Anderson J. C., Gerbing D. W. (1991). Predicting the performance of measures in a confirmatory factor analysis with a pretest assessment of their substantive validities. J. Appl. Psychol. 76 732–740. 10.1037/0021-9010.76.5.732 [ CrossRef ] [ Google Scholar ]
Baer J. D., Baldi S., Cook S. L. (2006). The Literacy of America’s College Students. Washington, DC: American Institutes for Research. [ Google Scholar ]
Barchard K. A. (2012). Examining the reliability of interval level data using root mean square differences and concordance correlation coefficients. Psychol. Methods 17 294–308. 10.1037/a0023351 [ PubMed ] [ CrossRef ] [ Google Scholar ]
Baumeister R. F., Vohs K. D., Funder D. C. (2007). Psychology as the science of self-reports and finger movements: whatever happened to actual behavior? Perspect. Psychol. Sci. 2 396–403. 10.1111/j.1745-6916.2007.00051.x [ PubMed ] [ CrossRef ] [ Google Scholar ]
Benjamin R. G. (2012). Reconstructing readability: recent developments and recommendations in the analysis of text difficulty. Educ. Psychol. Rev. 24 63–88. 10.1007/s10648-011-9181-8 [ CrossRef ] [ Google Scholar ]
Binning J. F., Barrett G. V. (1989). Validity of personnel decisions: a conceptual analysis of the inferential and evidential bases. J. Appl. Psychol. 74 478–494. 10.1037/0021-9010.74.3.478 [ CrossRef ] [ Google Scholar ]
Borsboom D. (2005). Measuring the Mind: Conceptual Issues in Contemporary Psychometrics. Cambridge: Cambridge University Press. [ Google Scholar ]
Borsboom D., Mellenbergh G. J., van Heerden J. (2004). The concept of validity. Psychol. Rev. 111 1061–1071. 10.1037/0033-295X.111.4.1061 [ PubMed ] [ CrossRef ] [ Google Scholar ]
Calderón J. L., Morales L. S., Liu H., Hays R. D. (2006). Variation in the readability of items within surveys. Am. J. Med. Qual. 21 49–56. 10.1177/1062860605283572 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
Cizek G. J. (2012). Defining and distinguishing validity: interpretations of score meaning and justifications of test use. Psychol. Methods 17 31–43. 10.1037/a0026975 [ PubMed ] [ CrossRef ] [ Google Scholar ]
Clark L. A., Watson D. (1995). Constructing validity: basic issues in objective scale development. Psychol. Assess. 7 309–319. 10.1037/1040-3590.7.3.309 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
Clark L. A., Watson D. (2019). Constructing validity: new developments in creating objective measuring instruments. Psychol. Assess. 31 : 1412 . 10.1037/pas0000626 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
Colquitt J. A., Sabey T. B., Rodell J. B., Hill E. T. (2019). Content validation guidelines: evaluation criteria for definitional correspondence and definitional distinctiveness. J. Appl. Psychol. 104 1243–1265. 10.1037/apl0000406 [ PubMed ] [ CrossRef ] [ Google Scholar ]
Cooksey R. W., Soutar G. N. (2006). Coefficient beta and hierarchical item clustering: an analytical procedure for establishing and displaying the dimensionality and homogeneity of summated scales. Organ. Res. Methods 9 78–98. 10.1177/1094428105283939 [ CrossRef ] [ Google Scholar ]
Credé M., Tynan M. C., Harms P. D. (2017). Much ado about grit: a meta-analytic synthesis of the grit literature. J. Pers. Soc. Psychol. 113 492–511. 10.1093/oxfordjournals.bmb.a072872 [ PubMed ] [ CrossRef ] [ Google Scholar ]
Cronbach L. J. (1951). Coefficient alpha and the internal structure of tests. Psychometrika 16 297–334. 10.1007/BF02310555 [ CrossRef ] [ Google Scholar ]
Cronbach L. J., Meehl P. E. (1955). Construct validity in psychological tests. Psychol. Bull. 52 281–302. 10.1037/h0040957 [ PubMed ] [ CrossRef ] [ Google Scholar ]
Cronbach L. J., Shavelson R. J. (2004). My current thoughts on coefficient alpha and successor procedures. Educ. Psychol. Meas. 64 391–418. 10.1177/0013164404266386 [ CrossRef ] [ Google Scholar ]
DeMars C. (2010). Item Response Theory. Oxford: Oxford University Press. [ Google Scholar ]
Deneubourg J. L., Pasteels J. M., Verhaeghe J. C. (1983). Probabilistic behaviour in ants: a strategy of errors? J. Theor. Biol. 105 259–271. 10.1016/s0022-5193(83)80007-1 [ CrossRef ] [ Google Scholar ]
DeSimone J. A. (2015). New techniques for evaluating temporal consistency. Organ. Res. Methods 18 133–152. 10.1177/1094428114553061 [ CrossRef ] [ Google Scholar ]
Doak C., Doak L., Friedell G., Meade C. (1998). Improving comprehension for cancer patients with low literacy skills: strategies for clinicians. CA Cancer J. Clin. 48 151–162. 10.3322/canjclin.48.3.151 [ PubMed ] [ CrossRef ] [ Google Scholar ]
Dorigo M. (1992). Optimization, Learning, and Natural Algorithms . Ph.D. thesis. Milano: Politecnico di Milano. [ Google Scholar ]
Dorigo M., Stützle T. (2004). Ant Colony Optimization. Cambridge, MA: MIT Press. [ Google Scholar ]
Dunn T. J., Baguley T., Brunsden V. (2014). From alpha to omega: a practical solution to the pervasive problem of internal consistency estimation. Br. J. Psychol. 105 399–412. 10.1111/bjop.12046 [ PubMed ] [ CrossRef ] [ Google Scholar ]
Edelen M. O., Reeve B. B. (2007). Applying item response theory (IRT) modeling to questionnaire development, evaluation, and refinement. Qual. Life Res. 16 ( Suppl. 1 ) 5–18. 10.1007/s11136-007-9198-0 [ PubMed ] [ CrossRef ] [ Google Scholar ]
Ferrando P. J., Lorenzo-Seva U. (2018). Assessing the quality and appropriateness of factor solutions and factor score estimates in exploratory item factor analysis. Educ. Pyschol. Meas. 78 762–780. 10.1177/0013164417719308 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
Ferrando P. J., Lorenzo-Seva U. (2019). An external validity approach for assessing essential unidimensionality in correlated-factor models. Educ. Psychol. Meas. 79 437–461. 10.1177/0013164418824755 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
Graham J. M. (2006). Congeneric and (essentially) tau-equivalent estimates of score reliability: what they are and how to use them. Educ. Psychol. Meas. 66 930–944. 10.1177/0013164406288165 [ CrossRef ] [ Google Scholar ]
Green S. B. (2003). A coefficient alpha for test-retest data. Psychol. Methods 8 88–101. 10.1037/1082-989X.8.1.88 [ PubMed ] [ CrossRef ] [ Google Scholar ]
Hardy B., Ford L. R. (2014). It’s not me, it’s you: miscomprehension in surveys. Organ. Res. Methods 17 138–162. 10.1177/1094428113520185 [ CrossRef ] [ Google Scholar ]
Harter J. K., Schmidt F. L. (2008). Conceptual versus empirical distinctions among constructs: Implications for discriminant validity. Ind. Organ. Psychol. 1 36–39. 10.1111/j.1754-9434.2007.00004.x [ CrossRef ] [ Google Scholar ]
Haynes S. N., Lench H. C. (2003). Incremental validity of new clinical assessment measures. Psychol. Assess. 15 456–466. 10.1037/1040-3590.15.4.456 [ PubMed ] [ CrossRef ] [ Google Scholar ]
Haynes S. N., Richard D. C. S., Kubany E. S. (1995). Content validity in psychological assessment: a functional approach to concepts and methods. Psychol. Assess. 7 238–247. 10.1037/1040-3590.7.3.238 [ CrossRef ] [ Google Scholar ]
Henrich J., Heine S. J., Norenzayan A. (2010). The weirdest people in the world? Behav. Brain Sci. 33 61–135. 10.1017/S0140525X0999152X [ PubMed ] [ CrossRef ] [ Google Scholar ]
Henson R. K., Roberts J. K. (2006). Use of exploratory factor analysis in published research: common errors and some comment on improved practice. Educ. Psychol. Meas. 66 393–416. 10.1177/0013164405282485 [ CrossRef ] [ Google Scholar ]
Hinkin T. R. (1998). A brief tutorial on the development of measures for use in survey questionnaires. Organ. Res. Methods 1 104–121. 10.1177/109442819800100106 [ CrossRef ] [ Google Scholar ]
Hinkin T. R., Tracey J. B. (1999). An analysis of variance approach to content validation. Organ. Res. Methods 2 175–186. 10.1177/109442819922004 [ CrossRef ] [ Google Scholar ]
Hood S. B. (2009). Validity in psychological testing and scientific realism. Theory Psychol. 19 451–473. 10.1177/0959354309336320 [ CrossRef ] [ Google Scholar ]
Hunsley J., Meyer G. J. (2003). The incremental validity of psychological testing and assessment: conceptual, methodological, and statistical issues. Psychol. Assess. 15 446–455. 10.1037/1040-3590.15.4.446 [ PubMed ] [ CrossRef ] [ Google Scholar ]
Janssen A. B., Schultze M., Grotsch A. (2015). Following the ants: development of short scales for proactive personality and supervisor support by ant colony optimization. Eur. J. Psychol. Assess. 33 409–421. 10.1027/1015-5759/a000299 [ CrossRef ] [ Google Scholar ]
Johanson G. A., Brooks G. P. (2010). Initial scale development: sample size for pilot studies. Educ. Psychol. Meas. 70 394–400. 10.1177/0013164409355692 [ CrossRef ] [ Google Scholar ]
Kane M. T. (1992). An argument-based approach to validity in evaluation. Psychol. Bull. 112 527–535. 10.1177/1356389011410522 [ CrossRef ] [ Google Scholar ]
Kelley K. (2016). MBESS (Version 4.0.0) [Computer Software and Manual]. [ Google Scholar ]
Kelley K., Pornprasertmanit S. (2016). Confidence intervals for population reliability coefficients: Evaluation of methods, recommendations, and software for composite measures. Psychological Methods 21 69–92. 10.1037/a0040086 [ PubMed ] [ CrossRef ] [ Google Scholar ]
Kelley T. L. (1927). Interpretation of Educational Measurements. New York, NY: World Book Company. [ Google Scholar ]
Knowles E. S., Condon C. A. (2000). Does the rose still smell as sweet? Item variability across test forms and revisions. Psychol. Assess. 12 245–252. 10.1037/1040-3590.12.3.245 [ PubMed ] [ CrossRef ] [ Google Scholar ]
Kruyen P. M., Emons W. H. M., Sijtsma K. (2013). On the shortcomings of shortened tests: a literature review. Int. J. Test. 13 223–248. 10.1080/15305058.2012.703734 [ CrossRef ] [ Google Scholar ]
Landy F. J. (1986). Stamp collecting versus science: validation as hypothesis testing. Am. Psychol. 41 1183–1192. 10.1037/0003-066X.41.11.1183 [ CrossRef ] [ Google Scholar ]
Lawshe C. H. (1985). Inferences from personnel tests and their validity. J. Appl. Psychol. 70 237–238. 10.1037/0021-9010.70.1.237 [ CrossRef ] [ Google Scholar ]
Leite W. L., Huang I.-C., Marcoulides G. A. (2008). Item selection for the development of short forms of scales using an ant colony optimization algorithm. Multivariate Behav. Res. 43 411–431. 10.1080/00273170802285743 [ PubMed ] [ CrossRef ] [ Google Scholar ]
Li X., Sireci S. G. (2013). A new method for analyzing content validity data using multidimensional scaling. Educ. Psychol. Meas. 73 365–385. 10.1177/0013164412473825 [ CrossRef ] [ Google Scholar ]
Likert R. (1932). A technique for the measurement of attitudes. Arch. Psychol. 140 5–53. [ Google Scholar ]
Lissitz R. W., Samuelson K. (2007). A suggested change in the terminology and emphasis regarding validity and education. Educ. Res. 36 437–448. 10.3102/0013189X0731 [ CrossRef ] [ Google Scholar ]
Locke E. A. (2012). Construct validity vs. concept validity. Hum. Resour. Manag. Rev. 22 146–148. 10.1016/j.hrmr.2011.11.008 [ CrossRef ] [ Google Scholar ]
Loevinger J. (1954). The attenuation paradox in test theory. Pschol. Bull. 51 493–504. 10.1037/h0058543 [ PubMed ] [ CrossRef ] [ Google Scholar ]
Lord F. M., Novick M. R. (1968). Statistical Theories of Mental Test Scores. Reading, MA: Addison-Wesley. [ Google Scholar ]
Lubin B., Collins J. E., Seever M., Van Whitlock R., Dennis A. J. (1990). Relationships among readability, reliability, and validity in a self-report adjective check list. Psychol. Assess. J. Consult. Clin. Psychol. 2 256–261. 10.1037/1040-3590.2.3.256 [ CrossRef ] [ Google Scholar ]
Mackenzie S. B. (2003). The dangers of poor construct conceptualization. J. Acad. Mark. Sci. 31 323–326. 10.1177/0092070303254130 [ CrossRef ] [ Google Scholar ]
Marcoulides G. A., Drezner Z. (2003). Model specification searches using ant colony optimization algorithms. Struct. Equ. Modeling 10 154–164. 10.1207/S15328007SEM1001 [ CrossRef ] [ Google Scholar ]
McDonald R. (1999). Test Theory: A Unified Treatmnet. Mahwah, NJ: Lawrence Erlbaum. [ Google Scholar ]
McNeish D. (2018). Thanks coefficient alpha, we’ll take it from here. Psychol. Methods 23 412–433. 10.1037/met0000144 [ PubMed ] [ CrossRef ] [ Google Scholar ]
McPherson J., Mohr P. (2005). The role of item extremity in the emergence of keying-related factors: an exploration with the life orientation test. Psychol. Methods 10 120–131. 10.1037/1082-989X.10.1.120 [ PubMed ] [ CrossRef ] [ Google Scholar ]
Mellenbergh G. J. (1996). Measurement precision in test score and item response models. Psychol. Methods 1 293–299. 10.1037/1082-989X.1.3.293 [ PubMed ] [ CrossRef ] [ Google Scholar ]
Messick S. (1980). Test validity and the ethics of assessment. Am. Psychol. 35 1012–1027. 10.1037/0003-066X.35.11.1012 [ PubMed ] [ CrossRef ] [ Google Scholar ]
Messick S. (1989). “ Validity ,” in Educational Measurement , 3rd Edn, ed. Linn R. L. (New York, NY: American Council on Education and Macmillan; ), 13–103. [ Google Scholar ]
Newton P. E. (2012). Questioning the consensus definition of validity. Measurement 10 110–122. 10.1080/15366367.2012.688456 [ CrossRef ] [ Google Scholar ]
Newton P. E., Shaw S. D. (2013). Standards for talking and thinking about validity. Psychol. Methods 18 301–319. 10.1037/a0032969 [ PubMed ] [ CrossRef ] [ Google Scholar ]
Novick M. R., Lewis C. (1967). Coefficient alpha and the reliability of composite measurements. Psychometrika 32 1–13. 10.1007/BF02289400 [ PubMed ] [ CrossRef ] [ Google Scholar ]
Olaru G., Witthöft M., Wilhelm O. (2015). Methods matter: testing competing models for designing short-scale big-five assessments. J. Res. Pers. 59 56–68. 10.1016/j.jrp.2015.09.001 [ CrossRef ] [ Google Scholar ]
Olderbak S., Wilhelm O., Olaru G., Geiger M., Brenneman M. W., Roberts R. D. (2015). A psychometric analysis of the reading the mind in the eyes test: toward a brief form for research and applied settings. Front. Psychol. 6 : 1503 . 10.3389/fpsyg.2015.01503 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
Parrigon S., Woo S. E., Tay L., Wang T. (2017). CAPTION-ing the situation: a lexically-derived taxonomy of psychological situation characteristics. J. Pers. Soc. Psychol. 112 642–681. 10.1037/pspp0000111 [ PubMed ] [ CrossRef ] [ Google Scholar ]
Permut S., Fisher M., Oppenheimer D. M. (2019). TaskMaster: a tool for determiningwhen subjects are on task. Adv. Methods Pract. Psychol. Sci. 2 188–196. 10.1177/2515245919838479 [ CrossRef ] [ Google Scholar ]
Peter S. C., Whelan J. P., Pfund R. A., Meyers A. W. (2018). A text comprehension approach to questionnaire readability: an example using gambling disorder measures. Psychol. Assess. 30 1567–1580. 10.1037/pas0000610 [ PubMed ] [ CrossRef ] [ Google Scholar ]
Podsakoff P. M., Mackenzie S. B., Podsakoff N. P. (2016). Recommendations for creating better concept definitions in the organizational, behavioral, and social sciences. Organ. Res. Methods 19 159–203. 10.1177/1094428115624965 [ CrossRef ] [ Google Scholar ]
Ravens-Sieberer U., Devine J., Bevans K., Riley A. W., Moon J., Salsman J. M., et al. (2014). Subjective well-being measures for children were developed within the PROMIS project: Presentation of first results. J. Clin. Epidemiol. 67 207–218. 10.1016/j.jclinepi.2013.08.018 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
Raykov T. (1997). Scale reliability, Cronbach’s coefficient alpha, and violations of essential tau-equivalence with fixed congeneric components. Multivariate Behav. Res. 32 329–353. 10.1207/s15327906mbr3204_2 [ PubMed ] [ CrossRef ] [ Google Scholar ]
Raykov T., Marcoulides G. A., Tong B. (2016). Do two or more multicomponent instruments measure the same construct? Testing construct congruence using latent variable modeling. Educ. Psychol. Meas. 76 873–884. 10.1177/0013164415604705 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
Raykov T., Pohl S. (2013). On studying common factor variance in multiple-component measuring instruments. Educ. Psychol. Meas. 73 191–209. 10.1177/0013164412458673 [ CrossRef ] [ Google Scholar ]
Reise S. P., Ainsworth A. T., Haviland M. G. (2005). Item response theory: fundamentals, applications, and promise in psychological research. Curr. Dir. Psychol. Sci. 14 95–101. 10.1016/B978-0-12-801504-9.00010-6 [ CrossRef ] [ Google Scholar ]
Reise S. P., Waller N. G., Comrey A. L. (2000). Factor analysis and scale revision. Psychol. Assess. 12 287–297. 10.1037/1040-3590.12.3.287 [ PubMed ] [ CrossRef ] [ Google Scholar ]
Revelle W. (1978). ICLUST: a cluster analytic approach for exploratory and confirmatory scale construction. Behav. Res. Methods Instrum. 10 739–742. 10.3758/bf03205389 [ CrossRef ] [ Google Scholar ]
Revelle W. (2008). psych: Procedures for Personality and Psychological Research.(R packageversion 1.0-51). [ Google Scholar ]
Revelle W., Condon D. M. (2019). Reliability from α to ω: a tutorial. Psychol. Assess. 31 1395–1411. 10.1037/pas0000754 [ PubMed ] [ CrossRef ] [ Google Scholar ]
Schmidt F. L., Le H., Ilies R. (2003). Beyond alpha: an empirical examination of the effects of different sources of measurement error on reliability estimates for measures of individual differences constructs. Psychol. Methods 8 206–224. 10.1037/1082-989X.8.2.206 [ PubMed ] [ CrossRef ] [ Google Scholar ]
Schroeders U., Wilhlem O., Olaru G. (2016). Meta-heuristics in short scale construction: ant colony optimization and genetic algorithm. PLoS One 11 : e0167110 . 10.5157/NEPS [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
Sechrest L. (1963). Incremental validity: a recommendation. Educ. Psychol. Meas. 23 153–158. 10.1177/001316446302300113 [ CrossRef ] [ Google Scholar ]
Sellbom M., Tellegen A. (2019). Factor analysis in psychological assessment research: common pitfalls and recommendations. Psychol. Assess. 31 1428–1441. 10.1037/pas0000623 [ PubMed ] [ CrossRef ] [ Google Scholar ]
Shaffer J. A., DeGeest D., Li A. (2016). Tackling the problem of construct proliferation: a guide to assessing the discriminant validity of conceptually related constructs. Organ. Res. Methods 19 80–110. 10.1177/1094428115598239 [ CrossRef ] [ Google Scholar ]
Sharpe D. (2013). Why the resistance to statistical innovations? Bridging the communication gap. Psychol. Methods 18 572–582. 10.1037/a0034177 [ PubMed ] [ CrossRef ] [ Google Scholar ]
Sijtsma K. (2009). On the use, the misuse, and the very limited usefulness of cronbach’s alpha. Psychometrika 74 107–120. 10.1007/s11336-008-9101-0 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
Simms L. J., Zelazny K., Williams T. F., Bernstein L. (2019). Does the number of response options matter? Psychometric perspectives using personality questionnaire data. Psychol. Assess. 31 557–566. 10.1037/pas0000648.supp [ PubMed ] [ CrossRef ] [ Google Scholar ]
Smith G. T. (2005). On construct validity: issues of method and measurement. Psychol. Assess. 17 396–408. 10.1037/1040-3590.17.4.396 [ PubMed ] [ CrossRef ] [ Google Scholar ]
Smith G. T., Fischer S., Fister S. M. (2003). Incremental validity principles in test construction. Psychol. Assess. 15 467–477. 10.1037/1040-3590.15.4.467 [ PubMed ] [ CrossRef ] [ Google Scholar ]
Tay L., Jebb A. T. (2018). Establishing construct continua in construct validation: the process of continuum specification. Ad. Methods Pract. Psychol. Sci. 1 375–388. 10.1177/2515245918775707 [ CrossRef ] [ Google Scholar ]
Thorndike E. L. (1904). An Introduction to the Theory of Mental and Social Measurements. New York, NY: Columbia University Press, 10.1037/13283-000 [ CrossRef ] [ Google Scholar ]
U.S. Census Bureau (2014). Educational Attainment in the United States: 2014. Washington, DC: U.S. Census Bureau. [ Google Scholar ]
Vogt D. S., King D. W., King L. A. (2004). Focus groups in psychological assessment: enhancing content validity by consulting members of the target population. Psychol. Assess. 16 231–243. 10.1037/1040-3590.16.3.231 [ PubMed ] [ CrossRef ] [ Google Scholar ]
Weijters B., De Beuckelaer A., Baumgartner H. (2014). Discriminant validity where there should be none: positioning same-scale items in separated blocks of a questionnaire. Appl. Psychol. Meas. 38 450–463. 10.1177/0146621614531850 [ CrossRef ] [ Google Scholar ]
Weng L. J. (2004). Impact of the number of response categories and anchor labels on coefficient alpha and test-retest reliability. Educ. Psychol. Meas. 64 956–972. 10.1177/0013164404268674 [ CrossRef ] [ Google Scholar ]
Westen D., Rosenthal R. (2003). Quantifying construct validity: two simple measures. J. Pers. Soc. Psychol. 84 608–618. 10.1037/0022-3514.84.3.608 [ PubMed ] [ CrossRef ] [ Google Scholar ]
Zhang X., Savalei V. (2016). Improving the factor structure of psychological scales: the expanded format as an alternative to the Likert scale format. Educ. Psychol. Meas. 76 357–386. 10.1177/0013164415596421 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
Zhang Z., Yuan K. H. (2016). Robust coefficients alpha and omega and confidence intervals with outlying observations and missing data: methods and software. Educ. Psychol. Meas. 76 387–411. 10.1177/0013164415594658 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
Zijlmans E. A. O., van der Ark L. A., Tijmstra J., Sijtsma K. (2018). Methods for estimating item-score reliability. Appl. Psychol. Meas. 42 553–570. 10.1177/0146621618758290 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
Zinbarg R. E., Revelle W., Yovel I., Li W. (2005). Cronbach’s, α Revelle’s β and McDonald’s ωH: their relations with each other and two alternative conceptualizations of reliability. Psychometrika 70 123–133. 10.1007/s11336-003-0974-7 [ CrossRef ] [ Google Scholar ]

Root out friction in every digital experience, super-charge conversion rates, and optimize digital self-service

Uncover insights from any interaction, deliver AI-powered agent coaching, and reduce cost to serve

Increase revenue and loyalty with real-time insights and recommendations delivered to teams on the ground

Know how your people feel and empower managers to improve employee engagement, productivity, and retention

Take action in the moments that matter most along the employee journey and drive bottom line growth

Whatever they’re are saying, wherever they’re saying it, know exactly what’s going on with your people

Get faster, richer insights with qual and quant tools that make powerful market research available to everyone

Run concept tests, pricing studies, prototyping + more with fast, powerful studies designed by UX research experts

Track your brand performance 24/7 and act quickly to respond to opportunities and challenges in your market

Explore the platform powering Experience Management

Free Account
For Digital
For Customer Care
For Human Resources
For Researchers
Financial Services
All Industries

Popular Use Cases

Customer Experience
Employee Experience
Net Promoter Score
Voice of Customer
Customer Success Hub
Product Documentation
Training & Certification
XM Institute
Popular Resources
Customer Stories
Artificial Intelligence
Market Research
Partnerships
Marketplace

The annual gathering of the experience leaders at the world’s iconic brands building breakthrough business results, live in Salt Lake City.

English/AU & NZ
Español/Europa
Español/América Latina
Português Brasileiro
REQUEST DEMO
Experience Management
What is a survey?
Likert Scales

Try Qualtrics for free

What is a likert scale.

14 min read In this guide, we’ll cover everything you need to know about likert scales, from what a likert scale is to how it works, and how you can use likert scale questions.

Understanding customer sentiment towards your brand, product or service is complex. You have to account for attitudes, opinions and perceptions, all of which influence how much a customer likes (or dislikes) what you offer.

You need a more comprehensive way to measure customer sentiment — one of the best ways to do so is with likert scale questions.

Get started with our free survey maker tool today

Likert scale definition

A likert scale, or rating system, is a measurement method used in research to evaluate attitudes, opinions and perceptions.

Likert scale questions are highly adaptable and can be used across a range of topics, from a customer satisfaction survey, to employment engagement surveys, to market research.

For each question or statement, subjects choose from a range of answer options. For example:

Strongly agree
Strongly disagree

In studies where answer options are coded numerically, ‘Strongly agree’ would be rated 1 or 5, respectively increasing or decreasing for each response, e.g. in the above example, 5, 4, 3, 2 and 1.

Some likert scales use a seven-point likert scale with 1 being ‘Strongly Agree’ and 7 being ‘Strongly disagree’ (or reversed). In the middle, a neutral statement like ‘neither agree nor disagree’.

As well as judging positive and negative statements, a likert scale survey question can judge frequency, quality, or feelings of importance. You could use a likert scale to understand how customers view product features, or what product upgrades they’d most like to see next.

The granularity it provides over simple yes or no responses means you can uncover degrees of opinion, giving an accurate and representative understanding of feedback.

Here are a few likert scale examples:

Benefits of using likert scale questions

Likert scale options have several benefits, especially if you want to align data to a specific scale. Here’s a few benefits:

1. They’re easy to understand

A likert scale is easy to understand as responders simply rank their preference based on the point likert scale you choose.

For example, depending on whether they strongly agree or strongly disagree, they just select their response. This is sometimes referred to as a symmetric agree disagree scale.

A likert scale is also easy to analyze based on the responses given using a rating scale, as they can be collated numerically and filtered based on responses.

2. Ideal for single topic surveys

A likert scale question is ideal for single topic surveys , as the data can be easily analyzed to judge sentiment or feelings towards particular things.

NPS surveys often use a likert scale to judge sentiment towards customer service.

Rather than ranging from strongly agree to strongly disagree, you’d use ‘highly satisfied’ to ‘highly dissatisfied’.

You can use likert scales to judge customers’ feelings about specific parts of your service, product or brand, then follow-up with a more detailed study.

3. Likert scale questionnaires are versatile

Likert scale questionnaires help you evaluate preferences, sentiment, perspectives, behaviors or opinions.

You can implement them in a standard questionnaire, or use site intercepts on specific pages. You could have a likert scale questionnaire pop up after a webinar to get feedback on content and ideas.

4. They don’t force specific responses

Rather than extreme response categories, e.g. giving respondents only two options when discussing polarizing topics, likert scales provide flexibility.

However, for difficult topics, respondents may feel they have to answer a certain way to avoid being seen as ‘extreme’. Just remind them survey responses are anonymous.

5. Likert scale questions are great for sentiment analysis

A likert scale is effective when trying to assess sentiments towards your business, brand, product or service.

Likert scale responses can judge sentiment, along with reasons for the sentiment. For example, you could collate data in a statistical analysis platform, filtering responses to see what percentage of customers are satisfied, versus those that aren’t.

You could go a step further and break the percentages down, e.g. those who are highly satisfied versus those who’re just satisfied. How can you convert those customers into true evangelists?

It’s important you only use a likert scale questionnaire when asking about a singular topic, otherwise you risk confusing respondents and damaging the legitimacy of your study.

6. They keep respondents happy

One of the pitfalls with conventional survey design is that researchers can use overly broad questions, limited to yes or no answers. These sorts of questions can frustrate respondents (as they give them no real way to provide context or accurate answers), leading to them rushing through surveys, affecting the quality of your data.

What are the limitations of likert scales?

While likert scaling is highly effective at measuring opinions and sentiments, it does have some limitations:

1. Response choices limit real understanding

While likert scales help determine sentiment, they aren’t as effective helping understand why people feel a certain way. There’s also no interpretation of the sentiment between each choice, whether positive or negative.

For example, a respondent might ‘slightly agree’ with a statement, but why? What made them feel that way, what influences their responses? This kind of granularity can only be achieved with qualitative methods.

With this in mind, to increase the accuracy of your survey data, it’s worth running any likert-based questionnaires in conjunction with qualitative research methods.

2. Respondents might focus on one side of the sentiment

Depending how questions are written, respondents might focus on one side of the scale. If they feel their answers might somehow affect their reputation, lifestyle or portray them negatively, they’ll pick positive responses.

Also, depending on the topic, respondents may be less likely to take extreme sides of the likert scale, instead agreeing, disagreeing or remaining neutral.

3. Previous questions can influence responses

With any quantitative survey, respondents can get into a ‘rhythm’ of answering questions. The result is that they start to respond a certain way (this can be exacerbated by poor questioning, long surveys and/or flicking between themes).

When to use a likert scale question

A likert scale question works best when assessing responses based on variables, e.g. sentiment, satisfaction, quality, importance, likelihood.

For example, you might ask a respondent: “How would you rate the quality of our products?”, and provide a response scale of:

Respondents get a range to which they agree or disagree, rather than a simple yes or no answer which is often insufficient. Ultimately, you’ll use likert scales to measure sentiment about something in more detail.

Get started with our free survey software today

How to write likert scale survey questions

When writing likert scale questions, to ensure you get accurate responses, there are several things to consider:

1. Keep them simple

The best way to get accurate results is asking simple, specific questions. Be crystal clear what you’re asking respondents to judge, whether it’s their preference, opinion or otherwise.

For example: “How satisfied are you with our service?” and providing a standard scale, from very satisfied to very dissatisfied, provides no room for confusion.

2. Make sure they’re consistent

Respondents should fully understand the likert scale they’re recording answers against, this means answers on either side of the scale should be consistent.

For example, if you say “completely agree” at one extreme, the other extreme should be “completely disagree”.

3. Use appropriate scaling – unipolar scales and bipolar scales

Any likert scale will use either a unipolar scale or bipolar scale.

A bipolar scale should be used when you want respondents to answer with either an extremely positive or negative opinion. Sometimes, an even-point scale is used, where the middle option of “neither agree nor disagree” or “neutral” is unavailable. This is sometimes referred to as a “forced choice” method.

A unipolar scale works in the same way, but it starts from zero at one end, while an extreme is at the other. For example, if you ask how appealing your product is, your unipolar responses would go from “not appealing at all” to “extremely appealing”.

You should also aim to keep your scales odd because scales with an odd number of values ensure there’s a midpoint. Keep your scales limited to 5 or 7 points.

4. Don’t make statements, ask questions

Creating an effective likert scale means asking questions, not statements. This way, you avoid bias.

This is because people tend to automatically agree with positive or established statements, or unconsciously respond in a positive way (acquiescence bias). This can damage your study.

Asking questions rather than making statements encourages less biased responses because respondents have to think about their answers.

For example, asking “How satisfied are you with the quality of this service?” provides respondents with a chance to answer truthfully.

5. Switch your scale points

Switching your rating scale prevents respondents from falling into a rhythm and giving biased responses.

For example, if your point scale starts at 1, ‘completely agree’ and ends with 5, ‘completely disagree’, then you switch these around for a few questions so 1 is completely disagree and 5 becomes completely agree. This keeps respondents on their toes and engaged with the survey.

Here are some likert scale examples:

How to analyze likert scale survey data

Unlike many survey types , you can’t use the ‘mean’ as a measure of tendency because the mean response to likert survey questions has no meaning. In other words, understanding the average of those who strongly agree or disagree tells you nothing.

Instead, when analyzing likert scale data, measure the most frequent response to understand the overall sentiment of respondents.

For example, 87% ‘strongly agree’ that you offer a good service.

You can also compare the percentages for each response to see where respondents ultimately fall.

This is incredibly useful when you want to nurture customers — perhaps there’s something you can do for those who answered ‘agree’ rather than ‘strongly agree’.

The easiest way to present likert scale survey results is using a simple bar or pie chart showing the distribution of response types or answer options.

You could also visualize your responses using a diverging stacked bar chart:

Image Source: mbounthavong

Likert scale questions

One of the biggest benefits of using likert scale survey questions is they can be used for a variety of topics to gather quantitative data

Below are some likert scale examples to give you an idea when you can use them in market research, and what kind of insights you can generate using likert scale surveys.

Customer satisfaction surveys

How do you rate the quality of service you received?

Exceptional

This kind of likert scale question can benefit from further qualitative questions to gather valuable feedback on why survey respondents feel the way they do.

Employee engagement survey

How satisfied do you feel in your current position?

Extremely happy
Somewhat happy
Neither happy nor unhappy
Somewhat unhappy
Extremely unhappy

Education engagement survey

How would you rate your satisfaction with your child’s education?

Completely satisfied
Moderately satisfied
Neither satisfied nor unsatisfied
Unsatisfied
Moderately unsatisfied
Completely unsatisfied

Marketing engagement survey

A business’ social responsibility score is more important than price

Completely agree
Somewhat agree
Neither agree or disagree
Somewhat disagree
Completely disagree

Go beyond standard likert scale questions with Qualtrics

Understanding engagement or sentiment towards your products or services is an essential part of collecting data to improve your business. And with Qualtrics CoreXM — you can go even further.

Designed to empower everyone to gather experience insights and take action, Qualtrics CoreXM is an all-in-one solution to carry out customer, product, marketing and brand research, and then implement effective strategies.

From customer satisfaction surveys and event feedback to product concept testing and simple polls, create and deploy the research projects you need to enhance every aspect of your business.

Listen to everyone wherever they are providing feedback — whether directly in surveys and chatbot windows or indirectly via online reviews. Capture experience data across more than 125 resources and use that data for more targeted research and highly personalized experiences.

Use advanced analytics to interpret the feedback data and then automatically alert the right people to tell them what actions to take. All in real-time, with no legwork required. It’s time to go from measuring to acting and start closing experience gaps across your business.

Related resources

Survey research 15 min read, survey bias types 24 min read, post event survey questions 10 min read, best survey software 16 min read, close-ended questions 7 min read, survey vs questionnaire 12 min read, response bias 13 min read, request demo.

Ready to learn more about Qualtrics?

Bipolar Disorder
Therapy Center
When To See a Therapist
Types of Therapy
Best Online Therapy
Best Couples Therapy
Best Family Therapy
Managing Stress
Sleep and Dreaming
Understanding Emotions
Self-Improvement
Healthy Relationships
Student Resources
Personality Types
Guided Meditations
Verywell Mind Insights
2024 Verywell Mind 25
Mental Health in the Classroom
Editorial Process
Meet Our Review Board
Crisis Support

Using a Likert Scale in Psychology

Kendra Cherry, MS, is a psychosocial rehabilitation specialist, psychology educator, and author of the "Everything Psychology Book."

Emily is a board-certified science editor who has worked with top digital publishing brands like Voices for Biodiversity, Study.com, GoodTherapy, Vox, and Verywell.

PeopleImages / DigitalVision / Getty Images

What a Likert Scale Looks Like

Creating items on a likert scale.

Disadvantages

A Likert scale is a type of psychometric scale frequently used in psychology questionnaires. It was developed by and named after organizational psychologist Rensis Likert. Self-report inventories are one of the most widely used tools in psychological research.

On a Likert scale, respondents are asked to rate the level to which they agree with a statement. Such scales are often used to assess personality , attitudes , and behaviors.

At a Glance

While you might not have known what they were called, you've probably encountered many different Likert scales. Simply put, a Likert scale is a type of assessment item that asks you to rate your agreement with a statement (often from "Strongly Agree" to "Strongly Disagree.") Such scales can be a great way to get a nuanced look at how people feel about a particular topic, which is why you'll often see this type of item on political surveys and psychological questionnaires.

On a survey or questionnaire, a typical Likert item usually takes the following format:

Strongly disagree
Neither agree nor disagree
Strongly agree

It is important to note that the individual questions that take this format are known as Likert items, while the Likert scale is the format of these items.

Other Items on a Likert Scale

In addition to looking at how much respondents agree with a statement, Likert items may also focus on likelihood, frequency, or importance. In such cases, survey takers would be asked to identify:

How likely they believe something to be true (Always true, Usually true, Sometimes true, Usually not true, Never true)
How frequently they engage in a behavior or experience a particular thought (Very frequently, Frequently, Occasionally, Rarely, or Never)
How important they feel something is to them (Very important, Important, Somewhat important, Not very important, Not important)

A Note on Pronunciation

If you've ever taken a psychology course, you've probably heard the term pronounced "lie-kurt." Since the term is named after Rensis Likert, the correct pronunciation should be "lick-urt."

In some cases, experts who are very knowledgeable about the subject matter might develop items on their own. Oftentimes, it is helpful to have a group of experts help brainstorm different ideas to include on a scale.

Start by creating a large pool of potential items to draw from.
Select a group of judges to score the items.
Sum the item scores given by the judges.
Calculate intercorrelations between paired items.
Eliminate items that have a low correlation between the summed scores.
Find averages for the top quarter and the lowest quarter of judges and do a t-test of the means between the two. Eliminate questions with low t-values, which indicates that they score low in the ability to discriminate.

After weeding out the questions that have been deemed irrelevant or not relevant enough to include, the Likert scale is then ready to be administered.

Experts suggest that when creating Likert scale items, survey creators should pay careful attention to wording and clearly define target constructs.

Some researchers have questioned whether having an even or odd number of response options might influence the usefulness of such data. Some research has found that having five options increases psychometric precision but found no advantages to having six or more response options.

Advantages of a Likert Scale

Because Likert items are not simply yes or no questions, researchers are able to look at the degree to which people agree or disagree with a statement.

Research suggests that Likert scales are a valuable and convenient way for psychologists to measure characteristics that cannot be readily observed.

Likert scales are often used in political polling in order to obtain a more nuanced look at how people feel about particular issues or certain candidates.

Disadvantages of a Likert Scale

Likert scales are convenient and widely used, but that doesn't mean that they don't have some drawbacks. As with other assessment forms, Likert scales can also be influenced by the need to appear socially desirable or acceptable.

People may not be entirely honest or forthright in their answers or may even answer items in ways that make themselves appear better than they are. This effect can be particularly pronounced when looking at behaviors that are viewed as socially unacceptable.

What This Means For You

The next time you fill out a questionnaire or survey, notice if they use Likert scales to evaluate your feelings about a subject. Such surveys are common in doctor's offices to help assess your symptoms and their severity. They are also often used in political or consumer polls to judge your feelings about a particular issue, candidate, or product.

Joshi A, Kale S, Chandel S, Pal DK. Likert scale: Explored and explained . British Journal of Applied Science & Technology. 2015;7(4):396-403. doi:10.9734/BJAST/2015/14975

East Carolina University Psychology Department. How do you pronounce "Likert?" What is a Likert scale?

Clark LA, Watson D. Constructing validity: New developments in creating objective measuring instruments . Psychol Assess . 2019;31(12):1412-1427. doi:10.1037/pas0000626

Simms LJ, Zelazny K, Williams TF, Bernstein L. Does the number of response options matter? Psychometric perspectives using personality questionnaire data . Psychol Assess . 2019;31(4):557-566. doi:10.1037/pas0000648

Jebb AT, Ng V, Tay L. A review of key Likert scale development advances: 1995-2019 . Front Psychol . 2021;12:637547. doi:10.3389/fpsyg.2021.637547

Sullman MJM, Taylor JE. Social desirability and self-reported driving behaviours: Should we be worried? Transportation Research Part F: Traffic Psychology and Behavior. 2010;13(3):215-221. doi:10.1016/j.trf.2010.04.004

Likert R. A technique for the measurement of attitudes . Archives of Psychology. 1932;22(140):1–55.

By Kendra Cherry, MSEd Kendra Cherry, MS, is a psychosocial rehabilitation specialist, psychology educator, and author of the "Everything Psychology Book."

Have a language expert improve your writing

Run a free plagiarism check in 10 minutes, automatically generate references for free.

Knowledge Base
Methodology
What Is a Likert Scale? | Guide & Examples

What Is a Likert Scale? | Guide & Examples

Published on 6 May 2022 by Pritha Bhandari and Kassiani Nikolopoulou. Revised on 16 January 2023.

A Likert scale is a rating scale used to measure opinions, attitudes, or behaviours.

It consists of a statement or a question, followed by a series of five or seven answer statements. Respondents choose the option that best corresponds with how they feel about the statement or question.

Because respondents are presented with a range of possible answers, Likert scales are great for capturing the level of agreement or their feelings regarding the topic in a more nuanced way. However, Likert scales are prone to response bias , where respondents either agree or disagree with all the statements due to fatigue or social desirability .

Likert scales are common in survey research , as well as in fields like marketing, psychology, or other social sciences.

Download Likert scale response options

What are likert scale questions, when to use likert scale questions, how to write strong likert scale questions, how to write likert scale responses, how to analyse data from a likert scale, advantages and disadvantages of likert scales, frequently asked questions about likert scales.

Likert scales commonly comprise either five or seven options. The options on each end are called response anchors . The midpoint is often a neutral item, with positive options on one side and negative options on the other. Each item is given a score from 1 to 5 or 1 to 7.

The format of a typical five-level Likert question, for example, could be:

Strongly disagree
Neither agree nor disagree
Strongly agree

In addition to measuring the level of agreement or disagreement, Likert scales can also measure other spectrums, such as frequency, satisfaction, or importance.

Prevent plagiarism, run a free check.

Researchers use Likert scale questions when they are seeking a greater degree of nuance than possible from a simple ‘yes or no’ question.

For example, let’s say you are conducting a survey about customer views on a pair of running shoes. You ask survey respondents ‘Are you satisfied with the shoes you purchased?’

A dichotomous question like the above gives you very limited information. There is no way you can tell how satisfied or dissatisfied customers really are. You get more specific and interesting information by asking a Likert scale question instead:

‘How satisfied are you with the shoes you purchased?’

1 – Very dissatisfied
2 – Dissatisfied
4 – Satisfied
5 – Very satisfied

Likert scales are most useful when you are measuring unobservable individual characteristics , or characteristics that have no concrete, objective measurement. These can be elements like attitudes, feelings, or opinions that cause variations in behaviour.

Each Likert scale–style question should assess a single attitude or trait. In order to get accurate results, it is important to word your questions precisely. As a rule of thumb, make sure each question only measures one aspect of your topic.

For example, if you want to assess attitudes towards environmentally friendly behaviours, you can design a Likert scale with a variety of questions that measure different aspects of this topic.

Here are a few pointers:

Include both questions and statements

Use both positive and negative framing, avoid double negatives, ask about only one thing at a time, be crystal clear.

A good rule of thumb is to use a mix of both to keep your participants engaged during the survey. When deciding how to phrase questions and statements, it’s important that they are easily understood and do not bias your respondents in one way or another.

If all of your questions only ask about things in socially desirable ways, your participants may be biased towards agreeing with all of them to show themselves in a positive light.

Positive framing
Negative framing

Respondents who agree with the first statement should also disagree with the second. By including both of these statements in a long survey, you can also check whether the participants’ responses are reliable and consistent.

Double negatives can lead to confusion and misinterpretations, as respondents may be unsure of what they are agreeing or disagreeing with.

Bad example
Good example

Avoid double-barrelled questions (asking about two different topics within the same question). When faced with such questions, your respondents may selectively answer about one topic and ignore the other. Questions like this may also confuse respondents, leading them to choose a neutral but inaccurate answer in an attempt to answer both questions simultaneously.

The accuracy of your data also relies heavily on word choice:

Pose your questions clearly, leaving no room for misunderstandin.
Make language and stylistic choices that resonate with your target demographic.
Stay away from jargon that could discourage or confuse your respondents.

When using Likert scales, how you phrase your response options is just as crucial as how you phrase your questions.

Here are a few tips to keep in mind.

Decide on a number of response options

Choose the type of response option, choose between unipolar and bipolar options, make sure that you use mutually exclusive options.

More options give you deeper insights but can make it harder for participants to decide on one answer. Fewer options mean you capture less detail, but the scale is more user-friendly.

Usually, researchers include five or seven response options. It’s a good idea to include an odd number so that there is a midpoint. However, if you want to force your respondents to choose, an even number of responses removes the neutral option.

You can measure a wide range of perceptions, motivations, and intentions using Likert scales. Response options should strive to cover the full range of opinions you anticipate a participant can have.

Some of the most common types of items include:

Agreement: Strongly Agree, Agree, Neither Agree nor Disagree, Disagree, Strongly Disagree
Quality: Very Poor, Poor, Fair, Good, Excellent
Likelihood: Extremely Unlikely, Somewhat Unlikely, Likely, Somewhat Likely, Extremely Likely
Experience: Very Negative, Somewhat Negative, Neutral, Somewhat Positive, Very Positive

Some researchers also include a ‘don’t know’ option. This allows them to distinguish between respondents who do not feel sufficiently informed to give an opinion and those who are ‘neutral’ on the topic. However, including a ‘don’t know’ option may trigger unmotivated respondents to select that for every question.

On a unipolar scale, you measure only one attribute (e.g., satisfaction). On a bipolar scale, you can measure two attributes (e.g., satisfaction or dissatisfaction) along a continuum.

Your choice depends on your research questions and aims. If you want finer-grained details about one attribute, select unipolar items. If you want to allow a broader range of responses, select bipolar items.

Unipolar scales are most accurate when five-point scales are used. Conversely, bipolar scales are most accurate when a seven-point scale is used (with three scale points on each side of a truly neutral midpoint.)

Avoid overlaps in the response items. If two items have similar meanings, it risks making your respondent’s choice random.

Before analysing your data, it’s important to consider what type of data you are dealing with. Likert-derived data can be treated either as ordinal-level or interval-level data . However, most researchers treat Likert-derived data as ordinal: assuming there is not an equal distance between responses.

Furthermore, you need to decide which descriptive statistics and/or inferential statistics may be used to describe and analyse the data obtained from your Likert scale.

You can use descriptive statistics to summarise the data you collected in simple numerical or visual form.

Ordinal data: To get an overall impression of your sample, you find the mode, or most common score, for each question. You also create a bar chart for each question to visualise the frequency of each item choice.
Interval data: You add up the scores from each question to get the total score for each participant. You find the mean , or average, score and the standard deviation , or spread, of the scores for your sample.

You can use inferential statistics to test hypotheses , such as correlations between different responses or patterns in the whole dataset.

Ordinal data: You hypothesise that knowledge of climate change is related to belief that environmental damage is a serious problem. You use a chi-square test of independence to see if these two attributes are correlated.
Interval data: You investigate whether age is related to attitudes towards environmentally friendly behaviour. Using a Pearson correlation test, you assess whether the overall score for your Likert scale is related to age.

Lastly, be sure to clearly state in your analysis whether you treat the data at interval level or at ordinal level.

Analysing data at the ordinal level

Researchers usually treat Likert-derived data as ordinal . Here, response categories are presented in a ranking order, but the distances between the categories cannot be presumed to be equal.

For example, consider a scale where 1 = strongly agree, 2 = agree, 3 = neutral, 4 = disagree, and 5 = strongly disagree.

In this scale, 4 is more negative than 3, 2, or 1. However, it cannot be inferred that a response of 4 is twice as negative as a response of 2.

Treating Likert-derived data as ordinal, you can use descriptive statistics to summarise the data you collected in simple numerical or visual form. The median or mode generally is used as the measure of central tendency . In addition, you can create a bar chart for each question to visualise the frequency of each item choice.

Appropriate inferential statistics for ordinal data are, for example, Spearman’s correlation or a chi-square test for independence .

Analysing data at the interval level

However, you can also choose to treat Likert-derived data at the interval level . Here, response categories are presented in a ranking order, and the distance between categories is presumed to be equal.

Appropriate inferential statistics used here are an analysis of variance (ANOVA) or Pearson’s correlation . Such analysis is legitimate, provided that you state the assumption that the data are at interval level.

In terms of descriptive statistics, you add up the scores from each question to get the total score for each participant. You find the mean , or average, score and the standard deviation , or spread, of the scores for your sample.

Likert scales are a practical and accessible method of collecting data.

Quantitative: Likert scales easily operationalise complex topics by breaking down abstract phenomena into recordable observations. This enables statistical testing of your hypotheses.
Fine-grained: Because Likert-type questions aren’t binary ( yes/no , true/false , etc.) you can get detailed insights into perceptions, opinions, and behaviours.
User-friendly: Unlike open-ended questions, Likert scales are closed-ended and don’t ask respondents to generate ideas or justify their opinions. This makes them quick for respondents to fill in and ensures they can easily yield data from large samples.

Problems with Likert scales often come from inappropriate design choices.

Response bias: Due to social desirability bias , people often avoid selecting the extreme items or disagreeing with statements to seem more ‘normal’ or show themselves in a favorable light.
Fatigue/inattention: In Likert scales with many questions, respondents can get bored and lose interest. They may absent-mindedly select responses regardless of their true feelings. This results in invalid responses.
Subjective interpretation: Some items can be vague and interpreted very differently by respondents. Words like ‘somewhat’ or ‘fair’ don’t have precise or narrow definitions.
Restricted choice: Since Likert-type questions are closed-ended, respondents sometimes have to choose the most relevant answer even if it may not accurately reflect reality.

A Likert scale is a rating scale that quantitatively assesses opinions, attitudes, or behaviours. It is made up of four or more questions that measure a single attitude or trait when response scores are combined.

To use a Likert scale in a survey , you present participants with Likert-type questions or statements, and a continuum of items, usually with five or seven possible responses, to capture their degree of agreement.

Individual Likert-type questions are generally considered ordinal data , because the items have clear rank order, but don’t have an even distribution.

Overall Likert scale scores are sometimes treated as interval data. These scores are considered to have directionality and even spacing between them.

The type of data determines what statistical tests you should use to analyse your data.

Operationalisation means turning abstract conceptual ideas into measurable observations.

For example, the concept of social anxiety isn’t directly observable, but it can be operationally defined in terms of self-rating scores, behavioural avoidance of crowded places, or physical anxiety symptoms in social situations.

Before collecting data , it’s important to consider how you will operationalise the variables that you want to measure.

Cite this Scribbr article

If you want to cite this source, you can copy and paste the citation or click the ‘Cite this Scribbr article’ button to automatically add the citation to our free Reference Generator.

Bhandari, P. & Nikolopoulou, K. (2023, January 16). What Is a Likert Scale? | Guide & Examples. Scribbr. Retrieved 20 May 2024, from https://www.scribbr.co.uk/research-methods/likert-scales/

Is this article helpful?

Pritha Bhandari

Other students also liked, construct validity | definition, types, & examples, qualitative vs quantitative research | examples & methods, how to write a strong hypothesis | guide & examples.

IMAGES

Descriptive Analysis of the Likert Scale
Likert Scale Surveys: Why & How to Create Them (With Examples)
30 Free Likert Scale Templates & Examples ᐅ TemplateLab
Likert scale: How to use the popular survey rating scale
What Is a Likert Scale?
27 Free Likert Scale Templates & Examples [Word/Excel/PPT]

VIDEO

SPSS Workshop Part 3: Descriptive and inferential statistics full
Likert Scale Questionnaire Coding in SPSS
Likert Scale
Different Forms of Likert Scale
A Video about Creating a Likert Scale
How to Interpret 4 Point Likert Scale Results

COMMENTS

What Is a Likert Scale?
Revised on June 22, 2023. A Likert scale is a rating scale used to measure opinions, attitudes, or behaviors. It consists of a statement or a question, followed by a series of five or seven answer statements. Respondents choose the option that best corresponds with how they feel about the statement or question.
Likert Scale: Survey Use & Examples
The Likert scale is a well-loved tool in the realm of survey research. Named after psychologist Rensis Likert, it measures attitudes or feelings towards a topic on a continuum, typically from one extreme to the other. The scale provides quantitative data about qualitative aspects, such as attitudes, satisfaction, agreement, or likelihood.
Analyzing and Interpreting Data From Likert-Type Scales
Likert-type scales are frequently used in medical education and medical education research. Common uses include end-of-rotation trainee feedback, faculty evaluations of trainees, and assessment of performance after an educational intervention. ... Descriptive statistics, such as means and standard deviations, have unclear meanings when applied ...
A descriptive analysis and interpretation of data from Likert scales in
The study used descriptive research because the variables of interest in this study are not directly observed and as such, they are assessed by self-report measures using Likert rating scales ...
How to Analyze Likert Scale Data
I am working on a research paper having a likert scale rating from (Most preferred, Preferred, Neutral, Not preferred and Least preferred) and gave them a quantitative value of 5,4,3,2,1 respectively. I applied a t-Test for Two-Sample Assuming Unequal Variances in MS Excel and got the p value as 4.976e-79 (which is extremely small).
Quantitative analysis: Descriptive statistics
Numeric data collected in a research project can be analysed quantitatively using statistical tools in two different ways. Descriptive analysis refers to statistically describing, ... while self-esteem is an average score computed from a multi-item self-esteem scale measured using a 7-point Likert scale, ranging from 'strongly disagree' to ...
Descriptive Research: Characteristics, Methods + Examples
Use descriptive research in your studies to capture the characteristics of the population. Descriptive research is a research method. Skip to main content; Skip to primary sidebar; ... Likert Scale Complete Likert Scale Questions, Examples and Surveys for 5, 7 and 9 point scales. Learn everything about Likert Scale with corresponding example ...
Likert scale interpretation of the results w/ examples
Likert scale is a powerful surveying tool for different kinds of research. Read our text and learn how to interpret the results of satisfaction survey. ... descriptive, or a combination of both numbers and words. ... This is the final stage of research. Analyzing the results of Likert scale questionnaires is a vital way to improve services and ...
PDF Analyzing Likert Data
A Likert scale, on the other hand, is composed of a series of four or more Likert-type items that are combined into a single composite score/variable during the data analysis process. Combined, the items are used to provide a quantitative measure of a character or personality trait. Typically the researcher is only interested in the composite ...
Likert Scale
Likert scaling is one of the most fundamental and frequently used assessment strategies in social science research (Joshi et al. 2015).A social psychologist, Rensis Likert (), developed the Likert scale to measure attitudes.Although attitudes and opinions had been popular research topics in the social sciences, the measurement of these concepts was not established until this time.
Likert Scale Questionnaire: Examples & Analysis
A Likert scale assumes that the strength/intensity of an attitude is linear, i.e., on a continuum from strongly agree to strongly disagree, and makes the assumption that attitudes can be measured. For example, each of the five (or seven) responses would have a numerical value that would be used to measure the attitude under investigation.
(PDF) Likert Scale: Explored and Explained
The Likert scale is a frequently employed instrument in the field of social sciences research for the purpose of gathering self-reported data pertaining to individuals' attitudes, opinions, and ...
Likert scales: Examples, tips, and how to analyze the data
Likert scales are one of the most reliable ways to measure opinions, perceptions, and behaviors. Learn how to use it. Products. Product Overview ... Market Research Solutions. Purpose-built solutions for all of your market research needs. INDUSTRIES. Healthcare. Education. Technology. Government. Financial Services. See more Industries.
Likert Scale Research: A Comprehensive Guide
The Likert scale is a versatile research tool for gathering quantitative data about people's attitudes and opinions. This comprehensive guide provides a thorough overview of the use of the Likert scale in research, from its historical origins to modern applications. We begin with an exploration of the history and development of this powerful ...
A Review of Key Likert Scale Development Advances: 1995-2019
Abstract. Developing self-report Likert scales is an essential part of modern psychology. However, it is hard for psychologists to remain apprised of best practices as methodological developments accumulate. To address this, this current paper offers a selective review of advances in Likert scale development that have occurred over the past 25 ...
PDF Likert Scale: Explored and Explained
social science & educational research [3,4,5]. Likert scale was devised in order to measure 'attitude' in a scientifically accepted and validated manner in 1932 [6,7]. An attitude can be defined ... motif of topic may lie between the two descriptive options provided on a 5 point scale. On repeated administration, he/she may differ in ...
How to analyze Likert and other rating scale data
In particular, a Likert scale (or a Likert-type scale) should only refer to a set of items. 8, 10, 24, 28 If being used appropriately (i.e., not outside of the aggregated Likert scale), an individual item with a Likert response format might appropriately be termed a "Likert item" or a "Likert-type item" if it is being used individually ...
What is a Likert Scale?
A likert scale, or rating system, is a measurement method used in research to evaluate attitudes, opinions and perceptions. Likert scale questions are highly adaptable and can be used across a range of topics, from a customer satisfaction survey, to employment engagement surveys, to market research. For each question or statement, subjects ...
Descriptive Research
Descriptive research aims to accurately and systematically describe a population, situation or phenomenon. It can answer what, where, when and how questions, but not why questions. A descriptive research design can use a wide variety of research methods to investigate one or more variables. Unlike in experimental research, the researcher does ...
Using a Likert Scale in Psychology
Using a Likert Scale in Psychology. A Likert scale is a type of psychometric scale frequently used in psychology questionnaires. It was developed by and named after organizational psychologist Rensis Likert. Self-report inventories are one of the most widely used tools in psychological research.
What Is a Likert Scale?
Revised on 16 January 2023. A Likert scale is a rating scale used to measure opinions, attitudes, or behaviours. It consists of a statement or a question, followed by a series of five or seven answer statements. Respondents choose the option that best corresponds with how they feel about the statement or question.
Likert scale
Likert scale, rating system, used in questionnaires, that is designed to measure people's attitudes, opinions, or perceptions. Subjects choose from a range of possible responses to a specific question or statement; responses typically include "strongly agree," "agree," "neutral," "disagree," and "strongly disagree.".
(PDF) THE LIKERT SCALE: EXPLORING THE UNKNOWNS AND THEIR ...
Likert scales can be categorized in terms of (1) the number of descriptive anchors, (2) the nature of scores, and (3) what the scale seeks to measure. Types of Likert Scale Based on the Number of ...
SPSS: How to Analyse and Interpret LIKERT-SCALE Questionnaire ...
This video captures how to analyse Likert-scale questionnaire responses or data appropriately using SPSS