A Feature Analysis of Food Tracking Applications

MIxed Methods: Quantitative/Qualitative Analysis
Project Overview
In this analysis we address the issue of high abandonment rates among food tracking applications by collecting and analyzing data from the App Store and Google Play reviews in order to gain a better understanding of user needs.
PROJECT
TASK
MY ROLE
TEAM
DURATION
University research project for the course Research in Human Centered Computing
Apply research methods to research questions
Data Collection/Analysis
Literature Review
3 MS Students
10 Weeks

Introduction

For this project, we seek to address the issue of high abandonment rates among food tracking applications by collecting and analyzing data from App Store and Google Play reviews in order to gain a better understanding of user needs. With this information, we propose a set of ideal features for future prototypes of food-tracking applications that address these gaps found in our study.

Methods: Data Collection

In order to achieve a better understanding of what features draw users to specific food or diet tracking applications, we compiled the list of applications to include by drawing on a number of online resources from popular websites reviewing and suggesting tracking apps for users. To gather a comprehensive list of the most popular and most recommended apps, we used multiple search engines (Google, Qwant, DuckDuckGo) to take advantage of their different SEO algorithms using the search terms “(top OR best) (diet OR food) tracking apps (2020 OR 2021)”. Data collection was performed in early January of 2021, but “2021” was included in the search terms because a number of popular sites had recently published lists of suggested diet apps in order to attract readership of individuals with health-related New Years resolutions.

Methods: Data Analysis

After data collection, two main routes of data analysis were followed: 1) qualitative coding and 2) mixed methods feature extraction from reviews using NLP and manually grouping overlapping extracted features. 




First Route: Qualitative Coding

In the first route we qualitatively coded for the presence or absence of a number of health-, preference-, and usability-related features (n=21). We then compared this array of binary values (1=present, 0=absent for each of the 21 features) for each application to its average app store review through three different methods.

Firstly, we performed individual correlation strength analysis using Pearson’s correlation to determine the strength of each factor’s association the app store rating. Secondly we selected the best linear regression model with the features as binary x-values and the app store ratings as the y-values. Thirdly, we used the matrix of data (iOS: 21 features x 23 apps; Android: 21 features x 24 apps) as the features to train a machine learning model

Second Route: Mixed Methods

We employed methods that fall into their categories of “in vivo codes” and “meta codes”. The tokenization and collocation features of the NLTK package in python were used to extract the most significant bigrams and trigrams from the entire body of the text of the 8753 reviews. Top bi-grams and trigrams by statistical occurrence probability and raw frequency were extracted.

Results

Manual Qualitative Coding

Pearson correlation analysis to examine the strength of correlation between each factor and the app store rating can be seen in the heatmaps in figure 1. Correlation analyses show that, across both app stores, exercise integration and the free trial or free version had the strongest positive correlation with the rating, while the application being a single-diet app had the strongest negative correlation with the rating. Differential effects were noted in the iOS app features compared to the Google Play app features. In the Google Play store, offline capacity was the second-most negatively correlated with the rating. On the other hand, the meal planning planning code was second-most negatively correlated with rating in the iOS app store.

Figure 1: A heatmap visualization of correlation between 454each factor against other factors and against the respective 455iOS or Google Play app store rating.
Table 1: iOS app features presented as Coefficients from step-wise linear model building Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 1.152 on 16 degrees of freedomMultiple R-squared: 0.9266, Adjusted R-squared: 0.8945F-statistic: 28.87 on 7 and 16 DF, p-value: 6.298e-08
Figure 3: Comparison of predicted versus actual ratings forGoogle Play apps (top) and iOS apps (bottom)

Semi-Automated Qualitative Coding

Semi-automated qualitative coding produced a set of 30 features on
which to perform the analysis.
 Pearson correlation strength analysis showed that, of the extracted features, the codes weight-related (a meta code of a number of n-grams related to mentions of weight-related topics), goals (a meta code of a number of n-grams related to individuals’ goals), and user-friendly (a meta code of a number of n-grams related to the app’s ease of use) were the most strongly correlated to the individual review’s rating.

Limitations

We acknowledge that there are a few limitations within our study. We do not expect that there will be one application that will satisfy every type of user and their needs. As new diets and technologies are being developed, it is almost impossible to create an application that will encompass everything as they are coming out. For the purposes of this study, we have highlighted the features that have correlated with more positive reviews and have implicated
long term use among a variety of users. Many of our limitations are due to the fact that we conducted this study with only 28 applications. This small selection may not be representative of every type of user’s opinions.

We acknowledge that in order for these technologies to encourage and promote positive behavioral change, they need to be used over an extended period of time. Some reviews have been left by users who have sampled and abandoned the application after a few days or an unpleasant experience with a particular feature.