We conducted a survey on what types of photos are likely to obtain “likes,” a form of engagement, from young female users on Instagram. In this study, we examined the relationships between the presence or absence of a face in a photo and its physical attractiveness, the degree of smiling, the presence or absence of facial editing, text length, the number of tags, and the number of “likes” (likes) for each category of photos (hereafter, contexts). The contexts included “coming-of-age ceremony,” “graduation ceremony,” “café,” and “domestic travel.” We collected 200 Instagram posts for each context and assigned the above features manually or with a computer program. A correlation analysis and tests of representative values for the two groups were performed on these data. The results showed that photos that included the poster’s face received more likes in the contexts other than the graduation ceremony. In the graduation ceremony and café contexts, more likes were obtained when physical attractiveness was high. The obtained likes for the coming-of-age ceremony; the café; and, to a lesser extent, the graduation ceremony were lower for those with facial editing.
Instagramにおいて,若い女性ユーザを対象として,どのような投稿写真がエンゲージメントの1つである「いいね!」を獲得しやすいのかについて調査を行った。本調査では,写真中の顔の有無と身体的魅力,笑顔の程度,顔への加工の有無,テキスト長,タグ数と「いいね!」数との関係を,写真のカテゴリ(本研究では「コンテキスト」と呼ぶ)ごとに調べた。コンテキストには,「成人式」,「卒業式」,「カフェ」,「国内旅行」を取り上げた。それぞれのコンテキストで200個の投稿を収集し,上記の特徴を人手またはコンピュータプログラムで付与した。これらのデータに対して,相関分析と2群の代表値の検定を行った。その結果,卒業式以外のコンテキストで,投稿者の顔を含む写真は多くの「いいね!」を獲得していた。また,卒業式とカフェのコンテキストでは,身体的魅力が高い場合に多くの「いいね!」を獲得していた。成人式とカフェ,また弱い結果ではあるが卒業式でも,顔に対する加工を行っているものは,「いいね!」の数が少なかった。
In recent years, social networking services (SNSs), which are internet services that allow people to build social relationships and communicate with each other, have gained popularity. SNSs are used not only for community-building and communication but also for self-expression, where users can obtain responses (initially comments) from other people to their posts (boyd & Ellison, 2007). In recent years, responses have included “likes” and “shares” as well as comments (Tominaga et al., 2022). These are actions that readers take in response to posts and are also called “engagement”.
The concept of engagement is often discussed in the context of consumer behavior research. Specifically, engagement is considered a construct that describes a participant’s interaction or interactive experience with a subject (product or brand) (Brodie et al., 2011; Kietzmann et al., 2011). In the early days, it was defined as a consumer’s motivation to interact with the subject or with community members in the field (Algesheimer et al., 2005). In recent years, with the development of SNSs, engagement has come to be seen as a measure of an individual’s response to self-presentation and is defined as the audience’s actions in response to posts, profiles, etc., on online communication platforms (Pletikosa & Michahelles, 2013).
Engagement is important for businesses in building relationships with consumers and for individuals in building relationships with friends and strangers because sharing and commenting on others’ posts can lead to an intention to revisit/recontact the account and posts in the future (Burke et al., 2009; Vasalou et al., 2010). Thus, increasing engagement is important for both firms and individuals to confirm the success or failure of impression management by self-presentation and to build rich firm‒consumer relationships (BRQ: brand relationship quality) (Fournier, 1998) and personal relationships.
Many studies have investigated which types of posts are more likely to gain engagement online. As an example of research conducted before the invention of SNSs, Arguello et al. (2006) reported that newcomers were less likely to receive replies in Usenet newsgroups. It was also found that a self-disclosure, such as age, in the first post was more likely to receive replies (Burke et al., 2007) and that politeness and rudeness in messages make them easier or harder to receive replies depending on the news group (Burke & Kraut, 2008).
In SNSs, studies have explored the relationship between the textual expressions of posts and user engagement with them (Aldous et al., 2019; Banhawi & Ali, 2011; Burney, 2016; Jaakonmäki et al., 2017; Naveed et al., 2011; Suh et al., 2010). Studies on text formats have shown that posts on Twitter are more likely to be retweeted when they include URLs or hashtags (Naveed et al., 2011; Suh et al., 2010) and less likely to be retweeted when they include exclamation points (Naveed et al., 2011). It has been reported that in news article posts on Instagram, emojis and exclamation marks increase the number of likes while decreasing the number of comments, whereas question marks decrease the number of likes but increase the number of comments. On the other hand, Burney (2016) reported that in users’ Instagram posts, the number of hashtags and question marks increases likes, whereas the number of exclamation points decreases likes. Additionally, it has been reported that emojis increase both the number of likes and comments (Jaakonmäki et al., 2017). On Facebook, it has been reported that exclamation marks increase both likes and comments, whereas longer text reduces them (Banhawi & Ali, 2011). Thus because of differences related to platforms and post topics, results are inconsistent on the use of exclamation points and question marks in SNS posts.
Research has been conducted on the relationship between posted content and engagement (Cvijikj & Michahelles, 2013; Hong et al., 2011; Hua et al., 2016; Rietveld et al., 2019; Wu & Shen, 2015). A study examining text polarity (Wu & Shen, 2015) reported that posts with negative content were more likely to receive engagements on Twitter. Hong et al. (2011) investigated the number of tweets that would be retweeted on Twitter based on the content of the tweets, information on the poster, and the time of posting. Cvijikj and Michahelles (2013) demonstrated that posts with entertaining and informative content increased likes and that posts with rewarding content decreased them in company accounts on Facebook.
In the context of early SNSs such as Twitter and Facebook, text-based messaging and communication were the most common forms of engagement. However, in recent years Instagram, a photobased SNS, has become popular. The consumer behavior research has shown that human information-processing differs depending on the modality. Specifically, humans (i.e., consumers) process visual information faster than textual information, and images tend to evoke human emotions more than words do (De Houwer et al., 2001; Holmes & Mathews, 2005; Houston et al., 1987). Modality differences also lead to differences in interpretation levels, with visual modalities stimulating more localized and concrete ways of processing information, whereas words have been found to trigger more global and abstract thinking (Amit et al., 2013). Because of these modality differences, mobile photographic communication is gaining ground as a new medium of human communication (Kindberg et al., 2005). A qualitative research study by Lin and Faste (2012) showed that people are socially motivated by photographs and use the stories they tell to interact with others. As this research case study shows, photocentered SNSs play an important role as a place for people to express themselves.
In recent years, engagement research has focused on images (i.e., photographs) as modalities shift (Bakhshi et al., 2014, 2015, 2019; Yu et al., 2011). For example, Yu et al. (2011) compared posts with images to those with links or videos on Facebook and reported that the former received more likes. Bakhshi et al. (2014) investigated the relationship between the presence or absence of faces in photos posted on Instagram and their engagement and reported that photos with faces received more likes and comments than those without faces. Bakhshi et al. (2015, 2019) also investigated the relationship between the presence or absence of filters on entire photos and engagement in Flickr posts. A filter, also called an image filter, is a function that adds color tones, blurring, outlines, and other effects to a captured photo. Bakshi et al. reported that photos with filters received more views and comments than those without filters.
Engagement research focusing on people in photos has been conducted, but traditionally, social psychology has shown that physical attractiveness strongly influences the impressions others form and their degrees of affection (Luo & Zhang, 2009; Snyder & Tanke, 1977; Walster et al., 1966). For example, in a classic study, Walster et al. (1966) revealed that the only factor related to the success of dating at a dance party at a university was physical attractiveness. Additionally, research on the influence of facial expressions on facial attractiveness has been conducted, confirming a positive correlation between the strength of a smile and the attractiveness of a face (Cunningham, 1986; Harker & Keltner, 2001). Furthermore, in recent research on SNSs, Wang et al. (2010) revealed that on Facebook, when profile pages were highly physically attractive, both men and women were more likely to accept friend requests from the owners of the profile pages (when they were of the opposite sex). It has also been confirmed that other users who view the profile page of a user with low physical attractiveness spend more time looking at the advertisement section than they do the profile picture (Seidman & Miller, 2012).
It thus becomes clear that there is a significant research gap regarding the extent to which facial features contribute to engagement with photos posted by users on SNSs. In the context of the engagement research on photos posted on SNSs, the relationship between the physical attractiveness or facial expressions of the people in the photos and engagement has not been clarified. While filter processing for entire photos has been studied, facial editing differs significantly in nature from applying a filter to an entire photo. Recent smartphone applications offer features that enhance facial attractiveness, such as enlarging the eyes or smoothing the skin texture, but it remains unclear whether such edits are related to engagement. Additionally, it is not known whether basic textual expressions, which have been traditionally studied, contribute to engagement in photos featuring people.
This is the first study to investigate the relationship between engagement and features related to faces (i.e., attractiveness, smiles, and facial editing) and to analyze the relationship between basic text expressions and engagement. The research questions (RQs) for this study are as follows:
RQ1: Does the physical attractiveness of the poster relate to engagement?
RQ2: Does a smile relate to engagement?
RQ3: Does facial editing relate to engagement?
RQ4: Are the length of the text and the number of hashtags related to engagement?
Generally, engagement includes not only likes but also comments and shares, so it is necessary to determine the type of engagement to analyze the type of response actions. Among these engagement types, comments and shares may not occur for some posts. Furthermore, some SNS platforms do not have sharing functions implemented (for example, Instagram does not have a sharing feature). Therefore, this study focuses on likes as a form of engagement. The most popular image-based SNS, Instagram, is analyzed.
In addition, this study is limited by poster demographics and posting topics. This study investigates facial attractiveness and the use of modern face filters, but the perceptions of these aspects may differ depending on age and gender. The importance of the face may also vary depending on the post topic. In topics where information is important, such as food or travel, the audience may not give as much attention to the faces of the posters. Specifically, we target young women because the Instagram usage rate is higher among women than among men in Japan, with the highest usage rate among users in their 20s (Institute for Information and Communications Policy [IICP], 2024). Finally, the post topics (hereafter, context) include “coming-of-age ceremony,” “graduation ceremony,” “café,” and “domestic travel”.
In this study, we examine a single post on Instagram with two types of features: photo and text. Every Instagram post is accompanied by a photo (image). Photo features relate to a person (especially a face) and are assigned by a human evaluator(s) after a single submitted photo is reviewed. Text features are related to the superficial representation of the text automatically extracted by the computer from the text contained in the post. The five examined photo features are whether the poster’s own face is in the photo (“isFace”) if so, their physical attractiveness (“PA”), the degree of smiling (“smiling”), whether the face has been edited (i.e., whether a facial filter has been applied) (“FaceEdit”), and overall photo quality (“PhotoQual”). The examined text features are the number of hashtags and the text length (characters).
The presence of a face (isFace) and facial editing (FaceEdit) are expressed on a nominal (binary) scale. Facial editing shows whether the area of the poster’s face in the photo has been edited by a facial filter to improve its appearance, such as by enlarging the eyes, making the face smaller, making the skin more beautiful, or improving the color of the blood. Physical attractiveness (PA), smiling, and overall photo quality (PhotoQual) are expressed on 3-, 4-, and 3-point ordinal scales, respectively. The values for facial editing, physical attractiveness, and smiling are not assigned to photos that do not include the poster’s face (i.e., these categories are scored as NA). When these items are analyzed, posts designated NA are excluded from the analysis. Smiles are categorized as follows: (1) straight face: corners of the mouth are not raised at all (teeth not visible); (2) slight smile: angle of the mouth slightly lifted (teeth barely visible); (3) smile: angle of the mouth is raised (teeth are visible); and (4) full smile: angle of the mouth is maximally raised (full smile with teeth showing).
2. Data collectionIn this study, we limited the number of users to those who have a certain number of follows and followers and those who have made a certain number of posts. Specifically, users were limited to those with 100–1,500 follows, 100–1,500 followers, and more than 20 posts. This restriction was applied because users who are extremely active may benefit greatly from Instagram’s postpresentation algorithm, whereas users who are extremely inactive may have too small an audience viewing their posts on a regular basis, which may increase the noise.
We collect posts in four contexts: coming-of-age ceremonies, graduation ceremonies, cafés, and domestic travel. In social surveys for quantitative research, it is common to randomly select the data to be studied. However, it is extremely difficult to randomly collect posts from Instagram for all users because even if a search is conducted using keywords or hashtags related to the above context, Instagram’s search algorithm is subject to bias due to its preferential selection and ordering methods.
Therefore, in this study, we collected posts as randomly as possible using the following method (i.e., the “quartile sampling method”). First, a hashtag/keyword search was performed using keywords and hashtags that represent the specified context, and target posts were identified from the top of the search results (the poster is assumed to be the seed user). Next, in the seed user’s “List of Follows” (the list of follows is ordered by which the seed user follows other users), from each of the three quartile points (first, median, and third quartiles), we checked the users in the “List of Follows” individually, moving both backward and forward, and looked for users who posted about the context. This process was repeated until 800 posts in four contexts were collected, with 200 in each context. There were no duplicate users.
Finally, Instagram allows up to 10 photos to be attached to a single post, but when the post has several photos, it is difficult to know which photos resulted in positive emotions for the viewer. Therefore, only posts with one photo attached are included in the study.
3. LabelingThe number of likes was obtained from the collected posts, and photo and text features were assigned based on the posts. Photo features were assigned manually by looking at the images. There are two types of photo features: one is a value that is fixed regardless of who looks at the image, and the other is a subjective value that varies by person. The latter was assigned by three evaluators. Specifically, the three evaluators checked the images individually and assigned the features, and the authors averaged the results to determine the value. Specifically, the features included physical attractiveness, smiling, and overall photo quality. Because the criteria for physical attractiveness may differ between men and women (Feingold, 1991), we asked three female raters and three male raters to rate the images separately. We denote them as “PA (woman)” and “PA (man).
Regarding the presence of face and facial editing), one of the authors assigned values manually. The determination of whether the person in the photo was the poster was made by looking for faces tagged with that poster’s account across Instagram and by determining whether the same face appeared in other photos on the poster’s account two or more times. Photos were excluded from the analysis if it was not possible to determine whether the person in the photo was the poster or a stranger. Therefore, these photos were not included in the set of 800 posts. Regarding facial editing, photos were determined to have facial edits if anyone looking at the photo could determine that edits were clearly made to the poster’s face area. The number of tags and text length, i.e., the text features, were automatically calculated by the program from the post text.
The reliability of the features assigned by multiple raters was checked using Fleiss’s kappa coefficient (see Table 1). The smiling ratings showed a high degree of agreement, whereas the ratings of physical attractiveness, for which the evaluation criteria may differ from person to person, presented slightly lower values. We judged that the ratings were generally consistent and used these data for the analysis.

Fleiss’ kappa coefficient for the ratings
The number of likes a post receives depends on the number of followers. Therefore, we normalized the number of likes on the post by dividing it by the number of followers (normalized likes). To clarify the relationship between the photo or text features of a post and its normalized likes, we used a correlation analysis and a test of differences between two groups. First, the normality of the data was determined to select the statistical method to be used in the analysis: for normalized likes, the Kolmogorov‒Smirnov test was performed. The results revealed that the p value was 0.4978 for the coming-of-age context, 0.0765 for the graduation context, 0.0008 for the café context, and 0.0011 for the domestic travel context. Since there was a mixture of variables above and below the significance level, we considered the context as a whole to be nonnormal and employed a nonparametric test.
Since some features are expressed on ordinal and proportional scales and others on nominal scales (binary values: true or false), the analysis was conducted according to the form of the scale. Spearman’s rank correlation coefficients and no correlation tests were performed for the ordinal and proportional scales (see Table 2). The significance level was set at 5%, and a Bonferroni correction was used to determine significant differences. The Mann‒Whitney U test was used for the nominal measures, and Table 3 shows the means of the normalized likes by feature and the p values of the tests.

Spearman’s rank correlation coefficients between post features and normalized likes
+ <.1, * <.05

Mean of normalized likes by feature presence and Mann‒Whitney U test results
+ <.1, * <.05
Table 3 shows that whether a poster’s face is included in the photo significantly differs in normalized likes for the coming-of-age ceremony, café, and domestic travel contexts, and a significant trend is observed for the graduation ceremony context. These results revealed that showing the poster’s face in the photo resulted in more likes from the audience. Table 2 shows that the PA (woman) of the posters in the photos was correlated with normalized likes in the graduation ceremony and the café. The post’s PA (man) was correlated with normalized likes in the graduation ceremony context. These findings indicate that whether physical attractiveness leads to like acquisition depends on the context (RQ1). The café context exhibited a difference in the correlation between physical attractiveness and the number of likes between evaluations from female and male raters. Smiling was not correlated with normalized likes in either context (RQ2).
Table 3 shows a significant difference in the relationship between facial editing and normalized likes in the coming-of-age ceremony and café contexts. A significant trend was also observed for the graduation ceremony. This finding indicates that facial editing does not contribute to the acquisition of likes (RQ3). Table 2 shows that text length and the number of tags were correlated with normalized likes in the graduation ceremony and café contexts. These results indicate that some contexts are more likely to receive likes when the text is long and the number of hashtags is high (RQ4).
We found that in some contexts, posting the face of the poster obtains more likes (RQ1). The finding that the presence of a face obtains more likes supports findings from previous engagement studies (Bakhshi et al., 2014; Bakhshi et al., 2019). These results can also be explained by the finding that self-disclosure leads to more replies (Burke et al., 2007). We found that the PA (woman) of the poster in the posted photo increased the number of likes from audiences (RQ1) in the graduation ceremony and café contexts. This finding supports previous findings on physical attractiveness and impressions (Luo & Zhang, 2009; Snyder & Tanke, 1977; Walster et al., 1966; Wang et al., 2010). However, there was no correlation between physical attractiveness and the number of likes for the coming-of-age ceremony or domestic travel contexts. The reasons for this are discussed in more detail later in this chapter under “Text features (RQ4),” but in the graduation ceremony and café contexts, many audiences do not know the posters, and they may have liked the posts on the basis of their content, including physical attractiveness. In the café context, the physical attractiveness rated by women was correlated with the acquisition of likes but the same was not true when the post was rated by men. Since many of the readers of the café posts are women, it is likely that the attractiveness of the women raters’ evaluations correlated with the number of likes the posts received.
2. Smiling (RQ2)The effect of having many likes was not confirmed for smiling (RQ2). Based on the previous findings on facial expressions (Cunningham, 1986; Harker & Keltner, 2001), we expected that posters who smiled more would obtain more likes, but these were not the results we obtained. Here, we investigate the relationship between normalized likes for each attractiveness level (low/medium/high) and the smile level (straight face/slight smile/smile/full smile) because the effect of the smile level on likes may be moderated by physical attractiveness. However, we did not confirm that the normalized likes varied with the level of smiling, even when physical attractiveness was constrained at the same level. The photographs used in previous studies were profile photos with uniform shooting conditions, such as background and face sizes (e.g., profile photos of 4 cm×5 cm) (Cunningham, 1986; Harker & Keltner, 2001). However, the photos used in this study were actual Instagram posts, and not only did the backgrounds and face sizes vary by user, but so did the clothing and poses. Even if the poster did not smile in the photo, the photo overall might have been viewed as positive by the audience. These differences in conditions between our study and previous studies suggest that there was no correlation between smiling and obtaining likes.
3. Facial editing (RQ3)Table 3 shows that there is a significant difference between the coming-of-age ceremony and the café contexts in terms of the presence or absence of facial editing in the face photos, and a significant trend occurs in the graduation ceremony context (RQ3). This result suggests that editing a face does not result in more likes but, conversely, may result in fewer likes. This result differs from previous findings that applying filters to an entire photo increases engagement (Bakhshi et al., 2015) and from previous findings that applying filters to an entire photo with a person in it increases engagement (Bakhshi et al., 2019).
Unlike previous studies, this study focuses on facial edits that make the face more attractive. Such edits are photo filters such as those making the eyes larger, making the face smaller, making the skin more beautiful, and improving the color of blood. These edits are expected to increase likes from the audience. In fact, people process their own photos so others will see them as special (Petrelli & Whittaker, 2010). In this study, we judged a photo to be edited if a person who did not know the poster could tell from looking at the photo that editing had been done to the face. However, such photographs are different from how the person’s face truly looks. SNSs are personal networks, and people expect to be provided with content about the posters themselves (Ellison et al., 2011). For this reason, people may dislike photos with facial editing, which makes it difficult to understand the poster’s true face.
4. Text features (RQ4)The number of hashtags was found to contribute to the acquisition of likes in the context of the graduation ceremony and the café (RQ4). This finding supports previous findings (Suh et al., 2010), suggesting that the use of hashtags increases engagement. In these contexts, the influence of the community and the presence of special hashtags are likely to be significant. In these contexts, we reviewed the profiles of the posters and their other posts and found that in the café context, many posters liked visiting cafés (mainly users who post about cafés), and in the graduation context, many posters were beauty students (students who study haircuts and hair arrangements, 48 out of 200 posters). For the graduation ceremony and café contexts, hashtags specific to these contexts were used. For example, “#beauty_students” (“#美容学生”) (students who want to become beauticians), “#want_to_connect_with_beauty_students” (“#美容学生と繋がりたい”) were used for graduation ceremonies, and “#café_tour” (“#カフェ巡り”) and “#want_to_connect_with_ café_lovers” (“#カフェ好きと繋がりたい”) were used for cafés. These tags are considered meaningful in encouraging mutual followings and comments. The presence of such hashtags and communities in the context may have made a difference in the number of likes. This result is similar to that of a previous study (DeMasi et al., 2016) that showed that the amount of community activity is influenced by the frequency of use of certain types of hashtags.
Finally, text length (number of characters) was associated with like acquisition on Instagram. Specifically, significant differences were found in the context of graduation ceremonies and cafés (RQ4). This result was surprising to the authors, since previous studies have indicated that images tend to evoke more human emotion than words do (De Houwer et al., 2001; Holmes & Mathews, 2005; Houston et al., 1987), and engagement on SNSs is thought to be the result of the audience’s positive emotions expressed in action (Berger & Milkman, 2012). In addition, since text length was negatively correlated with the acquisition of likes on Facebook (Banhawi & Ali, 2011), the same result might be obtained on Instagram. People who do not know the poster may not be able to understand what the poster experienced or how he or she felt at the time from the photo alone and thus may not have liked the post.
5. LimitationsThis study has the following limitations. The study included four contexts and analyzed them separately but revealed differences in features related to like acquisition, depending on the context. Therefore, it is not clear whether findings obtained in one context are applicable to other contexts. Therefore, to apply the results of this study, it is necessary to consider the degree to which the contexts of interest to users and PR professionals are similar in content and community to the four contexts discussed in this study.
In addition, young women were targeted in this survey because they are the main users of Instagram. However, Instagram also has many male and older users. Male users may not give as much attention to people as female users do and may respond by focusing on the contents of photos and texts. In addition, many older users are unlikely to put their own faces on their posts, and they may actively give likes to the few posts of their peers that do show their faces. All of these are interesting research topics from a social psychology perspective.
6. Design implicationsFinally, we discuss the applicability of this study. Clearly, the findings of this study can be of great help to individuals who want to obtain engagement from others on SNSs. Such individuals could obtain more engagement by taking their own photos containing the features that were correlated with engagement in the study findings and posting them on SNSs. This behavior corresponds to approval from others and may lead to an increase in one’s self-esteem (Leary et al., 1995). Since the 2010s, SNSs have become a valuable place of self-actualization for those who have difficulty expressing themselves in real time in the real world. The findings of this study may enhance the power of SNSs as a place for self-presentation.
Although this study was conducted to assess the engagement of general users, we believe that the results can be applied to corporate accounts, as many of them are anthropomorphic or reflect the personality of the person in charge (account manager) (Kwon & Sung, 2011). If the company continues to post with controlled topics and photographic expressions, it is likely to maintain a high level of long-term engagement from its customers. We hope that both companies and individuals use the results of this study for their own branding, depending on the context in which they work.
This work was supported by JSPS KAKENHI Grant Numbers JP15K12150 and JP23K28194.
The dataset generated and analyzed in this article is publicly unavailable due to legal restrictions and privacy concerns.
Yoshinori Hijikata
Yoshinori Hijikata (Ph. D) is a professor in the Graduate School of Information Science, University of Hyogo. In 1998, he joined IBM Research, Tokyo Research Laboratory. He was an assistant professor at the Graduate School of Engineering Science, Osaka University, from 2002–2007 and an associate professor from 2007–2017. He was an associate professor at the School of Business Administration, Kwansei Gakuin University, from 2017–2019 and a professor from 2019–2024. One of his books is “Social media study: Human society and psychology revealed by behavioral data”.
Kayako Morimoto
Kayako Morimoto is a system engineer in IBM Consulting. She received a B.B.A. degree from Kwansei Gakuin University in 2022. She studied user engagement on SNSs at Hijikata Laboratory, Kwansei Gakuin University. Her research interests are cyber psychology and digital marketing.