{"id":4559,"date":"2018-02-13T20:24:25","date_gmt":"2018-02-13T19:24:25","guid":{"rendered":"https:\/\/www.lesestunden.de\/?p=4559"},"modified":"2022-09-07T08:58:36","modified_gmt":"2022-09-07T06:58:36","slug":"worueber-schreiben-buchblogger-analyse-mit-visualisierung-und-statistiken","status":"publish","type":"post","link":"https:\/\/www.lesestunden.de\/en\/2018\/02\/what-do-book-bloggers-write-about-analysis-with-visualization-and-statistics\/","title":{"rendered":"What do book bloggers write about: Analysis with visualization and statistics"},"content":{"rendered":"\r\n<p class=\"wp-block-paragraph\">Since starting this blog, I\u2019ve repeatedly analyzed the book-blogging blogosphere and published posts with detailed statistics and visualizations. I\u2019ve already looked at <a href=\"https:\/\/www.lesestunden.de\/en\/2015\/03\/buchblogger-eine-analyse-mit-topliste-visualisierungen-und-statistiken\/\">networking<\/a>, the <a href=\"https:\/\/www.lesestunden.de\/en\/2016\/05\/woher-kommen-buchblogger-eine-neue-kleine-analyse-mit-visualisierung-und-statistiken\/\">origins of bloggers<\/a>, and <a href=\"https:\/\/www.lesestunden.de\/en\/2016\/07\/was-lesen-buchblogger-eine-neue-analyse-mit-visualisierungen-und-statistiken\/\">what book bloggers read<\/a>. This post now looks at the most interesting and at the same time most central topic of this select little circle: What do book bloggers write about? This evaluation is by far the most labor-intensive and, in terms of data, the most extensive analysis I\u2019ve done so far. The basis is nothing less than a complete copy of all articles ever published by all book blogs. This time there\u2019s a particular focus on publishers and the question of how successful their marketing work is. And of course, this post also includes some exciting general insights into the world of book bloggers.<\/p>\r\n\r\n\r\n\r\n<!--more-->\r\n\r\n\r\n\r\n<p class=\"wp-block-paragraph\">To understand the following statistics and visualizations, it\u2019s important to know what data they\u2019re based on. I processed all blogs from the <a href=\"https:\/\/www.lesestunden.de\/topliste\/\">Top List<\/a>, i.e., all book blogs that have published a post within the last six months. I developed a program that fully crawls all these blogs and loads all posts\u2014with title, article content (without images and formatting), and date\u2014into an efficiently searchable index. Then I obtained a database of all books published in Germany to date. I subsequently searched all posts for these books, once using the ISBN and once by looking for the occurrence of title and author name in the article text. The result is a list of books and information on when they were mentioned in which blog and in which post. Based on this information, I determined the results that I present in this post.<\/p>\r\n\r\n\r\n\r\n<p class=\"wp-block-paragraph\">If you\u2019re interested in the data basis and the approach and would like to learn more, I can point you to a post that I will publish in a few days. I will make the program that forms the basis for this analysis available as open source. That way anyone can understand what I did, validate the data, create a study based on it, or simply take a look at how I proceeded.<\/p>\r\n\r\n\r\n\r\n<h3 class=\"wp-block-heading\">Blogging behavior<\/h3>\r\n\r\n\r\n\r\n<p class=\"wp-block-paragraph\">In total, I captured <strong>616,044 blog posts<\/strong> across <strong>1,271 blogs<\/strong>. I found <strong>688,508 books<\/strong> mentioned in those posts. Bloggers typed <strong>2,188,957,324 characters<\/strong>, which corresponds to <strong>1,216,087 book pages<\/strong>, which in turn equals <strong>1,216 one-thousand-page tomes<\/strong>. On average, a blog article has <strong>3,553 characters<\/strong>, which is just under two book pages.<\/p>\r\n\r\n\r\n\r\n<p class=\"wp-block-paragraph\">The following histogram shows the distribution of post lengths in a bit more detail (the X-axis shows post length; the Y-axis shows the number of posts that fall into that length range).<\/p>\r\n\r\n\r\n\r\n<figure class=\"wp-block-image\"><a href=\"https:\/\/www.lesestunden.de\/wp-content\/uploads\/2018\/02\/beitragslaenge_histogramm.jpg\"><img loading=\"lazy\" decoding=\"async\" width=\"1081\" height=\"317\" src=\"https:\/\/www.lesestunden.de\/wp-content\/uploads\/2018\/02\/beitragslaenge_histogramm.jpg\" alt=\"\" class=\"wp-image-4602\" srcset=\"https:\/\/www.lesestunden.de\/wp-content\/uploads\/2018\/02\/beitragslaenge_histogramm.jpg 1081w, https:\/\/www.lesestunden.de\/wp-content\/uploads\/2018\/02\/beitragslaenge_histogramm-300x88.jpg 300w, https:\/\/www.lesestunden.de\/wp-content\/uploads\/2018\/02\/beitragslaenge_histogramm-768x225.jpg 768w, https:\/\/www.lesestunden.de\/wp-content\/uploads\/2018\/02\/beitragslaenge_histogramm-1024x300.jpg 1024w, https:\/\/www.lesestunden.de\/wp-content\/uploads\/2018\/02\/beitragslaenge_histogramm-1080x317.jpg 1080w\" sizes=\"auto, (max-width: 959px) 688px, (max-width: 1023px) 768px, (max-width: 1279px) 848px, 100vw\" \/><\/a><\/figure>\r\n\r\n\r\n\r\n<p class=\"wp-block-paragraph\">It\u2019s very clear that most posts have between 1,800 and 2,999 characters\u2014that is, between one and two book pages. I only considered posts with more than 600 characters. As I noticed in isolated cases, some bloggers use their blog almost like a kind of Twitter and write only a few lines there, often with completely trivial content. Very few bloggers write very extensive blog posts. It\u2019s noticeable that some blogs with longer posts indeed offer very good content. Below is a list of blogs with the greatest average post length:<\/p>\r\n\r\n\r\n\r\n<ul class=\"wp-block-list\"><li>14,633 characters: http:\/\/www.buchkolumne.de<\/li><li>11,914 characters: https:\/\/stefanmesch.wordpress.com<\/li><li>11,352 characters: https:\/\/phantastikon.de<\/li><li>9,096 characters: https:\/\/steglitzmind.wordpress.com<\/li><li>8,654 characters: https:\/\/radiergummi.wordpress.com<\/li><li>8,547 characters: http:\/\/buchwurm.org<\/li><li>7,944 characters: https:\/\/astrolibrium.wordpress.com<\/li><li>7,753 characters: https:\/\/postmondaen.net<\/li><li>7,692 characters: https:\/\/ivybooknerd.com<\/li><li>7,428 characters: https:\/\/theworldofvioletbooklady.blogspot.de<\/li><li>7,301 characters: https:\/\/kueckibooks.blogspot.de<\/li><li>7,261 characters: http:\/\/inkofbooks.com<\/li><li>7,202 characters: https:\/\/jljordan.net<\/li><li>7,168 characters: https:\/\/crimealleyblog.wordpress.com<\/li><li>7,135 characters: https:\/\/dasgrauesofa.com<\/li><li>7,124 characters: https:\/\/www.lesestunden.de<\/li><li>7,119 characters: https:\/\/thomasbrasch.wordpress.com<\/li><li>7,078 characters: https:\/\/literarischer.blogspot.de<\/li><li>6,998 characters: https:\/\/claudiabett.com<\/li><li>6,905 characters: https:\/\/ltrtr.wordpress.com<\/li><\/ul>\r\n\r\n\r\n\r\n<p class=\"wp-block-paragraph\">Karla from Buchkolumne, for example, writes very few posts, but when she does, they\u2019re well-researched and rich in content. Stefan Mesch and Gesine von Prittwitz also have extensive blog posts and interviews that I remember as high-quality. And of course my blog too\u2014obviously ;)<\/p>\r\n\r\n\r\n\r\n<p class=\"wp-block-paragraph\">The available article data allows me to determine the age of blogs very precisely. On average, blogs are 4.2 years old. In my <a href=\"https:\/\/www.lesestunden.de\/en\/2015\/03\/buchblogger-eine-analyse-mit-topliste-visualisierungen-und-statistiken\/\">first evaluation<\/a> at the beginning of 2015, I arrived at a value of 3 years. That may have been a measurement error back then (because at the time I estimated the age based on the first archive.org entry), or some bloggers have indeed stuck with it since then, pushing the average up. Looking at the blogs I follow, the age structure seems to fit well.<\/p>\r\n\r\n\r\n\r\n<p class=\"wp-block-paragraph\">I don\u2019t want to withhold from you the twenty oldest blogs that have been in the game the longest. The first blogger opened their doors in 1998\u2014at least if you trust the article date (which I\u2019m not entirely sure about in the first blog\u2019s case).<\/p>\r\n\r\n\r\n\r\n<ul class=\"wp-block-list\"><li>1998: https:\/\/www.lesenmitlinks.de<\/li><li>1999: http:\/\/www.die-leselust.de<\/li><li>1999: http:\/\/lesekreis.org<\/li><li>1999: http:\/\/buecher.ueber-alles.net<\/li><li>2000: http:\/\/www.buchrebellin.de<\/li><li>2002: https:\/\/literatenwelt.blogspot.de<\/li><li>2002: http:\/\/buchwurm.org<\/li><li>2003: https:\/\/moyasbuchgewimmel.de<\/li><li>2003: http:\/\/influenza-bookosa.de<\/li><li>2004: http:\/\/www.lesensiegut.de<\/li><li>2005: https:\/\/www.lesefieber.ch<\/li><li>2005: http:\/\/www.bonaventura.blog<\/li><li>2005: http:\/\/blog.literaturwelt.de<\/li><li>2005: https:\/\/sunsys-blog.blogspot.de<\/li><li>2005: http:\/\/www.laubet.de<\/li><li>2006: http:\/\/www.wortmax.de<\/li><li>2006: http:\/\/pinkfisch.net<\/li><li>2007: http:\/\/www.schattenwege.net<\/li><li>2007: http:\/\/www.bluecher.blog<\/li><li>2007: http:\/\/www.kossis-welt.de\/blog<\/li><\/ul>\r\n\r\n\r\n\r\n<p class=\"wp-block-paragraph\">An exciting question is what the trend looks like. How many blogs were launched over the years? It\u2019s clearly visible that 2015 was the year of the big book-blogger hype. That\u2019s when book blogging made big waves in the publishing world, was suddenly discovered for marketing, and was really noticed at fairs for the first time (I\u2019m thinking of the first blogger lounge or blogger sessions at the book fairs).<\/p>\r\n\r\n\r\n\r\n<figure class=\"wp-block-image\"><a href=\"https:\/\/www.lesestunden.de\/wp-content\/uploads\/2018\/02\/anzahl_blogs.jpg\"><img loading=\"lazy\" decoding=\"async\" width=\"1093\" height=\"334\" src=\"https:\/\/www.lesestunden.de\/wp-content\/uploads\/2018\/02\/anzahl_blogs.jpg\" alt=\"\" class=\"wp-image-4599\" srcset=\"https:\/\/www.lesestunden.de\/wp-content\/uploads\/2018\/02\/anzahl_blogs.jpg 1093w, https:\/\/www.lesestunden.de\/wp-content\/uploads\/2018\/02\/anzahl_blogs-300x92.jpg 300w, https:\/\/www.lesestunden.de\/wp-content\/uploads\/2018\/02\/anzahl_blogs-768x235.jpg 768w, https:\/\/www.lesestunden.de\/wp-content\/uploads\/2018\/02\/anzahl_blogs-1024x313.jpg 1024w\" sizes=\"auto, (max-width: 959px) 688px, (max-width: 1023px) 768px, (max-width: 1279px) 848px, 100vw\" \/><\/a><\/figure>\r\n\r\n\r\n\r\n<p class=\"wp-block-paragraph\">But it\u2019s also very clear that the trend is strongly downward. In 2015, 100 more blogs were started than in 2017. Blogging about books is apparently no longer quite as hip as it was three years ago. That prompted me to dig deeper. How active have book bloggers been over the last few years? Here\u2019s another chart that arranges the number of blog posts over time, grouped by month:<\/p>\r\n\r\n\r\n\r\n<figure class=\"wp-block-image\"><a href=\"https:\/\/www.lesestunden.de\/wp-content\/uploads\/2018\/02\/artikel_nach_monat.jpg\"><img loading=\"lazy\" decoding=\"async\" width=\"1092\" height=\"331\" src=\"https:\/\/www.lesestunden.de\/wp-content\/uploads\/2018\/02\/artikel_nach_monat.jpg\" alt=\"\" class=\"wp-image-4600\" srcset=\"https:\/\/www.lesestunden.de\/wp-content\/uploads\/2018\/02\/artikel_nach_monat.jpg 1092w, https:\/\/www.lesestunden.de\/wp-content\/uploads\/2018\/02\/artikel_nach_monat-300x91.jpg 300w, https:\/\/www.lesestunden.de\/wp-content\/uploads\/2018\/02\/artikel_nach_monat-768x233.jpg 768w, https:\/\/www.lesestunden.de\/wp-content\/uploads\/2018\/02\/artikel_nach_monat-1024x310.jpg 1024w\" sizes=\"auto, (max-width: 959px) 688px, (max-width: 1023px) 768px, (max-width: 1279px) 848px, 100vw\" \/><\/a><\/figure>\r\n\r\n\r\n\r\n<p class=\"wp-block-paragraph\">This also confirms: 2015 was the big blogger year. Bloggers were most active then and published the most articles. By the end of 2017, however, the level had returned to that of mid-2014. Apparently, bloggers\u2019 motivation dropped somewhat, and after the initial euphoria, a somewhat calmer everyday blogging routine returned. From my point of view, this is also reflected in the topics. In 2015 there was a lot of discussion about blogging, and debates repeatedly flared up and were conducted with a lot of emotion (e.g., bloggers vs. the feuilleton, professionalization of book blogs, etc.). That has subsided considerably in recent months.<\/p>\r\n\r\n\r\n\r\n<p class=\"wp-block-paragraph\">At the time I noticed that Sunday in particular is a popular day for publishing blog posts. Overall, it is indeed the most popular day, but not as significantly as I had subjectively perceived. Blog posts appear fairly evenly throughout the week. However, these data also contain some error, because for a few blogs I couldn\u2019t determine the exact day of publication.<\/p>\r\n\r\n\r\n\r\n<figure class=\"wp-block-image\"><a href=\"https:\/\/www.lesestunden.de\/wp-content\/uploads\/2018\/02\/artikel_nach_wochentag.jpg\"><img loading=\"lazy\" decoding=\"async\" width=\"1087\" height=\"328\" src=\"https:\/\/www.lesestunden.de\/wp-content\/uploads\/2018\/02\/artikel_nach_wochentag.jpg\" alt=\"\" class=\"wp-image-4618\" srcset=\"https:\/\/www.lesestunden.de\/wp-content\/uploads\/2018\/02\/artikel_nach_wochentag.jpg 1087w, https:\/\/www.lesestunden.de\/wp-content\/uploads\/2018\/02\/artikel_nach_wochentag-300x91.jpg 300w, https:\/\/www.lesestunden.de\/wp-content\/uploads\/2018\/02\/artikel_nach_wochentag-768x232.jpg 768w, https:\/\/www.lesestunden.de\/wp-content\/uploads\/2018\/02\/artikel_nach_wochentag-1024x309.jpg 1024w\" sizes=\"auto, (max-width: 959px) 688px, (max-width: 1023px) 768px, (max-width: 1279px) 848px, 100vw\" \/><\/a><\/figure>\r\n\r\n\r\n\r\n<p class=\"wp-block-paragraph\">One question I asked myself was how present giveaways are in the book-blogosphere. <strong>4%<\/strong> of the posts contain the word <strong>Gewinnspiel<\/strong> (\u201cgiveaway\u201d) in their text. That doesn\u2019t necessarily mean that all of these 26,400 posts are giveaways. Some are the posts announcing the draw results; others certainly write about a book they won. Still, it\u2019s a plausible figure. Giveaways pop up now and then, but they\u2019re a rather insignificant side issue overall.<\/p>\r\n\r\n\r\n\r\n<h3 class=\"wp-block-heading\">Popular content: books and authors<\/h3>\r\n\r\n\r\n\r\n<p class=\"wp-block-paragraph\">Of course I was curious which books and authors are very popular. I actively follow only a very small slice of book blogs, and naturally they have a reading taste similar to mine. But from browsing the Top List I already had an impression that\u2019s confirmed here once again. At the top of the list are YA and children\u2019s books by authors who are now familiar even to me and who also rank high on sales lists in bookstores and online shops. The number indicates how many posts the book was found in.<\/p>\r\n\r\n\r\n\r\n<ul class=\"wp-block-list\"><li>528:&nbsp;<label>The Neverending Story:<\/label>&nbsp;Michael Ende<\/li><li>461:&nbsp;<label>Shiver:<\/label>&nbsp;Maggie Stiefvater<\/li><li>424:&nbsp;<label>The Selection:<\/label>&nbsp;Kiera Cass<\/li><li>419:&nbsp;<label>Erebos:<\/label>&nbsp;Ursula Poznanski<\/li><li>393:&nbsp;<label>Sherlock Holmes:<\/label>&nbsp;Arthur Conan Doyle<\/li><li>376:&nbsp;<label>Miss Peregrine\u2019s Home for Peculiar Children:<\/label>&nbsp;Ransom Riggs<\/li><li>370:&nbsp;<label>Ruby Red:<\/label>&nbsp;Kerstin Gier<\/li><li>345:&nbsp;<label>Starters:<\/label>&nbsp;Lissa Price<\/li><li>344:&nbsp;<label>City of Bones:<\/label>&nbsp;Cassandra Clare<\/li><li>327:&nbsp;<label>Momo:<\/label>&nbsp;Michael Ende<\/li><li>326:&nbsp;<label>Splitterherz:<\/label>&nbsp;Bettina Belitz<\/li><li>325:&nbsp;<label>Graceling:<\/label>&nbsp;Kristin Cashore<\/li><li>322:&nbsp;<label>The Hunger Games:<\/label>&nbsp;Suzanne Collins<\/li><li>290:&nbsp;<label>Magie:<\/label>&nbsp;Trudi Canavan<\/li><li>289:&nbsp;<label>Linger:<\/label>&nbsp;Maggie Stiefvater<\/li><li>286:&nbsp;<label>A Monster Calls:<\/label>&nbsp;Patrick Ness<\/li><li>282:&nbsp;<label>Die Dreizehnte Fee:<\/label>&nbsp;Julia Adri\u00e1n<\/li><li>275:&nbsp;<label>Cherry Red Summer:<\/label>&nbsp;Carina Bartsch<\/li><li>269:&nbsp;<label>Sapphire Blue:<\/label>&nbsp;Kerstin Gier<\/li><li>266:&nbsp;<label>The Raven Boys:<\/label>&nbsp;Maggie Stiefvater<\/li><li>258:&nbsp;<label>Es wird keine Helden geben:<\/label>&nbsp;Anna Seidl<\/li><\/ul>\r\n\r\n\r\n\r\n<p class=\"wp-block-paragraph\">I do need to point out the method used to find books in the articles. There is, of course, a degree of fuzziness. I search blog posts for ISBNs, but also for title and author name. Kiera Cass\u2019s <em>The Selection<\/em> naturally appears across all three volumes of the series. Every mention is counted; the book doesn\u2019t have to have been reviewed in the post. In terms of magnitude, though, this list seems plausible to me. You really do encounter these books often on blogs.<\/p>\r\n\r\n\r\n\r\n<p class=\"wp-block-paragraph\">The selection of most popular authors is quite similar. Again, the number is the number of mentions:<\/p>\r\n\r\n\r\n\r\n<ul class=\"wp-block-list\"><li>1,937: Maggie Stiefvater<\/li><li>1,525: Cassandra Clare<\/li><li>1,427: Nina Blazon<\/li><li>1,073: J. R. Ward<\/li><li>1,015: Ursula Poznanski<\/li><li>978: Jennifer Estep<\/li><li>963: Michael Ende<\/li><li>936: Bettina Belitz<\/li><li>920: Richard Bachman<\/li><li>867: Abbi Glines<\/li><li>848: Simon Beckett<\/li><li>844: Cecelia Ahern<\/li><li>776: Kerstin Gier<\/li><li>694: Kiera Cass<\/li><li>689: Christoph Marzi<\/li><li>687: J. Lynn<\/li><li>686: Jennifer Benkau<\/li><li>674: Cornelia Funke<\/li><li>640: Karin Slaughter<\/li><li>608: Samantha Young<\/li><\/ul>\r\n\r\n\r\n\r\n<h3 class=\"wp-block-heading\">Similarity of blog content<\/h3>\r\n\r\n\r\n\r\n<p class=\"wp-block-paragraph\">Looking at how many mentions the most popular books have, that\u2019s very low given the volume of blog posts. I wondered how large the overlap is in books read and authors between blogs. Do all blogs read the same books and authors? To find out, I calculated for all blogs how similar they are to other blogs in their choice of books. I calculated a similarity score between two blogs ranging from 0 (no similarity) to 100 (very similar). If the exact same book is mentioned in both blogs, 15 points are added; if the same author is mentioned, I add 5 points. From the number of books and authors of the blog with the fewer books, a maximum possible upper limit of points results. I then divided the actual points achieved by the maximum possible score, yielding a value between 0 and 1 or 0 and 100. A high value indicates that both blogs discuss many identical books and authors.<\/p>\r\n\r\n\r\n\r\n<p class=\"wp-block-paragraph\">The following visualization shows a matrix in which all blogs are listed on the X and Y axes. The higher the similarity value, the darker the dot at the intersection between two blogs. I sorted very similar blogs to the beginning. You can click the image to open it at full resolution. It is, however, very, very large.<\/p>\r\n\r\n\r\n\r\n<figure class=\"wp-block-image alignwide\"><a href=\"https:\/\/www.lesestunden.de\/wp-content\/uploads\/2018\/02\/matrix.png#\"><img decoding=\"async\" src=\"https:\/\/www.lesestunden.de\/wp-content\/uploads\/2018\/02\/matrix.jpg\" alt=\"\" class=\"wp-image-4576\"\/><\/a><\/figure>\r\n\r\n\r\n\r\n<p class=\"wp-block-paragraph\">At first glance you can see that overlaps between blogs are very small. There are a few blogs that are very similar in content (you can see that at the somewhat darker upper-left corner), but the vast majority each have their own selection of books. An average content overlap of 6 (out of a possible 100) makes this even clearer. Bloggers make a very individual selection when it comes to their reading. This confirms an assumption I already determined in my evaluation <a href=\"https:\/\/www.lesestunden.de\/en\/2016\/07\/was-lesen-buchblogger-eine-neue-analyse-mit-visualisierungen-und-statistiken\/\">What do book bloggers read<\/a>: A book is read by only a few book bloggers.<\/p>\r\n\r\n\r\n\r\n<h3 class=\"wp-block-heading\">Genres<\/h3>\r\n\r\n\r\n\r\n<p class=\"wp-block-paragraph\">Does this great diversity also indicate a kind of perfect literary variety? Do bloggers really cover such a wide thematic range? Not at all, as a closer look at the genres of the books found reveals. To determine genres, I retrieved Amazon\u2019s categories for all the books found. A book can belong to multiple categories. I then grouped the categories\u2014which Amazon catalogues excellently and very granularly\u2014accordingly. The following bar chart shows how many of the books found can be assigned to each genre. 15% children\u2019s books means that 15% of all individual hits can be assigned to this genre.<\/p>\r\n\r\n\r\n\r\n<figure class=\"wp-block-image\"><a href=\"https:\/\/www.lesestunden.de\/wp-content\/uploads\/2018\/02\/top_genres.jpg\"><img loading=\"lazy\" decoding=\"async\" width=\"1093\" height=\"500\" src=\"https:\/\/www.lesestunden.de\/wp-content\/uploads\/2018\/02\/top_genres.jpg\" alt=\"\" class=\"wp-image-4605\" srcset=\"https:\/\/www.lesestunden.de\/wp-content\/uploads\/2018\/02\/top_genres.jpg 1093w, https:\/\/www.lesestunden.de\/wp-content\/uploads\/2018\/02\/top_genres-300x137.jpg 300w, https:\/\/www.lesestunden.de\/wp-content\/uploads\/2018\/02\/top_genres-768x351.jpg 768w, https:\/\/www.lesestunden.de\/wp-content\/uploads\/2018\/02\/top_genres-1024x468.jpg 1024w\" sizes=\"auto, (max-width: 959px) 688px, (max-width: 1023px) 768px, (max-width: 1279px) 848px, 100vw\" \/><\/a><\/figure>\r\n\r\n\r\n\r\n<p class=\"wp-block-paragraph\">This also confirms what I found in my previous evaluation <a href=\"https:\/\/www.lesestunden.de\/en\/2016\/07\/was-lesen-buchblogger-eine-neue-analyse-mit-visualisierungen-und-statistiken\/\">What do book bloggers read<\/a>: YA is unchallenged in first place. The book-blogging scene is dominated primarily by genre and popular fiction. I would also attribute the above-average share of YA and children\u2019s books to the age structure of bloggers. Without being able to back it up with numbers, it\u2019s noticeable when surfing book blogs that bloggers are primarily on the younger side.<\/p>\r\n\r\n\r\n\r\n<p class=\"wp-block-paragraph\">Classics, at 0.56%, are a marginal phenomenon. Contemporary literature is primarily located under \u201cOther\u201d at 8.5%. The feuilleton vs. book blogger debate is once again countered here. There may be a few blogs that focus on highbrow literature\u2014and I know some that are truly excellent\u2014but the majority of book bloggers primarily devote themselves to entertainment and place little value on intellectual challenges.<\/p>\r\n\r\n\r\n\r\n<h3 class=\"wp-block-heading\">Publishers<\/h3>\r\n\r\n\r\n\r\n<p class=\"wp-block-paragraph\">For publishers, book bloggers have been an interesting target group for various marketing activities in recent years. That ultimately fueled the professionalization debate, and many bloggers reflected on the value of their blogs. I asked myself how strongly individual publishers are represented on blogs. Based on the data, that\u2019s not so simple. For my book database I used the database of the German National Library. Its holdings are very extensive, but not all editions are listed. For example, Tad Williams\u2019s \u201cDer Drachenbeinthron\u201d appeared at Kr\u00fcger, at Fischer, and in the newest edition at Klett-Cotta. The latter, however, is not found in the DNB database. For the evaluation, this means that statements regarding title, author, and later also genre\/category are reliable, whereas statements about the publisher are reliable only for hits identified via ISBN.<\/p>\r\n\r\n\r\n\r\n<p class=\"wp-block-paragraph\">The distribution of publishers across all book hits (i.e., with the described source of error) looks like this:<\/p>\r\n\r\n\r\n\r\n<figure class=\"wp-block-image\"><a href=\"https:\/\/www.lesestunden.de\/wp-content\/uploads\/2018\/02\/verlage_absolut.jpg\"><img loading=\"lazy\" decoding=\"async\" width=\"1046\" height=\"2423\" src=\"https:\/\/www.lesestunden.de\/wp-content\/uploads\/2018\/02\/verlage_absolut.jpg\" alt=\"\" class=\"wp-image-4606\" srcset=\"https:\/\/www.lesestunden.de\/wp-content\/uploads\/2018\/02\/verlage_absolut.jpg 1046w, https:\/\/www.lesestunden.de\/wp-content\/uploads\/2018\/02\/verlage_absolut-130x300.jpg 130w, https:\/\/www.lesestunden.de\/wp-content\/uploads\/2018\/02\/verlage_absolut-768x1779.jpg 768w, https:\/\/www.lesestunden.de\/wp-content\/uploads\/2018\/02\/verlage_absolut-442x1024.jpg 442w\" sizes=\"auto, (max-width: 959px) 688px, (max-width: 1023px) 768px, (max-width: 1279px) 848px, 100vw\" \/><\/a><\/figure>\r\n\r\n\r\n\r\n<p class=\"wp-block-paragraph\">This mirrors the observation from the similarity analysis: there is no publisher that dominates book blogs or is excessively present. The books read by bloggers cover a very wide range of publishers, which corresponds to the diversity in their choice of books.<\/p>\r\n\r\n\r\n\r\n<p class=\"wp-block-paragraph\">Restricting to hits found via ISBN\u2014where the publisher can be determined reliably\u2014shows a very similar distribution and leads to identical conclusions:<\/p>\r\n\r\n\r\n\r\n<figure class=\"wp-block-image\"><a href=\"https:\/\/www.lesestunden.de\/wp-content\/uploads\/2018\/02\/verlage_isbn.jpg\"><img loading=\"lazy\" decoding=\"async\" width=\"1052\" height=\"2364\" src=\"https:\/\/www.lesestunden.de\/wp-content\/uploads\/2018\/02\/verlage_isbn.jpg\" alt=\"\" class=\"wp-image-4607\" srcset=\"https:\/\/www.lesestunden.de\/wp-content\/uploads\/2018\/02\/verlage_isbn.jpg 1052w, https:\/\/www.lesestunden.de\/wp-content\/uploads\/2018\/02\/verlage_isbn-134x300.jpg 134w, https:\/\/www.lesestunden.de\/wp-content\/uploads\/2018\/02\/verlage_isbn-768x1726.jpg 768w, https:\/\/www.lesestunden.de\/wp-content\/uploads\/2018\/02\/verlage_isbn-456x1024.jpg 456w\" sizes=\"auto, (max-width: 959px) 688px, (max-width: 1023px) 768px, (max-width: 1279px) 848px, 100vw\" \/><\/a><\/figure>\r\n\r\n\r\n\r\n<p class=\"wp-block-paragraph\">In a final step, I searched blog posts for publisher names and determined how often a publisher was mentioned. Here, too, there is no publisher that outshines all others and is far more important to book bloggers than any other house.<\/p>\r\n\r\n\r\n\r\n<figure class=\"wp-block-image\"><a href=\"https:\/\/www.lesestunden.de\/wp-content\/uploads\/2018\/02\/verlage_name.jpg\"><img loading=\"lazy\" decoding=\"async\" width=\"1061\" height=\"2430\" src=\"https:\/\/www.lesestunden.de\/wp-content\/uploads\/2018\/02\/verlage_name.jpg\" alt=\"\" class=\"wp-image-4608\" srcset=\"https:\/\/www.lesestunden.de\/wp-content\/uploads\/2018\/02\/verlage_name.jpg 1061w, https:\/\/www.lesestunden.de\/wp-content\/uploads\/2018\/02\/verlage_name-131x300.jpg 131w, https:\/\/www.lesestunden.de\/wp-content\/uploads\/2018\/02\/verlage_name-768x1759.jpg 768w, https:\/\/www.lesestunden.de\/wp-content\/uploads\/2018\/02\/verlage_name-447x1024.jpg 447w\" sizes=\"auto, (max-width: 959px) 688px, (max-width: 1023px) 768px, (max-width: 1279px) 848px, 100vw\" \/><\/a><\/figure>\r\n\r\n\r\n\r\n<p class=\"wp-block-paragraph\">But how successful are publishers\u2019 marketing activities really? To explore this, I took a closer look at the Random House publishing group. On March 2, 2015, the group launched its <a href=\"https:\/\/blogger.randomhouse.de\/\">blogger portal<\/a>, creating a platform through which bloggers can easily get in touch and order review copies with just a few clicks. How effective was\u2014and is\u2014this marketing strategy?<\/p>\r\n\r\n\r\n\r\n<p class=\"wp-block-paragraph\">To begin, I determined in absolute numbers the books mentioned from Random House imprints versus other publishers:<\/p>\r\n\r\n\r\n\r\n<figure class=\"wp-block-image\"><a href=\"https:\/\/www.lesestunden.de\/wp-content\/uploads\/2018\/02\/randomhouse_vs_andere_numerisch.jpg\"><img loading=\"lazy\" decoding=\"async\" width=\"1092\" height=\"328\" src=\"https:\/\/www.lesestunden.de\/wp-content\/uploads\/2018\/02\/randomhouse_vs_andere_numerisch.jpg\" alt=\"\" class=\"wp-image-4604\" srcset=\"https:\/\/www.lesestunden.de\/wp-content\/uploads\/2018\/02\/randomhouse_vs_andere_numerisch.jpg 1092w, https:\/\/www.lesestunden.de\/wp-content\/uploads\/2018\/02\/randomhouse_vs_andere_numerisch-300x90.jpg 300w, https:\/\/www.lesestunden.de\/wp-content\/uploads\/2018\/02\/randomhouse_vs_andere_numerisch-768x231.jpg 768w, https:\/\/www.lesestunden.de\/wp-content\/uploads\/2018\/02\/randomhouse_vs_andere_numerisch-1024x308.jpg 1024w\" sizes=\"auto, (max-width: 959px) 688px, (max-width: 1023px) 768px, (max-width: 1279px) 848px, 100vw\" \/><\/a><\/figure>\r\n\r\n\r\n\r\n<p class=\"wp-block-paragraph\">You can clearly see how the number of books from the Random House group discussed jumped\u2014from 651 to 980 books. However, the total number of books discussed also shot up. To see how strongly Random House imprints were represented, I calculated their share in percent and thus created a much more meaningful chart:<\/p>\r\n\r\n\r\n\r\n<figure class=\"wp-block-image\"><a href=\"https:\/\/www.lesestunden.de\/wp-content\/uploads\/2018\/02\/randomhouse_vs_andere_anteile.jpg\"><img loading=\"lazy\" decoding=\"async\" width=\"1092\" height=\"328\" src=\"https:\/\/www.lesestunden.de\/wp-content\/uploads\/2018\/02\/randomhouse_vs_andere_anteile.jpg\" alt=\"\" class=\"wp-image-4603\" srcset=\"https:\/\/www.lesestunden.de\/wp-content\/uploads\/2018\/02\/randomhouse_vs_andere_anteile.jpg 1092w, https:\/\/www.lesestunden.de\/wp-content\/uploads\/2018\/02\/randomhouse_vs_andere_anteile-300x90.jpg 300w, https:\/\/www.lesestunden.de\/wp-content\/uploads\/2018\/02\/randomhouse_vs_andere_anteile-768x231.jpg 768w, https:\/\/www.lesestunden.de\/wp-content\/uploads\/2018\/02\/randomhouse_vs_andere_anteile-1024x308.jpg 1024w\" sizes=\"auto, (max-width: 959px) 688px, (max-width: 1023px) 768px, (max-width: 1279px) 848px, 100vw\" \/><\/a><\/figure>\r\n\r\n\r\n\r\n<p class=\"wp-block-paragraph\">This shows that in March, compared to February, there was no effect yet. Books from Random House imprints were equally present at 9% in February and March. 2015 was the big year of book blogging, and the overall volume of articles rose sharply. Over the course of the year, you can see that from April the market share increased by two percentage points\u2014reaching a full 15% in September. Of course, bloggers first needed time to read, and it took a while for the portal to spread by word of mouth. It\u2019s also interesting that by the end of the year this effect dissipated again.<\/p>\r\n\r\n\r\n\r\n<p class=\"wp-block-paragraph\">From my point of view, all of these evaluations make it very clear that book bloggers are only limitedly influenced by publishers\u2019 marketing campaigns. They choose their books according to their own taste. I consider the fear that book bloggers let themselves be \u201cbought\u201d with review copies to be at least partly refuted by this evaluation. If all book bloggers were that easily bought, this blogger portal would have had a much greater impact, because Random House makes it very easy for bloggers to get their books. However, it\u2019s possible that the Random House group has a very restrictive policy when awarding review copies. Or the activities of other publishers dampened the effect\u2014after all, book bloggers have limited capacity and are also specifically integrated into other houses\u2019 marketing.<\/p>\r\n\r\n\r\n\r\n<h3 class=\"wp-block-heading\">Sentiment analysis<\/h3>\r\n\r\n\r\n\r\n<p class=\"wp-block-paragraph\">After I had the full text of all blog articles, I wondered whether I could determine how positive or negative the judgments expressed in them are. Book bloggers are often accused of lacking criticism. In fact, there is a method in machine-learning-based text processing that can detect this. The keyword here is <em>sentiment analysis<\/em>, i.e., detecting the mood of a text. For this I used a library that, with the sentiment lexicon provided by Leipzig University <a href=\"http:\/\/wortschatz.uni-leipzig.de\/de\/download\">for download<\/a>, performs an analysis of the mood of blog articles. For each post I calculated a sentiment score and then averaged it across all blog posts. A negative score indicates a post with a negative mood; a positive score indicates a correspondingly positive review. The value typically ranges between \u22124 (very negative) and 4 (very positive).<\/p>\r\n\r\n\r\n\r\n<p class=\"wp-block-paragraph\">For all blogs combined, the result is a <strong>score of 3.15<\/strong>, which is very, very positive. That also fits past evaluations where I already found that book bloggers rate very positively. That seems very plausible to me as well, because book bloggers write about books they like and share their enthusiasm. Once again, a confirmation of this widely known assumption. Publishers therefore have little to fear in terms of receiving a negative review when a blogger takes on one of their books.<\/p>\r\n\r\n\r\n\r\n<h3 class=\"wp-block-heading\">Reliability of the data<\/h3>\r\n\r\n\r\n\r\n<p class=\"wp-block-paragraph\">I\u2019ve shown several visualizations and statistics\u2014so how reliable are these values and comparisons? Here you have to consider the processing steps by which the data were collected. Searching for books in posts is not an exact method and involves some fuzziness. For one thing, a hit doesn\u2019t necessarily mean the book was reviewed. Some bloggers maintain overviews in which read and unread books are repeatedly listed, which are then repeatedly found and counted. In addition, the titles from the German National Library database don\u2019t always have to appear exactly like that in the posts. Spot checks did show that the hit rate is very good here, but if a book is titled \u201cSelection \u2013 Die Krone,\u201d then that exact character string must appear in the post to register a hit. There are also hits in the audiobook segment and among ebooks. For example, Random House published all ebooks under one imprint, which therefore has an above-average number of hits. Naturally, some books are found twice when multiple editions exist. I eliminated such duplicate hits as best I could.<\/p>\r\n\r\n\r\n\r\n<p class=\"wp-block-paragraph\">Books in other languages are completely excluded. Here I rely entirely on the German National Library\u2019s database, which does include foreign-language books, but is focused on the German-speaking world.<\/p>\r\n\r\n\r\n\r\n<p class=\"wp-block-paragraph\">The algorithm for scraping blog posts is also subject to inaccuracies. I did some spot checks and validated for blogs with very many or very few posts whether they were processed correctly. But I didn\u2019t do this for all blogs. It\u2019s therefore entirely possible that some posts were missed, while others were captured twice.<\/p>\r\n\r\n\r\n\r\n<p class=\"wp-block-paragraph\">Since the results reflect previous evaluations very well and numerous findings are confirmed again, I don\u2019t believe there\u2019s an error large enough to change the conclusions. The example of the publisher breakdown shows how the form of evaluation varies but still yields similar outcomes. However, there are certainly inaccuracies in the exact values. To provide maximum possible transparency, I will also release the source code for this evaluation. Anyone who wants to conduct further analyses or scrutinize the results closely is free to do so. Perhaps some statements I make here can even be refuted and turn out to be wrong.<\/p>\r\n\r\n\r\n\r\n<h3 class=\"wp-block-heading\">Conclusion<\/h3>\r\n\r\n\r\n\r\n<p class=\"wp-block-paragraph\">A closer analysis of blog articles shows that the range of books discussed is very broad and that bloggers make a very individual selection. There is little overlap in the choices of reading, and aside from a handful of slightly more popular books and authors, each blogger makes their own picks. This confirms the results of past evaluations. However, the range in terms of genre is quite narrow. The focus is clearly on YA, children\u2019s books, and popular fiction. Demanding literature is a marginal phenomenon here, and the average post length of one to two book pages shows that an extensive and in-depth discussion is likewise more the exception.<\/p>\r\n\r\n\r\n\r\n<p class=\"wp-block-paragraph\">Publishers are very evenly represented, and there is no single publishing house that dominates the book-blogging scene. The major trade publishers do have a higher share, but overall the spectrum is broad. As a closer look at the Random House group and the launch of its blogger portal shows, bloggers can indeed be influenced by successful marketing, but the effect is small and not lasting.<\/p>\r\n\r\n\r\n\r\n<p class=\"wp-block-paragraph\">2015 was clearly the big year of book blogging. Many blogs were created then, and the most blog articles were written during that period. That\u2019s not surprising\u2014at that time the literary world suddenly paid a great deal of attention to book blogs. But the big hype has subsided again, and a distinct downward trend is visible. In 2017 the volume of articles fell back to the 2014 level before the big peak, and only about half as many new blogs were created in 2017 as in 2015.<\/p>\r\n\r\n\r\n\r\n<p class=\"wp-block-paragraph\">There\u2019s still a lot of movement in this community. Even though book bloggers have already carved out a place in the literary world, it remains to be seen how sustainable this trend is. By the numbers, book blogs are a niche that still has plenty of potential in terms of quantitative output, quality, and thematic breadth. I\u2019m curious to see how the book-blogosphere will continue to change in the coming years.<\/p>\r\n\r\n","protected":false},"excerpt":{"rendered":"<p>Since starting this blog, I\u2019ve repeatedly analyzed the book-blogging blogosphere and published posts with detailed statistics and visualizations. I\u2019ve already looked at networking, the origins of bloggers, and what book bloggers read. This post now looks at the most interesting and at the same time most central topic of this select little circle: What do &hellip; <\/p>\n<p><a href=\"https:\/\/www.lesestunden.de\/en\/2018\/02\/what-do-book-bloggers-write-about-analysis-with-visualization-and-statistics\/\" class=\"more-link\">Continue reading<span class=\"screen-reader-text\"> &#8220;What do book bloggers write about: Analysis with visualization and statistics&#8221;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":4583,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"slim_seo":{"title":"Wor\u00fcber schreiben Buchblogger: Analyse mit Visualisierung und Statistiken - lesestunden","description":"Seit ich diesen Blog habe, analysiere ich auch immer wieder die Blogosph\u00e4re der Buchblogger und ver\u00f6ffentliche dazu Beitr\u00e4ge mit detaillierten Statistiken und V"},"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_feature_clip_id":0,"_jetpack_memberships_contains_paid_content":false,"footnotes":"","jetpack_post_was_ever_published":false},"categories":[154],"tags":[],"class_list":["post-4559","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-book-blogosphere"],"jetpack_featured_media_url":"https:\/\/www.lesestunden.de\/wp-content\/uploads\/2018\/02\/worueber_bloggen_buchblogger_beitrag.jpg","jetpack_sharing_enabled":true,"_links":{"self":[{"href":"https:\/\/www.lesestunden.de\/en\/wp-json\/wp\/v2\/posts\/4559","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.lesestunden.de\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.lesestunden.de\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.lesestunden.de\/en\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.lesestunden.de\/en\/wp-json\/wp\/v2\/comments?post=4559"}],"version-history":[{"count":0,"href":"https:\/\/www.lesestunden.de\/en\/wp-json\/wp\/v2\/posts\/4559\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.lesestunden.de\/en\/wp-json\/wp\/v2\/media\/4583"}],"wp:attachment":[{"href":"https:\/\/www.lesestunden.de\/en\/wp-json\/wp\/v2\/media?parent=4559"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.lesestunden.de\/en\/wp-json\/wp\/v2\/categories?post=4559"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.lesestunden.de\/en\/wp-json\/wp\/v2\/tags?post=4559"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}