It’s nearly impossible to have a meaningful discussion on the issue of media piracy. Strong opinions and anecdotal evidence dominate every conversation. There is seldom any hard data to back up the various claims of damage or lack thereof.
The recent New York Times piece on book piracy is typical of the kind of coverage we’ve come to expect from major news source. The story is long on speculation and short on deep thinking or meaningful data.
Meanwhile, O’Reilly Media has just published a new research report on the Impact of P2P and Free Distribution on Book Sales. Written and researched by Brian O’Leary, the report is an all too rare attempt to quantify the impact that various types of freely available content have on sales.
Free content has long been used to promote all forms of media. Is it possible that pirated content might serve a similar role in promoting the purchase of content? O’Leary’s early results seem to indicate that might be the case.
In this podcast I talk with O’Leary about his research. A full transcript of our talk will be available in the next couple of days.
Kirk Biglione: This is Kirk Biglione with medialoper.com, and I’m talking today with BO of Magellan Media Partners. Brian has just authored an interesting new report titled, “The Impact of P2P and Free Distribution on Book Sales.” Brian, thanks for being with us today. This is an interesting report, can you tell us a little bit, just, kind of, give us an overview?
Brian O’Leary: Sure. Well, it started out actually as looking purely at the impact of Peer-to-Peer or P2P networks on book sales. O’Reilly had an interest in opening up — which is the publisher, had an interest in opening up a conversation about whether piracy had an impact and if it did, what kind of an impact and how significant and did it vary across titles? We actually were lucky enough to get the Random House to join in and they contributed the results of some experiments that they’ve been conducting on the use of free content and its impact on essentially promoting page sales in other arenas.
We really approached it as neither being anti nor pro-piracy, there’s a lot of debate about this. We really just wanted to understand what practices would help boost overall publishing revenues, because were looking at both pirated content and authorized tests of free content distribution. We applied a methodology that was pretty consistent and tied to actual sales data. We looked at the sales four weeks prior to either a piracy first being spotted in one of the networks, or four weeks prior to a free test and then we looked at it during and after the distribution.
Generally, it was relatively easy to do because the sales data, as you know, is collected pretty regularly weekly and in some cases daily title for publishes like O’Reilly and Random.
Biglione: And so, I had been thinking about this as kind of a study of piracy but really, you kind of mix two different channels for free content, one being the legitimate channel where it’s willingly provided as a kind of promo content and the other being the traditional Peer-to-Peer or illegal download sites.
O’Leary: We actually — to be honest, we kind of backed into that. We initially started purely with piracy and when we approached Random, they were very interested in having a methodology to help them better understand the impact of free. They had done some tests on their own but they had not necessarily applied this methodology. It’s a little bit easier with O’Reilly because they have a — led by Tim O’Reilly, they have a fairly open view toward piracy as Tim O’Reilly calls it. It’s essentially progressive taxation and more successful books are probably pirated more but they also gain more and can afford it.
Random is, I think, the mind set is that piracy is potentially a lost sale and you also have to negotiate it author by author. So they were more comfortable joining with simply the free distribution that they had direct control over and already worked out with authors. However, one of the things that I think we either discovered or developed in the course of this report is a model that suggests that it’s not either or, I mean people tended to think of piracy as theft and free content distribution, whether it’s in print form like in ARC or e-galley, or in digital form as PDFs or eBooks and the like, as marketing.
We found that in fact, in the middle, there’s a gray area or a gray market that it could be theft, but in often cases even Peer-to-Peer distribution seems to play at least some marketing role.
Biglione: That’s an interesting point. I think the discussion of piracy tends to get hung up on the morality of theft of intellectual property rights. In many cases, the discussion kind of ends there without looking deeper into why piracy exists, what factors affect piracy, what impact piracy might have on legitimate sales. Is that something you’re hoping to be able to uncover with further research?
O’Leary: If we could do one thing for the good of humanity, it would be to take a little bit of the sting out of this debate and really focus in on what works. One of the things that we tried really diligently to focus in on when we were writing as well as developing the presentation materials is the notion that free is not new. Publishers have been giving away content in different ways for as long as content has been created, and it’s an effective marketing mechanism. People want the sample, they want a sense of it. It works in a variety of different print and digital means. And you think about how Amazon has sold some of its books on the Kindle as the most recent example.
So free content or free digital content available to help promote a title is not necessarily a bad thing. What we were trying to figure out is what works. I mean, what helps boost revenues. And so we looked very carefully at a collection in the piracy case. We looked at all of O’Reilly’s front list, more than five dozen titles for 2008, and we found those that it had been posted on any one of the set of Peer-to-Peer sites that we were tracking.
Then we looked at what would be the impact on their overall sales were both before, how they were selling that before they were pirated and how they sold after, and we found a number of interesting things in that front and we did the same for the free part. But in both cases, we found that in general, not causation but at least correlation, sales grew after free content was distributed, whether it was pirated or deliberate.
Biglione: Now, on those O’Reilly titles, I know O’Reilly is very aggressive about releasing digital and also releasing digital without DRM.
Biglione: Did you — this was 2008, so this is last year, were these titles that were available from O’Reilly in their eBook bundles, and do you know if the versions that were on the pirate networks were from the original source, is that something you look at as well?
O’Leary: We weren’t able to go through and identify the source. But generally, they were digital so we would assume that they came from one or more of the digital sources that O’Reilly — they have an eBook bundle and it’s been expanding since the middle of 2008, they now offer it in ePub, Kindle as well as PDF.
In general though, there was no pattern that we could discern, partly because they were only, of the 60 or plus titles that we were looking for; they were published front list titles in 2008. We found only eight had been pirated on Peer-to-Peer sites in 2008. So it’s really hard to draw a conclusion from just eight titles. But that in and of itself is an interesting thing because there was this notion that everything that O’Reilly publishes is immediately posted on Peer-to-Peer sites. That was even something that we were told when we did some internal interviews at O’Reilly, that piracy was rampant.
When we did the research, we found that it was not, actually. It was a minority of the titles. And there was another component that’s spelled out in the paper, that the average lag time between — when a title was published and if one of those eight titles appeared on a Peer-to-Peer site was 20 weeks. So it’s almost five months, so it was neither prevalent nor immediate.
Biglione: And you would expect for the titles that O’Reilly is publishing that — and this is probably why they assumed internally that they had a high degree of piracy of their titles — because they have more technical customer base, you would expect that those would be the kind of books that would be pirated, especially if they’re easily available in non-DRM format without any scanning required. You would think that those would be widely available in pirate networks. That really is surprising. So you’re saying the overall effect for the O’Reilly, the versions that showed up on the pirated titles was an increase in sales?
O’Leary: It was actually. Again, it’s a small sample set and we’re not calling it causation, just correlation.
Biglione: That’s the point I wanted to make clear. You’re not saying that it’s definitely tied to the free availability but you’re noticing that in this particular instance.
O’Leary: Yes. The average proceed sales, meaning that from the time that we first saw the appearance of a file on a Peer-to-Peer site, in the four weeks following, the average proceed sales were 6.5% higher than the four weeks prior.
Biglione: And what did you find for the Random House titles which were presumably given away free on their website or through some other channel?
O’Leary: Yes. Generally, we’re talking about eight different tests with 12 different total formats, sometimes on their own site, sometimes on a hosted site. We found that sales — we looked in this case at both the promotional period, which is typically about a week long and in three cases, it was longer, it was to three weeks long.
During the promotional period sales were up strongly, about 19%. And during the promotional plus post-promotional periods, I mean that if you’re promoted for a week, we would then measure that week plus the four weeks following. If you’re promoted for three weeks, we’d measure those three weeks plus the four weeks following, so a total of seven.
So it varied by title, but overall sales were up 6.5% during the promotional and post-promotional periods. And that was per week, so if the tests were longer, we would look at the lift per week, not the lift overall so that you wouldn’t get a favoritism for promotions that went on longer.
Overall then, it was positive and there is a range, I mean in both the Random and the O’Reilly data, that a little bit tighter range for the Peer-to-Peer, which is the sales run from 18% up to about 33% down. But on average, they were higher.
Biglione: It seems like with enough research, you could come up with some kind of formula that publishers could use to make intelligent decisions about how to use free content or even pirated content. As you were saying, free content has been around forever, I think in all forms of media, not just books.
There is usually a cost associated with that, you know printing costs money, distributions cost money. Physical products end up in a secondary market, used books end up in used bookstores and arguably those offset sales as well. If you get to the point that we enough data, where you can actually look at — do you think that you’re going to get the point where you can determine causation rather than correlation or do you think that would just require too much data?
O’Leary: Well, the answer’s yes and yes. Yes, do I think we can get enough data so that we can start to draw some conclusions? Absolutely, and because what you can do is focusing on different aspects of it. For example, looking at the impact on front list versus back list promotion, or looking at specific genres or formats.
So if you focus in on that, you can get enough data points by correlating results from different publishers or across time to be able to draw reasonable conclusions. But the reality is we have two dozen different things that people have wanted to test and every time we go out and talk to folks, they get interested and suggest other ideas or segments that they’d like to test.
To get to a critical mass on all of those is tough. It’s tough on two fronts: One is that it takes time and energy and some negotiation to get publishers to participate. But the other is these tests take time to — generally, you know, four, six, eight months lead time to organize, conduct and then analyze. And even with lots of different publishers and every year you’re going to get perhaps several dozen to join in.
One thing that might be useful and we haven’t made any progress on this, is to have essentially a standardized form where publishers could submit their results, perhaps even anonymously, and then characterize the title by title tests in ways that could be rolled up in aggregate. That would at least distribute the work and maybe make it a little bit faster and bigger, but it’s a challenge.
Biglione: You’ve talked a little bit about — well, we’ve talked before this conversation about the challenges that you’ve had in getting buy-in or participation from publishers. Can you talk just a little bit about some of those issues, the challenge with getting people to actually — publishers to participate in this meaningful gathering of data that actually has a significant impact on their business as we move into the digital realm?
O’Leary: Well, I think it is an issue and I think it will continue to be an issue. O’Reilly and Random had been really helpful and really generous in both giving their staff time as well as direction.
They’re both profit-making companies and they compete in the market space that for the most part, books aren’t, unit sales are not growing at any significant rate. And to the extent that they feel that they are sharing something that conveys a competitive advantage, they want to have their arms around that. That’s puts a little bit overhead on my side into making sure that the participants are comfortable and the like.
Also finding new participants, which helps and allays the thing that you have to do to grow a number of data points, it takes time and energy. People need to trust me, for example, that I’m going to be — act in a way that they’re comfortable with, that they can defend it to their bosses that I’m not going to do something that’s going to put them at a disadvantage, and building those relationships also takes time.
I think that it might make sense down the road, or maybe even in the next days of this project, to be looking at building a kind of like the notion of a library of data to build a consortium or work with smaller sets of publishers, small publishers overall. Because I think the smaller the publisher, even if they only have a few data points that they can contribute, the risk for them feels less. Larger publishers competing on a much bigger plane, there’s more handholding that’s involved.
They can contribute very valuable results but there’s overhead involved in getting them to participate and do so comfortably.
Biglione: Now, you’ve talked a little about how you’ve built a framework for doing this research. Is this something you could hand off to publishers for them maybe to do the data gathering on their own?
O’Leary: It is. It’s not really that — I don’t mean to undercut my own work and I certainly would love for people to take a look at the paper and judge for themselves. But it’s not that hard. I mean, what you want to do are two things. One is that make sure that you collect data at its granular level about the type of book that you’re testing and the type of test that you’re conducting, so that you can compare and contrast results reasonably well.
The second is you want to employ a consistent methodology. I used that four weeks before, four weeks after, as well as for the free distribution during the promotional period. That’s essentially built on like a Barnes & Noble cooperative marketing analysis framework. But that actually informed when we thought about piracy and we thought about free content issue if you thought about as a marketing expense or a marketing effort.
So that worked for us. You could make the period longer; you could make it shorter. That’s not too hard to do and it probably wouldn’t show dramatically different results, although on the free distribution the shorter you make the post-review period, the higher the impact of the promotion itself. But that can be handed off and it might not about a way to go.
Biglione: Then hopefully if publishers were doing something like that, well, you would assume they would feel more comfortable potentially doing something like that. But hopefully, they would share their data in some way so that you could aggregate it all across publishers.
O’Leary: Agreed. It’s a funny thing when you do research, especially in an area of the business where everybody knows the benefit, but we’re not an association, we are not the Book Industry Study Group or the American Association of Publishers. Working as independent consultants, I’d like to say that this is a for-profit enterprise but this paper is probably closer to a labor of love.
That’s okay, but without the association umbrella, it’s hard for folks to feel comfortable and maybe one of the other things is not just to hand it off publishers but also they get the engagement of a neutral or for a substantially neutral part of the industry to take on ownership and direction even if we continue to do the work.
Biglione: I think in the paper you talk about how this is — you’re capturing this data at a point in time but obviously we are seeing an evolution in the market place as consumers increasingly move towards digital. And that could have an impact on what your findings are as well. So this seems like it’s something that needs to take place on an ongoing basis.
O’Leary: I think it does. I think it’s important to – we put a stake on the ground. One of the things that we’d liked about the rough kind of approach is that we are not claiming this is the last word, but it’s at least something that’s akin to a baseline or a snapshot of where the industry is right now. There’s no question that there’s more digital content now than there ever has been. There are also are better eBook readers, which are having the effect of pulling more content into the, more digital content into the market place. But simultaneously, we’ve conflated that with the notion that there’s a piracy threat.
There’s a fair amount of debate and I don’t want to take you off topic, but the fair amount of debate about whether the music industry was hurting as badly as people presumed about where to expect the piracy. But it’s not — the structure and the impact of piracy in the music business is very different from what we see for the book business; that’s touched upon in the paper as well.
So what we were trying to do is get out two things. One was, what’s it doing right now and then by continuing to test as you suggest, how is it changing? Because if the market goes from 1% eBooks to 10% eBooks and the prevalence of digital content grows significantly, we’d like to know does this mean that digital marketing now is better, has essentially a multiplier effect or is it a higher-risk thing because we do out there that people can immediately supplant a paid copy with a digital copy.
No one knows the answer to that right now and we can’t project, but we think that by starting now, we at least can say, all right, things are shifting and they it could be shifting in your favor, if you’re a publisher, they could be shifting against you.
Biglione: No one really knows, as you’re saying, what the uptake is going to be, how fast it’s going to happen. I think there is some sense that if you download a book from the internet it could lead you to buy a printed copy because obviously, reading from your computer isn’t the most enjoyable reading experience. As you were saying, with more e-readers out there, it becomes easier for someone to transfer to a portable device and it is a pretty good reading experience.
Then the other side of that is that as you have more content available for those devices, there’s this whole concept that — I can never figure out whether it’s Tim O’Reilly or Cory Doctorow who said that — I hear this attributed to either one of them at various times depending on who’s quoting them — “I’m more afraid of obscurity than piracy”, and there is this whole issue of discovery in the mass of available content, which is just getting greater and greater.
O’Leary: Well I did understand that to be Tim O’Reilly’s quote initially, but I’m perfectly willing to kick it down the street and say, I don’t know either.
Biglione: I always thought it was Tim O’Reilly’s. I first heard him saying it as TOC, but then after that, I’ve heard a lot of people attribute it to Cory Doctorow, but you know, that group of people share so many of the same opinions that I guess it’s applied equally to both of them.
O’Leary: Agreed. I think the thing that’s different and even though this is an O’Reilly paper, obviously they are not exercising — it’s not their viewpoint necessarily. I’ve been joined into it because my background is in publishing to both magazines and books. I’m not as strongly steeped in books as many, and I really approached it with a, kind of, analytical framework of if I wanted to figure out whether this was helping or hurting what would I do, and that’s how we started the paper.
I don’t quite know and sometimes I’m drawn into discussion threads and forums and the like. I don’t quite know what to make of the “less filling, tastes great” debate that occurs on with respect to piracy.
I’m not in favor of it, but I suppose people could pirate this paper. But ultimately, I think that if people are of good will are going to go out and pay for things then they have recognized that they have to continue to do so. I could be wrong though, and so we are testing it.
Biglione: And since we mentioned Cory Doctorow, I was going to ask you, you know, you mentioned David Pogue in this paper and we’ve talked about the reaction you’ve gotten from publishers, I think there’s, obviously, authors have an opinion on these issues as well and there is some very different points of view.
O’Leary: Yes. I think one of the things we don’t have enough data to tell you the answer on yet, but we have a theory. Is that authors with the platform and David Pogue certainly has a platform, he is well known both through his books as well as his writings in other media like the New York Times.
I think authors with platforms do lose more when piracy occurs, in part because their books are well known, their value is established, and they don’t need to be familiar with it. I am a Macintosh user at home at work, and David’s – I mean the whole Missing Manual series is a bible for me, particularly for things related to the OS X system.
I wouldn’t even think about, you know, I don’t need a sample chapter, I know the book, I know the brand, I know the quality of the work that he and the folks working under his direction are capable of.
So if his book is pirated, it probably is a lost sale or at least could be conceived as a lost sale, but one thing that David’s having read a lot of his comments and the like, he does that I don’t agree with, at least not in all cases, is he equates the presence of pirated content online with lost sale and well I am willing to believe with more data points that some of that will be borne out in looking at this titles or a group of titles that are like his, authors with significant platforms.
I don’t think that we have data that suggests that just simply saying, there are 11,000 titles that have been pirated and are online right now, and there were 3,000 has any correlation whatsoever with actual lost sales in the book business and until we have that data, I am not saying, it’s not an issue, I am just saying, let’s go get the data.
Biglione: Somewhat related to that, I have seen David Pogue also question whether or not the iPhone app of the iPhone Missing Manual is impacting sales, because it’s selling for such a low price, and it’s available on the iPhone, and he seems to have one opinion about it and his publisher seems to have a very different opinion about it. They are saying it as an add-on to sales rather than a replacement of sales.
O’Leary: Well, there, again, it’s the question of whether or not an eBook or an app cannibalizes or augments total revenue. It’s something that’s almost chased at least at this point on a title by title level.
I have this — there is a debate going on that I sort of, in all caps saw shouting, described as CHEAP EBOOKS WILL KILL US ALL, and I laughed because I just don’t think we know. I mean, I think we know that consumers are more willing to buy things at lower price points and so that’s axiomatic and at some point the falloff is greater than unitary, meaning that a small increase in price has an outsized impact in total sales and therefore lowers total revenue, but we are testing those boundaries right now.
We are talking about a segment of the marketplace that by anyone’s estimate is one or two percent of the total volume; apps, I don’t know the percentage, but it’s smaller than eBooks; and I think that this is not a bad time to experiment and to do the kinds of things that Random House is doing in terms of free and to allow the analysis of — and in O’Reilly’s case of these period to period release titles.
It’s just we need data and figuring out what’s actually going on in the marketplace and how consumers behave. If anyone has a perfect answer to that I have not seen it.
Biglione: And you need to get that data before the market tilts digitally in a meaningful way.
O’Leary: And it could be two or three year. I mean it’s not a long time off potentially and that’s true ultimately, not just for guys like you and me who would think about the industry and where it’s going, but for independent booksellers, perhaps even for Barnes & Noble or Borders who can’t sustain a 15% loss in total revenue.
I think what it means is that the cost of the books that they do hold go up. So it’s an interesting time, and I think that we can provide value but if we can do it with more discussion and less moderation and less hysteria or immediate gut reaction, I think, we’ll be better off.
Biglione: So where does this go next? Do you continue to try to expand your dataset or it seems like there are plenty of opportunity to spin off different types of research that seemed to be somewhat related. You could get in to the whole pricing issue and the impact it has on piracy? There is this whole related issue of DRM and the impact it has on piracy and the impact it has on consumer behavior.
Those all seem to be interesting factors too, which are really outside of the scope of this study, but you can easily see how they are related.
O’Leary: I do. In fact, one of the things that we talked about and wanted to layer in and decided not to on, when we talked about the piracy threat and the ongoing debate about free was specifically what does this imply for DRM, but if I thought it was borrowing trouble there.
So it is certainly one of the things that we could go into, but to specifically answer your question what we are really interested in doing at this point is getting more data points and to that end over the course of the spring I have been talking to a number of the mainstream trade publishers and not one of them yet has said, “Yes,” but we are continuing to have discussions.
I am thinking, broadly, about trying to talk to a variety of smaller publishers, perhaps to like associations like AAUP, the Association of American University Presses, to see if they are doing experiments that might be adequately rolled out. They seem to have less of a commercial bent and so therefore they might be interested in the academic research.
We have got an interesting carrot in that. Anyone who participates, obviously, gets access to all the data discussed to the extent that the participating publishers choose to discuss it.
So that’s a pretty good thing, if you are a small publisher and you have got ten experiments, or five, or even one, and you throw it into the hopper, you can look at the whole dataset, and that’s pretty cool.
So trying to get more participation is important and to grow the paper. If we don’t do that this, kind of, dies on the vine a little bit and that would not be a desirable outcome from my perspective.
Biglione: So if you are a publisher and you are listening to this or reading the transcript, and BO comes to you asks for your help say, yes.
O’Leary: At least, at least say, yes, to the request for an interview, for a discussion.
Biglione: Right, that’s the take away on this interview.
O’Leary: Yes, I think we are effectively saying pool your data and the rising tide will lift all ships. I think publishers compete on things other than whether or not they are the most knowledgeable folks about what digital content works, and I think digital content as a marketing tool will work differently for different types of books and I think that for certain books it would have — we don’t have this data in our dataset, but axiomatically if you are a publisher of text books I am not going to tell you to go away and give out all your content, but there’s a lot of gray area and this a good opportunity to begin to shed some light on it.
Biglione: This is available; the full-report is available from the O’Reilly website. I’ll link to it from the blog post that has this interview on it.
O’Leary: Great. Thank you.
Biglione: So anyone who wants to see the interview can see the full-report and can download it. This is a Rough Cut.
O’Leary: It is.
Biglione: O’Reilly has Rough Cuts which are works in progress. So presumably as you update this, the purchase will buy you additional updates. Is that the way it works?
O’Leary: It’s more than presumably, that’s exactly how it works. If you bought it today and we did five more updates, you would get every one of them free of charge.
Biglione: So it’s like a subscription to the body of research.
O’Leary: It is, and it’s also available both in PDF and eBook; the digital bundle that they sell and I think it’s been added already to Safari.
Biglione: This is a hugely important research at this point in time for the publishing industry and I think probably for other content industries in terms of changing the way we talk about things like piracy is that we can come to some more meaningful conclusions about what kind of impact it has and ways that we can work with or work around or provide alternatives to just — you know, the kind of thing I referred to before is the moral imperative.
The thing that piracy is bad therefore we shouldn’t think about it. It doesn’t really solve any problems.
O’Leary: It certainly doesn’t from my perspective, but first we want to unbundle whether or not it is the problem, and after that start to talk about what we could do effectively to try and make sure that if it is a problem, we can adjust it and if it’s an opportunity, that we take advantage of it.
Biglione: Thanks for talking with us Brian and good luck with your research. If I see any publishers who you’ve talked to I will be sure to try to twist their arms.
O’Leary: Well, we will be doing an update and a presentation of this project at BEA, maybe some of them will come to see me there as well.
Biglione: Right, which day is that?
O’Leary: That will be on Thursday morning at, I guess, the 28 it is, and at 09:30.
Biglione: Okay, thanks again, Brian.
O’Leary: My pleasure.