Home » Blog » analytics » Big Data Analytics: Articles, Movies, Songs Robo-written by Computer? » Page 3
If you're new here, you may want to first register and subscribe to the RSS feed. Thanks for visiting!
Robo-journalism not anything new
In some sense, of course, blog and newspaper articles generated at least partially by computer are also nothing new. For at least ten years there have been articles recommending optimizing content around advertiser or search keywords to drive traffic. The recommendations go as far as selecting content pieces entirely around the most-searched for keywords, or the most expensive paid search keywords. (Your blog will be very boring if you do this. The author remembers one such article from ten years back recommending writing a blog, or blog articles, entirely around private jets, because private jet manufacturers at the time were paying several dollars for search keywords. Good luck if you attempt this.)
There are computer algorithms to analyze Google and Bing databases and try and identify which keywords or topics to write your blog article around based on the most popular or the most expensive topics. If, on some blogs, the content has been selected on the recommendation, are these blog and webzine articles not essentially computer generated? Some publishers have taken this a step further were the content really is auto-generated from keywords databases, or scrapped from Twitter (minus even the human quality-control check that the LA Times imposes, and without the disclaimer that it is an auto-generated article). It’s usually immediately obvious that this is poor quality, computer-generated content, and Google has implemented algorithms to try to weed out these publishers. (These kinds of auto-generated pages are sometimes referred to as ‘advertiser arbitrage’ where the page links one Advertising system with another, using an auto-generated page as a go-between. ‘Arbitrage’ is usually seen as a good word in financial markets, a necessary market player that links different financial exchanges to improve price discovery, greatly increase liquidity, and decrease the spreads that are essentially a tax on the user. But in the world of online advertising it is considered something of a dirty word.
Robo-journalism and the Singularity
As computers become increasingly more powerful due to Moore’s law, we expect computer-generated content to become increasingly harder to distinguish from human-generated content as we approach the much-hyped Singularity as described in Ray Kurzweils’ famous book and the related movies (watch now). (We previously touched on the Singularity in our coverage of Cisco’s IoT talk at CES.) Already, the Google News-style function is being used in many on-line newsrooms to select which human-generated content to give prominence, so even if the articles are not written by computer, computer algorithms are deciding which topics should be written about and which articles should be displayed. Is the next, which the Los Angeles Times has now taken, of having the entire article also be written by computer, really that much of a novelty?
Speaking of Kurzweil and the Singularity, we should mention that Kurzweil, whose father was a musician, fist achieved fame as a young man on a 1960s TV quiz show where he introduced ‘computer-generated’ music. (This was before he achieved commercial success with an early OCR machine for the blind, or his eponymous brand of commercial music synthesizers.) The idea, which has been copied and elaborated on many times sense, is to randomly select different human-written musical phrases under rules that govern which pre-written phrase may transition into which other phrase. The author remembers a 1980s ‘Mozart Machine’ for home computers that did something similar; the music was vaguely Mozart-esque except long-winded and, due to the random selection of phrases, lacking the thematic discipline used by real composers. (The same way that ‘Quakebot’, the Los Angeles Times’ robot journalist lacks the use of ‘elegant variation’ in its writing, as described in the Slate article. A human musician would normally have the different sections of a piece be somehow related in some way, so that each section is an elegant variation of another, rather than musical sections being either verbatim repetitions or entirely unrelated as these early computer robo-composers would attempt.) Most of these attempts were limited to classical music because it was easy to synthesize on the 4-voice monaural synthesizer chips of the time, but the concept can be applied nowadays to any genre of music (although it may still sound bad). We will come back to the concept of computer-generated, or at least computer-analyzed, modern commercial pop music in a bit.
Computer-generated feature films?
Let’s talk about computer-generated movies. Epagogix reportedly under contract to at least one major US studio, that uses computer algorithms to analyze movie scripts to predict their success. Apparently, it analyzes the themes used in the movie script and how closely they match certain well-known formulas. (It’s not immediately clear to us if it requires humans to read the script and encode the themes used in a machine-readable format first), According to an article by BBC Focus Magazine on Epagogix’s website, they predicted the $50 million 2007 Drew Barrymore movie Lucky You (watch now ) would flop. (It did flop, making only $5.7 million.) This analysis is independent of the movie’s marketing budget (normally a very important predictor of box office success) in that the intention is to help executives decide which movies to invest in based purely on the script.
Not mentioned in the BBC article, but supposedly Epagogix’s scored the movie poorly because it was not shot in exotic locations, although the casting of Drew Barrymore was a factor in the movie’s favor. Obviously, at least on this particular movie, Epagogix did much better than the human studio executives that green-lighted the movie; this isn’t a surprising result, as even expert human decision making is found to be biased and can be supplemented by computer models with domain expertise. In the sense that movie scripts that are poorly scored by Epagogix are then rejected by movie studios, the criticism has been made that scripts now a days are being ‘written’ by computer programs (in the sense that human writers will write and re-write their scripts until the computer is pleased).
Search API will now always return "real" Twitter user IDs. The with_twitter_user_id parameter is no longer necessary. An era has ended. ^TS
— Twitter API (@twitterapi)November7, 2011
Search API will now always return "real" Twitter user IDs. The with_twitter_user_id parameter is no longer necessary. An era has ended. ^TS
— Twitter API (@twitterapi)November7, 2011
Search API will now always return "real" Twitter user IDs. The with_twitter_user_id parameter is no longer necessary. An era has ended. ^TS
— Twitter API (@twitterapi)November7, 2011
Search API will now always return "real" Twitter user IDs. The with_twitter_user_id parameter is no longer necessary. An era has ended. ^TS
— Twitter API (@twitterapi)November7, 2011
Search API will now always return "real" Twitter user IDs. The with_twitter_user_id parameter is no longer necessary. An era has ended. ^TS
— Twitter API (@twitterapi)November7, 2011
Search API will now always return "real" Twitter user IDs. The with_twitter_user_id parameter is no longer necessary. An era has ended. ^TS
— Twitter API (@twitterapi)November7, 2011
Search API will now always return "real" Twitter user IDs. The with_twitter_user_id parameter is no longer necessary. An era has ended. ^TS
— Twitter API (@twitterapi)November7, 2011
Search API will now always return "real" Twitter user IDs. The with_twitter_user_id parameter is no longer necessary. An era has ended. ^TS
— Twitter API (@twitterapi)November7, 2011
Search API will now always return "real" Twitter user IDs. The with_twitter_user_id parameter is no longer necessary. An era has ended. ^TS
— Twitter API (@twitterapi)November7, 2011
There are 7 comments so far
Leave a Comment
Don't worry. We never use your email for spam.Recent Comments
- florimee on genetic disease turns you into a real-life vampire
- Acculation on Alien Pioneer plaque starmap to 3D printed jewelry transmedia: maker movement data-driven multiplatform media
- Acculation on Free Video Data Science Assessment Tool
- Acculation on Free Business Advice Chatbot Product
- Acculation on Online Consultation with Dr. Krebs (Big Data and Management Consulting)
[…] First in a series on applications of big data analytics, we discuss articles, songs, movies & content created & curated by algorithms. (Is hollywood out of a job? Future movies & songs robo-written by #bigdata #analytics algos? […]
[…] Unless you’ve been living under a large rock these last few days, you’ve probably heard that Los Angeles was struck in the last two weeks by what the USGS describes as a “moderate” 5.1 earthquake with “light” fore and aftershocks of around 4.5. (The Saint Patrick’s Day foreshock trembler prompted our earlier article on robot-written newspaper articles , music, and movies.) […]
[…] how you can take the Social Progress Index and use it to implement robot governance, something we talked about at the end of a previous post on Big Data Analytics. First, you have to accept that a metric like the Social Progress Index is a better objective […]
[…] generation, resulting in an exponential acceleration of technology. Futurists like Kurzweil who we have often mentioned in these blog pages believe this will lead to the Singularity, the point at which computer intelligence surpasses […]
[…] you read that article title right. We’ve done past articles on how computers were now writing newspaper articles automatically. When we learned there was software to “auto-generate” blog content, we were […]
[…] readers of our blog will recall our earlier article on predictive analytics models for Hollywood movie script or a pop songs. So, if you happen to be trying to sell one of these to a major corporation, you already have a […]
[…] did lightheartedly promise a series of articles on using big data for territorial expansion. Although there is no shortage of countries interested in these analytics applications, this […]