#Today's #Photo: @IBM Visualization of @Wikipedia data As well the store of #data, it is often difficult to come up with compelling images of different database technologies. This is a of daily Wikipedia edits activity by bot script "Pearle" on done by IBM. More info can be found in Proceedings of INTERACT (2007). "Visualizing Activity on Wikipedia with Chromograms". The data is several terrabytes in size. So this is "big data." At least for a while, until the equivalent of Moore's law for data storage makes it small data in a few decades or so. :) data Photo: Wikimedia/Fernanda B. Viégas/CC-BY-2.0

A Visualization of Wikipedia Data...

BI Inspiration: Lexus supercar automotiv...

A Visualization of Wikipedia Data...

#photo: of with device, second smartest on the planet. We're going to do a blog post on the AI superintelligence shortly. It's actually significant (from a convergent evolution and AI design point of view) that dolphins, not chimps, are the second smartest animals on the planet. This was an encore version of our earlier dolphin photo. This US Navy dolphin (in the original photo) has what looks like a camera but is described as a "locator becaon." It looks like a wearable for a dolphin! Check out our earlier photos and blog posts on this subject.

Talk with one of our Ivy League PhD data...

BI Inspiration: Lexus supercar automotiv...

Home » Blog » Art » Ted Talks on IBM Watson & Bayes’ rule in evolution » Page 2

22
May

Ted Talks on IBM Watson & Bayes’ rule in evolution

Posted by Acculation in Art, artificial intelligence, Featured, Photos, Singularity, unstructured data, Video, watson with 2 comments.

The secret to IBM Watson is the same one discovered a decade ago in statistical inference research: distance metrics & Bayes' rule. Photo: Wikimedia/mattbuck/cc-by-SA-3 #black #office #art #artwork #data #science This photo originally appeared in our Instagram on January 8, 2015, as the final clue in our reader's puzzle on what ostriches had to do with data science. (The clue was Bayes' rule

However, IBM Watson must do a great deal more to win Jeopardy! Finding a close match in Wikipedia between its text and a question is far from being the correct answer in many cases. IBM Watson looks at many other factors, such as the implied historical period of the question (modern medicine in Wikipedia would give the wrong answer to a question about medieval medicine). It has distance metrics for the popularity and authenticity of the data source. (Popular data sources aren’t always right. One example given are lengths of borders of South American countries, were a common fact frequently quoted by newspapers is, in fact, wrong.) When these different distance metrices are in conflict, it then applies machine learning to learn which metrics should dominate in any given answer.

The corollary, of course, is that much of the development time for a novel solution for Watson will be in writing code to compute distance metrics. For example, in our hypothetical FBI database Watson implementation, suppose a frequent use case was in matching partial license plates using natural language. Let’s say witnesses frequently said things like “license plate started with NQZ, the suspect had blue eyes, and last name sounded like Mike.” You could pay an in-demand SQL programmer to write complex SOUNDEX and Regex queries to access some database, and maybe come back with an answer several hours and several hundred dollars later. Or you can have IBM Watson or another natural language processing system (hopefully) figure out how to retrieve this information from your natural language query using much more computing power but presumably much faster and at lower total cost than the dedicated SQL programmer. In order to do that, however, Watson would probably need new distance metrics written for things the FBI (or witnesses) would commonly search for, such as (in our example) license plates or similar-sounded names. Basically, new distance metrics probably have to written for anything that doesn’t frequently come up in Jeopardy!

Not many questions involving similar license plates come up in Jeopardy!, although similar-sounding names might, so perhaps only one new metric has to be written in this example. This would create a new quantitate score that would compare two cases in the database exclusively on how similar their license plates are. A separate metric (or at least separate treatment) is needed, because similar or matching license plates between two cases is a qualitatively very different signal than some random text matching between the cases. The choice of metrics presumably re-imposes some structure on the resulting system and data interpretation. In a more realistic example, an experienced agent would guide the Watson engineers into creating new quantitatively metrics based on how they compare cases or suspects in real-life. They might create a metric that could compare two artists’ sketches, for example, or score how similar a sketch is with a photo. Machine learning would then take over to figure out how to integrate the different metrics in formulating responses to natural language questions. For example: Should similar license plate dominate when appearance are different?)

These considerations then get at the true cost of a Watson deployment. They also answer the question about the infrastructure that should be built out to develop a pre-Watson prototype: if the similarity you’re looking for isn’t asked for in Jeopardy!, write a custom distance metric for it.

Photo: Wikimedia/mattbuck/cc-by-SA-3. Black light office art & artwork: our featured photo is Bayes’ theorem in neon. When this photo was originally published on our Instagram feed, we used it to wrap up our final set of clues in our reader’s puzzle on the relationship between ostriches, Aristotle and data science.

This is Bayes’ theorem from statistics (and data science) spelled out in blue neon at the Cambridge, UK offices of data science firm Autonomy. (Apologies to the frequentists or should we say frequentistas among our readers.. This supposedly rival but in reality complementary branch of statistics is in holy war against Bayesians. We’re being satirical here in a nod to today’s (Jan 8, 2015’s) tragic events. Even bloggers have been targeted by dictators and fanatics, so these things make us all less free, but more on that later.

The clue was Bayes’ rule. As we and others have argued elsewhere, there are only so many ways you can design an intelligent system. Convergent evolution requirements will dictate that such systems use statistical inference, and specifically Bayes’ rule is essentially any such system. There is growing evidence in neuroscience that the human brain does, indeed, use Bayes’ rule, hardcoded by evolution. IBM Watson thus necessarily makes use of Bayes’ rule as one of many parts of a complex chain of statistical reasoning and machine learning heuristics.

1 2

Tagged: art, artificial intelligence, artwork, blue, business, data, figure, historical, intelligent, light, more, novel, photo, science, singularity, us, water, watson
2
2

Search API will now always return "real" Twitter user IDs. The with_twitter_user_id parameter is no longer necessary. An era has ended. ^TS
— Twitter API (@twitterapi)November7, 2011

There are 2 comments so far

open semantic meaning platforms: alternatives to IBM Watson? « Acculation

9 years ago · Reply

[…] out our more recent post on Watson, which includes a selection of Ted Talks on Watson. This also talks about Watson’ use of distance matrices in statistical inference, […]
Signal processing, motion, and artificial intelligence: Ted Talk « Acculation

9 years ago · Reply

[…] AI (or GOFAI) has it’s uses. As we’ve previously argued in our discussion on Bayes’ Rule and IBM Watson, statistical inference is much more computationally expensive that GOFAI. We noticed a decade ago […]

Don't worry. We never use your email for spam.

Ted Talks on IBM Watson & Bayes’ rule in evolution

There are 2 comments so far

Leave a Comment

Recent Comments

Featured Posts

Categories

Archives

Read more:

Social Progress Index and Big Data Analytics: government by computer?

Animal consciousness: is this bird human-like?

Smart cities: analytics, bigdata, cheap IoT sensors

Red Sky Rover: self-portrait of Mars Curosity at Gale Crater

Ted Talks on IBM Watson & Bayes’ rule in evolution

Related posts:

There are 2 comments so far

Leave a Comment

Recent Comments

Featured Posts

Tags

Categories

Archives

Read more:

Social Progress Index and Big Data Analytics: government by computer?

Animal consciousness: is this bird human-like?

Smart cities: analytics, bigdata, cheap IoT sensors

Red Sky Rover: self-portrait of Mars Curosity at Gale Crater