#Today's #Photo: @IBM Visualization of @Wikipedia data As well the store of #data, it is often difficult to come up with compelling images of different database technologies. This is a of daily Wikipedia edits activity by bot script "Pearle" on done by IBM. More info can be found in Proceedings of INTERACT (2007). "Visualizing Activity on Wikipedia with Chromograms". The data is several terrabytes in size. So this is "big data." At least for a while, until the equivalent of Moore's law for data storage makes it small data in a few decades or so. :) data Photo: Wikimedia/Fernanda B. Viégas/CC-BY-2.0

A Visualization of Wikipedia Data...

#photo: of with device, second smartest on the planet. We're going to do a blog post on the AI superintelligence shortly. It's actually significant (from a convergent evolution and AI design point of view) that dolphins, not chimps, are the second smartest animals on the planet. This was an encore version of our earlier dolphin photo. This US Navy dolphin (in the original photo) has what looks like a camera but is described as a "locator becaon." It looks like a wearable for a dolphin! Check out our earlier photos and blog posts on this subject.

Talk with one of our Ivy League PhD data...

A Visualization of Wikipedia Data...

Scoreboards are the business original analytics dashboards. This is the Indianapolis Motor Speedway Pylon. #auto #autoracing #racing. #sport #stadium #grey #white #seats #man #blue #sky #clouds #flag #wind #road #race #automobile #car #cars #speed #speedway. Photo credit: Wikimedia/public domain

Data viz: scoreboards as the original an...

Home » Blog » analytics » Watson from IBM: Why semantic text tech helps analytics » Page 3

24
Apr

Watson from IBM: Why semantic text tech helps analytics

Posted by Acculation in analytics, Art, artificial intelligence, crowdsourcing, education, Featured, Internet of Things, math, tech, unstructured data, watson with 7 comments.

IBM's Watson natural language processing software is named after IBM founder, Thomas J. Watson, Sr., pictured here in this 1920s photo from IBM's corporate archives. Photo: Wikimedia/IBM/CC-BY-SA-3.0

This example also illustrates nicely why this is a valuable computation technique. If you’re able to “lazily” (a technical term) leave data unstructured until it’s value is certain, you can eliminate the significant design, storage, and data entry associated with database schemas.

We briefly touched on IBM’s Watson previously in our article on Cisco’s CES talk on Internet of Things. The Internet of Things will create a lot of data. Some of it, especially in hindsight, will not be optimally structured. And, as our example above illustrated, it is actually more efficient to leave data unstructured when there is a great deal of it and the relative future importance of various features is uncertain. That’s where Watson will come to play.

Leave data unstructured when … its features’ relative future importance is uncertain. #watson

Click To Tweet

Apparently, IBM Watson could mine text very well for Jeopardy answers, somehow. But what is this technology, exactly? Can it do something besides play a mean game of Jeopardy?

Medicine like playing Jeopardy!?

IBM’s first killer application for Watson is assisting medical doctors in keeping up with the latest research. A huge amount of medical research is published each year. Articles sometimes provide new insights into diagnosis and treatments of disease, especially the more exotic and interesting cases. But the amount of new literature is vast and nearly impossible for specialists to keep up with, let alone your average general practitioner.

Enter Watson, which can understand large quantities text almost like a human. It can also answer “natural language” questions from humans (medical doctors) and respond to those questions in a natural way, as proved on Jeopardy. (It doesn’t really yet “understand” the way a human does. But it is able to create statistical models of the meanings of questions and text. So, when it is asked a question by a human, it is able to find the medical articles that are statistically most likely to be relevant in answering that question, and present that result back to the doctor.

Now, to be perfectly fair, this may not be an entirely new concept. (More on why tech to search scientific articles isn’t new in a bit.) Moreover, medical articles play to the computer’s strengths (much like the game of chess in that other famous IBM exhibition). While they are pretty fair along in the continuum of structured versus unstructured text, medical articles still have more structure than an average newspaper article or short story. Researchers go through a precise ritual when writing a medical article. There’s an abstract, introduction, conclusion, and so on. Space is very limited, so research describe concisely what they are doing in a set number of words in each section following a pre-set scientific style. (This is unlike a classic novel by, say, Agatha Christie or Lewis Carrol, which might jump from first-person to third-person narration or switch to prose mid-story.) Scientific articles use a lot of jargon. This also plays to the strength of computers, which can have a potentially unlimited vocabulary.

Searching scientific articles isn’t new tech

Scientific articles use citations to link paragraphs and sentences to other scientific articles. These citations also follow one of a small number of allowed formats, which provide a standardized reference intended to retrieve the cited article. The vast majority of these other articles, going back several decades, will already be on-line. Again, advantage computer, since it will be able to instantly retrieve and scan each citation to learn more about the meaning of the article. The poor human specialize must either already be familiar with the article (as is sometimes the case with highly cited articles in specialized fields) or spend time reading and retrieving it.

Moreover, in many cases the National Library of Medicine (NLM) and similar groups have electronically annotated cited articles in a machine-readable way. (Each discipline has their own system, but medicine and biology often use the MeSH ontology.) This was originally intended to speed researcher’s searching for related articles in PubMed/Medline (the online electronic article abstract searching system set up by the NLM, which took the place of multiple similar commercial services in the 1990s). If you knew (or know) the MeSH terms for the subjects you are interested in, you can pull their abstracts over the Internet via Medline. This, in turn, sometimes allows access to the full-text articles on publisher sites.

1 2 3 4

Tagged: abstract, analytics, art, business, careers, classic, data, education, famous, gadget, intelligent, Internet of Things, math, mine, more, novel, post, science, space, story, tech, us, watson, wolfram
7
0

Search API will now always return "real" Twitter user IDs. The with_twitter_user_id parameter is no longer necessary. An era has ended. ^TS
— Twitter API (@twitterapi)November7, 2011

There are 7 comments so far

[email protected] Author

10 years ago · Reply

We were curious to know what the folks at IBM thought about some of our proposed uses for Watson, so we posted to the IBM developer forum. Will Sennett of IBM was kind of enough to write a detail response on the IBM site.

Here’s an excerpt: “I’d have to dig a bit more at the FBI assistant example … certainly solutions in the big data and analytics realm that are a great fit for government…. On the HR side, I think you’re spot on. In fact, one of our Watson Mobile Developer
[Waston] application difficulty and complexity is probably dependent on the data ….”

Read his full response on the IBM forum.
Reviews of our app, or working more with governments on air quality | Acculation

10 years ago · Reply

[…] recent articles on IBM Watson analytics and Google Glass generated a lot of interest with people contacting us privately to ask for advice […]
Oh send in the trolls. Oh where are the trolls? There aren't any trolls.... | Acculation

10 years ago · Reply

[…] are all the trolls on the Internet? We have done our best to tick people off in this blog. We skewered Google Glass. We did not have kind words for IBM Watson’s marketing department. We’ve even poked fun […]
open semantic meaning platforms: alternatives to IBM Watson? | Acculation

10 years ago · Reply

[…] been a fan of IBM’s Watson semantic meaning analytics system since IBM first announced they were opening up their ecosystem. Around the time of CES we pointed […]
Ebola: Can big data or semantic text help?

10 years ago · Reply

[…] up the topic of semantic text systems. In our earlier article from April, we mentioned a “bear in the woods” scenario. The idea there is that structured data, such as the forms used in hospital […]
Ted Talks on IBM Watson & Bayes' rule in evolution

9 years ago · Reply

[…] of our most read articles have been on IBM Watson, including suggestions & possible alternatives. We’ve pushed IBM several times to come up with better demos for […]
Acculation Author

9 years ago · Reply

Twitter comments updated.

Don't worry. We never use your email for spam.

Watson from IBM: Why semantic text tech helps analytics

Medicine like playing Jeopardy!?

Searching scientific articles isn’t new tech

There are 7 comments so far

Leave a Comment

Recent Comments

Featured Posts

Categories

Archives

Read more:

Multicolor 3D scatterplot: traditional data visualization

Biofeedback, Wearables, and Fitness Video Games

Wildfire zone? Tech recommendations.

Easter Egg Hunt is on! Unlock a cool in-browser video game!

Watson from IBM: Why semantic text tech helps analytics

Medicine like playing Jeopardy!?

Searching scientific articles isn’t new tech

There are 7 comments so far

Leave a Comment

Recent Comments

Featured Posts

Tags

Categories

Archives

Read more:

Multicolor 3D scatterplot: traditional data visualization

Biofeedback, Wearables, and Fitness Video Games

Wildfire zone? Tech recommendations.

Easter Egg Hunt is on! Unlock a cool in-browser video game!