TextAnalytics


Intelligence from large texts:
transforming unstructured data


News

Software release

Release of Attensity 4. According to the company, the sofware offers:
"* Rapid and seamless identification of trends and issues hidden in text
* Enhanced ability to make decisions based on information in both structured and unstructured data
* Unified, integrated Text Analytics suite that offers the widest and most comprehensive array of Text Analytics approaches from search and targeted extraction to Attensity's patented Exhaustive Extraction™
* Fast and easy implementation allowing organizations, without extensive knowledge engineering, to analyze, understand and act on critical information hidden in text"

Webinars etc

Attensity: Advances in Text Analytics; Text Analytics for Insurance Early Warning and DetectionOn demand webinars

Beyond Buzzword Bingo: Discover the Real Business Value of Search and Intelligence. Archived at Inxight.

On-demand presentations

Applying Text Analytics Solutions for Effective Claims Analysis. Attensity

Google/Inxight webinar. June 20. Slides available from Inxight.


Text Analytics Conference

The 2nd Annual Text Analytics Summit was held June 22-23 in Boston.  Some presentations will be made available.


Articles

What is Text Analytics?

6 AUGUST 2006

Data crunched by companies and government agencies is typically quantitative. These numbers are manipulated within relational databases to yield useful information. However, the intelligence potentially available to organizations is much larger than what is garnered from these traditional sources. Note the phrase “potentially available”. How do we get access to this vast potential resource? The problem is that useful business intelligence is buried within large amounts of text data, such as company documents, emails, customer survey reports, and so on. Text documents are structured for reading by people, but they are unstructured as far as data extraction is concerned.  The essence of text analytics is to take very large unstructured text documents and extract useful business intelligence.

Before examining text analytics in more detail, let’s consider a range of ways to extract data from large texts. We can distinguish two broad approaches: queries and transformations.

Queries. One way to extract information from large texts is to formulate a query. Once a query is specified, software routines trawl through the text to provide a response to the query. An example of a response may be something as simple as a list of all instances of the words “IBM” and “UIMA” that occur within a certain span of words, say strings of 10 words or so. The queries and the responses may be more complex than this, but what characterizes a query is the obvious fact that you have to specify the query. In order to formulate a good query, you have to know what you want to know, and then from that decide how to structure the query, following the constraints of the query system software, to obtain the desired results. You have to decide what you want to know, and you have to make assumptions about the kind of information contained in the text documents.

Transformations. A query can be considered to be a request to reveal specified data patterns hidden within a text. An alternative way to deal with texts is to give a request along the lines of: “transform yourself to reveal interesting data patterns”. A simple example of this notion might be a request for a summary of a document.  Following this transformation metaphor, the summarization software can be viewed as a request to a document to transform itself into a summary.

Both queries and transformations are useful and have their place. One interesting aspect of a transformation approach is that few assumptions are made about the content of the data patterns in the texts. if you want a broad picture of the content of texts, then in adopting a transformational approach, you are giving the data patterns a chance to reveal themselves.   If, on the other hand, you know you want to find out about IBM and UIMA, then a query is the right way to go. You know what you are looking for and you know which entities are relevant. Read more


Text Analytics Case Studies 

6 AUGUST 2006

Finding the best reviewers for particular grant applications (pdf) Content Analyst

^ Top | © 2007 Michael Barlow | css | xhtml | dvd